Topics

Learning language by machine

5 February 2021

Former CMS user Mait Müntel left physics to found Lingvist, an education company harnessing big data and artificial intelligence to accelerate language learning.

Lingvist CEO Mait Müntel talks to Rachel Bray

Mait Müntel came to CERN as a summer student in 2004 and quickly became hooked on particle physics, completing a PhD in the CMS collaboration in 2008 with a thesis devoted to signatures of double-charged Higgs bosons. Continuing in the field, he was one of the first to do shifts in the CMS control room when the LHC ramped up. It was then that he realised that the real LHC data looked nothing like the Monte Carlo simulations of his student days. Many things had to be rectified, but Mait admits he was none too fond of coding and didn’t have any formal training. “I thought I would simply ‘learn by doing’,” he says. “However, with hindsight, I should probably have been more systematic in my approach.” Little did he know that, within a few years, he would be running a company with around 40 staff developing advanced language-learning algorithms.

Memory models

Despite spending long periods in the Geneva region, Mait had not found the time to pick up French. Frustrated, he began to take an interest in the use of computers to help humans learn languages at an accelerated speed. “I wanted to analyse from a statistical point of view the language people were actually speaking, which, having spent several years learning both Russian and English, I was convinced was very different to what is found in academic books and courses,” he says. Over the course of one weekend, he wrote a software crawler that enabled him to download a collection of French subtitles from a film database. His next step was to study memory models to understand how one acquires new knowledge, calculating that, if a computer program could intelligently decide what would be optimal to learn in the next moment, it would be possible to learn a language in only 200 hours. He started building some software using ROOT (the object-oriented program and library developed by CERN for data analysis) and, within two weeks, was able to read a proper book in French. “I had included a huge book library in the software and as the computer knew my level of vocabulary, it could recommend books for me. This was immensely gratifying and pushed me to progress even further.” Two months later, he passed the national French language exam in Estonia.

Mait became convinced that he had to do something with his idea. So he went on holiday, and hired two software developers to develop his code so it would work on the web. Whilst on holiday, he happened to meet a friend of a friend, who helped him set up Lingvist as a company. Estonia, he says, has a fantastic start-up and software-development culture thanks to Skype, which was invented there. Later, Mait met the technical co-founder of Skype at a conference, who coincidentally had been working on software to accelerate human learning. He dropped his attempts and became Lingvist’s first investor.

Short-term memory capabilities can differ between five minutes and two seconds!

Mait Müntel

The pair secured a generous grant from the European Union Horizon 2020 programme and things were falling into place, though it wasn’t all easy says Mait: “You can use the analogy of sitting in a nice warm office at CERN, surrounded by beautiful mountains. In the office, you are safe and protected, but if you go outside and climb the mountains, you encounter rain and hail, it is an uphill struggle and very uncomfortable, but immensely satisfying when you reach the summit. Even if you work more than 100 hours per week.”

Lingvist currently has three million users, and Mait is convinced that the technology can be applied to all types of education. “What our data have demonstrated is that levels of learning in people are very different. Short-term memory capabilities can differ between five minutes and two seconds! Currently, based on our data, the older generation has much better memory characteristics. The benefit of our software is that it measures memory, and no matter one’s retention capabilities, the software will help improve retention rates.”

New talents

Faced with a future where artificial intelligence will make many jobs extinct, and many people will need to retrain, competitiveness will be derived from the speed at which people can learn, says Mait. He is now building Lingvist’s data-science research team to grow the company to its full potential, and is always on the lookout for new CERN talent. “Traditionally, physicists have excellent modelling, machine-learning and data-analysis skills, even though they might not be aware of it,” he says.

bright-rec iop pub iop-science physcis connect