Mind the gap
How the translation startup Lesan uses AI to connect emerging economies to the www
For an English speaker, the internet is a house with open doors. Not so for people speaking Tigrinya, one of the many local languages in Ethiopia. Conventional machine translators don’t serve so-called low-resource languages, making most of the web remain inaccessible to the millions of people who only speak their mother tongue. This is where Lesan comes in. Young Entrepreneurs in Science alumnus Asmelash Teka found a way to open up the web to his community by turning his PhD into a business. Continue reading for our interview with Asmelash about the pitfalls and potentials of machine learning and his journey to entrepreneurship.
What does your startup Lesan do, in a nutshell?
At Lesan we build machine translation systems for low-resource languages. While there’s a lack of online content, Lesan uses offline sources to create the baseline system. Let’s say that we use a Tigrinya book and an English version of that book. First, we take photos of the pages and run an optical character recognition tool, which turns the images into strings of text. The next step is aligning the text sentence by sentence. Millions of those aligned pairs are fed into the translation engine. When you give the system a sentence, it generates a translation based on these examples.
That sounds quite laborious.
We normally tap into previously translated sources, because that is much cheaper and easier to scale. Our interns in Ethiopia help us with scanning the books, alignment and post-editing. This way offline content can be sourced through the community that owns that language.
What sparked your idea to build Lesan?
Take Tigrinya, my mother tongue, for example. It has millions of speakers, though is yet not included in any of the major commercial machine translation services. There’s simply not enough online content to scrape the web and create these systems. That’s the gap we’re trying to fill. Versions of this problem have been with me ever since I was an undergrad in Ethiopia, when I localised the operating system Ubuntu to Tigrinya and Amharic. I continued my research in Europe on applied machine learning. Combining my skills and experiences, I just connected the dots.
You then brought your idea to the Young Entrepreneurs in Science workshop …
The Young Entrepreneurs in Science workshop really helped me to push things forward, since I was contemplating my next steps after my PhD at that time. I was considering entrepreneurship, but I didn’t know how to go about it. Day 1 of the workshop started with self-reflection—as if it was designed for me! We were all in a room with scientists of high caliber, bouncing back and forth ideas. It became clear to me that this was, in fact, what I wanted to do. It just felt right! I was wanting more.
How did you meet your cofounder Adam? What is the way to finding the right person?
I learned about Entrepreneur First, which supports individuals to find co-founders via a three-month programme. There were 50 of us in Berlin, pitching ideas and getting to know each other. This is how I met Adam! We are a good match in terms of what we want to do and our skills complement each other. I’m the technical person, he’s the commercial guy. That helps us appreciate each other.
Which were some of the obstacles you’ve had to overcome on the road to entrepreneurship?
Graduating from Entrepreneur First to get our investment was the first hurdle. We cleared that, however, if you’re from Ethiopia it can take months to open a business bank account. Everybody else received their funding, and we waited for two months, because I couldn’t find any bank that would accept somebody with my background. Sadly, they just don’t see somebody with my profile founding a business in Berlin …
… After our pre-seed funding from EF, we had the opportunity to pitch to investors. Their feedback was: too early and not enough customers. We then decided to go back into building mode, where we are now, to push our product and get a couple of customers to use our translation service.
Speaking of obstacles in terms of tech, machine learning does have its pitfalls …
Most of the sources that we rely on are very religious in nature, because religious texts often provide English translations of local languages. They are a great start, however, they make the translator sound very religious. In what is called “domain adaptation”, we also collect contemporary sentences in the target language to feed into the base translator. Tigrinya, for example, is a morphologically rich language. This is what makes automated translation really difficult, and we’re only at the beginning.
On the flipside, imagine being in a library with thousands of books. How long would it take to make such a library accessible to a community like mine in Ethiopia? It would take hundreds of years to do it manually! That’s where algorithms can actually have huge potential, when done well and used in the proper places.
What do you like most about being the CTO of your own company?
First of all, I get to do something I truly love, and I get paid to do that. If one day it works out, the one thing that I want to be remembered for is opening up the web for my community. There’s nothing that’s more compelling to me. •