Friday, January 15, 2016

Why do we limit the list of cities in the Early Beta for the LeovaTravel Chrome Extension?

If you've gotten here, you're wondering why we kept the list of cities on the Early Beta of our LeovaTravel Chrome Extension down; and this post attempts to explain why in as simple terms as possible.

[Of course, if any of this sounds too complex at any point of time, feel free to write to me at nikolai [a.t.] leova.io and I'd be happy to share some of the AI / ML love around]

This blog post is from a reddit post I made earlier today:

If you look closely at the speech transcription in the demo (and our other demoes on leova.io and onesevene.com), you will see a lot of the spoken language is mis-transcribed. Ordinarily, that would break any natural language processing system like Siri / Cortana / jut about everybody else. So how do we solve this problem?
Our secret weapon is a machine learning layer sitting just behind the speech recognizer. The machine learning layer trains on people's accents, the names of products, cities, how those city / product names are pronounced. And the net result of this is an accuracy even greater than Google's 92%. With the machine learner, the AI is now able to structure casual conversation at levels of complexity never seen before.
 The machine learning allows us to offer outstanding accuracy for even complex product names that might often be mispronounced (most speech recognition fails miserably at this). Ever go to a fancy restaurant and mispronounce the name of a dish? Or have the guy next to you at the mall mispronounce versace as ver-say-ce? Yep - that happens a lot. And our ML helps fix this in real-time.
Ordinarily a reliable level of machine training for the set of 10,000+ worldwide destinations would take us about two more months, and we'd have to put off the early beta release for 60 - 90 days. Instead, we took a wee bit of risk and released Leova with MT for about 291 of the world's most popular cities, counting on adding additional cities as people asked for them.
As and when our beta users ask for destinations that aren't in the system, we update the geo-database and the machine training with them on a priority basis. Soon, we will have all destinations built-in and trained for. If you're a beta user and want to speed up the process, just email me a broad range of of cities / destinations you travel to (or want to go to). And I'll have it sorted out in next 12 hours.
A small downside to using the machine learning system is that with a small beta test group, the risk of having someone teach the system something wrong is too great. (you've probably seen instances where Google messes up and says something hilariously inappropriate due to training sets that are too small: 
http://searchengineland.com/when-google-gets-it-wrong-direct-answers-with-debatable-incorrect-weird-content-223073 
http://www.telegraph.co.uk/technology/google/6161567/The-20-funniest-suggestions-from-Google-Suggest.html
http://www.searchenginepeople.com/blog/google-ads-fail.html)
To avoid this horrifying scenario, the machine learner batches "learnings" for us to approve on a nightly basis. The learnings scale exponentially, so human intervention goes down rapidly. But for the time being, our early beta users are gonna be put through the process of helping the machine learn.
For reference, here's a list of cities that are currently in the geo-database: https://onesevene.com/leova/cities.html If you wanna add to this list, let me know and I will get it done asap :-)

No comments:

Post a Comment