This episode features Dr. Ni Lao, Chief Scientist and co-founder of Mosaix, an AI platform enables enterprise customers to create the smartest intelligent voice assistants. He graduated from Language Technologies Institute of Carnegie Mellon University. Prior to Mosaix, he worked more than 5 years on language understanding and question answering at Google. His interests lie in Information Retrieval, Natural Language Processing, Machine Learning and more.
Dr. Lao shares his research journey on machine learning and natural language processing and his view on the future of weak supervised learning.
Highlight: The Concepts of Supervised Learning, Unsupervised Learning and Reinforcement Learning
Watch the full interview:
Robin.ly is a content platform dedicated to helping engineers and researchers develop leadership, entrepreneurship, and AI insights to scale their impacts in the new tech era. Sign up with us to stay updated and access exclusive event, career, and business mentorship opportunities.
Wenli: As the Chief Scientist, you must be very smart. The first question I’d like to ask smart people is that: How did you grow up? Were you athletic? Were you nerdy?
Ni: I think I am more nerdy and athletic. When I was young, people called me a“panda”, and I was very fat, didn’t excise so much. That changed over time, but initially was like that. I spent most of the time just by myself, just observed the nature, tried to solve all the problems that teachers threw at me – pretty much like that.
Wenli: So you graduated from Carnegie Mellon University with a PhD degree from Language Technologies Institute.
Wenli: Your professor was William Cohen?
Ni: Yes. My thesis adviser.
Wenli: What was the story? Why did you choose to study with him? What made him accept you as a student?
Ni: He was interested in me because I was able to guess down and innovate and produce new algorithms that nobody else had produced.
Wenli: Can you give us an example of that?
Ni: Yeah. So actually it was in the summer, when I wanted to switch adviser. I talked to him and then he said: Oh, I have this project. Can you try it out? - which I did. So during the summer, I wrote an algorithm that does reasoning on knowledge graph, and was designed for recommendation and search. He was very, very impressed. He immediately wanted me to join this group.
Wenli: Okay. After you joined his group and through the years learning with him, what did you learn from William Cohen?
Ni: Actually a lot. He is a very interesting guy. He has a ponytail. He has seven Banjos. He was like always happy, never pushed you to anything. But he knows what is correct. Generally, he showed you that you should enjoy your life.
Wenli: Sounds like a very light hearted person to work with.
Ni: But he is extremely smart. Whatever you talk to him, he always understands what’s happening. If he doesn’t understand, he would point you to the right direction to investigate.
Wenli: Interesting. During your PhD years, you worked on natural language processing, and knowledge based systems. Knowledge based system wasn’t as popular as it is.
Ni: It wasn’t as popular in industry as it is now. But knowledge base (system) has been there for a very long time. In research community, people have been studying it for decades. I think it was Google that made it very popular. And I happened to be in the project which was funded by Google. Naturally I went to Google.
Wenli: Yeah, that explains why you chose to work for Google after graduating. My question is: what were some of the limitations, or the problems were you facing at the first project that Google brought in?
Ni: Oh, you mean after I joined Google, when was the…
Wenli: While you were working for google, because Google was the first one launching the knowledge based (system).
Ni: Oh, yeah. I worked on that for some time.
Wenli: What was the challenge you were facing?
Ni: It was just machine learning, and natural language understanding. It was the core research topic like how can you understand everything on the web and organize them into the knowledge graph, which is like still unsolved today.
Wenli: Is it still? Any progress?
Ni: It’s still unsolved. The progress is very slow. Most of the knowledge graph you can see today are generated by human, or semi-automatically generated by human. It always has some human come in and verify that the fact is correct, or even worse - having human to type in the facts.
Wenli: This is the first time meeting you. Before talking to you, on your personal website, you’re talking about that you worked in Google for language processing and question answering. What you just explained to us, that’s probably what you did at Google? Is that all?
Ni: Yeah. I’ve been in Google for like five and a half years. Almost all my projects are centered around question answering. They can be categorized into two groups. The first group we can call it close domain question answering, where you answer questions from knowledge graph, but you can only answer questions of certain type. It’s about the spouse of a person, profession of a person. You have to define the relationship. Because you have to generate knowledge graph for that. And another big group of work, which was open domain question answering, where you don’t have human to pre-define all the things. You just try to answer anything that people ask, that can be answered on the web which is more challenging than the first one. Both of them are very challenging, but with different technologies.
Wenli: Okay. Did you like the culture working there?
Ni: Yeah, I liked it a lot actually.
Wenli: I saw your volunteer experience at Google.
Ni: Yeah. I organized high school and middle school students to visit at Google. And I found engineers and scientists to give them lectures, to talk about what life would be like if you were an engineer or scientist at Google, and showed them around the campus. I like Google most because of its engineering culture. It’s very open, anyone can see anyone’s code. That helps people to communicate, like there is one problem that this person cannot solve. Anybody else can see it and they can just chime in and try to help.
Wenli: Yeah, but you left Google after five years and a half.
Ni: That’s true.
Wenli: You joined Mosaix.
Wenli: Did you choose Mosaix over other opportunities, other startups?
Ni: I was contacted by many startups before that. But I never considered to leave Google, until Mosaix. Because Mosaix is sort of the only one that is doing exactly what I wanted to explore, which is understand content on the internet, understand the user, and try to make machine more helpful to human.
Wenli: When you left Google and joined Mosaix as a co-founder, did anyone around you as friends, family had a second thought? Did anyone say you're crazy?
Ni: I don't think so. My wife was very supportive.
Wenli: Everyone was so supportive?
Ni: Yeah, I don't think so.
Wenli: That's really important. That makes your journey easier, because I did have our guests telling us then their family wasn't being supportive, was worried.
Ni: Yeah. I hear that a lot. Several of my friends who have worked in startups complain that their families don't like it. They want them to go back to big companies, which never happened in my family. I don't know why.
Wenli: That's great. That’s a big surprise for Chinese scientist. Because as we know, that our family can be kind of conservative sometimes. But yeah, as long as you love your job, so you told us that you love your job.
Ni: I definitely love my job. I am very happy.
Wenli: Mosaix is doing exactly what you wanted to do. So what do you exactly do right now at Mosaix?
Ni: I'm the main person who is responsible for language understanding infrastructure, and also does research part. We have research engineer, we have collaborations with PhDs in different universities. I also do a little bit of business development, like talking to people from other company from time to time, if they have technical discussion.
Wenli: Okay, so mainly the technical part.
Wenli: Let's talk about technical part that you talked about weak supervised learning a lot.
Ni: Yes. I think that’s where the future is. That’s why I’ve been talking about it.
Wenli: So in the machine learning category, there's supervised learning, unsupervised learning, and there's reinforcement learning. Can you briefly walk us through the revolution of the machine learning algorithms?
Ni: Yes, I'd love to. So these three concepts sounds like very foreign, but they're actually very powerful concepts. And everybody should be able to relate to, because everybody does a little bit learning every day. Say you want a machine to learn to play Super Mario. If you do supervised learning, you mean that human need to teach the machine step by step what needs to be done. And Super Mario case is like, if you see a turtle in front you, you should jump; if you see a coin in front of you, you should move forward - those kinds of instruction, right? Then you need to give a lot of instructions, you write a lot of rules, you need a lot of human labeling. If nobody does that, it's not going to work. Or it works, if you have a lot of money like Google or Apple.
And then there is the reinforcement learning. The idea in reinforcement learning is to have machine to explore strategies. So you just defined very high level goal, let’s say pass the stage or get more coins. And the machine will try different things and see which one actually gives you what you want. And then you reinforce that strategy. And over time you will learn to do kicking a turtle, or something that get you pass the stage.
But this is also limited. What we’ve done right is very good, it's like AlphaGo or Atari Games, the machine can explore spaces where no human have explored, and find new strategies basically can go beyond human. But there are two issues. One is – first, you need to find space where machine can search over. That’s a big problem. For Go, it's easier, because the space is just the board. But for other cases, like self-driving car, or playing Super Mario, you have to define more super space for learning. Because let's say in an extreme case, your rule is based on the whole image of a game that you need to write infinite rules to achieve performance, right? The second issue is, people haven’t applied too much learning theory to reinforcement learning, so the optimization procedures are not efficient, take a very long time to converge, take a lot of data to get good performance - just wasteful, that we can improve over time.
The third one is unsupervised training. I think that's a key to solve the problem in reinforcement learning. For supervised learning, you label what the machine should do step by step; for reinforcement learning, you provide ultimate goal for the machine to search over. For unsupervised learning the machine - it just tries to understand the world; it just tries to predict what happens if you do A, what happens if you do A plus B, which are not labeled by human at all. For example, for image classification, like ImageNet data set. You want to teach machine answer which is a cat in thousands of cat images, right? But if you want to teach this to a child, we just need one. So before you teach the children the word “cat”, it already has a representation where all the cats are close together and very far away from any other animal, right? So all the parents do is just to put a label there, so a lot of learning is already done. That's why a supervisor training is so important for machine learning.
I have to talk about Yann LeCun’s cake, I guess at this point. Let’s say machine learning is a cake, so most of the cake is just unsupervised training. So it’s trying to predict what happens to the world, just tries to make sense of the world. And then there is some icing on top of the cake, that's supervised training, where you have human labels and data. And on the top of the cake, there is a cherry, that's reinforcement learning. And that directly optimizes whatever you care about.
Wenli: Where do you think weak supervised learning fits on the cake?
Ni: So the icing is supervised learning; the whole cake and the cherry are all weak supervision.
Wenli: How did the technology in weak supervised learning get better over time?
Ni: I guess it's a long procedure of innovation. Because in order to figure out how to represent things better, like define a space where the model can explore good representation, you have to try different model structures. We need inspiration from animal, from psychology or neuroscience. We also need applications, we need benefit that drive you to make those innovations.
Wenli: Is it getting better, the weak supervised learning?
Ni: Yeah, it’s definitely getting better.
Wenli: Okay. I know that Mosaix is using one or two, or is it using weak supervised learning?
Ni: Yes, we are doing that.
Wenli: Compare to the voice assistants that're built by tech giants like Siri and Alexa, how does Mosaix differentiate itself compared to them?
Ni: You mean technical wise, or business wise?
Wenli: Both I guess.
Ni: Let’s talk about business part. The assistant to my understanding, I think it's a portal to the internet. And every assistant sort of indexes a portion of the internet, like Alexa indexes Prime videos, and Google indexes YouTube. But a lot of other content on the internet are not indexed by any of them. A person in this world might be consuming content, which is not served by any of the assistant. These are the things we can help with.
Wenli: Can you give us an example of that?
Ni: Let's say you are interested in TikTok. It's not indexed by any of the systems.
Wenli: So Mosaix can help me search?
Ni: Yes, theoretically we can. Now let's talk about the technical part. We just talked about the business part, the technical part is the following. Big companies - they have a lot of resources. They can afford to hire many, many people, writing the rules, labeling data; quality control; build assistants, a system which you know exactly what assistants do at every step. We are taking a different route. We put a lot of machine learning in a system, so a lot of approximation of what machines should think, as of human coming and annotate very step. But in order to learn those components, we have to get it trained somewhere. We try our best to use only the end output of the system and use that to backpropagate to all the intermediate learning modules, to create training for those modules. Like give you an extreme example, where you just have users, you don't have any data labeler, that you show the users a few options of what we can do for the user after the user says some command, and user will choose one of them. And you sort of figure out what does that user mean when he says this, because you choose some particular outcome, right? But there's also this cold-start problem, where we don't have user in the beginning and what do we do, right? Then we would like to still have some annotation. Given a question, we have rater to annotate what the endup output of the system should be. It could be interpretation of the question, or it can be an outcome. And then system sort of learns from the output.
Wenli: Okay. At the beginning of the interview, you talked about that what Mosaix is doing is to let machine to understand the content.
Ni: Oh yeah, that's also a goal of us. There’re a lot of contents. Let's say, movie is one type of content. And when you say that you query is “I want to play ‘love story’ from Taylor Swift.” And then you sort of understand Taylor Swift is a singer, therefore this “love story” - and she happened to have a song called “love story” , and you know what does this user mean when she says “love story”. So that’s the way we understand content. But definitely we want to have a deeper understanding of content. Let’s say, understand the movies by their reviews, their descriptions – understand the product by reviews, by descriptions, all those things (are what) we want to do. Now I’m still working on that basically.
Wenli: So in the future, can Mosaix understand that, if I say, “Mosaix, I'm depressed. I want to watch a cheerful movie.” Can Mosaix understand that?
Ni: I think that's relatively easy.
Wenli: It is? What would be the hard one? Like right now, what is that the voice assistant couldn't understand, couldn't figure out? Well, I think there’re a lot of things that the voice assistant couldn't figure out at this stage.
Ni: Yeah, you're totally right. Well, so I think it's like chicken-egg problem. For example, for search engine, everybody is saying that search engine is very smart, right? No matter what I type into a search engine, it almost always gives me relevant results somewhere, right? It's very smart because a lot of people use it. They just try to memorize what people choose after they see the results. And our system will be a little bit like that - the more people use it, the smarter you will get, as long as you try to learn from people's feedback.
Wenli: So Mosaix will need data from this person in order to learn from this person?
Ni: Correct. That's our goal.
Wenli: So you will need users first, to collect those data?
Ni: That's true.
Wenli: How would you get the users at the beginning stage?
Ni: Before we have users, we have to have a system which is like already useful and somebody would like to use it. I think we're very fortunate, we have very strong business development team.
Wenli: Nice. So Mosaix’s ultimate goal is to let a voice assistant to know you so well or even better than your friend.
Ni: Sometimes I guess.
Wenli: But sometimes it's really hard, because you are trying to let machine to become a logical machine to understand the content or even logic. What's the timeline between right now until your goal?
Ni: I think it's just a gradual ramp-up. It depends on how you measure it, how automatic you want this to be, how open domain you want it to be. If you want it to be like close domain, just understand your daughter’s dentist, you can do it today. Just write a few rules then you’re solved. But if you want it to be very general, understand everything, that will take longer time.
Wenli: What would you say is the next step for the voice assistant?
Ni: I think it's just a race to get user. It's like early years of search engine, everybody wants to get more users.
Wenli: So after you left Google and joined Mosaix, right at this stage you are an entrepreneur. What are some hardships that you're facing?
Ni: For me in particular, I think it’s mostly like just to figure out what people want, like who are the actual customer, who actually are going to use it. And then translate it to technologies. I'm very strong on the technology side, but to understand the market, I'm not an expert on that. I'm very happy that our company has people who are very good at those things that have been very successful.
Wenli: So my last question will be: What is your takeaway from leadership?
Ni: Once I'm here, I saw different people actually join full time. People come with different motives. Someone might just come, want to learn particular technology; Someone might come to have career development; someone might just want to build something that people want. I feel like it's very important just to respect what they want, and just guide them towards what the company needs, that usually has very good outcome I think.
Wenli: Do you have a leadership style?
Ni: I don't know. I need to ask them.
Wenli: Okay, thank you so much for coming here, sharing your knowledge about machine learning and NLP.
Ni: No problem. Thank you for having me.