Updated: Jun 27, 2019
This episode features Dr. Chang Yuan, founder and CEO of Foresight AI. Foresight AI is an AI and robotics technology company that develops a global scale data platform to empower autonomous mobile robots, such as robo-taxis, delivery trucks and flying cars.
Previously, Dr. Yuan was a senior research manager at Apple Special Projects Group and Senior Scientist at Amazon. He received his Ph.D. of Computer Science from USC in 2007 and has more than 10 years R&D experience in building cutting-edge technologies and products in the areas of computer vision, machine learning, robotics, etc.
Dr. Yuan shared his research journey in computer vision, his entrepreneurship experience, and perspectives on the future of autonomous mobile robots.
Highlight 1: Milestones in Computer Vision
Highlight 2: China vs. U.S.: Trends in Adopting Autonomous Driving Technology
Robin.ly is a content platform dedicated to helping engineers and researchers develop leadership, entrepreneurship, and AI insights to scale their impacts in the new tech era. Sign up with us to stay updated and access exclusive event, career, and business mentorship opportunities.
Wenli: In this episode, we have the pleasure to have Dr. Yuan to join us to share with us his journey in computer vision, as well as his expertise in the field. You did your undergrad in Tsinghua University. And then you did your PhD in the US?
Chang Yuan: Yes, at the University of Southern California. Throughout these years, I've been studying computer science, and my specialty is computer vision. Computer vision is a scientific field where we try to infer the semantic information out of the images and the videos, and recently, of course the LiDAR (laser scans) sequences. All these things are basically trying to mimic what humans can do - we observe things and then we use our brain to infer what's there. And we infer 3D geometry, such as how far the things are away from us, how things are moving. And we also infer semantic information - What is this object? What does human look like? Is it the same person as that person? Basically, computer vision is a science and technology for understanding the world in a geometric and semantic way, from the visual media.
Wenli: Which year was it that you started your PhD?
Chang Yuan: I started my PhD in 2003. I graduated in four years, and I went on to work in quite a few companies in the industry. I myself was very excited to see the tremendous progress that had been made by the industry. I was happy to contribute to some of those as well.
Wenli: In 2003, I'm guessing that's even before the 3D movies became popular in the theater, not even mentioned 3D imaging without the glasses. What I'm saying is that computer vision wasn't receiving as much attention as they do right now. What made you choose computer vision as the field of study?
Chang Yuan: Yes, for my undergrad (studies), I attended a digital image processing course. I was so intrigued that a technology can really be smart that can infer and understand the images. I can understand the human faces, I can reconstruct the world from the videos. All those were very intriguing for me. And I've decided to take that as my specialty. I've been doing that ever since for like 20 years.
Wenli: Reconstruct the world, that reminds me of the author of The Lord of Ring, like you're creating an entire new world. What was your vision when you started, what was your ego?
Chang Yuan: My ego was that I want to build new technologies so that we can just use them to understand the world better, so that the machines can do a lot of things for us. Take a previous example (that happened) a long time ago, face detection. For this kind of things, you can use it for detecting whether there's a person in front of you, or you can recognize the human bodies, those movement. And also, of course, recent examples, autonomous mobile robots. All those robots need to have the vision capability just like humans, and they decide to do some tasks for us.
We actually just gain our life back. Why is everybody so happy or excited about autonomous driving? It's not just about safety, but also liberating us from the mundane transportation tasks. Either the car can drive you or it brings stuff to you, so you can use this time to do other things, even using this time on Facebook, you still control the time, right? All these technologies, like computer vision, will make our life better, and will make the world a better place. That's what personally I feel very excited. I have been excited for the past decades about that.
Wenli: It’s true. All of those things we were only imagining it in the movies. And now you're doing it. It's actually really, really cool. After the PhD program, you joined Sharp in the US, and was working on the 3D Display and mobile sensing. It was a big thing, it was on the news, right?
Chang Yuan: Yeah.
Wenli: And then after that, you joined Amazon, worked on some Amazon startup projects. And one of the projects was Amazon Go. I really like it because I tried it in Seattle. I walked into a store with my Amazon ID. And then I grabbed four chocolate bars. And then after I walked out of the store, my Amazon account was automatically charged by exact amount of chocolate bars. I had no idea how did they recognize me grabbing the exact four chocolate bars. It was really cool.
Chang Yuan: Yeah, definitely. That was the magic experience that we enabled with computer vision, machine learning, sensor fusion, and also combined with highly optimized hardware, and 3D cameras,.
Wenli: Yeah. So, to share a little more about that.
Chang Yuan: In the Amazon Go experience, what was hard technically was that, we want to make sure that we recognize the product event. When I say product event, I mean the product was taken off the shelf, then we associate that product with a person. We have lots of cameras on the top (of the room) and track that person, so we can make sure that the product is associated with the right customer. Both from the user experience, or from financial side, we cannot make mistakes, and also enable a new thing.
Again, through this experience that you're happier, you've saved time, you don't have to wait in the line. All these things show how the technology is making our life better.
Wenli: Yeah, that's like billions of dollars investment in the 7/11 like store.
Chang Yuan: Yeah, definitely.
Wenli: After Amazon, you spent some time with Lenovo in Hong Kong for the front facing 3D imaging, which was kind of unique, like living in Hong Kong for a couple years. And then you joined Apple's special project group. What was that?
Chang Yuan: My wife and I kind of made a little venture. We went back to Hong Kong to just test it out, see how it is like to work in China. But then I got contacted by Apple. They really excited me about a world-changing project. After some interaction, I decided to come back to the US and join that project, which is the special project working on autonomous systems. We are building the ultimate system to enable new (robotics) applications. And over there, I think it's a great experience for me as a computer vision scientist. I learned a lot about autonomous mobile robot system. And then how is it behaving? What are the things to be done? And what are the bottlenecks?
I learned a lot over there. But one specific thing I learned is that for robot, making decision about how the robot can move, how robot can react is actually very hard. It is not computer vision; it's actually really general artificial intelligence problem. Computer Vision, while it is evolving and actually moving a lot, is quite predictable. The general term is that we can predict that when we can get to 99%. But even with that, you know what the world is about, how the things are moving, robots (still) have to make decision. I think this is the most challenging problem in the autonomous vehicles, mobile robots, and all these things. And that's actually where I think that I have a better way to solve that problem. That's why I started doing something new.
Wenli: I see. Until this point, I think that your entire career journey is like you're doing a cutting-edge technology, and leading teams and doing really cool stuff. I'm thinking maybe it would be really cool to share with our audience some milestones of innovations.
Chang Yuan: Sure, there are a few things along the development of computer vision that are really big milestone. Of course, this is an incomplete list. But let's say, start from 2001, which is pretty early, we only had webcams, those kind of crappy webcams, those kind of (low-resolution) videos that we could get. At that time, we already had a real time face detection method, which is called AdaBoost. This method can detect the human faces from just a single camera, and it can run at real time. I was not at the original conference where they released this paper. But the presenters -- by the way, they're very famous, Viola and Jones —showed a live demo. At that time, it was pretty rare that you can show a live demo. You point a US dollar in front of a camera, and then you can detect the faces on that dollar in real time, very accurately. That's like a life changing moment. I was also very motivated - it was very inspiring to me. That's 18 years ago, which is crazy.
After that, there were a few other important things released into consumer space. It's really kind of a big deal. One is the Microsoft Kinect camera. It has the 3D camera that can recognize humans, and then brings you the interactive 3D gaming experience. And another thing is the customized hardware that allows you to have the real time interaction.
Moving along, there's also the DARPA Grand Challenge. And then there's ImageNet, they developed deep learning methods to solve a really hard image recognition problem.
And then there's Amazon Go, which is another kind of great milestone in the confined environment, how can you use computer vision, machine learning technologies to enable new user experience, which is core of how Amazon provided a better shopping experience.
Recently, I'm also proud to share that, I contributed to the Apple face ID project. With the really tiny 3D cameras, you can enable the natural, frictionless 3D face recognition. Also, you can use it to generate the Emoji, all those kinds of things.
Along this line, these are the few kinds of major things I want to mention. If you look at this, what happened over there is all generated by the combination of technology breakthrough, especially software or algorithm breakthrough, and also the customized hardware so that you can actually make it run faster, in real time. And apply these technologies to enable better user experience. Without better user experience, all these technologies don't help - (otherwise) why do we work on that?
Wenli: Thank you so much for sharing that. One of the things I'm really curious is, as you mentioned that, while you were working at Apple, you learned so much. Why did you leave apple in 2017?
Chang Yuan: Of course, Apple is a big company with almost unlimited resources. It is actually a great place to work. Looking at the problem we're trying to solve, we still don't have the 100% certain answer about how to make self-driving cars work. It is a complex system with unknown answers to some scientific questions. In this case, I think the big corporations like Apple and many other companies, they're great, they build very solid technologies.
But they may be a little bit too big in tackling and adjusting for these very different, unknown, uncertain questions. It's a little bit like you're riding on a big cruise ship - on a Carnival Cruise ship. It's great, very comfy, and is driving smoothly. Do you believe if you're going to the right direction? you don't know. But sometimes we have to do fast experimentation. I felt that I could do something in a different way, hopefully better. And I can do faster experimentation. To continue the metaphor of riding on the boat, we jump into the water, and founded a little dragon boat. The benefit of dragon boat is, it is really fast. I'm blowing the whistle and cheerleading teams: Hey, let's go this way. If it doesn't work, okay, let's quickly change to another direction.
Wenli: Besides trying new things and trying exciting things, one of the things that is very different compared to joining a big company is that you have to take charge of this company now. I know that you like to take challenges. You love these exciting and challenging things? What still be difficult for you through your entrepreneur experience, some hardship that you experienced?
Chang Yuan: Sure. Of course, I have many years of technical leadership experience - I led a team and managed a team. For those part, mostly I was able to do that. I think there are several things that are new.
One is that selling our vision to investors. It basically takes time to find the best narrative. How do we convey that message in 10 seconds? That has been the part that I am still practicing.
The second challenging part, which is new, is that working with the customers. Our customers, big auto OEMs, tier one, tier two, are big companies. And how do I navigate within these big company and find the right team? That team will become an advocate to help us to drive the project and the contracts. I also wouldn't say that we have 100% perfect product market fit yet, because things keep changing.
Wenli: I know that Foresight AI is an AI platform for autonomous driving vehicle companies. Let’s tackle those things one by one. When you're selling to investors, what would be your one sentence pitch?
Chang Yuan: We are using computer vision technologies to generate 3d driving data to solve motion planning problem. That is how can these autonomous vehicles drive safely and naturally and how do we know, which is the most challenging problem in the autonomous driving industry. From a technical point of view, we capture all the driving behaviors on the road, bring all the driver behaviors around the world into the simulator, and the let the customer vehicle drive in the realistic environment. You can drive around and get fully tested. Probably later, we can provide a certification (service)There're many kinds of full-stack driving system or vehicles. You can claim you're safe. How do I know you're really safe? How about I test you with one million left turns and one million right turns all those tricky situations, and see if your vehicle can handle?
Wenli: You want to set a rule later?
Chang Yuan: Setting the rule is really hard in this sense. (We are providing guidelines and quantitative metrics.)
Wenli: Providing a certificate is like providing a golden standard.
Chang Yuan: Hopefully we will provide some golden tickets. But honestly, it's too early to mention that. I think right now we are working with a group of sports players who are running a marathon. We don't know when we can get to the ending point, but we are providing Red Bulls to these players. Here's a Red Bull, you drink it with our data, you basically move faster, which guarantees you can see the ending point faster. And then when you get to the end, I also tell you with my stop timer, say: Hey, how fast have you moved here? How safe are you? We’re really humble to say we're facilitators, boosters, and we help all these companies. Using our data helps these companies to make them develop their system faster, and then get to the end point faster.
Wenli: Let's talk about the second thing you mentioned, which was market fit. The idea I'm getting is that Foresight AI's technical focusing is simulation.
Chang Yuan: We pretty much really focused on the data and the derived software. Simulation itself is also a big effort that you have to create a lot of large scale graphics engine in the distributed way. With a limited resource, we're not working on that. We bring all the data and the related intelligent software as a plug-in into simulator.
Wenli: I'm thinking that maybe your field is competing with Waymo's Carcraft? How do you differentiate with all of your competitors?
Chang Yuan: Sure. Waymo has been the leader and we learned a lot from their public blog post and Carcraft. Honestly what we're doing is similar to what Waymo is doing. We go out and collect lots of driving scenario data, and just bring all these pieces together and put into simulator so you can fully test and train the car. I think our specialty is that the way we capture data is special - unfortunately I cannot talk too much about that, but this is special that we can capture millions of driving scenarios in a very cost effective and highly accurate way.
Basically, we can spread out faster than Waymo and just cover many, many areas, and capture how people drive and move in those different areas. That's our really special sauce and our focus is to move fast, generate lots of such scenarios, and bring those data into the simulator, any existing simulator.
Wenli: Yeah, but as a startup, how do you get those data?
Chang Yuan: We collect by ourselves. Our specialty is that even with a low cost sensor, we can get the same 5 to 10 centimeter accurate data that is needed for autonomous driving. We have the technical advantage. If any other team wants do the same thing, they can probably build in one or two years, which is possible. But with this lead time, we can build the data barrier and customer barrier, which are the really valuable things.
Another thing down to the customer needs is that who are our customers? The real customers are the R&D engineers in these customer companies- they are crying for data every day: I need data, I need the driving data, all those things. We want to enable this frictionless process so that they can go to our website, select the data they want, and put some payment process in the place and get the data in half an hour. That's our goal.
In the end, robots are just robots, which are developed by R&D engineers. They need a lot of love so we give them more data and they get more love.
Wenli: I know that your company has collaborated with City of Sacramento to capture HD map routes of all the autonomous driving companies that are operating in the Sacramento. I also know you have clients, both in China and in the US. I am just wondering that will there be any different trends in terms of technology developments in the two countries?
Chang Yuan: There are three things common between both countries. One is of course, the big investment in the Level 4-5 autonomous driving, because it's a hard and unknown problem. A lot of people are developing that. Deployment is also following a (step-by-step) procedure. We have shuttle buses and then go for a shuttle robot taxi, and a real robot taxi, then passenger vehicles. The whole process can take a decade or even more than that. That's actually common between China and the US.
The second part in common is about the city government and the state governments - they actually want to understand what's happening. There're a lot of proposals submitted to the USDOT (Department of Transportation). They want to study how the shuttle buses and other autonomous vehicles are coming to the city and how should city react to that. This is also the second market, intelligent transportation, (for autonomous driving technologies).
And the third part is actually quite different. In the US and Europe, the ADAS (Advanced driver-assistance systems) has been pretty common. Many, many cars have that. But that's not the case in China, where there's a big bringing-up of ADAS system. China recently passed a regulation that all the commercial vehicles need to have assistant driving, and they also need to have a driver monitoring system. That's actually the big opportunity there. For all those people developing this, they're all competing against Mobileye to get into those systems.
Wenli: That's really interesting. As a tier one company, if I were you, I would be really considering about the commercialization of the entire industry. I'm guessing the bottlenecks in China will be different compared to the bottlenecks that we're facing in the US. One thing that I know about the US market is that there will be lots of powers to stop the cars from going commercializing, because it may affect a lot of people's job. What would be some different bottlenecks in two countries?
To be really humble and accurate, we are striving to be a Tier two company (the automotive industry is quite hierarchical). We provide the data component, and we provide part of the simulation capability to these Tier one and OEMs. There are regulatory dynamics in different countries. In the US, I think people tend to be more conservative, they're still studying that. There's a great example from the trucking industry: people want to do autonomous trucking and claim that it's an easier problem to solve, they're going to face a bigger pushback from the driver unions.
Honestly, right now, what I focus on the technology part is, how can we educate people about expectation of safety using such system, either assistant driving, or automated driving? Elon Mask has made a very controversial comment (about Tesla vehicles’ full self-driving capability). I definitely don't agree with his comment and their cars with (so called) full self-driving is actually very dangerous. We need to have a certain way to clearly define, what's the safety capability of each vehicle, whether we can trust that, and we have to participate in this process too.
China is similar, but lots of city governments have more privilege or more capability to push things. (If) you want to enable a vehicle to communicate with infrastructure, sure, let's reconstruct the road with the sensors and get it done in a month. That's totally doable in China. I think people are generally, from what I've seen, more tolerant to start experimenting on the road and are trying more things.
Wenli: Yeah, I guess the road situation is more complicated in China.
Chang Yuan: Definitely. I mean, all the driving behaviors we capture in the US are not that applicable in China, because it's more crowded, with more crazy things., We're going to capture data in China as well.
Wenli: What will be your solution? Focusing on both markets means that you need the data from both.
Chang Yuan: We probably will be focused on the US (at the current moment), because we have other things to figure out and we're busy with. We will collect data, process data, serve our customers over there. In China, we will work with the partners, who will collect data for us or with us. And then we have the software as a service to basically take their data, process them, generate the real world driving scenarios, and use those to train the vehicles in China. We got to have different data packages, one for US, one for China, one for Europe, or Japan, Korea, many different places.
We're also in a good spot. We're a global company: for every customer company, we have to generate local data over there.
Wenli: So much stuff have to get done in the future.
Chang Yuan: Definitely.
Wenli: Speaking of that, we were thinking that L4 and L5 vehicles will probably not be commercialized until 2020. One of my concern is that, how do you buy that much time for your company? Because you're relying on them to make profit? What's your plan?
Chang Yuan: The biggest challenge is time to market - how fast you can really deploy a trustable robo-taxi fleet or shuttle bus fleet to the real world, you will be more likely to succeed in a business. It's unfortunate that Waymo’s promise “We will get the fully driverless fleet last December” didn't happen, which means there're a lot of problem and challenges. In a humble way, we want to be facilitator and we want to help them. We make revenue along that way. But at the same time, we also gain a lot of intelligence about autonomous systems. We have other kind of things that can be offered as well. Not just data, but also the intelligence, that's eventually will be more valuable.
Wenli: My last question will be: Where do you see yourself and Foresight AI in the next three to five years?
Chang Yuan: In five years, I hope we get to this place that all the automotive companies, tier one, tier two, and all the public sector entities like city government, state government, want to use our data and the intelligence platform. They can use that to train the autonomous vehicles. They can also use that to analyze (driving systems)—it's almost like we provide data to California DMV, and say if you want to deploy your car into California DMV, please run the Foresight AI data and the software to get a safety rating. How safe is it? How well can I handle the traffic information here? This is just one piece of a pie.
The other big piece of pie was for us to contribute to the intelligent transportation system. How are the cars driving on the road? At the city level, which part is the most crowded area? How do you improve that? How do you adjust a red light duration? How do you design the parking spot? So many things will help them use us. For example, (I have) a great model company called Esri. They're the world's number one Geographical Information System company. They started 50 years ago, and their annual revenue is $1.1 billion. The founders are a couple, Jack Dangermond and his wife. They started a company actually without a VC funding, bootstrapped and really built up, so that now every US government entity is using their product. That's how powerful it is.
We want to be this open data platform, software platform that allow many people to use our capability. And use us to improve the automotive transportation, logistics, government, public sector, other industries. That's our vision and goal, to move along, build up more customers, have more sustainable business, and then also invite more people to work with us. We are very open to working with people and different companies. We're talking to HD mapping company, they build their map - great - we import their maps. We're working with simulation companies, - great - we want to put our data into your simulator. And then we serve our common customers who are developing the real time driving systems. We become the most important, most unique data provider in this industry and work with the whole ecosystem.
Wenli: Wow, you really have a big vision of our company. And I think you'll be doing great because I feel like you do have the entrepreneurial spirit and you're not afraid of taking challenges and also that you have the personality to sell.
Through this one and half years, I think I (learned) and did better at sales. I must thank my wife. She's a Professor in Marketing, and trained me a lot with solicited and unsolicited advice. That has been really good.
Wenli: Yeah, we're not even mentioning your professional skills in the industry. I wish you all the best. Thank you so much for coming to our platform!
Thank you for the great opportunity!