QCraft, a Chinese startup co-founded by four former engineers of Google’s self-driving project, is developing a driverless vehicle expected to launch by the end of this year.
Called “Longzhou Space,” the autonomous shuttle will be “a hybrid between robotaxis and robobuses,” Da Fang, co-founder and chief scientist of QCraft, said on Sept. 17 on the sidelines of TechNode’s Emerge 2021 conference in Beijing. The vehicle is one of the two-year-old company’s efforts to expand its autonomous commercial fleet, which now numbers about 70 robobuses operating in six cities.
“It’s going to provide city bus services but, when the demand is not high, it also can fulfill the (function of) ride-hailing,” Da said. He added the latest product highlights the company’s thinking on the future of shared mobility, in which autonomous vehicles (AVs) will be seamlessly shareable among people and can be adapted to carry freight and for other purposes.
Drive I/O
Drive I/O is TechNode’s ongoing premium series on the cutting edge of mobility: EVs, AVs, and the companies trying to build them. Normally available only to TechNode Squared subscribers, we’re making this issue free as a sample of our paid content.
Da declined to reveal the name of its manufacturing partner. He told TechNode that the company is also developing assisted driving technologies for vehicles larger than minibuses and sedans in separate collaborations with automakers, without revealing further details.
Since its founding in 2019, QCraft has quickly become a rock star self-driving car company in China. Its co-founders include Yu Qian, a former team leader at Google Maps, as well as Da Fang, Hou Cong, and Wang Kun. All four are former software engineers at Waymo, Google’s self-driving unit that is widely considered a technical leader in autonomous driving.
Now employing about 200 in China and the US, the startup in August announced it had closed a $100 million Series A+ from new investors including YF Capital, a private equity firm founded by Jack Ma, and Longzhu Capital, food delivery giant Meituan’s industrial fund. This followed another major funding round of “dozens of millions of dollars” reportedly from TikTok parent ByteDance and other investors earlier this year.
TechNode took the opportunity at last month’s Emerge 2021 conference to interview Da, a former Waymo engineer in motion planning, one of the most challenging areas for autonomous driving. Da obtained a PhD in computer science at Columbia University where he focused on computer graphics and animation, developing simulation methods used in the modeling of liquids.
The following conversation has been edited for clarity and brevity.
TechNode: Behavior prediction is one of the hardest problems in autonomous driving. Companies including Waymo are training their driverless cars using simulated software to handle various unpredictable situations. How does that work?
Da: Behavior prediction is basically trying to model the world, including the agents such as the other vehicles and people, and predict how they are going to behave. The difficulty is that the future is not really certain. If you imagine a pedestrian standing on the side of the road and he is moving towards the middle of the road, a human driver probably knows how to react to it and to brake to let the pedestrian pass first. But there are uncertainties in the humans’ actions. The pedestrian may stop in the middle of the road or accelerate and speed through the road and reach the other side quickly. So your reactions need to change accordingly.
There are other difficulties as well. For example, negotiations. Sometimes prediction is not just about predicting what the other people or vehicles will move, but also about negotiating with them. If you imagine two cars merging into one lane and they approached the point at roughly the same time, one of them has to proceed first and the other has to brake a bit later. So this will definitely involve some negotiations in scenarios like this. Motion prediction is not only about predicting what other vehicles will do or will not do, but also about understanding how our actions will affect those predictions.
TechNode: There has been a significant debate over whether AVs should leverage multiple sensors or purely rely on cameras to navigate the environment. What is your take on that?
Da: Our view is that these different sensors are very complementary to each other. There’s just no reason to not use them at this stage of AV development. We know that right now, the most commonly used sensors are cameras, lidar, and millimeter wave radar. Lidar is really good at measuring distance. You can get lidar points that specifically, accurately pinpoint an object in a 3D space and know how far they are from us. That’s what cameras can’t do. With millimeter wave radar, you get speed measurement as well, but at a lower resolution.
READ MORE: DRIVE I/O | Lidar is hard—but it’s coming soon
There are many advantages to cameras in terms of high resolution. You can recognize textures. You can recognize small objects, like traffic cones and faraway pedestrians. That’s something you cannot do with lidar. But you will get a lot of negative impact in adverse weather like rains, snows, and frogs with both lidar sensors and cameras and that’s where radar sensors really shine. All of these sensors have their strengths and weaknesses. None of them by itself is going to be enough for dealing with all the scenarios. Basically, we have to use all of them together in order to build a really safe vehicle.
TechNode: But which one is better for AVs to detect and react to stable objects such as parked vehicles? That’s one of the major technical issues behind the recent Tesla and Nio crashes.
Da: We know that’s a very challenging problem for radar, mostly because of the low resolution. Radar sensors have reflections of these objects, but they have a difficult time telling them apart from backgrounds like the ground or buildings on the side of the road. Lidar will be much better because of a higher resolution. You can recognize objects directly apart from the background, even though it’s just a stationary point. With cameras, you can do the same, because you have much richer information in both resolution and color. I think right now lidar is proving to be the most important sensor of the three for highly autonomous driving.
TechNode: You and your founding team members worked at Waymo for a few years before setting up QCraft. What have you learned from that?
Da: One of the things that Waymo has done really well that we are trying to replicate here is that engineers do not just try to solve the problems, but try very hard to solve the problems in the right way. So, for example, the computation of the headway.
When you’re controlling the vehicle to follow the vehicle in front in the same lane at a comfortable distance, there’s this headway distance that you want to figure out. If an engineer is tasked with computing this optimal headway distance, how would he proceed? An average engineer will probably say, let’s put up a few driving logs, watch what human drivers have been doing, measure the distances, and maybe do some averages. They will get some numbers like 10 meters, 20 meters, 30 meters, depending on different scenarios, and then just use the numbers.
Obviously, that’s a really bad solution because it doesn’t generalize. Let’s say 20 meters may be good for a reasonably high-speed road, but it’s not good for expressways and urban areas as well. A better engineer would realize that it depends on the driving speed and the circumstances, such as the width of the road, but, most importantly, the speed of the two vehicles.
But that’s not good enough, still. If we have a really good engineer, who’s trying to always go one step further, he will think about when we are driving the car ourselves, why we will pick a different headway distance at different speeds. The answer is probably at a high vehicle speed, if we don’t leave enough room in front of our vehicle when the car in front of us brakes, we will not have enough reaction time.
That’s going straight to this concept called RSS, which stands for “responsibility-sensitive safety.” If we have a really good engineer who’s trying to find the right solution, they will basically discover RSS by themselves by solving this problem. That’s a really important thing that we would value. (Editor’s note: RSS is a mathematical model for AV safety framework developed by Intel’s self-driving division Mobileye.)