Rav Babbra, business development and program manager at dRisk.ai, which is a partner in the D-Risk co-innovation project with Imperial College London, DG Cities and Claytex, discusses how to perform safe and transformative AV testing
What are the key elements of a safe AV test strategy?
There are several factors that are imperative. First, every testing model must have the capability to identify failure modes early, and then continuously in the development and testing process. This model ensures the right resources are placed on retraining AVs in areas where performance was not ideal.
Second, there cannot be bias, for example, overtraining in any one area of risk over another. The ability of an AV to make an unprotected right turn in the UK is critical for safety. However, overtraining to the detriment of a left-hand turn would create unforeseen risk.
Similarly, testing models need to capture scenarios that are ‘out-of-sample’ events that the AV has never been exposed to. Today, AVs are being trained using the most common traffic accidents. But this does not account for the one-in-a-million incidents that the public has experienced, nor the near-miss accidents that are not recorded as accidents but are still a possibility. These need to be included to deliver robust testing across the entire operational design domain (ODD). We’ve established there is a six-fold improvement in the detection of a collision risk when an AV is trained using edge cases compared with traditional models, underlining the importance of taking such an approach.
Silicon Valley has arguably led the way in AV development and testing; from a safety perspective, do you agree with the aggressive approach taken by developers in the region?
The US has, by and large, allowed AV companies to self-report on the safety of their systems, and AV developers have now deployed those systems so that people can access and use them in a couple of cities in the US. They are already collecting rich data sets, moving the US closer to having the experience and capability to manage large-scale deployments.
However, questions still remain and raise the importance of including the public in consultation and development. Are people happy that AVs being rolled out have not been tested by an overarching central authority? Does the public truly trust the AV developers’ safety reports?
In contrast, the UK is taking a slower, design-by-committee approach, which means more consideration and thought around items like safety cases, and that there will no large-scale deployments without a safety driver. The UK government channels grant funding into the unexplored areas requiring answers, including large-scale testing in simulation. This method will lead to a central governing body and approved testing methodology, which will ultimately be able to channel responsibility with a vetted and transparent approach.
Both approaches have pros and cons. Whether centrally administered or federated, it’s our view that the testing and training process can and should be made completely transparent, and yet still hard to ‘game’ (i.e. cheat on). But it must also remain flexible enough to take into account regulatory changes and new oversight guidelines.
How is AI being leveraged in the development of AVs?
AI enables AVs and their development. It will always perform well in those areas where it has been trained. If it is trained on the right types of data, including an especially large number of edge cases, we can cut down the number of scenarios that it needs to be tested on in real life.
It’s therefore important to note that AI itself does not reduce the number of scenarios that need to be tested. Indeed, it is the AI-based autonomous vehicles that are likely to be the most effective, but also need to be tested against the greatest number of scenarios, because they are hardest for regulators to understand.
AI approaches are used to train AVs in simulation, which can speed the AV development timelines while also training them to be competent in far more scenarios than without AI-based approaches. Most importantly, effective use of training AI on high-risk and tricky scenarios lets developers improve safety and get good performance on more commonly seen cases for free.
We have seen that if you train on edge cases, AV stack performance on center cases (normal driving) remains the same, but performance against high-risk events is drastically improved – AVs trained this way are between six and 20 times more likely to detect a risk of collision.
Is simulation the answer to efficient and safe AV testing and development?
Physical testing is expensive and time-consuming so simulation is certainly one of the tools that can help you efficiently identify where failure modes exist. Overall, it allows you to test more scenarios quicker than you can in real life. It also lets you test your AV against edge cases without having to go and find these events. Once you know where the potential failure modes are, you can direct attention to the things that need to be tested on the track.
What simulation tools and techniques will be the most effective here and why?
Design and testing are most effective when developers use simulators appropriate to the AV stack component being tested. For instance, perception requires sensor-realistic simulators, motion planning requires realistic vehicle dynamics. It’s also important to use simulators that can replicate all the weirdness of the real world. Light is a very good example: detecting reflections, how light bounces off objects, shadows, lighting while being physically realistic.
How can developers work together to ensure a safe test strategy?
There are many different tools but also standardized formats, for example, scenario description languages that make it easy to share scenario data and test on different simulation platforms. We’re advocates for collaborating on the design of systems based on edge cases. By crowdsourcing data from the public, developers have genuine use cases to trial. These rule in the one-in-a-million incidents you simply can’t dream up in a lab, and rule out bias, such as color of skin.
What milestones still need to be reached for widespread adoption of automation to be achieved?
The biggest milestone ahead is to prove with statistically sound methods that a certain AV stack can drive more safely than a human. To begin with, this might only be achieved for a relatively small domain, like driving on a predefined route. But eventually it will have to be shown that AVs are indeed capable of reducing the overall number of fatalities and injuries on public roads.
How realistic are industry predictions for the introduction of full autonomy?
It’s true that overly optimistic predictions have been made in the past. However, more and more companies have progressed from exploratory R&D to developing prototypes for specific autonomous driving functions. These are very close to being ready for deployment on public roads. And thanks to more efficient collaboration between manufacturers and suppliers, we are now seeing a much faster development cycle on subsystems, so it’s just a matter of time until we see better full autonomy results.
The D-Risk project is part-funded by the Centre for Connected and Autonomous Vehicles and Innovate UK. The aim of the initiative is to improve the safety of AVs to make them a commercial reality. To do this, the project is crowdsourcing from the public the one-in-a-million accidents (known as ‘edge’ cases) with the aim of creating the world’s most extensive library of edge cases that developers can use for testing.