By clicking “Accept All Cookies”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.
News

An autonomous AI system must know what it doesn’t know

Corti
An autonomous AI system must know what it doesn’t know

We are happy and proud to announce that we just had another publication accepted at one of the most premiere AI venues worldwide, International Conference on Machine Learning.

Your autonomous vehicle is driving you towards a red light of a busy intersection, and instead of the car stopping, it speeds up…

A scenario that many dread when AI comes into mind. — And it is not far-fetched. — We have seen several examples of strange behavior by autonomous vehicles that have led to fatal outcomes.

Modern AI systems possess no concept of what they don’t know

There exists a general misconception that modern AI systems have a notion of confidence or uncertainty for that matter. When learning a modern AI system on a more or less biased dataset, it generally has no concept of what it is not learned upon. What is worse is that it then has no concept of bias. Let’s touch upon the example in which you are in an autonomous vehicle driving you towards an intersection…

The autonomous vehicle’s AI system has been learned on a lot of data, where intersection data, due to its chaotic nature, will have a high prevalence. In this case, there is something out of the ordinary in the intersection that you are heading towards. It does not have to be a paramount anomaly, it can perhaps be construction work that routes the traffic differently or a snow cover that has been polluted by the exhausts. This minor anomaly has not been covered by the dataset that the AI system was trained upon, nor is it adequately similar. Since the system has no knowledge of what it doesn’t know, the behavior that it will tend to in this particular intersection is highly unpredictable. — the car could potentially speed up…

We are of course not the first people to think about this problem. Many extremely skilled engineers work on alleviating these kinds of risks from multiple angles. Here the most frequent solution is to use brute force: Make the dataset big enough to cover all situations that an AI system can possibly meet. We truly believe that this approach can indeed bring us very far, and even far enough to have autonomous vehicles that are safer than human drivers on the roads. Especially for frequently occurring situations, such as highway driving. However, we are also firm believers that we cannot live under the reality of these systems not knowing what they don’t know. 

More and more fully autonomous systems are being rapidly developed across the spectrum.  What happens when the AI systems that are developed get increasingly commodified, such that the engineers developing them are not aware of the dangers that might lie within?

Recent findings in machine learning research are concerning

One of the big hopes for predicting uncertainty was unsupervised deep generative models, a research field in which Google and Microsoft have invested heavily, primarily through their Brain, Deepmind, and OpenAI ventures. However, Google recently proved that these models were not able to predict what they don’t know. Several findings have since then tried to mitigate these issues but have not been able to solve the purely unsupervised detections where we seek robustness towards truly unknown unknowns.

What makes our approach different?

Instead of looking at the entirety of the complex data input, our contributions utilize powerful latent variable models and their ability to learn latent semantic representations over data, similar to a sort of data fingerprint, to quantify the uncertainty. 

For instance, when a machine learning model is learned on a dataset of handwritten digits from 0 to 9, this fingerprint might relate to the specific digit class and the handwriting style it was written in. If we then feed the model an image of a shoe, an object it has not seen before, the fingerprint diverges from the semantic abstractions that were useful for describing handwritten digits.

This we have shown to work on complex data to the extent that we can detect such out-of-distribution inputs. Hence, we have a system that knows what it doesn’t know.

Read the latest publication from this year’s International Conference on Machine Learning here and the predecessor from Neural Information Processing Systems here.