Data Poisoning: The Critical Weakness of ML That Hackers Are Exposing

Artificial intelligence (AI) has been hyped as the future of many businesses. And in particular, the ability of machine learning (ML) to analyze data, look for correlations and spot patterns will be critical to how enterprises operate in the years to come.

For instance, Algorithmia's 2021 Enterprise Trends in Machine Learning survey found more than three-quarters of firms (76%) are prioritizing AI and ML over other IT initiatives in 2021. The main goals of this are to drive revenue growth through better customer experiences, and automating processes to reduce costs.

However, they don’t come without their challenges. One particular issue that businesses will have to address is the security risks they open up. IT professionals need to be aware of the inherent weaknesses within these models that can allow hackers to bypass defenses and infiltrate businesses.

The risks posed by adversarial machine learning

There are several types of so-called 'adversarial AI' techniques that criminals can use to their advantage. And one of the most effective ways for hackers to achieve this is to target the data being used to train ML models with incorrect information, so the model learns the wrong lessons and applies them to future work.

This is known as 'data poisoning' and it can be difficult to stop as it’s hard to spot when in progress, and once an AI model has learned a connection, it can be very tricky to get it to unlearn it.

AI models work by looking at patterns and drawing inferences from them. They take in huge amounts of data and make a 'best guess' as to what they're seeing. However, while this offers many benefits, there’s an intrinsic weakness in this model. The system can only look for correlations, and has no understanding of causation or any other logical relations between pieces of data. This leaves it open to exploitation.

What is data poisoning?

One example of how data poisoning can be exploited is with image recognition systems. For example, a human knows instantly what a dog looks like, whether it's a great dane or a chihuahua, but an ML system has to be taught. This works by showing the system thousands or even millions of pictures of different dogs, so the AI can recognize common characteristics (such as four legs, a tail, fur etc) and deduce that an image it hasn't seen before that contains all these features is likely to be a dog.

But what if an attacker was to influence this training data? It's easy to add an additional layer or feature to an image that's unnoticeable to the human eye, but will be picked up by an AI. A common set of pixels such as a logo hidden in the corner of the training data, for example, will be noticed by the AI as having especially high correlation.

This means a hacker could present an AI model using this poisoned data with an image of anything - a cat, a bird, a bowl of fruit - and as long as it has the same logo in the same place, the AI will recognize this and decide the image must be a dog.

Such poisoning isn't limited to image recognition AIs. It could be injected into natural language processing models, audio files or even structured data like sales records or stock values.

How can data poisoning be used by hackers?

This is more than a hypothetical question, as there have been many real-world examples of ML algorithms misunderstanding inputs to give an incorrect response.

For example, one AI was trained to detect skin cancer using images of melanomas. However, these training images also contained ruler markings which were easier for the AI to identify than tiny variations in the lesions themselves. This meant every time the model was shown a lesion next to a ruler marking, it flagged it as a malignant melanoma.

This weakness can also be exploited by bad actors. For example, ML can be used as part of network security defenses to look for unusual activity. But if the AI has been trained on poisoned data, it may ignore telltale signs of an intrusion.

Another real-world example is evading spam filters. For instance, Google’s noted several large-scale efforts where hackers have reported huge amounts of spam emails as not being spam, in order to fool Gmail's built-in filters into letting their own messages through.

How can you protect your AI systems from attack?

For data poisoning to work, attackers must have access to the data training program. For public-access systems such as Google's 'report spam' function, this is easy, but for others, it may require an insider threat or an initial network intrusion. Therefore, a strong all-round security strategy can go a long way to protecting systems, and practices like penetration testing can help you spot weaknesses - though for more advanced attacks, you can't rely solely on this.

Protecting the training program will also require effective monitoring tools, which can alert IT pros if a small number of accounts or IP addresses are having an outsized influence on the data. As AI and ML solutions become more advanced, using a second layer to look for anomalies in the training materials may become a useful solution - though again, this needs careful building to ensure it doesn't fall victim to the same weaknesses.

Having human oversight of these training programs to spot errors is a must, but there are a couple of issues with this. Firstly, the poisoned data may have been designed to be imperceptible to the human eye, while it’s also impractical for humans to review data at the scale needed.

Data poisoning is still a relatively new field, so best practices have yet to emerge. However, being aware of the risks and knowing what to look for will give you a good starting point when developing ML tools.

Tech Insights for Professionals

Insights for Professionals provide free access to the latest thought leadership from global brands. We deliver subscriber value by creating and gathering specialist content for senior professionals.