Learning to unlearn: the new challenge of artificial intelligence

Share:

Your data belongs to you. At least, that’s what the European General Data Protection Regulation (GDPR) law provides. On the one hand, it limits the collection of your data by private companies to what you have consented to. On the other hand, it allows you to request the complete erasure of information about you from that actor’s servers: this is the right to be forgotten. It is this second part that interests us here, and its applicability in the modern world. While deleting a few lines from a database is not complicated, the task becomes much more perilous when artificial intelligence (AI) comes into play.

Indeed, increasingly complex AI models, based on artificial neural networks, are already being deployed by many private players. These models need to learn from as much data as possible to be efficient. Thus, the presence of information about you in a company’s servers often implies its use for training its AI models. From then on, forgetting your data goes from simply deleting a row in a table to a complex operation closer to neurosurgery applied to artificial “brains”. So, how can you successfully make an artificial neural network forget specific information?

Forgetting data: ethical and privacy issues

The application to the case of data protection is important, but the issue of machine forgetting, also called unlearning, does not stop there. The use of protected information for training artificial neural networks is still a gray area in the eyes of the law. Several cases of this type are being processed by the courts in different countries and they could mark an important precedent for the future of artificial intelligence legislation .

Notable example: in some cases, ChatGPT is able to recite entire paragraphs of articles from the New York Times, without citing its source . The American daily has therefore filed a complaint against OpenAI, the company developing ChatGPT, and the outcome of the trial could well guide future case law in the field. However, machine forgetting is not limited to the use of personal or commercial data. Indeed, the global trend in recent years has been to train increasingly large models, particularly in the field of language processing where the progress made is impressive. It is therefore becoming increasingly complicated to control the legitimacy of the data used to train AI.

If GPT-3, OpenAI’s 2020 model, had learned from a corpus representing 2,400 years of continuous reading for an average human, this figure has only increased, making manual verification impossible. Whether it is false statements, racist or sexist content, or even personal contact details of individuals, it is a safe bet that some will inadvertently slip into the training data and therefore into the knowledge of such a model. The current lack of effective forgetting methods then means that in the event of infiltration of unwanted data, there will be no real solution to erase this information other than retraining, the cost of which amounts to tens of millions of euros and hundreds of tons of CO 2 .

How do artificial neurons learn?

To understand the difficulty of removing information from a neural network, it helps to have an idea of ​​how such a network “learns.” Consider a neural network that is tasked with distinguishing between images of dogs and cats. The set of labeled images (i.e., with a “dog” or “cat” caption) that the neural network uses to learn is called the “training set.”

The network is initialized randomly: artificial neurons are created, organized in layers, and connected to each other. The strength of these connections, called “weights”, is the equivalent of the neural connections of a real brain. These weights are used to characterize the way in which the input (the image of a dog or cat) is processed, transformed and sent between the different neurons of the artificial “brain”, so that a score between 0 and 1 is finally obtained. A score of 0 (or 1) corresponds to absolute certainty that the image is a cat (or a dog), and a score of 0.5 corresponds to total uncertainty between the two. Fun tools can be used to represent how a neural network works.

During the so-called “learning” phase, images from the training set are shown to the neural network, which predicts a label for each of them. The network is then given the real label that was expected. The model can then calculate the error it made. This is where the magic will happen. Based on the sole information of the error made, the network will update all of its weights to try to correct it. This modification uses simple calculation rules at the scale of a neuron, but incomprehensible to humans at the scale of the entire network.

Where is the data once the model is trained?

This is where a paradox comes from, often difficult to understand for the uninitiated: even if humans have designed these artificial intelligence architectures from A to Z, the resulting system is not completely understandable by its creator. Some groups of neurons are relatively well understood by researchers. However, the precise role of each neuron is poorly known and also subject to interpretation. It is therefore difficult to answer a question such as “find all the neurons used to identify the dog’s tail”, especially since neurons are strongly connected to each other and reducing a neuron to a single functionality is generally impossible.

The question asked when trying to do unlearning is even more difficult: how would each of the neurons in the network have been impacted if we had never processed the cat image #45872? It is not a question of altering the network’s ability to recognize cats – this image may very well provide little information – nor of deleting the image from the database since what the network has learned from it is stored, as in a human brain, in the weights linking the neurons. We must then try to identify the neural connections (weights) that have learned the most from this particular image, and modify their intensity in order to destroy the information associated with the data whose forgetting we want to simulate.

Different paths to unlearning

Three main criteria ensure effective unlearning. First, forgetting must occur fairly quickly, otherwise it is easier to retrain the model entirely. Second, the network’s performance on the remaining (non-forgotten) data must remain good. Finally, forgetting of information must be ensured by the procedure. This last point is the most delicate since it consists of verifying the absence of information. Mathematically quantifying the forgetting capabilities of the method is therefore crucial.

Among the methods considered in the literature, many rely on a learning phase on the remaining data. This retraining allows the network to update its weights to specialize only on this data. The goal is to “crush” the information of the data to be forgotten as it goes along, as the human brain does for example for an unpracticed language.

Other methods try to use the data to be forgotten to reverse the learning process. While this idea may seem very intuitive, we have no mathematical guarantee to date on the quantification of the forgetting it allows. Moreover, its instability can sometimes lead to an overall degradation of the model’s performance.

Finally, some are betting on a modification of the training process to facilitate future forgetting of data. By gradually providing the training data to the network, forgetting can be done by simply going back to a point where the network has not seen the data to be forgotten. This is followed by a retraining phase on the remaining data. The limit of this type of approach is the necessary existence of “first data” seen by the model, which would force it to be completely reset in the event of a forgetting request. Indeed, we cannot “remove” the effect of this first data on the model: isolating exactly the impact of a piece of data is as expensive as training a model from A to Z.

A deployment still in its infancy

The field of machine forgetting is vast and has many challenges. Although no method is perfect yet, significant progress is expected in the coming years due to the growing demand for this type of solution. Although the field of unlearning is still young and no industrial application has been made public, some companies such as Google or JPMorgan Chase are taking a close interest in the subject.

Machine forgetting poses a complex but essential challenge in the era of artificial intelligence and personal data protection. While regulations such as GDPR aim to guarantee the rights of individuals, their implementation in neural networks is difficult. Current approaches show progress, but we are still at the forefront of this issue. Investment from large companies suggests a sustainable future for the field, offering more robust methods to ensure unlearning and build user trust in AI systems.

Author Bios: Martin Van Waerebeke is a PhD student in machine (de)learning and Marco Lorenzi is a Researcher (health data analysis, medical imaging, machine learning, modeling) at Inria

Tags: