Catastrophic Forgetting, also known as Catastrophic Interference, is a phenomenon that occurs when a neural network or machine learning model "forgets", or dramatically reduces its performance on previously learned tasks after learning a new task. This can occur when training a model on a stream of tasks, rather than training it on all tasks at once.
There are a few different ways that catastrophic forgetting can occur. One way is through the process of "overfitting" 1, where the model is so focused on fitting the training data for the new task that it forgets the information from the previous tasks. Another way is through the process of "interference", where the new task is related to the previous tasks in some way, and the model's learning about the new task "interferes" with its knowledge about the previous tasks. One common way that Catastrophic Forgetting occurs is when training a model using the "Online Learning" 2 approach, in which the model is continually updated with new examples as they come in, rather than being trained on a fixed set of examples all at once. In this scenario, the model can be presented with new examples that are significantly different from the examples it was previously trained on, and this can cause it to "forget" or significantly degrade its performance on the previous task.
There are several ways to mitigate catastrophic forgetting:
-
One approach is to use techniques such as “Weight Regularization” 3, which can help prevent the model from drastically changing its weight values and losing the knowledge it has gained from previous tasks.
-
"Elastic Weight Consolidation" 4, which involves adding a small amount of noise to the weights of the network during training, can also help prevent Catastrophic Forgetting. This noise helps "stabilizing" the weights, making it less likely that the model will forget its knowledge about previous tasks.
-
Another approach is to use methods such as “Rehearsal” 5, in which the model is continually presented with examples from previously learned tasks to help it retain that knowledge.
-
Another popular method for addressing Catastrophic Forgetting is to use "Transfer Learning" 6, in which a model trained on one task is fine-tuned on a related task. For example, a model that has been trained to recognize images of dogs might be fine-tuned to recognize images of cats. In this case, the model has already learned many features that are useful for recognizing images of animals in general, so it can use this knowledge to quickly learn to recognize images of cats.
-
"Ensemble Methods" 7, in which multiple models are trained to solve different tasks, and their outputs are combined to make a final prediction, are also helpful in preventing Catastrophic Forgetting. For example, an ensemble model might consist of one model that is trained to recognize images of dogs, another model that is trained to recognize images of cats, and so on. When presented with a new example, the ensemble model can use the output of each of its constituent models to make a more informed prediction.
Catastrophic Forgetting is an important consideration when training machine learning models, especially when those models are being trained to learn multiple tasks over time. By using techniques such as Weight Regularization, Elastic Weight Consolidation, Rehearsal, Transfer Learning, and Ensemble Methods, it is possible to mitigate the effects of catastrophic forgetting and improve the performance of machine learning models.
[1] The Overfitting Iceberg (2020)
[2] Online Methods in Machine Learning - Theory and Applications (Consulted in January 2023)
[3] Regularization Techniques in Deep Learning (2019)
[4] Overcoming Catastrophic Forgetting in Neural Networks (2017)
[5] Catastrophic Forgetting, Rehearsal, and Pseudorehearsal (1995)
[6] A Survey of Transfer Learning (2016)
[7] Ensemble Learning - Wikipedia (Consulted in January 2023)