The realities of AI in cybersecurity: catastrophic forgetting

Issue 2 2021 Information Security

There is a lot of hype about the use of artificial intelligence (AI) in cybersecurity. The truth is that the role and potential of AI in security is still evolving and often requires experimentation and evaluation.

Malware detection is the cornerstone of IT security and AI is the only approach capable of learning patterns from millions of new malware samples within a matter of days.

But there’s a catch: should the model keep all malware samples forever for optimum detection but slower learning and updates; or go for selective fine-tuning that enables the model to better keep up with the rate of change of malware, but runs the risk of forgetting older patterns (known as catastrophic forgetting)?

Retraining the whole model takes about one week. A good fine-tuning model should take about one hour to update.

SophosAI wanted to see if it was possible to have a fine-tuning model that could keep up with the evolving threat landscape, learn new patterns but still remember older ones, while minimising the impact on performance. Researcher Hillary Sanders evaluated a number of update options and has detailed her findings in the Sophos AI blog (https://ai.sophos.com/blog/).

The detection dilemma

Keeping detection capabilities up to date is a constant battle. With every step we take towards defending against a malicious attack, adversaries are already developing new ways to get round it, releasing updates with different code or techniques. The result is that hundreds of thousands of new malware samples appear every day.

Detection is made even harder by the fact that the latest-and-greatest malware is rarely completely ‘new’. Instead, it is more likely to be a combination of new, old, shared, borrowed or stolen code and adopted and adapted behaviours. Further, old malware can re-emerge after years in the wilderness, co-opted into an adversary’s latest arsenal to take defences by surprise.

Detection models need to ensure they can continue to detect older malware samples and not just the most recent ones.

Updating AI detection models

When it comes to updating AI detection models with new malware samples, vendors have a choice between two options.

The first is to keep a copy of every sample they might ever want to detect and retrain the model repeatedly on an ever-increasing volume of data. This results in better overall performance, but also slower updates and fewer releases.

The second is to only update the detection model on new samples. This is known as fine-tuning. During each step of the fine-tuning process, the model updates its understanding according to the new knowledge added and the impact of this on the patterns seen overall. As a result, the model can ‘forget’ the old patterns it learned previously (catastrophic forgetting). However, training a model on less data means the model updates faster and can be released more frequently, keeping better pace with the rapid rate of change of malware.

Regardless of the option chosen, the need to keep training AI detection models on new samples is critical.

The patterns that AI learns from malware samples enable it to generalise and detect not only what it was trained on, but also never before seen samples that bear at least some resemblance to the training data. Over time, however, new samples will begin to deviate enough that an old model’s effectiveness will decay and it will need to be updated.

The three detection update options evaluated by Hillary Sanders were:

1. Learning based on a selection of old and new samples

This is called ‘data-rehearsal’ and involves taking a small selection of old samples and mixing them in with the new, never-before-seen training data. Using this, the model is ‘reminded’ of the old information it needed to detect older samples, while at the same time learning to detect the newer ones.

2. Learning rate

This approach involves modifying how quickly the model learns by adjusting how much it can change after seeing any given sample. If the learning rate is too fast (in which case the model can change a lot with each sample added), it will only remember the most recent samples that it has seen. If the learning rate is too slow (the model can change only slightly with each sample added), it takes too long to learn anything. Finding the right trade-off between learning rate, retaining old information and adding new information can be tricky.

3. Elastic Weight Consolidation (EWC)

This approach was inspired by work by Google’s DeepMind in 2017 and it involves using the old model like an elastic spring to ‘pull back’ the new model if it starts to forget. For a more in-depth explanation of how to implement this approach, read Hillary Sanders’ blog post at https://ai.sophos.com/2021/02/02/catastrophic-forgetting-part-1/.

Findings

All three approaches performed better on older malware samples than on newer samples. Both the EWC and learning-rate approaches remove the need and cost of maintaining older data. However, the graph shows that while their future performance (using new data) is stronger than that achieved using the data-rehearsal technique, they don’t perform as well as data-rehearsal when it comes to remembering past data.

Because the data-rehearsal technique enables faster training and update releases, dips in future performance are more short term and therefore less worrying. Overall, the research showed that the data-rehearsal approach offers the best compromise between simplicity, update speed and performance in malware detection modelling.

Conclusion

In the malware detection game, being able to remember the past is almost as important as being able to predict the future. This must be balanced against the cost and speed of updating your model with new information. Data-rehearsal is a simple and effective way to protect the model’s ability to detect old malware while significantly increasing the pace at which you can update and release new models.

Read more at https://ai.sophos.com/




Share this article:
Share via emailShare via LinkedInPrint this page



Further reading:

Banking’s AI reckoning
Commercial (Industry) Surveillance Access Control & Identity Management Fire & Safety Perimeter Security, Alarms & Intruder Detection Information Security Asset Management News & Events Integrated Solutions Infrastructure Security Services & Risk Management Education (Industry) Entertainment and Hospitality (Industry) Financial (Industry) Healthcare (Industry) Industrial (Industry) Mining (Industry) Residential Estate (Industry) Retail (Industry) Transport (Industry) Conferences & Events Products & Solutions Associations Videos Training & Education Smart Home Automation Agriculture (Industry) Logistics (Industry) AI & Data Analytics Facilities & Building Management IoT & Automation Power Management
From agentic commerce disputes to quantum-powered risk modelling, SAS experts offer a ‘banker’s dozen,’ 13 industry-defining predictions that will separate institutions that master intelligent banking from those still struggling with the basics.

Read more...
Axis signs CISA Secure by Design pledge
Axis Communications SA News & Events Surveillance Information Security
Axis Communications has signed the United States Cybersecurity & Infrastructure Security Agency’s (CISA) Secure by Design pledge, signalling the company’s commitment to upholding and transparently communicating the cybersecurity posture of its products.

Read more...
Eight African cybersecurity trends for 2026
Information Security
Check Point Software Technologies has released eight critical trends shaping Africa’s digital turning point in 2026, noting that their implementation will require the government, the private sector, and key civic institutions to cooperate.

Read more...
The year of the agent
Information Security AI & Data Analytics
The dominant attack patterns in Q4 2025 included system-prompt extraction attempts, subtle content-safety bypasses, and exploratory probing. Indirect attacks required fewer attempts than direct injections, making untrusted external sources a primary risk vector heading into 2026.

Read more...
AI cybersecurity predictions for 2026
AI & Data Analytics Information Security
The rapid development of AI is reshaping the cybersecurity landscape in 2026, for both individual users and businesses. Large language models (LLMs) are influencing defensive capabilities while simultaneously expanding opportunities for threat actors.

Read more...
SMARTpod Talks to Check Point Technologies about the African Perspectives on Cybersecurity report
SMART Security Solutions News & Events Information Security Videos
SMART Security Solutions spoke with Check Point's Hendrik de Bruin about the report, the risks African organisations face, and some mitigation measures.

Read more...
Securing the smart fleet
Information Security Transport (Industry) Logistics (Industry) IoT & Automation
Contributing around 10 to 12% of South Africa’s GDP, the transport and logistics sector supports almost every part of the country’s economic activity. The stakes for keeping these systems secure are higher than ever before.

Read more...
Who are you?
Access Control & Identity Management Information Security
Who are you? This question may seem strange, but it can only be answered accurately by implementing an Identity and Access Management (IAM) system, a crucial component of any company’s security strategy.

Read more...
Check Point launches African Perspectives on Cybersecurity report
News & Events Information Security
Check Point Software Technologies released its African Perspectives on Cybersecurity Report 2025, revealing a sharp rise in attacks across the continent and a major shift in attacker tactics driven by artificial intelligence

Read more...
What is your ‘real’ security posture?
BlueVision Editor's Choice Information Security Infrastructure AI & Data Analytics
Many businesses operate under the illusion that their security controls, policies, and incident response plans will hold firm when tested by cybercriminals, but does this mean you are really safe?

Read more...










While every effort has been made to ensure the accuracy of the information contained herein, the publisher and its agents cannot be held responsible for any errors contained, or any loss incurred as a result. Articles published do not necessarily reflect the views of the publishers. The editor reserves the right to alter or cut copy. Articles submitted are deemed to have been cleared for publication. Advertisements and company contact details are published as provided by the advertiser. Technews Publishing (Pty) Ltd cannot be held responsible for the accuracy or veracity of supplied material.




© Technews Publishing (Pty) Ltd. | All Rights Reserved.