Cyber security news for all

More

    Microsoft Introduces PyRIT: A Red Teaming Tool for AI Security

    Microsoft has launched PyRIT (Python Risk Identification Tool), an open-source automation framework aimed at identifying risks in generative artificial intelligence (AI) systems. PyRIT is designed to help organizations worldwide innovate responsibly with the latest AI advances, according to Ram Shankar Siva Kumar, Microsoft’s AI red team lead.

    The tool focuses on assessing the robustness of large language model (LLM) endpoints against various harm categories, including fabrication (e.g., hallucination), misuse (e.g., bias), and prohibited content (e.g., harassment). It can also identify security risks such as malware generation and privacy risks like identity theft.

    PyRIT offers five interfaces: target, datasets, scoring engine, support for multiple attack strategies, and a memory component that can store intermediate input and output interactions in either JSON format or a database.

    The scoring engine provides two options for scoring outputs from the target AI system: using a classical machine learning classifier or leveraging an LLM endpoint for self-evaluation. This allows researchers to establish a baseline for their model’s performance against different harm categories and compare it to future iterations.

    However, Microsoft emphasizes that PyRIT is not a replacement for manual red teaming of generative AI systems but rather a tool to complement existing domain expertise. It is intended to highlight risk areas by generating prompts for evaluating AI systems and identifying areas that need further investigation.

    Microsoft also recognizes that red teaming generative AI systems requires probing for both security and responsible AI risks simultaneously. While automation is essential for scaling, manual probing is often necessary to identify potential blind spots.

    The release of PyRIT follows the disclosure of multiple critical vulnerabilities in popular AI supply chain platforms, including ClearML, Hugging Face, MLflow, and Triton Inference Server, by Protect AI. These vulnerabilities could lead to arbitrary code execution and disclosure of sensitive information.

    Recent Articles

    Related Stories

    Leave A Reply

    Please enter your comment!
    Please enter your name here