Microsoft Introduces PyRIT: A Red Teaming Tool for AI Security

Microsoft has launched PyRIT (Python Risk Identification Tool), an open-source automation framework aimed at identifying risks in generative artificial intelligence (AI) systems. PyRIT is designed to help organizations worldwide innovate responsibly with the latest AI advances, according to Ram Shankar Siva Kumar, Microsoft’s AI red team lead.

The tool focuses on assessing the robustness of large language model (LLM) endpoints against various harm categories, including fabrication (e.g., hallucination), misuse (e.g., bias), and prohibited content (e.g., harassment). It can also identify security risks such as malware generation and privacy risks like identity theft.

PyRIT offers five interfaces: target, datasets, scoring engine, support for multiple attack strategies, and a memory component that can store intermediate input and output interactions in either JSON format or a database.

The scoring engine provides two options for scoring outputs from the target AI system: using a classical machine learning classifier or leveraging an LLM endpoint for self-evaluation. This allows researchers to establish a baseline for their model’s performance against different harm categories and compare it to future iterations.

However, Microsoft emphasizes that PyRIT is not a replacement for manual red teaming of generative AI systems but rather a tool to complement existing domain expertise. It is intended to highlight risk areas by generating prompts for evaluating AI systems and identifying areas that need further investigation.

Microsoft also recognizes that red teaming generative AI systems requires probing for both security and responsible AI risks simultaneously. While automation is essential for scaling, manual probing is often necessary to identify potential blind spots.

The release of PyRIT follows the disclosure of multiple critical vulnerabilities in popular AI supply chain platforms, including ClearML, Hugging Face, MLflow, and Triton Inference Server, by Protect AI. These vulnerabilities could lead to arbitrary code execution and disclosure of sensitive information.

Cyber security news for all

New PondRAT Malware Disguised in Python Packages Targets Software Developers

Discord Unveils DAVE Protocol for Comprehensive Encryption in Audio and Video Communication

GitLab Resolves Critical SAML Authentication Bypass Vulnerability in CE and EE Versions

SpyLoan Malware Embedded in Android Loan Apps Exposes 8 Million Users

Google Halts Risky Android App Sideloading in India, Elevating Fraud Prevention

Watering Hole Attack on Kurdish Platforms Unleashes Harmful APKs and Spyware

Related

Recent Articles

The Hidden Risk of Non-Human Identities: Why Secrets Management Must Evolve

New Malware Campaigns Target Android and iOS Users with Fake Apps

Weekly Cyber Threat Insights: Windows 0-Day, VPN Exploits, AI Abuse, Antivirus Hijacking, and More

New BPFDoor Malware Controller Facilitates Hidden Lateral Movement on Linux Systems

New Vulnerabilities Found in Rack::Static Allow Unauthorized Access and Data Manipulation on Ruby Servers

Related Stories

EDITOR PICKS

The Hidden Risk of Non-Human Identities: Why Secrets Management Must Evolve

New Malware Campaigns Target Android and iOS Users with Fake Apps

Weekly Cyber Threat Insights: Windows 0-Day, VPN Exploits, AI Abuse, Antivirus Hijacking, and More

POPULAR POSTS

A complete OSINT tutorial on finding someone’s personal information

Chaos Computer Club discovered access to 5 million records

Mailto links can pose an unexpected security risk

ABOUT US

POPULAR CATEGORY

FOLLOW US