微软发布PyRIT：一种用于识别生成式人工智能系统中风险的工具

PyRIT可以生成数千个恶意提示来测试生成式AI模型并评估其回应

“`html

Microsoft shares its AI security tool with the public

Despite the advanced capabilities of generative AI models, we have seen many instances of them “going rogue,” hallucinating, or having loopholes that malicious actors can exploit. To help mitigate that issue, Microsoft is unveiling a tool that can help identify risks in generative AI systems.

On Thursday, Microsoft released its Python Risk Identification Toolkit for generative AI (PyRIT), a tool Microsoft’s AI Red Team has been using to check for risks in its gen AI systems, including Copilot.

A New Era of Risk Identification for Generative AI Systems

In the past year, Microsoft red-teamed more than 60 high-value gen AI systems, through which it learned that the red-teaming process differs vastly for these systems from classical AI or traditional software, according to the blog post.

The process looks different because Microsoft has to consider the usual security risks, in addition to responsible AI risks, such as ensuring harmful content cannot be intentionally generated, or that the models don’t output disinformation.

Additionally, gen AI models vary widely in architecture, and there are deviations in outcomes that can be produced from the same input, making it difficult to find one streamlined process that fits all models.

Introducing PyRIT: The AI Risk Identification Toolkit

As a result, manually probing for all of these different risks ends up being a time-consuming, tedious, and slow process. Microsoft shares that automation can help red teams by identifying risky areas that require more attention and automating routine tasks, and that’s where PyRIT comes in.

The Python Risk Identification Toolkit, “battle-tested by the Microsoft AI team,” sends a malicious prompt to the generative AI system, and once it receives a response, its scoring agent gives the system a score, which is used to send a new prompt based on previous scoring feedback.

Microsoft says that PyRIT’s biggest advantage is that it has helped Microsoft’s red team efforts be more efficient, significantly shortening the amount of time a task would take.

“For instance, in one of our red teaming exercises on a Copilot system, we were able to pick a harm category, generate several thousand malicious prompts, and use PyRIT’s scoring engine to evaluate the output from the Copilot system all in the matter of hours instead of weeks,” said Microsoft in the release.

Getting Started with PyRIT

The toolkit is available for access today and includes a list of demos to help familiarize users with the tool. Microsoft is also hosting a webinar on PyRIT that demonstrates how to use it in red teaming generative AI systems, which you can register for through Microsoft’s website.

Q&A:

Q: What is generative AI?

Generative AI refers to artificial intelligence models that have the ability to create new content such as images, text, and even music without direct human intervention. These models are trained on large datasets and can generate outputs that are similar to what they were trained on.

Q: Why is it important to identify risks in generative AI systems?

Identifying risks in generative AI systems is crucial to prevent potential malicious activities and unintended consequences. Without proper risk identification, these systems can produce harmful or misleading content, making them a potential threat in various domains such as cybersecurity, misinformation, and content moderation.

Q: How does PyRIT work?

PyRIT, the Python Risk Identification Toolkit, sends malicious prompts to a generative AI system and evaluates its responses using a scoring agent. Based on the scores received, PyRIT generates new prompts to further test the system’s behavior. This automated process helps identify potential risks quickly and efficiently.

Q: Can PyRIT be used with any generative AI system?

PyRIT is designed to be adaptable to different generative AI architectures. Since different models have unique characteristics and produce varying outputs from the same inputs, PyRIT provides a flexible approach to risk identification. It can be tailored to specific system requirements, making it suitable for a wide range of generative AI systems.

“““html

Q: 我在哪里可以访问PyRIT？

PyRIT今天可以访问。您可以在Microsoft的网站上找到工具包和其他资源。Microsoft还举办了一个网络研讨会，演示如何在红队生成AI系统中使用PyRIT。通过提供的链接注册参加网络研讨会。

通过发布PyRIT，Microsoft旨在提高生成AI系统的安全性和负责任使用。这个创新工具不仅简化了风险识别过程，还使红队能够更有效地发现和解决潜在的漏洞。随着生成AI的不断发展，保持领先风险并确保这些强大技术的安全部署变得越来越重要。

参考：

不要忘记在社交媒体上分享本文，宣传新工具PyRIT及其对生成AI系统未来的影响！ 💻🚀

“`

微软发布PyRIT：一种用于识别生成式人工智能系统中风险的工具

PyRIT可以生成数千个恶意提示来测试生成式AI模型并评估其回应

Microsoft shares its AI security tool with the public

A New Era of Risk Identification for Generative AI Systems

Introducing PyRIT: The AI Risk Identification Toolkit

Getting Started with PyRIT

Q&A:

参考：

200年历史的苏格兰威士忌公司采用人工智能生成...

🚨 紧急呼吁立法打击深度伪造 🚨

Match Group与OpenAI强强联手：将AI引入约会世...

ChatGPT Goes Rogue AI 聊天机器人给出荒谬答...

生成式人工智能：网络犯罪新武器 💻🔫

Gen AI 信任预测的基石 🤖💪

AI