Announcing Arthur Shield: The First Firewall for LLMs

At Arthur, we are on a mission to make AI better for everyone—and right now, nowhere is that mission more important than the deployment of large language models.

Companies across industries have begun to rapidly integrate LLMs into their operations following recent advancements from organizations like OpenAI, Google, Meta, and others. However, businesses don’t currently have a way to ensure fast and safe deployment of these applications, which has led to data leaks and toxic outputs that have been costly in more ways than one.

That’s why, today, we are launching a powerful addition to our suite of AI monitoring tools: Arthur Shield, the first firewall for large language models. Arthur Shield enables companies to deploy LLM applications like ChatGPT faster and more safely, helping to identify and resolve issues before they become costly business problems—or worse, result in harm to their customers.

Simply put, Arthur Shield acts as a firewall to protect organizations against the most serious risks and safety issues with deployed LLMs. Use cases can include:

PII or sensitive data leakage: Arthur Shield allows companies to use the power of an LLM trained or fine-tuned on their full data set while having the peace of mind from knowing that other users of that same LLM are blocked from retrieving sensitive data from the training set.
Toxic, offensive, or problematic language generation: Arthur Shield allows companies to block LLM responses that are not value-aligned with their organization.
Hallucinations: Some LLMs confidently output incorrect facts. Arthur Shield detects these likely incorrect responses and prevents them from being returned to a user where they can do significant harm if they are actioned upon.
Malicious prompts by users: Arthur Shield detects and stops malicious user prompts, including attempts to get the model to generate a response that would not reflect well on the business, efforts to get the model to return sensitive training data, or attempts to bypass safety controls.
Prompt injection: It is becoming common for LLM applications to augment their prompts through retrieval from third-party websites and databases of pre-trained document embeddings. Those sources are not secure and can contain malicious prompts that are injected into the LLM system, causing significant risk of unauthorized response generation and data leakage.

“LLMs are one of the most disruptive technologies since the advent of the Internet. Arthur has created the tools needed to deploy this technology more quickly and securely, so companies can stay ahead of their competitors without exposing their businesses or their customers to unnecessary risk.”
– Adam Wenchel, Co-Founder & CEO

Arthur Shield and its capabilities are currently being rolled out in beta to select Arthur customers. Read more in our official press release or get in touch to request a demo.

Announcing Arthur Shield: The First Firewall for LLMs

SHARE