Setting up the Arthur Evals Engine

The Arthur Engine is a free, open-source toolkit for evaluating AI models. This guide shows you how to quickly download, install, and run the engine in your own environment — so you can start measuring model performance with confidence.

Step 1

Download Requirements

For the quickest way to get the Arthur Engine running, make sure you have docker downloaded & running.

Step 2

Configure & Run the Engine

Copy & paste the command below into your terminal window

Step 3

Start building

Configure guardrails for real-time detection of PII or Sensitive Data leakage, Hallucination, Prompt Injection attempts, Toxic language and more.

Want to deploy Arthur Engine your way?

Whether you're using a CloudFormation template or a Helm chart, we’ve got you covered — flexible deployment options, right at your fingertips.

Explore on GitHub

Want to Ensure Your Models Stay Reliable and High-Performing?

Join the Arthur platform to monitor, debug, and drive actionable insights across your most valuable GenAI and traditional ML use cases — all in one place.

Start optimizing your models today

Sign Up Now

Why Arthur Evals Engine?

Easy Integration

Seamlessly integrate with your existing ML pipelines and LLM applications with just a few lines of code

Real-time AI Evaluation

Arthur Evals Engine is the first open-source real-time AI evaluation engine, designed to help you monitor and improve your AI models continuously

Comprehensive Metrics

Evaluate your models across multiple dimensions including accuracy, fairness, bias, toxicity and more

Open Source

Fully open-source and customizable to meet your specific evaluation needs and requirements

Ready to get started?

Download the Arthur Evals Engine today and start evaluating your AI models in real-time

What Can Arthur Do For You illustration