Supported Models and Use Cases

A comprehensive evaluation framework that spans the entire AI development lifecycle

Machine Learning

Recommender Systems
NLP
Classifiers
Forecasting
Computer Vision
Regression
  • Data Drift
  • Classification Rates
  • Root Mean Square
  • Precision & Recall
  • Many More

Generative AI

RAG Co-Pilots
GenAI Automation
  • Hallucination Rates
  • Data Security Controls
  • Acceptable Use Policies
  • Domain-specific Evals, inc. custom code
  • Inference & hallucination count
  • Pass & Fail rates for Toxicity, PII & Sensitive Data
  • Tokens & Model cost

Agentic AI

AI Agents
  • Groundedness Failure Rate
  • Trace Visualization
  • Tool Selection Evaluation
  • Prompt/Response Relevance

See what Arthur can do for you.

What Can Arthur Do For You illustration