# Docs - [Introduction to DeepEval](/docs/introduction) - [Design Philosophy](/docs/introduction-design-philosophy) - [Comparisons](/docs/introduction-comparisons) - **Getting Started** - [DeepEval 5-min Quickstart](/docs/getting-started) - [Vibe Coder 5-min Quickstart](/docs/vibe-coder-quickstart) - [Vibe Coding with DeepEval](/docs/vibe-coding) - Use Cases - [AI Agent Evaluation Quickstart](/docs/getting-started-agents) - [Chatbot Evaluation Quickstart](/docs/getting-started-chatbots) - [RAG Evaluation Quickstart](/docs/getting-started-rag) - [MCP Evaluation Quickstart](/docs/getting-started-mcp) - [LLM Arena Evaluation Quickstart](/docs/getting-started-llm-arena) - **LLM Evals** - [Introduction to LLM Evals](/docs/evaluation-introduction) - Concepts - Test Cases - [Single-Turn Test Case](/docs/evaluation-test-cases) - [Multi-Turn Test Case](/docs/evaluation-multiturn-test-cases) - [Arena Test Case](/docs/evaluation-arena-test-cases) - [Datasets](/docs/evaluation-datasets) - [Prompts](/docs/evaluation-prompts) - [Model Context Protocol (MCP)](/docs/evaluation-mcp) - [LLM Tracing](/docs/evaluation-llm-tracing) - End-to-End Evals - [End-to-End LLM Evaluation](/docs/evaluation-end-to-end-llm-evals) - [Single-Turn End-to-End Evaluation](/docs/evaluation-end-to-end-single-turn) - [Multi-Turn End-to-End Evaluation](/docs/evaluation-end-to-end-multi-turn) - [Component-Level LLM Evaluation](/docs/evaluation-component-level-llm-evals) - [Unit Testing in CI/CD](/docs/evaluation-unit-testing-in-ci-cd) - [Flags and Configs](/docs/evaluation-flags-and-configs) - **Eval Metrics** - [Introduction to LLM Metrics](/docs/metrics-introduction) - Custom - [G-Eval](/docs/metrics-llm-evals) - [DAG (Deep Acyclic Graph)](/docs/metrics-dag) - [Conversational G-Eval](/docs/metrics-conversational-g-eval) - [Conversational DAG](/docs/metrics-conversational-dag) - [Arena G-Eval](/docs/metrics-arena-g-eval) - ['Do it yourself' Metrics](/docs/metrics-custom) - Agentic - [Task Completion](/docs/metrics-task-completion) - [Step Efficiency](/docs/metrics-step-efficiency) - [Argument Correctness](/docs/metrics-argument-correctness) - [Tool Correctness](/docs/metrics-tool-correctness) - [Plan Adherence](/docs/metrics-plan-adherence) - [Plan Quality](/docs/metrics-plan-quality) - RAG - [Answer Relevancy](/docs/metrics-answer-relevancy) - [Faithfulness](/docs/metrics-faithfulness) - [Contextual Precision](/docs/metrics-contextual-precision) - [Contextual Recall](/docs/metrics-contextual-recall) - [Contextual Relevancy](/docs/metrics-contextual-relevancy) - Multi-Turn - [Turn Relevancy](/docs/metrics-turn-relevancy) - [Role Adherence](/docs/metrics-role-adherence) - [Knowledge Retention](/docs/metrics-knowledge-retention) - [Conversation Completeness](/docs/metrics-conversation-completeness) - [Goal Accuracy](/docs/metrics-goal-accuracy) - [Tool Use](/docs/metrics-tool-use) - [Topic Adherence](/docs/metrics-topic-adherence) - [Turn Faithfulness](/docs/metrics-turn-faithfulness) - [Turn Contextual Precision](/docs/metrics-turn-contextual-precision) - [Turn Contextual Recall](/docs/metrics-turn-contextual-recall) - [Turn Contextual Relevancy](/docs/metrics-turn-contextual-relevancy) - MCP - [MCP-Use](/docs/metrics-mcp-use) - [Multi-Turn MCP-Use](/docs/metrics-multi-turn-mcp-use) - [MCP Task Completion](/docs/metrics-mcp-task-completion) - Safety - [Bias](/docs/metrics-bias) - [Toxicity](/docs/metrics-toxicity) - [Non-Advice](/docs/metrics-non-advice) - [Misuse](/docs/metrics-misuse) - [PII Leakage](/docs/metrics-pii-leakage) - [Role Violation](/docs/metrics-role-violation) - Non-LLM - [Exact Match](/docs/metrics-exact-match) - [Pattern Match](/docs/metrics-pattern-match) - [Json Correctness](/docs/metrics-json-correctness) - Images - [Image Coherence](/docs/multimodal-metrics-image-coherence) - [Image Helpfulness](/docs/multimodal-metrics-image-helpfulness) - [Image Reference](/docs/multimodal-metrics-image-reference) - [Text to Image](/docs/multimodal-metrics-text-to-image) - [Image Editing](/docs/multimodal-metrics-image-editing) - Others - [Summarization](/docs/metrics-summarization) - [Prompt Alignment](/docs/metrics-prompt-alignment) - [Hallucination](/docs/metrics-hallucination) - [RAGAS](/docs/metrics-ragas) - **Prompt Optimization** - [Introduction to Prompt Optimization](/docs/prompt-optimization-introduction) - Algorithms - [GEPA](/docs/prompt-optimization-gepa) - [MIPROv2](/docs/prompt-optimization-miprov2) - **Synthetic Data Generation** - [Introduction to Synthetic Data Generation](/docs/synthetic-data-generation-introduction) - Golden Synthesizer - [Golden Synthesizer](/docs/golden-synthesizer) - [Generate Goldens From Documents](/docs/synthesizer-generate-from-docs) - [Generate Goldens From Contexts](/docs/synthesizer-generate-from-contexts) - [Generate Goldens From Goldens](/docs/synthesizer-generate-from-goldens) - [Generate Goldens From Scratch](/docs/synthesizer-generate-from-scratch) - Conversation Simulator - [Conversation Simulator](/docs/conversation-simulator) - [Model Callback](/docs/conversation-simulator-model-callback) - [Stopping Logic](/docs/conversation-simulator-stopping-logic) - [Custom Templates](/docs/conversation-simulator-custom-templates) - [Lifecycle Hooks](/docs/conversation-simulator-lifecycle-hooks) - **Benchmarks** - [Introduction to LLM Benchmarks](/docs/benchmarks-introduction) - Available Benchmarks - [MMLU](/docs/benchmarks-mmlu) - [HellaSwag](/docs/benchmarks-hellaswag) - [BIG-Bench Hard](/docs/benchmarks-big-bench-hard) - [DROP](/docs/benchmarks-drop) - [TruthfulQA](/docs/benchmarks-truthful-qa) - [HumanEval](/docs/benchmarks-human-eval) - [IFEval](/docs/benchmarks-ifeval) - [SQuAD](/docs/benchmarks-squad) - [GSM8K](/docs/benchmarks-gsm8k) - [MathQA](/docs/benchmarks-math-qa) - [LogiQA](/docs/benchmarks-logi-qa) - [BoolQ](/docs/benchmarks-bool-q) - [ARC](/docs/benchmarks-arc) - [BBQ](/docs/benchmarks-bbq) - [LAMBADA](/docs/benchmarks-lambada) - [Winogrande](/docs/benchmarks-winogrande) - **Others** - [CLI Settings](/docs/command-line-interface) - [Environment Variables](/docs/environment-variables) - [Troubleshooting](/docs/troubleshooting) - [Frequently Asked Questions](/docs/faq) - [Data Privacy](/docs/data-privacy) - [Miscellaneous](/docs/miscellaneous)