Evaluate RAG systems with hit rate, MRR, faithfulness metrics and compare retrieval strategies. Use when testing retrieval quality, generating evaluation datasets, comparing embeddings or retrievers, A/B testing, or measuring production RAG performance.