Evaluating RAG Applications: From Retrieval to Response Quality | Srinivas Chitta | TestMu 2025
In this insightful TestMu session, ๐๐ซ๐ข๐ง๐ข๐ฏ๐๐ฌ ๐๐ก๐ข๐ญ๐ญ๐, Lead Consultant at Thoughtworks, unpacks the complexities of evaluating RAG (Retrieval-Augmented Generation) Applications. From discussing the architecture of RAG to demonstrating how to evaluate retrieval and response quality using RAGAS, Srinivas provides a deep dive into best practices for assessing the quality of retrieval systems, generation accuracy, and handling hallucinations.
A must-watch for anyone working with AI-powered applications, particularly LLMs, in real-world scenarios.
๐๐๐ ๐๐ซ๐๐ก๐ข๐ญ๐๐๐ญ๐ฎ๐ซ๐: How RAG combines retrieval and generation for precise answers.
๐๐ฏ๐๐ฅ๐ฎ๐๐ญ๐ข๐จ๐ง ๐ฏ๐ฌ ๐๐๐ฌ๐ญ๐ข๐ง๐ : Why evaluating RAG is different from traditional testing.
๐๐๐ฒ ๐๐๐ญ๐ซ๐ข๐๐ฌ: Important metrics for evaluating retrieval and response quality.
๐๐ฌ๐ข๐ง๐ ๐๐๐๐๐: Automated evaluation with RAGAS for RAG-based apps.
๐๐จ๐ง๐ญ๐๐ฑ๐ญ & ๐๐ซ๐จ๐ฎ๐ง๐ ๐๐ซ๐ฎ๐ญ๐ก: Importance of context retrieval and ground truth for accurate evaluation.
๐๐ซ๐๐๐ญ๐ข๐๐๐ฅ ๐๐๐ฆ๐จ: Live demo showing real-time testing using LLM as an evaluator.
๐๐/๐๐ ๐๐ง๐ญ๐๐ ๐ซ๐๐ญ๐ข๐จ๐ง: Seamless integration of RAG evaluation into your CI/CD pipeline.
Testฮผ
Testฮผ(TestMu) Conference is LambdaTestโs annual flagship event, one of the worldโs largest virtual software testing conferences dedicated to decoding the future of testing and development. Built by the community, for the community, itโs a space where youโre at the center, connecting, learning, and leading together. From deep-dive sessions on emerging trends in engineering, testing, and DevOps, to hands-on workshops and inspiring culture-driven talks, every experience is designed to keep you at the heart of the conversation.