Test your AI Agents with the all-new Agent to Agent Testing Platform.Learn More

TestMu 2025 Home / Video /

Evaluating RAG Applications: From Retrieval to Response Quality | Srinivas Chitta | TestMu 2025

Evaluating RAG Applications: From Retrieval to Response Quality | Srinivas Chitta | TestMu 2025

Testμ

Testμ

Playlist

About the talk

In this insightful TestMu session, 𝐒𝐫𝐢𝐧𝐢𝐯𝐚𝐬 𝐂𝐡𝐢𝐭𝐭𝐚, Lead Consultant at Thoughtworks, unpacks the complexities of evaluating RAG (Retrieval-Augmented Generation) Applications. From discussing the architecture of RAG to demonstrating how to evaluate retrieval and response quality using RAGAS, Srinivas provides a deep dive into best practices for assessing the quality of retrieval systems, generation accuracy, and handling hallucinations.

A must-watch for anyone working with AI-powered applications, particularly LLMs, in real-world scenarios.

Key Takeaways:

𝐑𝐀𝐆 𝐀𝐫𝐜𝐡𝐢𝐭𝐞𝐜𝐭𝐮𝐫𝐞: How RAG combines retrieval and generation for precise answers.

𝐄𝐯𝐚𝐥𝐮𝐚𝐭𝐢𝐨𝐧 𝐯𝐬 𝐓𝐞𝐬𝐭𝐢𝐧𝐠: Why evaluating RAG is different from traditional testing.

𝐊𝐞𝐲 𝐌𝐞𝐭𝐫𝐢𝐜𝐬: Important metrics for evaluating retrieval and response quality.

𝐔𝐬𝐢𝐧𝐠 𝐑𝐀𝐆𝐀𝐒: Automated evaluation with RAGAS for RAG-based apps.

𝐂𝐨𝐧𝐭𝐞𝐱𝐭 & 𝐆𝐫𝐨𝐮𝐧𝐝 𝐓𝐫𝐮𝐭𝐡: Importance of context retrieval and ground truth for accurate evaluation.

𝐏𝐫𝐚𝐜𝐭𝐢𝐜𝐚𝐥 𝐃𝐞𝐦𝐨: Live demo showing real-time testing using LLM as an evaluator.

𝐂𝐈/𝐂𝐃 𝐈𝐧𝐭𝐞𝐠𝐫𝐚𝐭𝐢𝐨𝐧: Seamless integration of RAG evaluation into your CI/CD pipeline.

Testμ

Testμ

Testμ(TestMu) Conference is LambdaTest’s annual flagship event, one of the world’s largest virtual software testing conferences dedicated to decoding the future of testing and development. Built by the community, for the community, it’s a space where you’re at the center, connecting, learning, and leading together. From deep-dive sessions on emerging trends in engineering, testing, and DevOps, to hands-on workshops and inspiring culture-driven talks, every experience is designed to keep you at the heart of the conversation.

More Videos from TestMu 2025

LT Video

Opening Note by Joe Colantonio

LT Video

Keynote: Air Fryers, Automation, and AI

LT Video

Intent Over Scripts: Modernizing Software Testing with AI

LT Video

CI and the Great Flakiness Adventure

LT Video

AI for Accessibility: Empowering Inclusive Digital Experiences

LT Video

Panel Discussion: AI and Community: Shifting Roles, Rising Impact

LT Video

Ask Me Anything: Future-Proof Your Career: AI, Testing & Path Ahead

LT Video

2025: The Agentic Shift – Are We Reasoning, Or Just Retrieving Smarter?

LT Video

Accelerating Success: How to Optimize Value Delivery with DORA

LT Video

Rapid Threat-to-Test for Agents

LT Video

How to Build Enterprise-Grade AI Agents Using Robust Evaluation

LT Video

Ship Code. Without Writing It.

LT Video

Exploratory Testing with AI

LT Video

AI-Powered Debugging & Browser Automation with Playwright MCP

LT Video

Network Control for End-to-End Web Testing

LT Video

Keynote: Zero-UI Engineering: Architecting Systems for Agent Experience (AX)

LT Video

So you think a new tool will help? Here’s an idea-t to think about…

LT Video

AI, Automation & DevEx: Fueling High-Velocity Engineering

LT Video

Fast Doesn’t Mean Fragile: Delivering AI-Powered Software at Scale

LT Video

How to Test LLM Agents

LT Video

The Great Reckoning: How AI is Exposing the Existential Crisis of Software Testing

LT Video

Your Test Suite Can’t Catch a Hallucination: Real Talk on AI in Production

LT Video

Event Driven Architecture: Love Triangle in Integration Testing

LT Video

When Life Gives You Lemons… Are You Counting Them or Making Lemonade?

LT Video

AI-Driven Quality Engineering Practices

LT Video

Transforming Retail with Quality Engineering for Seamless Digital Experiences

LT Video

Role of Quality Engineering in Shaping the Future of Financial Services

LT Video

Opening Note Day 2

LT Video

Build Your Testing Sidekick: Custom Tools with Model Context Protocol

LT Video

Reactive Browser Testing with WebDriver BiDi

LT Video

How Software Testing can Increase Agent Autonomy

LT Video

What can go wrong with AI in testing?

LT Video

Code It Forward: Making Your Mark in Open Source

LT Video

Testing the Untestable: Agent to Agent Testing

LT Video

Testing Early, Testing Right - Balancing Early Testing with Real-World Reliability

LT Video

The Enterprise AI Playbook: Strategies for Scaling AI in Quality Engineering

LT Video

Advanced Playwright with AI

LT Video

Generative to Agentic to Quantum - The Evolution of AI

LT Video

Test Data Key to Effective Test Coverage

LT Video

QE Strategic Shift: What's Changing with AI, Automation, and Speed?

LT Video

Building AI Fluency: Leading Teams Through the Learning Curve

LT Video

The Practical Automation Playbook

LT Video

Building a Handwriting Recognition System for the New York Times Crossword

LT Video

Agentic Cloud: Using Agents to Build Tomorrow’s Cloud

LT Video

QA to QE: Scaling Quality with Ownership, Code, and Culture

LT Video

Automated Test Data Portal for Financial Services

LT Video

QA in the Age of AI: Enhancing Agent Reliability Through Evaluation-Driven Development

LT Video

Ensuring quality testing in an AI-driven world

LT Video

AI-Driven Strategies for Scalable & Resilient Performance Engineering

LT Video

Day 3 Opening Note

LT Video

Mastering Appium 3: Architecture, Gestures & Beyond

LT Video

From Zero to MCP: Automating Test Environments for DevOps & QA

LT Video

AI & GenAI in Quality Engineering: Crawl, Walk, Run

LT Video

Embracing Agentic AI: From Autonomous Goals to Enterprise Guarantees

LT Video

Oops, AI Did It Again: How to Get AI to Stop Being Weird and Actually Be Useful

LT Video

Should We Let AI Take Over Test Automation Completely?

LT Video

From Hours to Minutes: Run Thousands of CI Tests in Just Minute

LT Video

Evaluating RAG Applications: From Retrieval to Response Quality

LT Video

Stop Breaking Your Teams: Seeing the Whole Instead of Pieces

LT Video

Surviving and Thriving with AI in QA

LT Video

The Quality Leadership Shift: From Compulsiveness to Cautiousness

LT Video

Full Court Quality: Lacing Up for Speed, Stability & Style

LT Video

Navigating Mobile App Testing and App Store Rejection: From Review to Release

LT Video

Randomized testing: Gotta Catch ‘Em All

LT Video

Balancing release & sprint delivery speed with thorough testing

LT Video

Building for AI at Scale: Infrastructure, Integrity, and Innovation

LT Video

Trusting the Machine: Building Confidence in AI-Driven Testing Decisions

LT Video

Observability - Holistic Quality across Software Systems

LT Video

From SDLC to ADLC: The Enterprise Agent Development Lifecycle

LT Video

Agentic Testing: Your Skilled Human Tester

LT Video

Evolution of Quality Engineering in Financial Services

LT Video

Closing Note Day 3