QA in the Age of AI: Enhancing Agent Reliability Through Evaluation-Driven Development | Shadab Nazar | TestMu 2025
In this thought-provoking TestMu session, 𝐒𝐡𝐚𝐝𝐚𝐛 𝐍𝐚𝐳𝐚𝐫, Lead Generative AI Architect at Splunk, dives into the future of QA in the Age of AI - where ensuring agent reliability demands a whole new mindset.
With over two decades of experience across Generative AI, system assurance, and NLP observability, Shadab shares real-world lessons from building enterprise-scale AI platforms and why QA professionals are uniquely positioned to lead this new era of evaluation-first development.
Traditional QA wasn’t built for probabilistic, data-driven systems that evolve constantly. But Shadab shows how the core skills of QA - test planning, scenario coverage, and automation can be reimagined through an evaluation-driven development framework to make AI agents more reliable, transparent, and trustworthy.
✔ Why AI agents require a rethinking - not a replacement of traditional QA practices.
✔ How QA teams can drive evaluation-first development workflows.
✔ Practical tools and techniques for automating AI agent testing and monitoring.
✔ Real-world examples of QA practices improving AI agent reliability and user experience.
Testμ
LambdaTest is an AI-Native test orchestration and execution platform that allows you to perform both manual and automated testing across 3000+ environments, making it a top choice among other cloud testing platforms.