Observability in Software Test Modernization

May 15th, 2025

39 Mins

Watch Now

Listen On

Donovan Mulder (Guest)

Chief Executive Officer,
Kinetic Skunk

Kavya (Host)

Director of Product Marketing,
LambdaTest

The Full Transcript

Kavya (Director of Product Marketing, LambdaTest) - Hi, everyone. Welcome to another exciting session of the LambdaTest XP Podcast Series. Through the XP Series, we dive into a world of insights and innovation featuring renowned industry experts in the testing and QA ecosystem. I'm your host, Kavya, Director of Product Marketing at LambdaTest, and it's a pleasure to have you with us today.

Today, we are diving into a topic that's redefining how we approach software testing, which is basically observability in software test modernization. In an era where digital experiences must be seamless and reliable, understanding real user interactions and monitoring performance is crucial, which is exactly why we have the guest of today's show with us over here.

Before we dive in, let me just introduce you to someone who truly embodies the spirit of innovation and resilience in the tech space, Donovan Mulder, who is the Chief Executive Officer at Kinetic Skunk. As the Founder and CEO of Kinetic Skunk, Donnie has been instrumental in shaping cloud infrastructure, observability, and DevSecOps practices in partnership with industry giants like AWS, Azure, GitHub, and Red Hat.

So, before we move on to the topic further, let's hand it over to Donnie. The stage is yours. Please let our audience know a little bit about yourself.

Donovan Mulder (Chief Executive Officer, Kinetic Skunk) - Thank you so much, and thanks for having me on this podcast. It's an honour. I've been in software quality or automated testing since the 1990s. So, about 27 years now, and in that space I've seen the e-Bankflow automated testing when it was just came on the scene, and it was quite unsuccessful and it became shelf wear.

And then later on, quite off the shelf, we knew that different approaches to automated testing became available, more research went into it, etc. So, I've seen the ebb and flow of automated testing for nearly three decades. And we've been in Tieson's salon and quality has always been a driving force for me personally in my business, for myself and but more importantly, in business, how to measure quality, right?

And that's a massive thing because you can only measure if you go after proper baselines. So I've spent a lot of my career focused on how to measure quality. I've done some research in this space, I've got a couple of conference papers and a journal, a journal paper in the software quality space. And I've really taken that and built it into the business. And yeah, that's how Kinetic Skunk got started.

Focusing on automated testing and then the natural flow from there is to bring in CI/CD, because it's good to know your test results. It's even better to know your test results earlier. And so, you know, the way I see it, we're in the business of risk mitigation. And this mitigation is all about getting your information as early as possible to make decisions as quickly as possible, right? So it's, yeah, bringing us onto our topic.

Kavya (Director of Product Marketing, LambdaTest) - That's amazing. You have done almost everything in the testing ecosystem, more or less. Very impressive. So let's start with the first question that we have for you, which is, how does real user monitoring differ from synthetic monitoring in terms of data collection and insights?

Donovan Mulder (Chief Executive Officer, Kinetic Skunk) - Yeah, it's a great question. So, synthetic monitoring is like sending bots to your test app in a clean, controlled environment. Helpful but limited. Real user monitoring, or RUM, flips that data collection to collect actual data from users in the wild. So you get to see things like errors, page load issues, and what device types they're using, where in the world they are connecting to your site.

And you get that from real-time sessions. And that becomes quite valuable because it shows you what your users are really experiencing, not just what the test scripts say they should be experiencing.

Kavya (Director of Product Marketing, LambdaTest) - Very interesting. You know, moving on to the next one, which is what are some of the best practices for implementing feedback loops between production monitoring and pre-production testing?

Donovan Mulder (Chief Executive Officer, Kinetic Skunk) - Okay, so that's the best practices, so we can get into quite a few best practices, right? So, observability, amazing, great, can observe, etc.. But you go right down to, let's say, when your developers are writing their code, etc., do they actually have a strategy for implementing the logging of what's happening in that application, right?

And that is sort of the first principles part, right? You don't want to have logging where you have gaps where we're in the execution of the code base, you know, the code of one class gets instantiated and maybe it's only one or two log messages coming from there and it bounces to another function or another method that's in another class.

So you've got those gaps, right? So I always think or subscribe to the fact that your logging needs to tell you a story, right? Needs to tell you what's happening in the application. So that's the most basic idea to start with, or the first principle idea to start with. And that also reduces the cost of observability; log messages are actually coming through, and they can drive up the costs significantly.

And that is a barrier to entry. If I'm looking at the most, in my opinion, the most important and best practice is to get your logging strategy right at the code level already. Then, in production, you need to be monitoring things that make sense in the business. You're a retailer, right? Or what's the most important thing that you want?

Let's say you want that car, you want your car to be performed. You know, if your car is taking 10 seconds for someone to make a payment, I think, I don't know what the stat is, I think the stat is between 60 and 70 percent, and your potential customers have abandoned that card. So that's another best practice time observability to the Business KPI, just with the cart as an example for the retail industry.

Kavya (Director of Product Marketing, LambdaTest) - Very interesting perspective of concern. Thank you so much for those valuable insights. So moving on to the next question, right? How can QA teams use RUM data to make stronger business cases for investing in performance and quality improvements? Because I'm pretty sure that this is something that most of the organizations sort of struggle with in terms of understanding.

Donovan Mulder (Chief Executive Officer, Kinetic Skunk) - Yeah, that's a great question. So to me, RUM or real user monitoring gives QA superpower. QA can show how slow things are, low times, how it's convergence, while certain bugs affect them, say your user community or your customers, right? And business will appreciate that because then they start to understand how what's happening in production affects revenue, affects reputation.

It can be, yeah, we just look at the reputational damage. Your site is slow. People think, okay, or businesses think, okay, so what if the site is slow? Right? But maybe in the MIs, they're not making a connection between a slow site and the reputation, right? The reputation is that we can't finish our transactions quickly enough on that e-commerce site. But that e-commerce site is super fast, maybe a bit more expensive, but I can get my shopping done.

So that's quite a good deal. So in South Africa, where I'm from, about two years ago, there was just a rage about prime water. Think actually it was around the world that everyone wanted to buy prime water. And a large retailer in South Africa got quite a bit of large stock of brand water, they made it available on the e-commerce site, and the site crashed.

Because there are just too many users or potential customers connecting. So not only could they not sell prime water, but whoever was doing the normal shopping, because there's a retailer selling food, etc. So whoever was doing the normal shopping couldn't complete the shopping. Those who were about to make a payment, the cart couldn't complete it.

So, potentially losing millions of dollars, correlating the data and understanding how your site performance affects revenue is super important. So, if QA can attribute bugs that they find those real scenarios, real-life scenarios where they get the data from the real users, QA then becomes a critical business function.

We know it's a critical business function, but it becomes a critical business function in the minds of businesses, right? And then suddenly, QA is not seen as a garage purchase. It's actually seen as a critical opponent of the business. So, super important, very good question.

Kavya (Director of Product Marketing, LambdaTest) - Thank you so much. And I was thinking about how, based on the examples that you provided, right? I mean, this can be implemented across different industries, for instance. It doesn't matter which industry might be dealing with a performance or quality improvement, know, metrics or questions; for instance, it can be relevant for the industry to be implemented by QA teams.

Donovan Mulder (Chief Executive Officer, Kinetic Skunk) - Absolutely, not just retail, every industry where you have a means of interacting with your clients, it's apps, your website, theater, any of those. If your systems are not performed, then you're to suffer reputational damage. You know, this is something that's not really discussed in the industry, so when you deploy a new feature, your user community is very forgiving about new features.

But after deploying a new feature, and that new feature causes a regression in existing features, suddenly you're interrupting how people go about their day. They expect us to work, and when you disturb or interrupt, someone goes about their day. The user community becomes very intolerant of that. So we have an area that I think businesses in industry should use observability or real user monitoring to really protect their business.

Kavya (Director of Product Marketing, LambdaTest) - Thank you so much, Donovan. That's really insightful. Moving on to the next question that we have, what are the approaches that help teams correlate user behavior patterns with performance degradation before issues even escalate?

Donovan Mulder (Chief Executive Officer, Kinetic Skunk) - Okay, so if I'm hearing you, how can we use real user data to predict performance issues based on how the user community uses your application, consumes your application? So in performance testing, we have to put together load models, volumes, etc., and mimic how we think your user community is going to consume your application.

So when it's a Greenfield application, it's new kind of thing, so then we're very dependent upon the synthetic users where we assume, we make assumptions that at a certain time of day there's going to be ramp up of users or a certain time of the month, when let's say after payday, you're a bank, know, people are there, there's going to be lots of transactions, there's going to be lots of shopping, lots of activity, etc.

So you can use that information to model how your user community is going to work. But over time, when you're doing real user monitoring, you can start to collect real data about how your users consume or use your service or use your application or website. And based on that, you can then use that data to inform your test strategy.

So let's call it the right-hand side informing the left-hand side. We have what's on shift left. There's a big thing about shift left, bringing the test in left. But now we know what happens on the right-hand side. The right-hand side, let's call it shift right, the monitoring, the real user monitoring. And that's informing the left-hand side. So over time, you now know how you're using it.

so that you can consume your application. You know that if you introduce itt's just going back to detail, know, things are going, things are, know, you're to have a sale, it's a winter sale, it's a summer, and a summer sale, it's a Black Friday, etc. You've got that data now, right?

So you can then plan how your infrastructure should shrink and contract based on the user community, user usage patterns. So you can plan for that, can assign budgets to increasing infrastructure consumption, so you can do quite a granular level. So that's how, in my experience, that's how real user data will inform how you're to set up your tests, you set up your infrastructure to handle a loan before some kind of special event.

Kavya (Director of Product Marketing, LambdaTest) - Thank you so much. Very interesting. Also, what we basically see is in terms of user behavior patterns in itself, correlating that data and figuring out how the performance is basically getting affected, right? Most of the quality engineering leaders whom we speak with, for instance, they do, you know, they are not able to spend a lot of time even digging through this data.

So with these approaches, I'm pretty sure that they're able to build a framework so as to handle it and approach it in a better manner. So moving on to the next question, what challenges do global distributed teams face when implementing test ops practices, and how can that be addressed?

Donovan Mulder (Chief Executive Officer, Kinetic Skunk) - Okay, um, they are also the symbol challenges and it doesn't even have to be geographically dispersed teams in an organization where you have silo teams. I'm not talking about the silos between QA and TEVA, I'm just talking silos between departments that make up a corporate, potentially each department is following their own way of doing testing.

One way of orchestrating a test environment geographically dispersed teams becomes harder because you have to have effective communication strategies and you're probably more in need of a traditional test centre of excellence for your growth and that informs strategy, informs how test execution environments are going to be orchestrated, test environments, so the target environment, that's going to be orchestrated using tools such as Teleform, HALM, etc. to spin up your QA environments, data environments or pre-prod in order to test that.

So in that respect, standardization, orchestration are really, really important. Test design, how you design functional tests, your approach to design and performance tests. It's also good to add that. Standardize, because I can say if you have several teams, several dev teams, QA teams, etc., and QA, some people from QA need to move to a different team, they shouldn't have to learn new ways of designing tests, right?

Because that will just drive up your cost, time to, the time for someone to become productive on their project also goes up. But just going back to test ops, you want a consistent way of. So you have few years back, I worked for him on a big team for a bank in a major project where we had a deployment team in Adelaide, a test team in India, and a dev team in Moscow. had, I managed the test environments in London, and we had performance teams in New York and in California.

So the testing, deployments just happened around the clock. Back then, no microservices, nothing. But so highly dependent upon a single team somewhere in the world to make sure, in this case, Adelaide, to be deploying effectively. But with microservices, with Lambda architecture, what you say is called the cloud, basic cloud, building blocks, anything these days should be able to deploy an environment they need to test, and they need to be able to test against those environments.

So that's what test ops, in my opinion, comes in and solves that problem. Tools such as LambdaTest in the test ops space solve that problem from an orchestration perspective. So we're just focusing on orchestration. So whatever browsers, browser combinations or devices you need to test against. That's all available. It's not like back in the day, one QA has an iPhone 5, another one has an iPhone 8 over here, and now, sort of all over the show. You know, getting the 50 resolution tends to be quite hard because, you know, devices are sprawled and all over the show.

Kavya (Director of Product Marketing, LambdaTest) - That's an amazing insight, Donovan, super helpful, of course. What also stands out is how the complexity of managing test ops across global teams that are working remote, that's distributed in nature, is so real. But at the same time, with the strategic collaborations and approaches that you mentioned and automation in itself, it can make a huge difference for the quality teams out there. So thanks for sharing those insights again. Speaking of, yeah, go ahead, please.

Donovan Mulder (Chief Executive Officer, Kinetic Skunk) - The big thing over the standardization, right, whether it's design, test execution environments, etc. That's something that TestOps brings to the table, like test orchestration and scale, but it's a consistent test execution environment. Brilliant.

Kavya (Director of Product Marketing, LambdaTest) - Yeah, that's definitely insightful. You know, speaking of creating that platform for QA professionals to basically thrive more, right? So, what skills and capabilities do QA professionals need to develop so that they are able to thrive in an observability-driven testing environment?

This might again be a question that a lot of QA professionals must be wondering about this and at the same time, a lot of quality engineering managers and leads might be thinking about this. So, based on your experience, how have you gone about it?

Donovan Mulder (Chief Executive Officer, Kinetic Skunk) - Okay, so being in QA for such a long time and effectively running a software quality business, us personally, us having to train graduates and bringing them through, you know, through the different programs that we run and also seeing what's happening in industry. So, at Kinetic Skunk, we will focus on putting a lot of energy into training graduates and bringing them through our program.

And our program is structured for what the current demands in the market are. So back in the day, I ran some tests, produced some test results where manually was great. And then we needed to automate that automation in the late 90s, early 2000s, became software. Not many people wanted to use it at any expense.

And the problem there was that the people who started automating tests were not trained software engineers. Manual testing background, test analysts, and business analysts, but they weren't trained on code, proper software engineering principles. So that stage, recording playback, seems like a silver bullet, but it actually ended up being an Achilles' heel for automated testing.

Then people started actually understanding, learning that we as quality engineers, we need to know how to write code in order to automate this test to make tests more repeatable, manageable, reusable, etc. And from let's call it recording playback, we moved to functional decomposition.

From functional decomposition, we moved to data-driven testing. From data-driven testing, we moved to 3D-attached driven testing, from the move to keyword-driven testing, to behaviour-driven testing. There are other things like model-based testing, etc. But essentially, as that evolution occurred, test engineers needed to know more about the technical components with regard to automated testing microservices and suddenly with a template.

And with CI/CD, the automation of deployments doesn't have to be so many with the DevOps team or the Linux team or anything like that. A normal test automation engineer or even a manual tester can execute the script to deploy it into a Kubernetes environment.

So again, I've to know how to code, understand databases and now in this day at this current juncture, having to know how not just how to design tests but how to automate it, how to incorporate your tests into CI/CD pipeline, understanding DevOps, understanding how to deploy microservices, etc.

But three or four years ago, I had a conversation with some other professional test engineers, and I said our future test engineer needs to be data scientists. They need to know data science. So here we talk about the correlation of what's happening in the real world versus what's happening under the hood.

So now we have all of this data available. And as a test engineer, what do you do with the data? You need to correlate that data to what's happening in production. It needs to inform your tests and so on. Do you need to improve your tests? Do you need greater coverage, etc.? So, in my opinion, your test engineers, as things are moving so fast, need to understand data.

They need some element, some level of data science, experience or expertise as a tool now, observability, real user monitoring, that's all about what you need with the data. What are you going to do?

So get the business value out of the data and correlate it to what's happening in your QA environments, dev environments, update your test strategy, your tests, etc. So the skill right now is to know data. You need to know how to handle that data. That's my opinion. And what's happening in industry, I think, is validating my opinion.

Kavya (Director of Product Marketing, LambdaTest) - That makes a lot of sense, Donovan. Observability is definitely shifting the way QA professionals have been working or even thinking about testing, right? And it's very interesting and exciting to see how skills like data analysis, for instance, are just becoming so essential.

So thanks for breaking that down. Just quickly on what you just mentioned, right? Do you also have any resources for QA professionals that how they can get started with those skills? Are there any recommendations for resources that you'd like to share with them?

Donovan Mulder (Chief Executive Officer, Kinetic Skunk) - So, you know, as a start, understand what observability is, right, so getting into that. So you have something, have, let's call it alerting and monitoring. So that's reactive. You're waiting for something to happen and monitoring, and monitoring is going to wait for something to happen, and it's going to alert, and you're to be alerted about it.

Whereas observability is about observing your system in order to introduce changes to improve efficiency, a slow, let's say, a slow cart might not. Unless you've got a configured alert, it might not generate an alert, right? Or the alert will be when it's taking five seconds for someone to make a payment, right?

But leading up to five seconds, observability can tell you, okay, know, four seconds is great, but if we can get it down to three, we're to improve conversions by 15%. So it's a new attempt to bring in observability. And that means your tester or your QA professional needs to understand business metrics as well.

So there's observability, great, we can consume a lot of data, but we need to make sense of that data. So, where to start? Know, go in and search. Observability, real user monitoring, and you know, it always comes back to mathematics and stats, especially when you get into the data science components and understanding what to do with the data on a way more higher level of sophistication.

That amount of learning is important, so just going back to first principles, getting a solid understanding of maths and stats, which you should get if you're a science-based or computer science degree. But that's not going to stop you from becoming great or really good at quality engineering.

So, yep, that's where I will start. If you've got no technical background or even if you have a technical background, I mentioned first principles earlier, and we got first principles. You know, chunking junk out. if you're automating, you're creating automation, right, but you're, but the test design, the tests are not designed to expose certain classes of defects, right?

So actually understanding how to design a manual test to expose different classes of defects, because that's the, that's the beginning, that's the basis, right, for us as quality engineers. Then automating that and then observability on top of that. So then you get reliable data.

So all the tech is amazing but if your tests aren't efficiently designed or doesn't expose classes of defects or doesn't mitigate the risks of regression back, even if everything passes, should tell you that you're can go live, Because it mitigates a certain type of risk which we spoke about earlier, that if a user or user community can't do what they normally do day to day, they become quite unforgiving, right?

So, back to basics, what does that look like, no technical knowledge to be the obvious thing. Go and learn how to code in Python. It's a super-easy language to understand and to get quite far ahead if you compare it to Java. There are Python wrappers around Selenium, around Playwright, etc.

So then you can start doing that and then move on, learning behavior-driven testing, how to design your tests on tests, really understanding that, so you get to automate that. And we can go on and on and on from there, learning how to, you know, dockerize your test automation framework if you can build one.

And then getting on to understanding test ops, how can you form your test execution to using tools such as LambdaTest so you can do unattended, this execution, you know, so there's quite a bit but if you're looking at the data side I'd say go and do some basic data science courses.

Kavya (Director of Product Marketing, LambdaTest) - Thank you so much, Donovan, for those insights and resources. I also wanted to just take a minute to say that you run an amazing community in South Africa for testers. Kudos for building that. I just wanted to let our listeners know about the community aspect of the world that you do, too. Yeah, feel free to, once we publish the podcast, I hope that people can connect with you in case they have any more questions around it.

So, moving on to the last question that we have for the day, what are your thoughts on future trends at the intersection of AI observability and software quality? And how might software quality evolve over the next five years as observability practices mature? And I'm sure it's a tough question to answer because the way software quality is evolving within a fraction of seconds or, you know, yeah, seconds even. It's a tough question to answer for sure.

Donovan Mulder (Chief Executive Officer, Kinetic Skunk) - So, if I'm hearing you right, how is AI going to affect observability in the testing space? So, the patterns, the user patterns that we're talking about and all that data being collected, and actually being able to make sense of that data. I'm pretty sure many organizations out there are already developing AI models or machine learning models to consume that data and start making those predictions.

This is one of the reasons I say that a future test engineer is also going to be a data scientist, that they can actually go into, in my opinion, building those models themselves. So, from a job perspective, getting into data science, ML, and AI, and actually getting into building those models that can do that with consumer data and make predictions from there.

So from an observability perspective, that model is going to consume the RUM data to the real user data. It's going to be able to inform you that it's going to look at your test suite. It's going to inform you that there is a lack of test coverage over here, right? And then you can go to manual intervention and update that.

But pretty soon, it's going to be able to update itself, read that way, and they were able to run those tests. I'd like to validate it, etc. So I think the future is exciting. It's also scary because as a test engineer, it means we really have to evolve our game and really step up, going to the AI, ML space, but with the domain knowledge of software quality, software testing how to actually build those models using our domain knowledge.

So, as you know, AI means right now that domain knowledge is going to be supreme. So, that's my opinion, and how it's going to impact. It's going to be massive. But as test engineers, as organizations that push out professional services in the software quality space, the organization itself also needs to evolve in that way.

Kavya (Director of Product Marketing, LambdaTest) - Wow, the future of software quality is definitely dynamic. We are very sure about it. And I'm sure that everyone's excited to see how it all unfolds over the next couple of years. What also stands out is how the convergence of AI as well as observability, seems like it's going to redefine how teams ensure efficiency across the board. So thank you so much, Donovan. It's been an incredibly insightful session.

You know, really amazing insights that you've shared on how observability and RUM can, of course, transform software test modernization. And to our listeners, thank you for tuning in. We hope today's session has provided you with practical takeaways to integrate RUM and also absorb it into your testing strategy.

This is just one of the many thought-provoking conversations that we will be bringing to you through LambdaTest XP Series. So, stay tuned for more sessions, and Donovan once again just wanted to thank you for your time as well as for sharing all the insights. If there is anything that you'd like to share with our audience as we wrap, over to you.

Donovan Mulder (Chief Executive Officer, Kinetic Skunk) - Closing comment. To your point about rapidly change and how dynamic the industry we find ourselves in is. All of this, they came into it about three years ago in conversation, was that our future test analysts or test engineers have to become data scientists. So it is scary, I mentioned earlier, but I think as test engineers, quality assurance engineers, we just need to embrace it, take on the challenge and become, let's call it, AI-skilled test engineers.

Kavya (Director of Product Marketing, LambdaTest) - Awesome. Thank you so much, Donovan, for those closing remarks. And everyone, stay tuned for more exciting sessions where we continue to explore the future of testing automation software quality with industry leaders. Until next time, stay curious, stay innovative, and stay testing. Happy testing. Thank you so much. Have a good day.

Guest

Donovan Mulder

Chief Executive Officer

Donovan “Donnie” Mulder, a technologist, entrepreneur, and global thought leader in DevOps, software quality, and digital resilience. As Founder and CEO of Kinetic Skunk, a cutting-edge firm delivering Cloud Infrastructure, Observability, and DevSecOps solutions in partnership with tech giants like AWS, Azure, GitLab, and RedHat. With over 20 years' experience, Donnie has led global teams and developed testing solutions for critical systems like Deutsche Bank's trading platform. He's also a published academic and passionate social activist, committed to equity in tech and mentoring emerging talent across Africa's underrepresented communities.

Host

Kavya

Director of Product Marketing, LambdaTest

With over 8 years of marketing experience, Kavya is the Director of Product Marketing at LambdaTest. In her role, she leads various aspects, including product marketing, DevRel marketing, partnerships, GTM activities, field marketing, and branding. Prior to LambdaTest, Kavya played a key role at Internshala, a startup in Edtech and HRtech, where she managed media, PR, social media, content, and marketing across different verticals. Passionate about startups, technology, education, and social impact, Kavya excels in creating and executing marketing strategies that foster growth, engagement, and awareness.