How To Find Broken Links Using Cypress [With Examples]

Enrique

Posted On: August 5, 2021

21 Min

Have you ever encountered a 404 error? From an end user’s perspective, a broken link can be a major turn-off. Beyond frustrating users, dead links can negatively impact your website’s SEO, increase bounce rates, and reduce site credibility. Regularly detecting and fixing broken links is crucial for maintaining a professional, reliable site. Cypress simplifies this process with its cy.request() function, allowing you to check the status of each link and quickly identify any that aren’t working.

Overview

Why Find Broken Links On A Website?

Broken links can frustrate users and reduce engagement, while also impacting how search engines view your site. Regularly checking links ensures smooth navigation and maintains website credibility.

  • User Retention: Users encountering broken links are more likely to leave your site immediately.
  • Bounce Rate: High bounce rates from dead links can indirectly affect search engine rankings.
  • Trust & Credibility: Broken links reduce trust and professionalism, affecting your brand reputation.
  • Content Accessibility: Preventing 404 errors ensures all content is accessible and functional.
  • Navigation Experience: Helps maintain a seamless navigation experience across all pages.

How To Find Broken Links Using Cypress?

Cypress allows you to automate link verification across your website, helping you quickly identify pages that don’t load correctly or return errors. By leveraging its modern testing capabilities, you can streamline checks and reduce manual effort.

  • Collect Links: Gather all anchor elements (<a> tags) on the page and extract their URLs.
  • Check Response: Loop through each URL and send a request to check the HTTP response code.
  • Flag Errors: Identify links that return error codes (like 404 or 500) to flag broken pages.
  • Use Configs: Utilize configuration files to manage multiple URLs and scale tests across different site sections.
  • Log Results: Record results for easy debugging and track which links need updates.
  • Handle Exceptions: Integrate exception handling to avoid test failures from non-critical errors.

How Can Error Handling Prevent Cypress Tests From Failing Due to Minor Issues?

Error handling in Cypress, such as ignoring uncaught exceptions, ensures that non-critical errors like temporary link failures do not stop the entire test. This makes the testing process more robust and allows teams to focus on actionable failures rather than being blocked by minor issues.

Search engine algorithms pay special attention to how users behave on your website. As a result, their online behavior has a significant role in the ranking process. HTTP 404 code is one of the most frustrating things your visitors can come across, and the chances are that they might never revisit your website. It is like handing over your customer base to your competition.

Apart from a negative user experience, high bounce rates due to broken links can negatively affect your SEO. So although Google’s algorithm may not directly consider bounce rate, it can indeed hurt your online rankings.
This is where a broken link checker using Cypress framework can be handy, as it can be regularly triggered to ensure that the website is free from broken links.

You can find broken links on a website only when you have clarity about the various conditions that might result in broken links. There are umpteen reasons that result in 404 errors (or broken links/dead links); the major ones are mentioned below:

  • The page has been removed from the website
  • The page is moved to another URL, and the redirection is done incorrectly
  • You entered the wrong URL address
  • Server malfunctions (though it is a rarity)
  • You have entered an expired domain address

Broken links are often left for long periods after the page has been deleted or moved. This is because websites that link to this page are not informed that the site doesn’t exist anymore or can be found under a new URL. In addition, broken HTML tags, JavaScript errors, and broken embedded elements can also lead to broken (or dead) links on a website.

Irrespective of the criticality of the ‘page’ from the larger scheme of things (of your website), it is essential to keep a regular check for broken links on the website. Though you can find broken links using Selenium, finding broken links on a website using Cypress is recommended due to the simplicity involved in the implementation process.

Whenever a user visits a website, the server responds to the request sent by the browser with a three-digit response code. This code is called ‘HTTP Response Status Code,’ which indicates the status of the HTTP request.

Here are the five major classes of HTTP status codes:

  • Informational responses (100–199)
  • Successful responses (200–299)
  • Redirects (300–399
  • Client errors (400–499)
  • Server errors (500–599)

Though it is important to have a top-level understanding of all the HTTP status codes, our interest mainly lies in HTTP 404 status, which indicates whether a particular link on the website is broken. A 404 error means that although the server is reachable, the specific page you are looking for is not present (or available) on the server. Essentially, it’s a page that doesn’t exist, or it’s broken. The 404 error code can appear in any browser, regardless if you’re using Google Chrome or Firefox.

Here are some of the many ways in which 404 errors are shown to the end user:

  • 404 Not Found Error
  • 404 HTTP 404
  • 404 Page Not Found
  • Error 404 Not Found
  • HTTP 404 Not Found
  • The requested URL was not found on this server
  • 404 File or Directory Not Found

If you want a quick walk-through of the Cypress framework, check out our blog that explains the architecture of Cypress automation framework. Alternatively, you can also check out LambdaTest YouTube Channel, where we have a video that explains how you can get started with Cypress.

Starting your journey with Cypress End-to-End Testing? Check out how you can test your Cypress test scripts on LambdaTest’s online cloud.

How To Find Broken Links Using Cypress

Till now, you would have understood the importance of checking broken links on your website? So, with the platform all set, let’s look at how to find broken links using Cypress. For starters, Cypress is a next-generation front-end testing tool built for the modern web; Cypress testing enables you to write faster, easier, and more reliable tests.

In case you are coming from a Selenium automation background, make sure to check out the differences between Selenium and Cypress from our blog that covers Cypress vs Selenium comparison. Saying that, let’s focus on our problem about broken links and how to build some tests to validate that using Cypress test automation.

Let’s take a sample HTML page that contains four relative hyperlinks:

As you can see, we have four links, and we need to click on each link, check the redirect URL, and then go back to our main page. So, how do you find broken links using Cypress considering the above HTML page.

What if we come up with an implementation that is limited to finding which out of the four are broken in nature. If the implementation is hardcoded for four links, it will lead to scalability issues, especially if the checker is used on a different web page.

Here is the sample code that uses the Cypress framework to find broken links on the website. As mentioned earlier, this is not a scalable approach and should be avoided when finding broken links on large-scale websites.

In our example below, we are clicking on every page and checking the specific assertion, it does not follow the design patterns, and we can improve the way we get the links on our website.

Let’s take a look at the following example.

As you can deduce from the code above, we are looping against the specific pages and validating the page information. We created a ‘forEach’ loop that will iterate through the array where the entire procedure is repeated. Particularly useful if, for any reason, our navigation bar changes items. We’ll add an item to the array, and our test works.

“If you are using Cypress to find broken links on your website, it is essential to note that Cypress changes its host URL to match the URL of your AUT (Application Under Test). The basic requirement from Cypress is that the URLs being navigated should have the same superdomain for the entirety of a single test.”

Navigating to subdomains works fine, but Cypress will throw an error if you visit two different superdomains. Thus, you can see different superdomains in other tests but not in the same test.

Instead of opening every link on the test website, we can simply check links with the href attribute and check their HTTP status code. If the return code is 404, it means that that particular link is a broken (or dead) link.

In order to demonstrate how to find broken links on your website using Cypress, let’s perform a broken links test on the LambdaTest Blog. There are close to 500+ posts on the LambdaTest blog, and our broken link checker using Cypress will check every link (i.e., under href attribute).

Here is the test scenario used for finding broken links on a website using Cypress:

Test Scenario

  1. Go to LambdaTest Blog on Chrome.
  2. Collect all the links present on the page.
  3. Send HTTP requests for each link.
  4. Print whether the link is broken or not on the terminal.

Project Structure

It is time to create our test under the integration folder; as you can see below, we have a test called test-example.js. Shown below is the directory structure:

cypress

Implementation

For ensuring that the code is extensible and maintainable to check for broken links on the ‘N’ number of websites, we keep the test URL in a separate JSON file (e.g., config.json).

As seen above, we have two test URLs (i.e., URL1 and URL2); however, we would run the test only on URL1. Shown below is the implementation (that uses JavaScript).

Code Walkthrough

Step 1:

We first import config.json since it contains the test links.

Step 2:

We now visit a remote URL. The base URL is stored in Cypress.json to ensure better portability and maintainability.

Step 3:

We need to ignore some uncaught exceptions when we are testing some websites. In this case, we can use the following code to turn off uncaught exception handling for specific errors. Cy.on is to catch a single exception or event; in this case, we have used this code to fail a test on purpose, using a small hack.

Step 4:

cy.wrap is used for logging purposes in Cypress. In this case, we are using it as a control variable to test or fail the test depending on our parameters.

Step 5:

We use “each” to get the elements, excluding “mailto:” and empty ones. With this, we will get the URLs that we want to monitor for broken links with Cypress.

Step 6:

We are validating head links, and one of those always presents anchors for the code block shown below. As a part of the process, we validate them all. We combine the selectors wherever possible.

Execution

Now, let’s add to Cypress and run it from there; if you already have an npm project, please open a terminal using VS Code and run the following command:

Now that Cypress is installed, let’s run the following command to get the Cypress folder:

For configuring Cypress, we open Cypress Test Runner, which creates Cypress.json. This JSON file is used to store any configuration values you supply.

Open Cypress test runner and click on the corresponding test to execute the same.

unnamed (1)

Here is the test execution, which indicates that there are zero broken links on the test website:

pasted image 0 (3)

Take this certification to showcase your expertise with end-to-end testing using Cypress automation framework and stay one step ahead.

Here’s a short glimpse of the Cypress 101 certification from LambdaTest:

Perform Cypress Parallel Testing on LambdaTest and speed up the testing and release process. Check out how you can test your Cypress test scripts on LambdaTest’s online cloud.

How to find broken links using Cypress on Cloud Grid

Cypress testing on cloud grid like LambdaTest helps in running tests on a wide range of browser and OS combinations. Parallel execution helps in accelerated test execution as well as achieving optimal browser coverage.

Cypress on LambdaTest helps you run Cypress tests at scale. You can check out our earlier blog that deep dives into the essentials on how to perform Cypress testing at scale with LambdaTest.

To get started, you have to install LambdaTest Cypress CLI on your machine. Trigger the following command to install the same:

After installation is completed, setup the configuration using the below command:

Once the command is completed, lambdatest-config.json is created in the project folder. Next, enter the LambdaTest credentials from the LambdaTest profile section.

Here is how you can configure the required browser & OS combinations in lambdatest-config.json:

run_settings section in the JSON file contains the desired Cypress test suite capabilities, including Cypress_version, build_name, visual feedback settings, number of parallel sessions, etc.

Tunnel_settings in the JSON file lets you connect your local system with LambdaTest servers via an SSH-based integration tunnel. Once this tunnel is established, you can test your locally hosted pages on all the browsers currently supported by Cypress on LambdaTest.

Now that the setup is ready, it’s time to run the tests by triggering the following command:

Shown below is the test execution status from the Automation Dashboard:

automation dashboard

After the test execution, click on the test name to view the automation logs for debugging the respective test.

automation log

You can view the live video feed, screenshots for each test run, view console logs, terminal logs, and do much more using Cypress on LambdaTest.

One crucial aspect of running the tests using LambdaTest and Cypress is parallel testing. This can be achieved using two methods. The first option is passing the parallelization level from the command line:

The other option is to set the parallelization level using the parallels key in lambdatest-config.json.

Here is the execution snapshot, which indicates the progress of test execution:

automation test execution

Here’s quick video if you have doubts regarding how to handle iframes in Cypress.

To summarize, the major benefit of Cypress testing on the cloud is achieving optimal test coverage without making modifications to the core logic of the test code.

It’s A Wrap

It’s inevitable 404 errors will appear on your site. Broken links can also impact the rankings on search engines; make sure you proactively monitor the links of your website. Finding broken links on your website or an HTTP 404 is as vital as posting unique and high-quality content.

pasted image 0 (2)
Source

Check your 404s as a part of continuous testing, and you can include Cypress tests as part of your testing tools. Putting time aside to update your website and perform technical testing will help you stay ahead of the competition.

We know that we must enable, nurture, and foster an ecosystem that includes high-quality success. Every line of test code is an investment in our codebase. Tests will be able to run and work independently always. Lastly, in this Cypress tutorial, we saw how LambdaTest and Cypress integration ensure seamless user experience across different browsers on 40+ browser versions simultaneously.

Happy Bug Hunting with Cypress!!

Frequently Asked Questions (FAQs)

Why is automating broken link detection important for large websites?

Manually checking links on large websites is time-consuming and error-prone. Automating with tools like Cypress ensures every link is verified, preventing dead pages and improving overall website reliability.

How can Cypress tests for broken links improve SEO performance?

Broken links frustrate users and increase bounce rates, indirectly affecting SEO rankings. Cypress helps identify and fix these links, ensuring search engines can crawl all pages effectively, improving visibility and user experience.

Can Cypress detect broken links across multiple domains in one test?

Cypress tests are limited to URLs within the same superdomain in a single test. For external domains, separate tests or external HTTP request scripts are required.

How can broken link tests be scaled effectively using Cypress?

Scale tests by storing URLs in external configuration files, looping through them, and using parallel execution on cloud platforms. This makes tests maintainable and flexible for adding or removing links.

How does checking the HTTP response code help detect broken links?

HTTP response codes indicate link status: 200 means accessible, while 404 or 500 indicate broken pages. Checking these codes automatically detects non-functional links without manually visiting each page.

How can Cypress handle links that are dynamically loaded on a page?

Cypress can wait for elements to appear or use assertions to ensure visibility before validating links. This ensures all links, including those generated via JavaScript, are checked.

Can Cypress validate links inside iframes?

Yes, by switching context to the iframe using iframe-specific commands or plugins, Cypress can access its content and verify links just like on the main page.

What role does parallel testing play in broken link validation?

Parallel testing allows multiple link checks to run simultaneously across different browsers or environments, reducing overall testing time and enabling regular checks on large websites.

How can error handling prevent Cypress tests from failing due to minor issues?

Error handling, like ignoring uncaught exceptions, ensures non-critical errors don’t stop the entire test. This makes testing robust and focuses on actionable failures.

Can broken link tests be integrated into CI/CD pipelines?

Yes, Cypress broken link tests can run during automated builds in CI/CD pipelines, catching newly introduced broken links before deployment and maintaining website quality and reliability.

Author

Enrique DeCoss, Senior Quality Assurance Manager at FICO is an industry leader in quality strategy with 16+ of experience implementing automation tools. Enrique has a strong background in Testing Tools, API testing strategies, performance testing, Linux OS and testing techniques. Enrique loves to share his in-depth knowledge in competencies including, Selenium, JavaScript, Python, ML testing tools, cloud computing, agile methodologies, and people Management.

Blogs: 11

linkedintwitter