How to Find Broken Links in Selenium Java

Selenium is an open-source web-based automation tool that is implemented using a web driver. We will be using geckodriver because Selenium 3 enables geckodriver as the default WebDriver implementation for Firefox.

Pre-requisites:

  1. geckodriver.exe
  2. maven dependency selenium
 <dependency>
    <groupid>org.seleniumhq.selenium</groupid>
	<artifactid>selenium-java</artifactid>  
	<version>4.1.1</version> 
 </dependency>

Steps:

Finding broken links takes 5 simple steps:
  1. Set webdriver.gecko.driver and its' path as a system property.
  2. Set the firefox diver and browse to the website.
  3. Get all the anchor tags as Web-element using webdriver.findElements() function and css selector.
  4. Get the value of the attribute href from web-element as URL and try to open an HTTP connection using java.net.HttpURLConnection.
  5. Verify from the response code, whether the link is broken or not. (Broken links have a response code greater than and equals 400: reference)

Let’s see all the above steps in the code. We will be using HttpURLConnection.getResponseCode() function to get the response status of links:

import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;

import java.net.HttpURLConnection;
import java.net.URL;
import java.util.ArrayList;
import java.util.List;
import java.util.Random;

public class GetBrokenLinks {

    public static String GECKODRIVER_PATH = "F:\\WORK\\SeleniumShortTasks\\ResponseCode\\src\\main\\resources\\geckodriver.exe";

    public static void main(String[] args) {
        //set firefox webdriver
        System.setProperty("webdriver.gecko.driver", GECKODRIVER_PATH);
        WebDriver driver = new FirefoxDriver();

        //get the firefox browser & Browse the Website
       // String siteLink = "https://www.google.com.pk";
        String siteLink = "http://isoredirect.centos.org/centos/8/isos/x86_64/";
        driver.get(siteLink);

        //get all link elements : anchor tags
        List<WebElement> linkElements = driver.findElements(By.tagName("a"));

        for (WebElement e : linkElements) {

            //Get link : href
            String link = e.getAttribute("href");

            try {
                // establish, open connection with URL
                HttpURLConnection cn = (HttpURLConnection) new URL(link).openConnection();
                // set HEADER request
                cn.setRequestMethod("HEAD");
                // connection initiate
                cn.connect();
                //get response code
                int res = cn.getResponseCode();

                //Verify if links is broken or not Display
                if (res >= 400)
                    System.out.println("Http response code: " + res + " --broken link : " + link);
                else
                    System.out.println("Http response code: " + res + " --Not A Broken link : " + link);
            } catch (Exception exp) {
                exp.printStackTrace();

            }

        }

        driver.quit();

    }
}

  

Output:

All resources used in this tutorial are attached:
  • source code
  • geckdriver.exe
Download

Comments