What Is New in Selenium 4?

Before we dive into what’s new in Selenium 4, let’s do a quick refresh of the basics.

Many people think of Selenium as simply the WebDriver. But it is an open-source suite that comprises a host of tools such as Selenium IDE, Grid, WebDriver and RC.

Selenium IDE

Selenium IDE is a Record and Playback tool that creates test scripts in programming languages such as C#, Python, Java, and Ruby. It uses its own scripting commands known as Selenese. You can customize these scripts based on your requirements and run the tests.

Selenium RC

Selenium Remote Control simulates user actions such as submitting a form, clicking a button, entering text in fields, and writing scripts in any programming language to automate UI tests using GET/POST requests. However, the Selenium RC server must start manually before you execute the tests. Due to its limitations and low performance, it has been deprecated.

Selenium WebDriver

Selenium WebDriver is an important tool that was introduced to overcome the limitations of Selenium RC. WebDriver controls browsers at the OS level and establishes direct communication between the code and the browser. It supports multiple language bindings such as C#, Python, Java, Ruby, Javascript, and browsers such as Chrome, Safari, Firefox, Opera, Edge, and Internet Explorer.

Selenium Grid

Selenium Grid is a smart server that allows you to run tests across browsers, operating systems, and machines concurrently. We use it primarily for cross-browser testing and running tests on a remote computer.

In Selenium 4, some items are deprecated: FindsBy interfaces, Actions class, Driver constructors, etc. The complete list with details is available here.

New Features in Selenium 4

Here’s the most significant difference between Selenium 3 and 4:

Selenium 3 requires encoding and decoding of API since the browser interaction happens through the JSON wire protocol.
Selenium 4 does not require encoding and decoding because the communication between the driver and the browser follows the W3C(World Wide Web Consortium) standard protocol.

Now, let’s look at some exciting new features introduced in Selenium 4.

1. Relative Locators

We use locators to find an element and perform an action. You would require knowledge of the DOM structure and the page to define these locators. Sometimes finding locators to identify an element can become complex. What if there is an option to locate an element next to another element? Interesting, right? That is where Relative Locators come into the picture. You can use the By locator or WebElement parameters for these methods.

This is mainly helpful when your application is unstable, has dynamic elements, or has elements without unique attribute values.

Relative Locators Methods

above() – used to find an element/elements located just above a fixed element
below() – used to find an element/elements located just below a fixed element
near() – used to find an element/elements located 50 pixels distant a fixed element
toLeftOf() – used to find an element/elements located to the left of a fixed element
toRightOf() – used to find an element/elements located to the right of a fixed element

Example: Let us use the relative locators and find some of the elements on the Contact page of the Modus Create website.

Import ‘byTagName’ and identify the relative locators as follows.

import static org. openqa. selenium. support. locators.RelativeLocator.withTagName;

WebElement firstName = driver.findElement(By.name(“firstname”));

WebElement whatCanWeDoForYouInput = driver.findElement(By.name(“
what_can_we_do_for_you”));

WebElement email = driver.findElement(RelativeLocator.withTagName(“input”).below(firstName));

WebElement lastName = driver.findElement(RelativeLocator.withTagName(“input”).
toRightOf(firstName));

WebElement industry =
driver.findElement(RelativeLocator.withTagName(“input”).above(whatCanWeDoForYouInput));

2. Chrome Debugging Feature

Selenium introduced a new feature around the Chrome browser DevTools debugging protocol.

Chrome DevTools are developer tools that are built into the Chrome browser. They enable us to perform a variety of tasks, such as:

View console logs
Inspect and edit elements
Check and monitor website performance
Mock the geolocation
Mock the network speed
Execute and debug JavaScript

You can use the development properties such as Fetch, Network, Profiler, Performance, Application cache, Resource timings, Security, Target CDP domains, and debug the problems. The important packages available in the library org.openqa. Selenium.devtools are listed here.

You can trigger the DevTools commands with the following script.

DevTools devTools = (new ChromeDriver()).getDevTools();

devTools.createSession();

Selenium provides new class methods for the network that enables the following capabilities:

Intercepting requests.
Emulating Network Conditions by changing connection types.
Enabling network tracking.

You can find further details on the implementation and the sample scripts of CDP (Chrome DevTools Protocol) commands here.

3. Capture Screenshots and Location

Capturing screenshots gives a quick visual representation of the page. It saves time, and you do not need to rerun the script to check the issue. If you want to capture a screenshot of a specific element, Selenium has provided a new method. We can get the location and size of the element using the Rectangle object.

Example: Use the code snippet to take a screenshot of the WebElement ‘logo’ from the webpage

WebElement logo= WebElement logo = driver.findElement(By.className("col-sm-12
col-md-4 fl-page-header-logo-col"));

File srcFile=logo.getScreenshotAs(OutputType.FILE);

File destinationFile =new File(“logo.png”);

FileUtils.copyFile(file,destfile);

Example: Let us get the location, coordinates, height, and width of the logo element.

WebDriverManager.chromedriver().setup();

WebDriver driver = new ChromeDriver();

driver.navigate().to("https://moduscreate.com/");

WebElement logo = driver.findElement(By.className("col-sm-12 col-md-4 
fl-page-header-logo-col"));

System.out.println("Height is "+logo.getRect().getDimension().getHeight());

System.out.println("Width is "+logo.getRect().getDimension().getWidth());

System.out.println("Location X is "+logo.getRect().getX());

System.out.println("Location Y is "+logo.getRect().getY());

4. Multiple Tabs and Windows Handling

Selenium 4 lets you open multiple tabs/windows along with the existing ones. You can use the following commands to achieve this:

driver.get(https://www.google.com/);

driver.switchTo().newWindow(WindowType.TAB);

driver.navigate().to(https://www.moduscreate.com/);

Similarly, you can also open multiple windows using these commands:

driver.get(https://www.google.com/);

driver.switchTo().newWindow(WindowType.WINDOW);

driver.navigate().to(https://moduscreate.com/work/)

5. Selenium Grid with Observability

Selenium Grid has a more user-friendly UI and comes with Docker support, which helps spin up the containers. You can also deploy the grid via Kubernetes cluster on AWS, GCP, or Azure.
Grid features are designed to be a fully distributed system, making it easier to understand and debug the traces, metrics, and logs.

6. Selenium IDE TNG (The Next Generation)

New Selenium IDE has received a fresh UI and allows us to run the scripts on any browser.
You can now specify more than one locator for an element (back up element). When a locator is not found, it falls back to the alternative locators and looks for the element. Our tests that failed previously will not fail anymore.
IDE comes with a package SIDE tool aka Selenium IDE runner that allows testers to run the ‘.side’ projects on ‘node, js’ platform. You can also use this runner to run the cross-browser tests, provide information of the time, passed, and failed statuses.
It has an export option to export the code in Java, .Net, Python, Ruby, C#, and Javascript.
It follows the accessibility guidelines and supports some of the controls like role, tooltips, focus order, announces the start of recording, color, and design.
You can now use better if/else conditions, loops to control the execution order of the statements.

Image source: selenium.dev

Selenium IDE will soon be available as a standalone app, rewritten to be an Electron app that allows us to listen from the browser, making test recordings even more powerful.

Conclusion

Selenium is the most popular and preferred open-source tool for writing test scripts in different languages (Ruby, Python, Java, C#, etc.). There are a large number of active Selenium communities to address issues. IDE makes it easier for the testers to automate the test cases with limited programming knowledge. WebDriver makes cross-browser testing more effective worldwide. You can reuse the test suites and test across multiple browsers & operating systems. With the regrouping/refactoring of test scripts, you can make quick changes that improve maintainability. Selenium Grid enables us to run tests in parallel, which saves time and facilitates faster releases. As per the data, 55,514 companies use Selenium. In addition to this, Selenium 4 is available with enriched features such as Relative Locators, Chrome DevTools, Multiple Tabs/Windows Handling, Screenshot for Elements, new IDE improvements, and Grid enhancements that make automation projects more successful.

References

This post was published under the Quality Assurance Community of Experts. Communities of Experts are specialized groups at Modus that consolidate knowledge, document standards, reduce delivery times for clients, and open up growth opportunities for team members. Learn more about the Modus Community of Experts here.

Posted in Quality Assurance

Durga Sundarraj

Durga is a QA Tester at Modus Create with over ten years of experience in delivering successful products. She is passionate about Agile Methodologies, BDD, Quality Assurance, and Quality Control processes. Durga is an expert in testing various products, involved in automation development using selenium. During her free time, She likes to travel and spend time with her family.