Before we dive into what’s new in Selenium 4, let’s do a quick refresh of the basics.
Many people think of Selenium as simply the WebDriver. But it is an open-source suite that comprises a host of tools such as Selenium IDE, Grid, WebDriver and RC.
Selenium IDE
Selenium IDE is a Record and Playback tool that creates test scripts in programming languages such as C#, Python, Java, and Ruby. It uses its own scripting commands known as Selenese. You can customize these scripts based on your requirements and run the tests.
Selenium RC
Selenium Remote Control simulates user actions such as submitting a form, clicking a button, entering text in fields, and writing scripts in any programming language to automate UI tests using GET/POST requests. However, the Selenium RC server must start manually before you execute the tests. Due to its limitations and low performance, it has been deprecated.
Selenium WebDriver
Selenium WebDriver is an important tool that was introduced to overcome the limitations of Selenium RC. WebDriver controls browsers at the OS level and establishes direct communication between the code and the browser. It supports multiple language bindings such as C#, Python, Java, Ruby, Javascript, and browsers such as Chrome, Safari, Firefox, Opera, Edge, and Internet Explorer.
Selenium Grid
Selenium Grid is a smart server that allows you to run tests across browsers, operating systems, and machines concurrently. We use it primarily for cross-browser testing and running tests on a remote computer.
In Selenium 4, some items are deprecated: FindsBy interfaces, Actions class, Driver constructors, etc. The complete list with details is availableย here.
New Features in Selenium 4
Here’s the most significant difference between Selenium 3 and 4:ย
- Selenium 3 requires encoding and decoding of API since the browser interaction happens through the JSON wire protocol.
- Selenium 4 does not require encoding and decoding because the communication between the driver and the browser follows the W3C(World Wide Web Consortium) standard protocol.
Now, let’s look at some exciting new features introduced in Selenium 4.ย
1. Relative Locators
We use locators to find an element and perform an action. You would require knowledge of the DOM structure and the page to define these locators. Sometimes finding locators to identify an element can become complex. What if there is an option to locate an element next to another element? Interesting, right? That is where Relative Locators come into the picture. You can use the By locator or WebElement parameters for these methods.ย
This is mainly helpful when your application is unstable, has dynamic elements, or has elements without unique attribute values.
Relative Locators Methods
- above()ย – used to find an element/elements located just above a fixed element
- below()ย – used to find an element/elements located just below a fixed element
- near()ย – used to find an element/elements located 50 pixels distant a fixed element
- toLeftOf()ย – used to find an element/elements located to the left of a fixed element
- toRightOf()ย – used to find an element/elements located to the right of a fixed element
Example: Let us use the relative locators and find some of the elements on theย Contact page of the Modus Create website.
Import ‘byTagName’ and identify the relative locators as follows.
import static org. openqa. selenium. support. locators.RelativeLocator.withTagName; WebElement firstName = driver.findElement(By.name(โfirstnameโ)); WebElement whatCanWeDoForYouInput = driver.findElement(By.name(โ what_can_we_do_for_youโ)); WebElement email = driver.findElement(RelativeLocator.withTagName(โinputโ).below(firstName)); WebElement lastName = driver.findElement(RelativeLocator.withTagName(โinputโ). toRightOf(firstName)); WebElement industry = driver.findElement(RelativeLocator.withTagName(โinputโ).above(whatCanWeDoForYouInput));
2. Chrome Debugging Feature
Selenium introduced a new feature around the Chrome browser DevTools debugging protocol.
Chrome DevToolsย are developer tools that are built into the Chrome browser. They enable us to perform a variety of tasks, such as:
- View console logs
- Inspect and edit elements
- Check and monitor website performance
- Mock the geolocation
- Mock the network speed
- Execute and debug JavaScript
You can use the development properties such as Fetch, Network, Profiler, Performance, Application cache, Resource timings, Security, Target CDP domains, and debug the problems. The important packages available in the libraryย org.openqa. Selenium.devtoolsย are listedย here.
You can trigger the DevTools commands with the following script.
DevTools devTools = (new ChromeDriver()).getDevTools(); devTools.createSession();
Selenium provides new class methods for the network that enables the following capabilities:
- Intercepting requests.
- Emulating Network Conditions by changing connection types.
- Enabling network tracking.
You can find further details on the implementation and the sample scripts of CDP (Chrome DevTools Protocol) commandsย here.ย
3. Capture Screenshots and Location
Capturing screenshots gives a quick visual representation of the page. It saves time, and you do not need to rerun the script to check the issue. If you want to capture a screenshot of a specific element, Selenium has provided a new method. We can get the location and size of the element using the Rectangle object.
Example: Use the code snippet to take a screenshot of the WebElement ‘logo’ from the webpage
WebElement logo= WebElement logo = driver.findElement(By.className("col-sm-12 col-md-4 fl-page-header-logo-col")); File srcFile=logo.getScreenshotAs(OutputType.FILE); File destinationFile =new File(โlogo.pngโ); FileUtils.copyFile(file,destfile);
Example: Let us get the location, coordinates, height, and width of the logo element.
WebDriverManager.chromedriver().setup(); WebDriver driver = new ChromeDriver(); driver.navigate().to("https://moduscreate.com/"); WebElement logo = driver.findElement(By.className("col-sm-12 col-md-4 fl-page-header-logo-col")); System.out.println("Height is "+logo.getRect().getDimension().getHeight()); System.out.println("Width is "+logo.getRect().getDimension().getWidth()); System.out.println("Location X is "+logo.getRect().getX()); System.out.println("Location Y is "+logo.getRect().getY());
4. Multiple Tabs and Windows Handling
Selenium 4 lets you open multiple tabs/windows along with the existing ones. You can use the following commands to achieve this:
driver.get(https://www.google.com/); driver.switchTo().newWindow(WindowType.TAB); driver.navigate().to(https://www.moduscreate.com/);
Similarly, you can also open multiple windows using these commands:
driver.get(https://www.google.com/); driver.switchTo().newWindow(WindowType.WINDOW); driver.navigate().to(https://moduscreate.com/work/)
5. Selenium Grid with Observability
- Selenium Grid has a more user-friendly UI and comes with Docker support, which helps spin up the containers. You can also deploy the grid via Kubernetes cluster on AWS, GCP, or Azure.
- Grid features are designed to be a fully distributed system, making it easier to understand and debug the traces, metrics, and logs.
6. Selenium IDE TNG (The Next Generation)
- New Selenium IDE has received a fresh UI and allows us to run the scripts on any browser.
- You can now specify more than one locator for an element (back up element). When a locator is not found, it falls back to the alternative locators and looks for the element. Our tests that failed previously will not fail anymore.
- IDE comes with a package SIDE tool aka Selenium IDE runner that allows testers to run the ‘.side’ projects on ‘node, js’ platform. You can also use this runner to run the cross-browser tests, provide information of the time, passed, and failed statuses.
- It has an export option to export the code in Java, .Net, Python, Ruby, C#, and Javascript.
- It follows the accessibility guidelines and supports some of the controls like role, tooltips, focus order, announces the start of recording, color, and design.
- You can now use better if/else conditions, loops to control the execution order of the statements.
Image source: selenium.dev
Selenium IDE will soon be available as a standalone app, rewritten to be an Electron app that allows us to listen from the browser, making test recordings even more powerful.
Conclusion
Selenium is the most popular and preferred open-source tool for writing test scripts in different languages (Ruby, Python, Java, C#, etc.). There are a large number of active Selenium communities to address issues. IDE makes it easier for the testers to automate the test cases with limited programming knowledge. WebDriver makes cross-browser testing more effective worldwide. You can reuse the test suites and test across multiple browsers & operating systems. With the regrouping/refactoring of test scripts, you can make quick changes that improve maintainability. Selenium Grid enables us to run tests in parallel, which saves time and facilitates faster releases. As per theย data, 55,514 companies use Selenium. In addition to this, Selenium 4 is available with enriched features such as Relative Locators, Chrome DevTools, Multiple Tabs/Windows Handling, Screenshot for Elements, new IDE improvements, and Grid enhancements that make automation projects more successful.
References
- https://www.softwaretestinghelp.com/new-features-in-selenium-4/
- https://applitools.com/blog/selenium-4/
- https://www.browserstack.com/guide/selenium-4-features
This post was published under the Quality Assurance Community of Experts. Communities of Experts are specialized groups at Modus that consolidate knowledge, document standards, reduce delivery times for clients, and open up growth opportunities for team members. Learn more about the Modus Community of Experts here.ย
Durga Sundarraj
Related Posts
-
Top QA Challenges in API Testing
In the last article, "How to Effectively Implement API Testing," we learned about the API…
-
Codeless Testing - A Complete Overview
Looking for quick automation solutions with minimal coding and faster time to market? Manual testers…