A question that comes often, especially by folks that are new to Selenium, is what are the best practices to use when choosing a locator for an element. In other cases, people just have bad habits and don’t even think to ask this question, but they encounter instabilities or maintenance problems, without knowing that they don’t use the locators effectively.
For the sake of clarity, a locator is the combination of the type and the value of the By clause which Selenium WebDriver uses in the FindElement and FindElements methods. As you probably know (given you’re familiar with Selenium), Selenium supports the following types of locators: Id, Name, Class, LinkText, PartialLinkText, TagName, XPath and CssSelector. While most of these locators use a simple text value, the XPath and CssSelector locators take strings that should match a special corresponding syntax. You can learn everything you need about XPath and CSS Selectors in the W3C School website.
Note that while this article is focused on Selenium, the same basic rules apply to almost any UI automation technology, and even other types of test automation, as I’ll discuss below.
So, I’ll get right to the point, and later I’ll give explanations, examples and additional tips. Here we go:
The 4 rules for choosing the best locators
These are my 4 rules for choosing a locator:
- The locator must match the desired element
- The locator must not match any other element
- Avoid depending on information that is likely to change
- Depend on the minimal necessary information
Important: the order of these rules is significant! While all the rules are important, the first ones are more important than the later ones.
Here’s an in-depth explanation of each of these rules.
The Locator Must Match the Desired Element
This rules goes without saying, but without it the locator is useless. If the locator does not match the desired element (or elements), then it’s of no use…
The Locator must not Match Any Other Element
This rule is also pretty trivial. However, I’ll break it down to two cases:
- If we’re looking for a single element (using FindElement), then it means that the locator must be unique for the desired element. In other words, it should match exactly one element which is the element that we’re interested in, and no other element. Note that if multiple elements match the specified locator, Selenium returns the first one it encounters. While this may happen to be the element that you want, it is very unwise to depend on this fact, as the assumption that it will always be this way is very fragile.
- If we’re looking for multiple matching elements (using FindElements), then it means that the locator should match all of the elements we’re interested in, but no other.
Avoid depending on information that is likely to change
This rule is the trickiest one, but this is also what makes the big difference between unreliable or hard to maintain locators and locators that are reliable and easy to maintain. The trickiest part is to determine what pieces of information are more likely to change than others. But don’t worry – I’ll give you some guidelines for this too. But before, let me clarify few points.
The first point I want to clarify is what “information” we are talking about. For the simple locators (e.g. By.Id, By.Class, etc.), there’s only a single piece of information. For example, for the By.Class(“myClass”) locator, the only piece of information is that the element has a class myClass. If this specific piece of information changes, then the locator will no longer be valid. However, for the more complex locators (i.e. XPath and CssSelector), the value is typically composed from multiple pieces of information. If any of these pieces of information change, the locator will no longer be valid. For example, the locator By.XPath(“//*[@id=”ember6180″]/div/h3”) relies on the following pieces of information:
- The element has a tag name h3
- Its direct parent is a div
- Its direct parent is the 2nd div child of the grandparent (of the h3 element)
- Its grandparent has an id attribute with value ember6180
The second thing I want to clarify is the context for the likelihood for something to change: A piece of information can change:
- Between runs or instances of the application
- Between different environments
- Between builds or versions of the application
- Any of the above events may cause a piece of information to change and make the locator invalid.
Ok, here are the guidelines for identifying pieces of information that are less likely to change than others:
Avoid depending on auto-generated values
While in many cases depending on Ids are a good thing (as you’ll soon see why), there are cases in which the Ids (or any other attribute for that matter) are not a good choice. These cases are when the value of the attribute is automatically generated. You can usually identify that these values are generated because they contain random numbers and/or letters which are meaningless. For example, depending on id such as button-1d3f542 is not a good idea, because this random number is likely to change. If it changes between runs, you’ll notice it very early as the 2nd time you’ll run the test it will probably fail. But if it changes only between builds, or when a developer changes that particular page, then it will take longer until you notice it.
Prefer depending on meaningful pieces of information
In contrast, an Id with value loginButton is very unlikely to change, as this value conveys the meaning of the element. As long as this meaning correspond to what the element actually represents, it’s unlikely that someone will change it. While technically class names are not guaranteed to be unique (as opposed to Ids, though this guarantee is also weak), if the element has a class loginButton, it’s still pretty safe to use it as the locator, as it’s very unlikely that there will be another button on the page with the same class name.
If the login button doesn’t have a meaningful Id, class name, etc., but it is contained in a noticeable panel in the UI that contain all of the login elements, and that panel has a meaningful attribute attached to it, then the fact that the login button is contained inside the login panel is also meaningful, and so the XPath //*[id=’loginPanel’]//button can still be useful.
Avoid depending on technical details
In contrast to the last example, if there are 2 div elements in the DOM hierarchy between the loginPanel element and the button, which may only be used for styling or layout, then the purpose of these div elements is more technical and therefore depending on their existence and their exact order is not advisable (e.g., //*[id=’loginPanel’]/div/div/button).
Depend on the minimal necessary information
Lastly, if we depend only on meaningful information that is less likely to change, we’d like to depend on the minimum information that still matches only our desired element(s). For example, if we have both an Id and a class that are meaningful, depending on both of these pieces of information (as in a CSS Selector with value #loginButton.login-button-style) only makes your code more fragile. Even though the chance for each of these pieces of information to change is low, combining them together means that when either one of them change eventually, the locator will no longer be valid. Clearly, if we only depend on a single piece of information (e.g. Id), it can also change at some point, but the changes are lower than if we depend on two or more pieces of information.
Conclusions and Other Tips
Once we understand the above rules, we can discuss some related best practices and see how they fit nicely into these rules.
Avoid using the XPath and CSS selectors that the Dev Tools suggests
All rules except of the third one are pretty technical, and those are more-or-less the rules that Chrome uses when you choose “Copy XPath” or “Copy selector” from the Dev Tools (F12). However, because Chrome doesn’t have a sense of what’s meaningful and what’s not it cannot apply the 3rd rule which is what makes the difference in terms of maintainability and stability. There are other tools that claim to do a better job, often using some kind of “AI” technology. I don’t have a real experience with those tools, but I believe that while they can do a somewhat better job, they cannot replace your judgement because in order to determine what’s “meaningful” you should also understand the business domain of your application and that’s something that I doubt any AI tool will do in the near future.
Work together with the developers and decide on the locators’ strategy together
In fact, I wanted to put this rule first because I strongly believe that it has many other benefits. However, in order to keep things to the point, the above rules apply even in cases where collaborating with the developers is not as easy as it should be. And technically, this rule stems from the above 4 rules and not the other way around.
Anyway, if you can ask the developer to add some kind of identifier to the elements that you need, or ask them to let you put these identifiers yourself, then you can use these identifiers as your locators and it is very easy to ensure that they adhere to all of the 4 rules. Generally, there are 3 ways to add these identifiers:
- Adding an “id” attribute – this is the most straightforward way. Ids should be unique across the page (though the browsers don’t really enforce that), and are very easy to use from Selenium. However, Ids have two drawbacks:
- Some elements are generated at run-time and can have multiple instances of them. For example, elements inside a list are repeated for every item in the list and therefore a fixed Id cannot identify a single item uniquely.
- Adding a dedicated class name for each element. For example, you can decide together that the automation developer can add class names starting with “auto-” (e.g., auto-login-button) to any element. The developers should not have a reason to touch these class names as they are dedicated only for the automatic tests.
- Adding a special attribute for the automation. For example automationId=’login-button’.
The goal of the above rules is to make the locators as robust as possible and to reduce the chance that we’ll need to change them. However, from time to time things will need to change, and therefore we should also take care to make it easy for us to change it in the future, or in other words, make it more maintainable.
One of the most important techniques to make your code more maintainable is to avoid or remove duplication. Whenever we have duplication, and the duplicated piece of information changes, then we need to update in multiple places, which is more error prone. If the project is big, it’s probably hard to find all of the references of that piece of information, and we may miss few of them.
With regard to locators, duplication is most significant when using XPath or CSS Selectors to search for multiple elements that reside in a common container. Expanding on the previous example, suppose that the login panel has 3 elements in it: username, password and the login button. Suppose that these elements don’t have Ids and we use XPath to locate them, then the beginning of the XPath of the 3 elements would probably be identical. If at some point the Id of the panel will change (or the elements would be moved to another panel or window), then all of these 3 locators should be changed.
In order to remove this duplication we have 2 options:
- Extract the common part of the XPath into a constant string
- Search for the panel element first and then look for each of the inner elements by invoking FindElement on the WebElement object of the panel
FindElement within another Element
While the first option is technically simpler, the second one makes the code more modular and improves encapsulation (which is another property of maintainable code). This technique is most useful when implementing the Page Object model pattern properly.
One important tip regarding this technique: I often see people create their own “framework” that wraps Selenium and provide methods that are seemingly easier to use, like void ClickOnElement(By by). The problem with such methods is that they always look for the element on the entire page (i.e. invoke FindElement on the WebDriver object), and lose the ability to look for elements within other elements. So if you design a framework, beware of that mistake.
Applying these rules outside of Selenium
As mentioned above, if you’re writing test automation using other UI automation technologies, like Ranorex, Coded UI, etc., the same rules still apply. Ranorex for example also uses the concept of XPath which you can edit, and you can build the Elements Repository also in a nested and modular way to remove duplication. In Coded UI there’s no concept similar to XPath, but you can select and edit the SearchProperties of each element in the hierarchy. Unfortunately, if you’re using UIMaps, then you don’t have much control over the hierarchy of the elements, but if you don’t use UIMaps and write it in code, you can simply look only for the relevant elements in the hierarchy, and Coded UI will search it for them recursively. You can use the TestAutomationEssentials.CodedUI project to help you achieve it more easily.
But these rules apply even for API testing or anything that have to look for data. This can when using XPath to query XML data, using JSONPath to query JSON or even when using SQL to query relational data.