Back to Blog

Top 10 Mobile Application Testing Automation Tool Requirements

It’s no secret that mobile applications, native, hybrid and web based, are in high consumer demand. The world is moving from the desktop to mobile devices. Tablets are projected to outsell PCs by Q3 2012 and mobile devices outshipped PCs in 2011. Mobile devices and tablets are the technology of the future. If your organization is not already invested in mobile applications, there’s a strong chance you will be in the near future.

Our clients asked us for solution recommendations and requirements to tackle the task of automating test cases on mobile applications and devices. Northway Solutions Group developed this unbiased mobile application testing automation tool requirements checklist which we believe should be used as a guide for evaluating mobile automation tools in the market. Taking an objective approach, this checklist consists of the most important and relevant feature requirements and selection criteria to assist your organization in choosing an adequate automated mobile testing tool with the highest ROI.

1)  No jailbreaking or rooting devices

Before I underscore the importance of not utilizing jailbroken or rooted devices, let’s review what these terms mean (technically):

Android rooting is the process of allowing users of smartphones, tablets, and other devices running the Android mobile operating system to attain privileged control (known as “root access”) within Android’s subsystem. Rooting is often performed with the goal of overcoming limitations that carriers and hardware manufacturers put on some devices, resulting in the ability to alter or replace system applications and settings, run specialized apps that require administrator-level permissions, or perform other operations that are otherwise inaccessible to a normal Android user.”
http://en.wikipedia.org/wiki/Android_rooting

iOS jailbreaking is the process of removing the limitations imposed by Apple on devices running the iOS operating system through the use of hardware/software exploits – such devices include the iPhone, iPod touch, iPad, and second generation Apple TV. Jailbreaking allows iOS users to gain root access to the operating system, allowing them to download additional applications, extensions, and themes that are unavailable through the official Apple App Store. Jailbreaking is a form of privilege escalation, and the term has been applied to privilege escalation on other computer systems as well.”
http://en.wikipedia.org/wiki/IOS_jailbreaking

Simply put, jailbreaking or rooting renders the device in a state other than the operating system engineers or device manufacturers intended. The end user, and any application that runs within a jailbroken/rooted device, has escalated privileges to manipulate the system outside of supported methods.

In the case of iOS devices, jailbreaking voids the warrantyUnauthorized modification of iOS has been a major source of instability, disruption of services, and other issues. In addition, jailbreaking an iPad is a direct violation of the DMCA.

In the case of rooted Android devices, rooting may void the warranty. Just as in iOS, you should expect the probability (or at the very least, possibility) that: 1) stability and performance related issues will occur, and 2) rooting a tablet violates the DMCA.

DMCA exemptions change every three years. The latest exemptions included the legality of mobile phones but did not include tablets (effectively rendering jailbreaking/rooting tablets illegal). In another three years, DMCA exemptions will change. Will jailbreaking/rooting mobile phones remain legal? Will jailbreaking/rooting tablets remain illegal? Are you willing place a bet (in terms of finances and internal resources) on the outcome?

Testing an application in an environment that differs from the overwhelming majority of end users – an environment that is known to cause stability issues and in many cases directly violates the DMCA – has been deemed unacceptable by our clients and fails to meet testing best practices. Brian Copeland recently authored a great article on mobile testing challenges related to jailbrekaing and rooting, and the importance of testing applications in a way that replicates the end user’s environment.

When your testing effort relies on jailbreaking/rooting for automation, you have placed the availability of testable devices and OS combinations on the hacking community. Jailbreaking and rooting is a “hack”, period. iOS 6 was released on September 19th, 2012, followed by iOS 6.0.1 on November 11th, 2012 and, at the time the article was published, there is still no available jailbreak for the “New iPad” (iPad generation 3 and 4). This means that any automated mobile testing solution that relies on hacked devices cannot test the latest and greatest Apple products.

A majority of the automated mobile testing solutions out there use jailbroken/rooted devices (Perfecto Mobile or “UFT Mobile”, Zap-fix, etc.). Out of the remaining solutions which do not circumvent OS level security measures, the most mature solutions involve instrumentation (like Jamo Solutions M-eux Test). Instrumentation is the Apple approved method for mobile UI testing and automation.

After digesting all of this information, you may be asking – can we afford to invest in technologies that rely on hacked and potentially illegal mobile devices? If your answer is “Yes, we believe that our automated testing effort should focus on jailbroken/rooted devices,” then consider the cost savings in a balanced approach of utilizing emulators and simulators and a manual testing effort with real (stock) devices. The ROI is significantly greater.

2) True object recognition

If you are familiar with the HP UFT (Unified Functional Testing) or HP QTP (QuickTest Professional) product for functional test automation, then you already understand the importance and benefits of true object recognition over bitmap, OCR (Optical Character Recognition) and/or coordinate based object mapping. For those that are not familiar with the object recognition capabilities of HP UFT, let’s define and discuss this feature.

Every application with a GUI (Graphical User Interface) layer is comprised of UI (User Interface) elements (objects). In the example of a web based application, the following UI elements are objects:  buttons, text boxes, selection lists, radio groups, images, etc. Every UI object has a set of properties that can be used to identify, define or validate the object. A text box has an HTML tag (“input”) and a variety of other properties and attributes that uniquely define it: id, class, enabled (true/false), width, height, X/Y coordinates, and the list goes on.

HP UFT utilizes the UI element properties to uniquely define the object for recognition and in checkpoints. If an object is defined by it’s class and id (or another set of properties that remain mostly static), then the position of the element is irrelevant; the UI element can be identified on the screen if it moves around or even if it is not visible. This allows for high reusability and low-cost script development and maintenance.

Bitmap, OCR and X/Y Coordinate Based Object Mapping

Bitmap based mapping (recognition) relies on a pre-defined image to identify the UI element. The scripting tool literally takes a snapshot of the screen and the automation engineer crops the image to better define it.

If the object context changes, the “snapshot” must be redefined. Think about how the mobile device keyboard layout and dimensions change when the device is rotated from portrait mode to landscape. The dimensions of the “snapshot” change, as does the orientation. Multiple scripts or additional logic must be developed in order to support the two presentation modes.

What if the object has a layer of transparency and the background rendering (image or text) changes dynamically? The “snapshot” now becomes dynamic and may not suitable for bitmap based object mapping.

What if the object is not visible? It cannot be defined.

These are all basic examples but they underscore the challenges with bitmap based recognition.

OCR (Optical Character Recognition) technology relies on related text displayed in the application, which is screen scrapped to help map the object. If the related text changes, moves or is removed entirely, the object is very difficult (or impossible) to identify consistently. OCR is slow and can be very inaccurate.

Coordinate based mapping relies on predefined X/Y axis coordinates. If the coordinates of the object change, they must be adjusted to properly identify the object. Some of the same challenges in bitmap mapping also applies to coordinate based mapping.

Marketing word play

Some companies use marketing word play is disguise the underlying object mapping technology. Just a few of these marketing terms include “native analysis”, “visual screen analysis” and a “hybrid analysis”. I have even seen companies use the exact phrases “object level recognition” or “object based scripting” interchangeably with complete disregard for the understood meaning in the automation community.

None of these terms relate to true, object level recognition as known in the UFT/QTP community.  It is important to be fully aware and educated on the technical implementation of the underlying object recognition technology and how it will impact the scripting effort.

Putting it all together

Using bitmap, OCR or coordinate based object mapping means that the underlying object properties, which are set by developers, are not accessible. This leads to very low reusability across devices and  high-cost script maintenance and development. An overwhelming majority to mobile automation tools use these technologies (in addition to jailbroken/rooted devices).

An automated mobile testing tool must be able to access developer defined object properties, which are used to identify and validate the state of objects. This is the most efficient means of object recognition in automation, and widely accepted as best practice. Other means of identifying objects should be used only when true object recognition does not satisfy automation requirements or is technically unavailable (i.e., custom objects).

3) Integration with existing Integrated Development Environments (IDE)

As a professional automation engineer or developer, it is very likely that you already know HP UFT, Visual Studio or Eclipse. You have skillsets in one or more of these tools and certifications and technical expertise specific to the scripting environment. The last thing an automation engineer desires is have to learn a new IDE or language to start automating test cases against mobile applications.

An automation engineer should be able to utilize existing skillets for automating mobile applications. This is the quickest way to being automating the application under test, and the most efficient. Learning a new IDE and/or language that is unique to a product leads to an increased barrier to entry and provides low-reusability for learned skills.

In addition, if a mobile automation tool has the capability of integrating with multiple IDEs this means that multiple departments can utilize the same technology and skillsets. Automation engineers may be most familiar with UFT while developers will be more familiar with Visual Studio or Eclipse. Using the same underlying technology and mobile automation tool decreases the fragmentation of skillets and leads to a lower-cost, decreased barrier to entry.

4) High reusability of scripts

Scripts must be reusable across devices and mobile operating system (OS) versions. This requirement is inherently related to the second requirement (true object recognition). Automation engineers should have the ability to create one script that plays back on any device running the same base OS, independent of the OS version (major/minor/build).

Marketing word play

Many solutions on the market claim to meet his requirement but clearly fail to do so optimally, which becomes apparent when evaluating the product (or unfortunately, after purchase). If you spend enough time studying the marketing material and technical documentation of various mobile testing solutions, you’ll see a lot of marketing word play in this area. Phrases like “object based scripting” and “cross OS scripting” are common. They all sound good, but in practice they probably don’t mean what you think they do.

Object based scripting usually means objects are mapped through bitmap images, x/y coordinates or OCR, similar to the way that Virtual Objects are created in UFT/QTP (requirement #2 discuses these technologies in greater detail). Yes, the end result is a test object in UFT/QTP, but the technical implementation is not utilizing best practices or true object based recognition.

Cross OS scripting sounds ideal, but the methods to achieve this is through a single function that contains logic to handle multiple operating systems and their differing “objects” (I use this term loosely as a “test object”). For example, one function is called that executes different lines of code depending on the operating system under test. This usually is requires more effort than creating multiple automation scripts, and it certainly requires a reasonably skilled coder.

The reality

If your organization develops a mobile application that supports multiple mobile operating systems, I can almost guarantee the fact that the application does not behave the same across operating systems. Transitions between screens vary, as do the screen layouts and their objects, gestures, touch support, and potentially base application features.

You should expect to create scripts to handle multiple mobile operating systems by either creating multiple scripts or by creating functions with logic to handle each operating system independently. The exception to this rule is in web based applications (see requirement #6); since the root object (the browser) is common across all operating systems, a single script for a web based application should support cross-OS execution for (at least) the major vendors (iOS still covers over 50% of mobile web traffic with Android trailing behind). There are two exceptions: 1) when developers implement code that is browser dependent or, 2) server-side logic delivers browser specific code.

5) Physical device and emulator/simulator support

Recording and replaying scripts should be supported across physical devices and emulators/simulators.

OpenSignal reported approximately 4,000 distinct Android devices, representing very high levels of market fragmentation or (some argue) differentiation. iOS does not present the same challenge as there are fewer variations of hardware and operating system versions. The same fact is applicable in Windows (Mobile, CE, Phone), Blackberry and others as the market is not dominated by these devices and operating systems.

In mobile application testing, consider applying the 80/20 rule for test coverage, as you are likely doing for other applications. Develop test cases that support 80% of your prominent market share.

Does your organization have the time to invest in testing all of the various distinct device configurations? No – you will never finish a single testing cycle. Therefore, select 80% of the dominant market for test coverage based on the target market, devices and operating systems.

Pefecto Mobile suggests that a typical organization may need 6-12 devices across the development and QA stages (see image below). This number includes 6-8 “must” have devices during development and ~12 “major” devices during QA testing.

This number may leave room for an overlap of physical devices and emulators/simulators. Does your development team currently have 6-8 unique physical devices? How many unique physical devices does the QA team currently have? Will an emulator/simulator suffice for part of the testing effort?

Truth be told, there is no de facto standard for the number of devices that mobile applications should be tested under for every organization; the number is dynamic. There needs to be a justified balance between cost and risk for your organization, both in terms of device selection for testing and device management. A decision must be made to determine if devices will be housed on-premise or in “the cloud” (private or public). The organization may determine that the cost is substantial in “the cloud”, while the cost and risk is minimal for an on-site (private) lab. Both present their own challenges and the topic deserves a detailed discussion and balanced cost/risk analysis.

6) Web application support

Hybrid and web based applications are increasing in popularity and are projected to replace native applications.

Native applications are applications developed specifically to run on one mobile operative system. This type of application utilizes the OS level user interface objects and is designed to utilize OS specific firmware. A native application built for iOS cannot be run in Android (without utilizing a 3rd party cross-platform development framework).

Web based applications are designed to run on any (mobile) browser, independent of the operating system. These are accessed through a URL from within the mobile device’s web browser. These applications are typically developed using the latest HTML5 and CSS3 standards.

Hybrid applications have both native and web based components to the application. A native application is downloaded which has embedded web based components. As little as one component or the entire meat-and-potatoes of the application can be web based. Some examples of hybrid applications include Facebook and Yelp.

All three types have their benefits and drawbacks for developers, testers and end users. Web based and hybrid applications are gaining ground because application code changes are delivered immediately; this is just one of many benefits. There is no need to download a native application update or, in the case of hybrid, the frequency is significantly reduced. This improves the end user experience.

The solution should support all types of applications, native, web based and hybrid, utilizing true object recognition (see requirement #2).

7) Data driving, screen capturing and standard reporting capabilities

Expect full integration with the scripting tool (IDE) to support standard automation best practices.

Data driving capabilities build on the high reusability requirement (#4) by allowing variable data sets and test scenarios to be executed from a single script. Without screen capturing and reporting capabilities (i.e., checkpoints), the solution is extremely limited and is not suited for test automation. These are no-brainers.

8) Support for common interruptions and functionality

Text messages and phone calls are common interruptions for end users. These interruptions (on a physical device) during script execution should not cause the test to fail. The solution should handle common interruptions by allowing the script to continue execution once the interruption is handled (e.g. ignoring or accepting and ending the phone call).

In addition, common functionality like gestures, multi-touch, swipe, pinch-zoom, drag and drop and alerts should be supported out of the box.

9) Manual or automatic (scheduled) execution

The solution should be able to execute test scripts at all hours, driven by a human or by scheduled execution. Who wants to manually kick off an automated regression test at 2am?

Integration with HP ALM (Application Lifecycle Management) supports this requirement.

10) Integration with performance testing tools

Research shows that poor application performance translates to lost revenue. The solution should be able to integrate with standard performance testing tools.

This is where HP UFT (QuickTest Pro and Service Test) and HP LoadRunner play nicely together. LoadRunner generates load on the backend service and UFT directly integrates with LoadRunner for execution on the mobile device.

In addition, the solution should be able to measure on-device resources such as RAM, CPU, battery and disk space utilization. Add network condition simulation through Shunra Network Virtualization, or a comparable product, and you have an all-around mobile performance testing solution.

Additional notes

We also have a article that explains the current mobile automation tool market and vendors. Understanding the tools available and differences between key vendors is critical in selecting the best mobile automation tool for your organization based on internally defined requirements.

Now, for our shameless plug – we have endorsed an automated mobile testing tool that effectively meets these requirements. Contact us for more details.

Your thoughts…

Do you agree or disagree with these requirements? We’d love to hear your feedback – please leave a comment below.

Back to Blog