Chapter 10 - Releasing software

Testing

Testers are usually involved at the end of failing projects.

Or, to put it another way, if you leave testing until the end of a project, it is too late.

In an architecturally led process, testing is part and parcel of development as the architectural integrity is dependent on things working well, and continuously. If you have stated a throughput of 10,000 transactions per second, you must identify the pieces of software and hardware which will have an effect on the throughput, and test that the requirement is met.

And long before the code is being assembled, tests must take place. For example:

Testing must be enmeshed with all other activities of development. If you build and release every day, then you must test every day. A tester will typically run an automated regression test to see if everything working yesterday is still working today.

As we are modelling, and building analysis and design models, then it is reasonable to assume that we can also have a test model. Usually, this model is represented by a documented test plan and both manual or automated test scripts.

 

Figure 10.1 The test plan, the documented test model

The test plan must explain how each part of testing will be carried out. It may be a generic template, tweaked for each project, stating the following:

Lots of tests are carried out, either with or without a formal statement of them occurring.

Requirements must be given a MECE test. This means your requirements must be mutually exclusive, i.e. no requirement is the same or overlaps with another - if they overlap, refactor them. They must also be collectively exhaustive, i.e. they are complete; no gaps can be found. Doing so will provide a clean foundation to build the system upon. Unclear or incomplete requirements can lead to confusion within development, and ultimately within the product.

Analysis is tested as far as it can be. If use case models are produced, then each requirement is attached to a use case. Any hanging use cases or requirements mean either the analysis or use case model is incomplete.

Following design, testing will verify that the design model is a true reflection of the analysis, and where differences occur, the deviation is acceptable. Architectural review will ensure the design conforms to the envisioned or actual architecture, and testing then verifies that the design model is likely to fulfil the requirements.

During coding, two types of test are carried out. The first is to review the code, to ensure it follows coding standards, and that algorithms and methods identified during design are carried through into the code and penned effectively. The second type of test is to extract deliverables for testing. Each deliverable is tested in a prescribed manner, for example, a style sheet is tested one way, while a compiled library file another.

The major role of testing during analysis and design is to extract test information from the models and documentation. The output from this activity is test plans, specifying which tests will be carried out when, and the test scripts or automations to test the objects being created. Test scripts are lists of inputs versus expected state changes or outputs. Automating test scripts can save testers a lot of time retesting after small changes have been made.

Any bugs or problems discovered during testing are worked into the test scripts, so regression testing does not become a chore. Regression testing is re-testing items for all bugs found so far as the item in question is debugged and re-developed.

Use case led testing

When development is led by first identifying the use cases of a system, and then mapping the paths through the use cases, the test scripts are derived from those use cases and pathways. If the use case structure is reflected in the test plan structure, then a casual observer will immediately see the test coverage from the user's point of view.

Design led testing

A design will identify, at many levels of granularity, the parts which are to be constructed and compiled to form the complete system. In a generic design, properties and code are compiled into applications.

Figure 10.2 A Software system is compiled from a granular level

As the granularity decreases from code to application, the focus of testing moves from detail to abstraction, from conformance to timings. Many arguments have been put forth for testing in this granular way to avoid problems later on. Others argue equally passionately that testing at this level can waste resources who would be better employed elsewhere as in many cases, average performance is often more than adequate and testing small granules is not effective resource usage. As always, a healthy balance between the two extremes must be identified, followed and periodically reviewed.

Testing can, and must, take place on each part of the software puzzle. The test team must see themselves as an integral part of the process of developing excellent software systems. In many cases, when the tests are peer reviews, the testers are the code writers themselves, and should consider themselves subcontracted and responsible to the test team while reviewing.

Software Test
Application UI, response, understandability, usability, security
Framework Components and interaction, efficiency of calls through layers
Toolkit Components and interaction
Component Interfaces, object creation, limits, speed of execution
Class Property limits, initialisation and destruction (time and ease)
Function Black box input output testing for efficiency and throughput
Property Test limits
Code Peer review coding style and algorithms

Each set of tests builds upon the results of the lower level test. It requires a defined and articulated process to be able to do testing in this manner.

Furthermore, if testing is not carried out bottom up, then bugs will possibly be fixed higher in the code chain than where they really occur. Immediately, spaghetti code begins to form; arteries that were running clear and smooth are blocked up, and eventually the code has a heart attack. Call in the code surgeons.

The V model

The V model is a waterfall model, with a mirror image created for testing.

Figure 10.3 The V Model for software testing

The V model can imply that testing follows on from development if the horizontal axis is perceived as time. What the V model demonstrates clearly, is how tests are built up in layers, and that the test stages have different objectives.

Types of test, and words used by testers

Testing jargon is every bit as bad as development jargon. You may hear about strange things called black box testing, white box testing, and testing in the small, usability, integration testing, component tests etc. Here’s what a few of them mean.

Occasionally, test are divided into functional and non-functional tests, targeting functional and non-functional requirements. Although tests in a test plan may be divided this way, the testing methods cannot be similarly divided except in a few cases, so are all listed together.

Alpha test

The product is nearing completion, and testers run all test scripts against the product, then feed it out to a small user group for simulated live testing.

Benchmarking test

To determine how a new version of software or hardware compares to previous versions, or how competing products measure up against each other.

Beta test

The product is released to a small group of users to identify and fix the last bugs

In an environment where the users are working with the developers, alpha and beta tests are far less formal, although still required as they are part of a formal hand over process from development to live.

Black and white box testing

The terms black box and white box testing come from electronics. You cannot see inside a black box, you can only see its external interfaces. You can see into a white box and examine the code.

Black box testing

A missing function is the most obvious black box test failure. It’s there in the design, but missing in the component. Interface errors, errors in data structures or external access to a database are typical black box problems, as are initialisation and termination errors.

Black box testing can cover many areas, and depends on the use the box is intended for. Boundary analysis and syntax testing are the more obvious, where function calls are tested against the design. State transitions can be mapped, as can cause and effect.

Others problems are not quite so obvious. Maybe the performance is down, or pushing the component too hard begins to introduce errors. This is when white box testing can help.

White box or glass box testing

When testing goes beyond the interface and into the code, it becomes white box testing. The differentiation may be simply that testers are only responsible for the black box test, and internal problems must be tested and resolved by the developer.

White box tests must be derived from a detailed design, and knowledge of the internal structure of the component.

A comprehensive white box test must

It should test all statements, equations, branch decisions, data flow, and that the pathways identified in the use cases are reflected in the code.

Compatibility test

Tests how a software component performs under different operating systems, or on different hardware.

Component, unit or module test

To test a component, it needs a test cradle. This cradle can be a test script, or a dummy program built specifically to test a component's exposed interfaces.

Component integration, link or testing in the small

Tests if a compiled component works as expected with other components

End to end test

A system is tested from the inception to the destruction of its objects. For example, a job for an external client goes through a sequence of tasks: idea – proposal – sale – live – closed. The tester goes through the job, creating and completing each stage of the job from end to end.

Another meaning is used in integration testing where a call goes through an integration layer from one system to another, and information is returned to the caller.

Install/uninstall test or rollout/rollback test

In distributed systems, across a diverse set of operating systems and configurations, a rollout test ensures that installation will work on all configurations of each machine used by an enterprise.

Even for desktop applications, they are painstaking tests to perform, as machines vary in age, operating system version, patches, and installed software. This test alone highlights the need for corporate standards in hardware and software as rollout problems can sometimes be difficult to overcome if different versions of dependent software behave differently.

Large scale integration, or enterprise testing

Tests how a distributed system will respond to its users. This is load testing in its largest scale where multiple concurrent test processes run together. It is very difficult to do manually, and test applications or automations will be written to perform and measure the test.

Mutation testing

These test are best automated. A set of data is fired into a software product and the output is measured. The data set is then mutated by changing a few fields, and the test is run again.

Regression testing

Once a test has been carried out on a particular component in a particular way, it should be retested to confirm that behaviour has not changed unexpectedly during subsequent development or bug fixes. Automating these tests can take some effort up front, but pay huge dividends during retest.

Each time a new internal function or component is added, or an interface changes, a regression test should be carried out to see if anything unforeseen has happened.

Sanity testing

A quick once over to see if a product is ready for testing.

Stress, load, soak, volume and performance testing

These are a set of related tests performed prior to deployment, on a mirror of the live system. Each of these tests is performed using automated testing tools, which record the response time and load (typically number of users) placed on a system. Stress, load and soak tests typically use the same automation scripts, with different levels of users and interactions.

Load or performance test

A system or component is tested within its design limits to prove it it capable of supporting the expected load. Stress and Soak tests are intensive and prolonged load tests.

Stress test

A system is considered to be stressed when it is operating beyond its design limits. A stress test will gradually increase the number of users and/or level of transactions until the system fails. A failure may be a catastrophic crash, or merely a response time deemed to long for effective use of the system. The information gathered from a stress test can help the designers to pinpoint where and how failures occur, identifying areas for rework should throughput be inadequate.

Soak test

A soak test is usually run overnight, or longer if time and systems are available. The load exceeds what is expected on a busy day, so problems with high, continuous data and communication rates can be identified.

Volume test

Databases, numeric record counts, backups, archive files, audit and error logs typically get bigger over time and overload file space or data types allocated to them. Volume tests run a simulated high load and monitor data values and file space to predict how long a system will operate effectively before counter resets and file archives are required.

System test, or testing in the large

Tests how a complete application performs within an existing system

The terms testing in the large and testing in the small come from the Object Oriented Software Process (OOSP).

Usability tests

In a usability test, a proposed user is sat down in front of the application and told to ‘do it’. Each fumble or bottleneck, each contorted face, withered curse and and occasional scream is recorded by a silent and unprompting observer. The interface or flow is reworked based on the results of the test.

The reality of usability is that the tests are carried out in a false environment. If you are the viewer you must do everything you can to help the viewer and gauge their feelings. You must not say things like click that button there to start the search; you should ask questions like what are you thinking? What is your first impression of this screen? Can you talk me through your perception of this view?

User acceptance testing, also known as factory or contract testing

The users verify that the system works by performing predefined tests, usually recorded as scripts for manual interaction.

User interface testing

The slant on UI testing is different to usability testing. Usability testing focuses on getting instant understanding from the user when seeing your screen for the first time. User interface testing focuses on interaction, presentation and conformance to operating system expectations.

Architectural testing

It is quite difficult to test an architecture. The two methods of testing are peer review, and time. Peer review will help point out some flaws in an architecture, mostly based on experience. Only time will test it properly.

Is testing important?

News reports which surfaced in 1999 claimed software bugs in a Soviet early warning system nearly brought on nuclear war in 1983. The software was supposed to filter out false missile detections caused by Soviet satellites picking up reflections of sunlight from the tops of clouds. It failed to do so. Disaster was averted when a Soviet commander, based on what he described as a ‘funny feeling in my gut’, decided the apparent missile strike was a false alarm.

Test management

The management of testing is as important as the actual testing itself. It is not enough to find bugs. Getting them fixed, and monitoring that they do not return in subsequent development, is key.

A bug tracking mechanism is essential.

The test plan and test scripts

Test plan

The Test plan is the highest level test document. If there is a programme plan, there will be a test plan for that programme. If the programme has a set of subprojects, there will be a test plan for each project within the programme, all derived from the programme test plan.

The test plan is a document outlining the tests required, and the environment required to carry out those tests. If there are legal stipulations or contract agreements which need verification, they must be listed or referenced in the test plan.

It will also state how problems are to be detailed and resolved.

Test script or test case

For each test plan, a number of test scripts, also known as test cases, are produced. Each test script focuses on one particular test, stating its preconditions, the actions required to complete the test, and the expected and actual responses of the system.

Figure 10.4 Test script example

The art of testing

Testing has two dimensions, one the planning and execution of the test, and the other the granularity of the test

Figure 10.5 The two dimensions of testing

Testing is also a detailed, designed process. Figure 10.6 shows the structure of test being derived from the IT strategy.

Figure 10.6 The structure of testing

The act of test is also an integral part of the software development process. In a way, reviews themselves are tests, designed to test the whole software development process, and the various facets of it. Reviews improve the delivery of software. Tests improve the quality of a product.

Formal testing is a bit of a dry subject, so hearken Pooh when organising to have the results of the test heard.

One day Rabbit and Piglet were sitting outside Pooh’s front door listening to Rabbit, and Pooh was sitting with them. It was a drowsy summer afternoon, so Pooh got into a comfortable position for not listening to Rabbit, and from time to time he opened his eyes to say “Ah!” and then closed them again to say “True,” and from time to time Rabbit said, “You see what I mean, Piglet,” very earnestly, and Piglet nodded earnestly to show that he did.[4]

I have been Pooh once or twice myself, I am sorry to say.

Release or deployment

Imagine a world where you could write the tiniest piece of code, and have testers and users test it on a continually maintained mirror of the live system. When the tests are complete, it is confirmed for release to live, and the person responsible for the integrity of the live system rolls it out worldwide.

This is the nirvana of the configuration management world.

With a configuration management system, release is not be focused on the action of installing something, but on managing the environment in which it is coded, tested and released. All releases can be traced back through the test and development environments, and from there to their source. Furthermore, all source is traceable through its versions, and to the tasks defined in project management plans which requested the change or creation in the first place.

However, a significant investment is required to put configuration management in place, and a structured environment required in which to enforce its processes. The reality is that most development environments use source code control software, which allow a certain agility, over the more demanding configuration management practices.

The release of any software product, whether it resides within configuration management, or is being deployed manually, must be accompanied by notification of the release to all stakeholders. A typical release will go from development to test, then from test to members of the user group for initial user acceptance or alpha testing. Successful alpha testing will be followed by a release to a limited number of users for beta testing, possibly in the live environment. For a large project, there may be a number of phases of beta testing to ever greater numbers of people. Finally, the product will be released to the full user population.

Release can be a delicate time. Your hard work is being spread among the users, or sold to joe public. If you have been foolish or unfortunate enough to exclude them from development, this will be the point where they say: "that's not what we wanted", or even have the temerity to not buy it. Hopefully, you will have been persuaded away from this path by now.

Even so, release can still be a trying time, particularly where large rollouts of sets of applications are necessary. Such releases can be the culmination of highly fraught work when creating new business practices, or the merging together of two companies where different working practices must be supported by your software.

Rollouts must be planned carefully. If there are many people involved then there must be a timetable of events, with someone responsible for managing the rollout. That management is to see that everything happens when it is meant to happen. If something goes wrong during the rollout, then the rest of the tasks may have to be suspended until the problem is resolved, and the completed tasks rolled back. Running a release rehearsal is a good idea, particularly on complex platforms, even if it is only a walkthrough rather than a deployment to a mirror of the live environment.

User education

If you follow the architecturally led process laid out so far, user education will be a lot easier. If members of the design team are part of the user group, and can see their ideas reflected in the software product, they will be your user education evangelists.

Training in the use of your product, or creating user manuals and training courses, must be addressed long before release. Documentation and training material can begin to be created from a sufficiently detailed design, and training itself can begin once deliverables have first been created and tested, although it may be prudent to defer training for later in the test phase.

You may also want to train the second line support, who may in turn train first line support. You may create a help file or a set of web pages for general consumption.

Reviews

Code reviews

Code reviews have already been mentioned at the lowest level of the testing hierarchy. Such reviews are best carried out by line managers and peers. The reasons for code review are:

Design reviews or walkthroughs

A walkthrough is when a user group is taken, step by step, through a specific piece of functionality, or element of design. It can be a verbal or graphical presentation or a mock up, intended to explain one particular facet or flow, and gain some input from the user group.

User reviews

User reviews must run throughout the life of the project to ensure the work being done accurately reflects the users’ needs. They can be as informal as emails, or as formal as structured meetings where users are presented with the progress so far and where accepted plans might have to be deviated from. Their purpose it not solely one of information dissemination, but must allow the users to steer changes to benefit them most.

Post release wash up

Each release should be accompanied by a wash up meeting, the purpose of which is twofold. First, it ensures the objectives of the release have been met, and that all tasks have been completed, and second it provides a forum for the promotion of practices to help improve the process next time around. Wash up should take place within a week of release.

Post release review

A released product is not necessarily a finished product. When the product has been in the field for some time, a review of its operation, in terms of reliability, changing requirements and as a reflection for the project team. The big questions are:

  1. Is it working?

  2. Did we get it right?

  3. Has the underlying need for this product changed?

If an architecturally led, user inclusive, process has been followed, then the answer to question 2 will be a resounding yes. Your success as an architect can be measured upon the response.

The timing of post release reviews is dependent upon the project, but three months and one year after release are probably adequate.

The linear software development process

In all, we have followed a software development process based on the rather linear waterfall. It has been geared heavily toward business systems, but this is where politics is most concentrated and in need of a process. Those working in engineering systems and consider themselves software engineers seem to manage more easily when software teams are devoid of business people. I could have chosen a more technical slant, but I have more experience of business systems. Other software architecture books are more technically biased.

Also, I have spent half a book on the software development process. Some will argue that architecture has nothing to do with the software development process. It has to do with creating software architectures around which systems are built, not the daily grind of creating them.

Well, yes and no. If you do not have a software development process which will allow you to define architectures, then you will be superfluous. You will spend all day creating architectures for nothing. No-one will use them, and no-one will be interested in your output. You will be one of whom questions are asked: What exactly is it that you do? when the hand of redundancy begins its sweep of extinction. You are nothing more than a seller of sand by the seashore. If you have such time to waste, spend it on something better. Your own sand is running out. You are unlikely to gain the rewards of paradise from the endless cycle of sitting at work and achieving nothing.

If there is no process, then you as software architect may as well define it, sell it and implement it. If you are fond of asking ‘Why me?’ slip a not in the middle. Without a process, you cannot architect.

Where can I find out more?

Testing Computer Software[5] gives a good introduction of the testing process, and explains in detail how to go about it.

References

  1. The Mythical Man-month. Fred Brooks. Addison Wesley amazon.uk
  2. Don't make me think. Steve Krug. New Riders amazon.uk
  3. User Interface Design for Programmers. Joel Spolsky. Apress amazon.uk
  4. Tigger is unbounced, from House at Pooh Corner. A A Milne. Hunnypot press amazon.uk
  5. Testing Computer Software. Cem Kaner, Hung Quoc Nguyen, Jack Falk. Wiley amazon.uk
Click here for the US book list
Click here for the UK book list