Wednesday, March 31, 2010

DMAIC

Basic methodology to improve existing processes

· Define out of tolerance range.
· Measure key internal processes critical to quality.
· Analyze why defects occur.
· Improve the process to stay within tolerance.
· Control the process to stay within goals.

DMADV

Basic methodology of introducing new processes.

· Define the process and where it would fail to meet customer needs.
· Measure and determine if process meets customer needs.
· Analyze the options to meet customer needs.
· Design in changes to the process to meet customers needs.
· Verify the changes have met customer needs

Tuesday, March 30, 2010

Approach to Testing Safety-Critical Internationalized and Localized Software

Many companies would like their software to run in multiple languages so that they can market it around the world. Internationalization, sometimes referred to as I18N (I18N = I[nternationalizatio]N, 18 is the number of letters between the I and N) is a software methodology that makes it possible to create internationalized software at a reasonable cost by separating the executable code from any user interface components. Localization is the process of adaptation of the internationalized software to a new language, only the user interface needs to be translated; the executable code remains unchanged. Internationalization and localization together provide a means to produce the software that is capable to run in multiple languages.

The internationalized and localized applications employ two main principles:
1. Locale dependent strings are separated from the application binary and stored in message catalogs or the resource files, and
2. Locale aware access functions are used to retrieve the messages from the message catalogs or the resource files.

Sunday, March 28, 2010

Test Documentation

The Test Documentation should be an integral part of the documentation of application systems. Information services documentation standards should specify the type and extent of test documentation to be prepared and maintained. The type and extent of documentation that will be needed depends on its usefulness. Test Documentation should commence in the requirement phase and continue through the life of the project. The test process that has been outlined in each phase of the life cycle should be documented.

The uses of test documentation include:

1. Verify correctness of requirements.
2. Improve user understanding of information services.

3. Improve user understanding of application systems

4. Justify the test resources.

5. Determine the risk.

6. Evaluate test results.

7. Retest the system

8. Analyze the effectiveness of the test.

Thursday, March 25, 2010

Six Sigma

'The performance of a product is determined by how much margin exists between the design requirement of its characteristics (and those of its parts/steps), and the actual value of those characteristics. These characteristics are produced by processes in the factory, and at the suppliers.

Each process attempts to reproduce its characteristics identically from unit to unit, but within each process some variation occurs.

A variation of the process is measured in Std. Dev, (Sigma) from the Mean. The normal variation, defined as process width, is +/-3 Sigma about the mean.

Approximately 2700 parts per million parts/steps will fall outside the normal variation of +/- 3 Sigma. This, by itself, does not appear disconcerting. However, when we build a product containing 1200 parts/steps, we can expect 3.24 defects per unit (1200 x .0027), on average. This would result in a rolled yield of less than 4%, which means fewer than 4 units out of every 100 would go through the entire manufacturing process without a defect.

Thus, we can see that for a product to be built virtually defect-free, it must be designed to accept characteristics which are significantly more than +/- 3 sigma away from the mean.

Wednesday, March 24, 2010

Inspection,Integration and Installation Testing

Inspection

A group review quality improvement process for written material. It consists of two aspects; product (document itself) improvement and process improvement (of both document production and inspection).

Integration Testing

Testing of combined parts of an application to determine if they function together correctly. Usually performed after unit and functional testing. This type of testing is especially relevant to client/server and distributed systems.

Installation Testing

Confirms that the application under test recovers from expected or unexpected events without loss of data or functionality. Events can include shortage of disk space, unexpected loss of communication, or power out conditions.

Tuesday, March 23, 2010

What can be done if requirements are changing continuously?

A common problem and a major headache.

Work with the project's stakeholders early on to understand how requirements might change so that alternate test plans and strategies can be worked out in advance, if possible.
It's helpful if the application's initial design allows for some adaptability so that later changes do not require redoing the application from scratch.
If the code is well-commented and well-documented this makes changes easier for the developers.
Use rapid prototyping whenever possible to help customers feel sure of their requirements and minimize changes.
The project's initial schedule should allow for some extra time commensurate with the possibility of changes.
Try to move new requirements to a 'Phase 2' version of an application, while using the original requirements for the 'Phase 1' version.
Negotiate to allow only easily-implemented new requirements into the project, while moving more difficult new requirements into future versions of the application.
Be sure that customers and management understand the scheduling impacts, inherent risks, and costs of significant requirements changes. Then let management or the customers (not the developers or testers) decide if the changes are warranted .
Balance the effort put into setting up automated testing with the expected effort required to re-do them to deal with changes.
Try to design some flexibility into automated test scripts.
Focus initial automated testing on application aspects that are most likely to remain unchanged.
Devote appropriate effort to risk analysis of changes to minimize regression testing needs.
Design some flexibility into test cases (this is not easily done; the best bet might be to minimize the detail in the test cases, or set up only higher-level generic-type test plans)

Monday, March 22, 2010

Requirements Management

Requirements management is concerned with meeting the needs of end users through identifying and specifying what they need. Requirements may be focused on outcomes where the main concern is to describe what is wanted rather than how it should be delivered. Or the requirements may need to be specified in precise terms. Or requirements may be described in any way between these two extremes. The important issue is that those specifying the requirement have an adequate understanding of what the users need and how the market is likely to meet that need; they also need to be able to keep any changes to the requirement to an appropriate minimum and to document the requirement in such a way that the market will be able to understand what is required.

Requirements management aims to establish a common understanding between the customer and other stakeholders and the project team(s) that will be addressing the requirements at an early stage in the project life-cycle and maintain control by establishing suitable base-lines for both development and management use.

Requirements specification should exhibit the following characteristics:
· Lack of ambiguity - the requirements should be described in such a manner that it avoids multiple interpretations

· Completeness - it may be impossible to second-guess future requirements, but at least known requirements should be specified

· Consistency - there are problems in defining and realizing solutions that satisfy requirements if there are conflicting requirements.

· Traceability - the source of each requirement should be identified and should be traceable throughout the project life-cycle.

Thursday, March 18, 2010

Test Data Clarity

Permutation techniques may make data easier to grasp by making the datasets small and commonly used, but we can make our data clearer still by describing each row in its own free text fields, allowing testers to make a simple comparison between the free text (which is generally displayed on output), and actions based on fields which tend not to be directly displayed. Use of free text fields with some correspondence to the internals of the record allows output to be checked more easily.


Testers often talk about items of data, referring to them by anthropomorphic personification - that is to say, they give them names. This allows shorthand, but also acts as jargon, excluding those who are not in the know. Setting this data, early on in testing, to have some meaningful value can be very useful, allowing testers to sense check input and output data, and choose appropriate input data for investigative tests.


Reports, data extracts and sanity checks can also make use of these; sorting or selecting on a free text field that should have some correspondence with a functional field can help spot problems or eliminate unaffected data. Data is often used to communicate and illustrate problems to coders and to the business. However, there is generally no mandate for outside groups to understand the format or requirements of test data. Giving some meaning to the data that can be referred to directly can help with improving mutual understanding.

Clarity helps because:
· Improves communication within and outside the team

· Reduces test errors caused by using the wrong data

· Allows another method way of doing sanity checks for corrupted or inconsistent data

· Helps when checking data after input

· Helps in selecting data for investigative tests

Wednesday, March 17, 2010

Test Coverage

Test Coverage is an important measure of quality for software systems. Test Coverage analysis is the process of:

· Finding areas of a program not exercised by a set of test cases,
· Creating additional test cases to increase coverage, and
· Determining a quantitative measure of code coverage, which is an indirect measure of quality.

Also an optional aspect of test coverage analysis is:

· Identifying redundant test cases that do not increase coverage.

Test coverage analysis is sometimes called code coverage analysis. The two terms are synonymous. The academic world more often uses the term "test coverage" while practitioners more often use "code coverage".

Tuesday, March 16, 2010

How is Rapid Testing different from normal software testing?

Testing practice differs from industry to industry, company to company, and tester to tester. But there are some elements that most test projects have in common. Let's call those common elements "normal testing". In our experience, normal testing involves writing test cases against some kind of specification. These test cases are fragmentary plans or procedures that loosely specify what a tester will do to test the product. The tester is then expected to perform these test cases on the product, repeatedly, throughout the course of the project.

Rapid testing differs from traditional testing in several major ways:
First, Waste No Time. The most rapid action is no action at all. So, in rapid testing we eliminate any activity that isn't necessary. Don't repeat tests just because someone told you that repetition is good. Make sure that you get good information value from every test. Consider the opportunity cost of each testing activity.

Mission. In Rapid Testing we don't start with a task ("write test cases"), we start with a mission. Our mission may be "find important problems fast". If so, then writing test cases may not be the best approach to the test process.

Skills. To do any testing well requires skill. Normal testing downplays the importance of skill by focusing on the format of test documentation rather than the robustness of tests. Rapid Testing, as we describe it, highlights skill. Robust tests are very important, so we practice critical thinking and experimental design skills.

Risk. Normal testing is focused on functional and structural product coverage. In other words, if the product can do X, then try X. Rapid Testing focuses on important problems. We gain an understanding of the product we're testing to the point where we can imagine what kinds of problems are more likely to happen and what problems would have more impact if they happened. Then we put most of our effort into testing for those problems. Rapid Testing is concerned with uncovering the most important information as soon as possible.



Exploration. Rapid Testing is also rapid learning, so we use exploratory testing. We avoid pre-scripting test cases unless there is a clear and compelling need. We prefer to let the next test we do be influenced by the last test we did. This is a good thing, because we're not imprisoned by pre-conceived notions about what we should test, but let ourselves develop better ideas as we go.

Sunday, March 14, 2010

Concurrency Testing: Multi-user testing geared towards determining the effects of accessing the same application code, module or database records. Identifies and measures the level of locking, deadlocking and use of single-threaded code and locking semaphores.

Conformance Testing: The process of testing that an implementation conforms to the specification on which it is based. Usually applied to testing conformance to a formal standard.

Context Driven Testing: The context-driven school of software testing is flavor of Agile Testing (Agile Testing: Testing practice for projects using agile methodologies, treating development as the customer of testing and emphasizing a test-first design paradigm)that advocates continuous and creative evaluation of testing opportunities in light of the potential information revealed and the value of that information to the organization right now.

Conversion Testing: Testing of programs or procedures used to convert data from existing systems for use in replacement systems.

Cyclomatic Complexity: A measure of the logical complexity of an algorithm, used in white-box testing.

Friday, March 12, 2010

Monkey Testing

Testing a system or an Application on the fly, i.e. just few tests here and there to ensure the system or an application does not crash out.

Negative Testing

Testing aimed at showing software does not work. Also known as "test to fail".

N+1 Testing


A variation of Regression Testing. Testing conducted with multiple cycles in which errors found in test cycle N are resolved and the solution is retested in test cycle

N+1. The cycles are typically repeated until the solution reaches a steady state and there are no errors.

Thursday, March 11, 2010

What is risk management?

Risk Management is a practice with processes, methods, and tools for managing risks in a project. It provides a disciplined environment for proactive decision making to

· assess continuously what could go wrong (risks)
· determine which risks are important to deal with
· implement strategies to deal with those risks

A successful risk management practice is one in which risks are continuously identified and analyzed for relative importance. Risks are mitigated, tracked, and controlled to effectively use program resources. Problems are prevented before they occur and personnel consciously focus on what could affect product quality and schedules.



“Risk management is not a silver bullet. However, it can improve decision making, help avoid surprises, and improve your chances of succeeding.”

Wednesday, March 10, 2010

Version control

Version Control fundamentally deals with the changes made to a file. Version control addresses issues such as Team Co-ordination, Version Tracking, etc.

Proper version control makes sure that unless otherwise specified only one member of a development team can get access to a file. The version control system places a lock on the file that is currently used by a team member making it inaccessible to other users for development. However, others can view the file in read only mode. The file can be accessed by others for development only when the file has been "checked-in" by the current user.

Version tracking ensures that all the files where changes have been incorporated are saved and tagged separately using numbering schemes called version numbers. Version numbers give an idea about the various change cycles that a particular file has gone through. For e.g. let us assume that the file that is checked out has version number 1.0. When it is checked in after changes have been incorporated the number is automatically updated as 1.1. Now 2 copies of the same file reside in the database, where 1.1 is the latest revision. When a particular file number is 1.7 it can be understood that there have been 7 revisions made to the particular file. But the version numbering scheme can vary for different projects and hence it is advisable to ascertain the numbering scheme before arriving upon the number revisions made to the file. Labels can also be used for easier identification of the file.

The difference between revisions, versions and variants is not always clear. Strictly, a new revision fixes faults, as already indicated. Before a meaningful baseline can be defined, its constituent CIs must be uniquely identified, and stored so they can be accessed as they were when the baseline was taken. Version control, or controlling the revisions, versions and variants of CIs, is complex.

Tuesday, March 9, 2010

Functional Decomposition

A technique used during planning, analysis and design; creates a functional hierarchy for the software.

Functional Specification

A document that describes in detail the characteristics of the product with regard to its intended features.

Functional Testing

Testing the features and operational behavior of a product to ensure they correspond to its specifications.
Testing that ignores the internal mechanism of a system or component and focuses solely on the outputs generated in response to selected inputs and execution conditions

Monday, March 8, 2010

Classic Testing Mistakes

It's easy to make mistakes when testing software or planning a testing effort. Some mistakes are made so often, so repeatedly, by so many different people, that they deserve the label Classic Mistake.

Classic mistakes cluster usefully into five groups, which I've called "themes":

The Role of Testing: who does the testing team serve, and how does it do that?
Planning the Testing Effort: how should the whole team's work be organized?
Personnel Issues: who should test?
The Tester at Work: designing, writing, and maintaining individual tests.
Technology Rampant: quick technological fixes for hard problems.
I have two goals for this paper. First, it should identify the mistakes, put them in context, describe why they're mistakes, and suggest alternatives. Because the context of one mistake is usually prior mistakes, the paper is written in a narrative style rather than as a list that can be read in any order. Second, the paper should be a handy checklist of mistakes. For that reason, the classic mistakes are printed in a larger bold font when they appear in the text, and they're also summarized at the end.

Although many of these mistakes apply to all types of software projects, my specific focus is the testing of commercial software products, not custom software or software that is safety critical or mission critical.

This paper is essentially a series of bug reports for the testing process. You may think some of them are features, not bugs. You may disagree with the severities I assign. You may want more information to help in debugging, or want to volunteer information of your own. Any decent bug reporting system will treat the original bug report as the first part of a conversation. So should it be with this paper. Therefore, follow this link for an ongoing discussion of this topic.

Theme One: The Role of Testing

A first major mistake people make is thinking that the testing team is responsible for assuring quality. This role, often assigned to the first testing team in an organization, makes it the last defense, the barrier between the development team (accused of producing bad quality) and the customer (who must be protected from them). It's characterized by a testing team (often called the "Quality Assurance Group") that has formal authority to prevent shipment of the product. That in itself is a disheartening task: the testing team can't improve quality, only enforce a minimal level. Worse, that authority is usually more apparent than real. Discovering that, together with the perverse incentives of telling developers that quality is someone else's job, leads to testing teams and testers who are disillusioned, cynical, and view themselves as victims. We've learned from Deming and others that products are better and cheaper to produce when everyone, at every stage in development, is responsible for the quality of their work ([Deming86], [Ishikawa85]).

In practice, whatever the formal role, most organizations believe that the purpose of testing is to find bugs. This is a less pernicious definition than the previous one, but it's missing a key word. When I talk to programmers and development managers about testers, one key sentence keeps coming up: "Testers aren't finding the important bugs." Sometimes that's just griping, sometimes it's because the programmers have a skewed sense of what's important, but I regret to say that all too often it's valid criticism. Too many bug reports from testers are minor or irrelevant, and too many important bugs are missed.

What's an important bug? Important to whom? To a first approximation, the answer must be "to customers". Almost everyone will nod their head upon hearing this definition, but do they mean it? Here's a test of your organization's maturity. Suppose your product is a system that accepts email requests for service. As soon as a request is received, it sends a reply that says "your request of 5/12/97 was accepted and its reference ID is NIC-051297-3". A tester who sends in many requests per day finds she has difficulty keeping track of which request goes with which ID. She wishes that the original request were appended to the acknowledgement. Furthermore, she realizes that some customers will also generate many requests per day, so would also appreciate this feature. Would she:

file a bug report documenting a usability problem, with the expectation that it will be assigned a reasonably high priority (because the fix is clearly useful to everyone, important to some users, and easy to do)?
file a bug report with the expectation that it will be assigned "enhancement request" priority and disappear forever into the bug database?
file a bug report that yields a "works as designed" resolution code, perhaps with an email "nastygram" from a programmer or the development manager?
not bother with a bug report because it would end up in cases (2) or (3)?
If usability problems are not considered valid bugs, your project defines the testing task too narrowly. Testers are restricted to checking whether the product does what was intended, not whether what was intended is useful. Customers do not care about the distinction, and testers shouldn't either.

Testers are often the only people in the organization who use the system as heavily as an expert. They notice usability problems that experts will see. (Formal usability testing almost invariably concentrates on novice users.) Expert customers often don't report usability problems, because they've been trained to know it's not worth their time. Instead, they wait (in vain, perhaps) for a more usable product and switch to it. Testers can prevent that lost revenue.

While defining the purpose of testing as "finding bugs important to customers" is a step forward, it's more restrictive than I like. It means that there is no focus on an estimate of quality (and on the quality of that estimate). Consider these two situations for a product with five subsystems.

100 bugs are found in subsystem 1 before release. (For simplicity, assume that all bugs are of the highest priority.) No bugs are found in the other subsystems. After release, no bugs are reported in subsystem 1, but 12 bugs are found in each of the other subsystems.
Before release, 50 bugs are found in subsystem 1. 6 bugs are found in each of the other subsystems. After release, 50 bugs are found in subsystem 1 and 6 bugs in each of the other subsystems.
From the "find important bugs" standpoint, the first testing effort was superior. It found 100 bugs before release, whereas the second found only 74. But I think you can make a strong case that the second effort is more useful in practical terms. Let me restate the two situations in terms of what a test manager might say before release:

"We have tested subsystem 1 very thoroughly, and we believe we've found almost all of the priority 1 bugs. Unfortunately, we don't know anything about the bugginess of the remaining five subsystems."
"We've tested all subsystems moderately thoroughly. Subsystem 1 is still very buggy. The other subsystems are about 1/10th as buggy, though we're sure bugs remain."
This is, admittedly, an extreme example, but it demonstrates an important point. The project manager has a tough decision: would it be better to hold on to the product for more work, or should it be shipped now? Many factors - all rough estimates of possible futures - have to be weighed: Will a competitor beat us to release and tie up the market? Will dropping an unfinished feature to make it into a particular magazine's special "Java Development Environments" issue cause us to suffer in the review? Will critical customer X be more annoyed by a schedule slip or by a shaky product? Will the product be buggy enough that profits will be eaten up by support costs or, worse, a recall?

The testing team will serve the project manager better if it concentrates first on providing estimates of product bugginess (reducing uncertainty), then on finding more of the bugs that are estimated to be there. That affects test planning, the topic of the next theme.

It also affects status reporting. Test managers often err by reporting bug data without putting it into context. Without context, project management tends to focus on one graph:





The flattening in the curve of bugs found will be interpreted in the most optimistic possible way unless you as test manager explain the limitations of the data:

"Only half the planned testing tasks have been finished, so little is known about half the areas in the project. There could soon be a big spike in the number of bugs found."
"That's especially likely because the last two weekly builds have been lightly tested. I told the testers to take their vacations now, before the project hits crunch mode."
"Furthermore, based on previous projects with similar amounts and kinds of testing effort, it's reasonable to expect at least 45 priority-1 bugs remain undiscovered. Historically, that's pretty high for a successful product."
For discussions of using bug data, see [Cusumano95], [Rothman96], and [Marick97].

Earlier I asserted that testers can't directly improve quality; they can only measure it. That's true only if you find yourself starting testing too late. Tests designed before coding begins can improve quality. They inform the developer of the kinds of tests that will be run, including the special cases that will be checked. The developer can use that information while thinking about the design, during design inspections, and in his own developer testing.

Early test design can do more than prevent coding bugs. As will be discussed in the next theme, many tests will represent user tasks. The process of designing them can find user interface and usability problems before expensive rework is required. I've found problems like no user-visible place for error messages to go, pluggable modules that didn't fit together, two screens that had to be used together but could not be displayed simultaneously, and "obvious" functions that couldn't be performed. Test design fits nicely into any usability engineering effort ([Nielsen93]) as a way of finding specification bugs.

I should note that involving testing early feels unnatural to many programmers and development managers. There may be feelings that you are intruding on their turf or not giving them the chance to make the mistakes that are an essential part of design. Take care, especially at first, not to increase their workload or slow them down. It may take one or two entire projects to establish your credibility and usefulness.

Theme Two: Planning the Testing Effort

I'll first discuss specific planning mistakes, then relate test planning to the role of testing.

It's not unusual to see test plans biased toward functional testing. In functional testing, particular features are tested in isolation. In a word processor, all the options for printing would be applied, one after the other. Editing options would later get their own set of tests.

But there are often interactions between features, and functional testing tends to miss them. For example, you might never notice that the sequence of operations open a document, edit the document, print the whole document, edit one page, print that page doesn't work. But customers surely will, because they don't use products functionally. They have a task orientation. To find the bugs that customers see - that are important to customers - you need to write tests that cross functional areas by mimicking typical user tasks. This type of testing is called scenario testing, task-based testing, or use-case testing.

A bias toward functional testing also underemphasizes configuration testing. Configuration testing checks how the product works on different hardware and when combined with different third party software. There are typically many combinations that need to be tried, requiring expensive labs stocked with hardware and much time spent setting up tests, so configuration testing isn't cheap. But, it's worth it when you discover that your standard in-house platform which "entirely conforms to industry standards" actually behaves differently from most of the machines on the market.

Both configuration testing and scenario testing test global, cross-functional aspects of the product. Another type of testing that spans the product checks how it behaves under stress (a large number of transactions, very large transactions, a large number of simultaneous transactions). Putting stress and load testing off to the last minute is common, but it leaves you little time to do anything substantive when you discover your product doesn't scale up to more than 12 users.

Two related mistakes are not testing the documentation and not testing installation procedures. Testing the documentation means checking that all the procedures and examples in the documentation work. Testing installation procedures is a good way to avoid making a bad first impression.

How about avoiding testing altogether?

At a conference last year, I met (separately) two depressed testers who told me their management was of the opinion that the World Wide Web could reduce testing costs. "Look at [wildly successful internet company]. They distribute betas over the network and get their customers to do the testing for free!" The Windows 95 beta program is also cited in similar ways.

Beware of an overreliance on beta testing. Beta testing seems to give you test cases representative of customer use - because the test cases are customer use. Also, bugs reported by customers are by definition those important to customers. However, there are several problems:

The customers probably aren't that representative. In the common high-tech marketing model, beta users, especially those of the "put it on your web site and they will download" sort, are the early adopters, those who like to tinker with new technologies. They are not the pragmatists, those who want to wait until the technology is proven and safe to adopt. The usage patterns of these two groups are different, as are the kinds of bugs they consider important. In particular, early adopters have a high tolerance for bugs with workarounds and for bugs that "just go away" when they reload the program. Pragmatists, who are much less tolerant, make up the large majority of the market.
Even of those beta users who actually use the product, most will not use it seriously. They will give it the equivalent of a quick test drive, rather than taking the whole family for a two week vacation. As any car buyer knows, the test drive often leaves unpleasant features undiscovered.
Beta users - just like customers in general - don't report usability problems unless prompted. They simply silently decide they won't buy the final version.
Beta users - just like customers in general - often won't report a bug, especially if they're not sure what they did to cause it, or if they think it is obvious enough that someone else must have already reported it.
When beta users report a bug, the bug report is often unusable. It costs much more time and effort to handle a user bug report than one generated internally.
Beta programs can be useful, but they require careful planning and monitoring if they are to do more than give a warm fuzzy feeling that at least some customers have used the product before it's inflicted on all of them. See [Kaner93] for a brief description.

The one situation in which beta programs are unequivocally useful is in configuration testing. For any possible screwy configuration, you can find a beta user who has it. You can do much more configuration testing than would be possible in an in-house lab (or even perhaps an outsourced testing agency). Beta users won't do as thorough a job as a trained tester, but they'll catch gross errors of the "BackupBuster doesn't work on this brand of 'compatible' floppy tape drive" sort.

Beta programs are also useful for building word of mouth advertising, getting "first glance" reviews in magazines, supporting third-party vendors who will build their product on top of yours, and so on. Those are properly marketing activities, not testing.

Planning and replanning in support of the role of testing

Each of the types of testing described above, including functional testing, reduces uncertainty about a particular aspect of the product. When done, you have confidence that some functional areas are less buggy, others more. The product either usually works on new configurations, or it doesn't.

There's a natural tendency toward finishing one testing task before moving on to the next, but that may lead you to discover bad news too late. It's better to know something about all areas than everything about a few. When you've discovered where the problem areas lie, you can test them to greater depth as a way of helping the developers raise the quality by finding the important bugs.

Strictly, I've been over-simplistic in describing testing's role as reducing uncertainty. It would be better to say "risk-weighted uncertainty". Some areas in the product are riskier than others, perhaps because they're used by more customers or because failures in that area would be particularly severe. Riskier areas require more certainty. Failing to correctly identify risky areas is a common mistake, and it leads to misallocated testing effort. There are two sound approaches for identifying risky areas:

Ask everyone you can for their opinion. Gather data from developers, marketers, technical writers, customer support people, and whatever customer representatives you can find. See [Kaner96a] for a good description of this kind of collaborative test planning.
Use historical data. Analyzing bug reports from past products (especially those from customers, but also internal bug reports) helps tell you what areas to explore in this project.
"So, winter's early this year. We're still going to invade Russia."

Good testers are systematic and organized, yet they are exposed to all the chaos and twists and turns and changes of plan typical of a software development project. In fact, the chaos is magnified by the time it gets to testers, because of their position at the end of the food chain and typically low status. One unfortunate reaction is sticking stubbornly to the test plan. Emotionally, this can be very satisfying: "They can flail around however they like, but I'm going to hunker down and do my job." The problem is that your job is not to write tests. It's to find the bugs that matter in the areas of greatest uncertainty and risk, and ignoring changes in the reality of the product and project can mean that your testing becomes irrelevant.

That's not to say that testers should jump to readjust all their plans whenever there's a shift in the wind, but my experience is that more testers let their plans fossilize than overreact to project change.

Theme Three: Personnel IssuesFresh out of college, I got my first job as a tester. I had been hired as a developer, and knew nothing about testing, but, as they said, "we don't know enough about you yet, so we'll put you somewhere where you can't do too much damage". In due course, I "graduated" to development.

Using testing as a transitional job for new programmers is one of the two classic mistaken ways to staff a testing organization. It has some virtues. One is that you really can keep bad hires away from the code. A bozo in testing is often less dangerous than a bozo in development. Another is that the developer may learn something about testing that will be useful later. (In my case, it founded a career.) And it's a way for the new hire to learn the product while still doing some useful work.

The advantages are outweighed by the disadvantage: the new hire can't wait to get out of testing. That's hardly conducive to good work. You could argue that the testers have to do good work to get "paroled". Unfortunately, because people tend to be as impressed by effort as by results, vigorous activity - especially activity that establishes credentials as a programmer - becomes the way out. As a result, the fledgling tester does things like become the expert in the local programmable editor or complicated freeware tool. That, at least, is a potentially useful role, though it has nothing to do with testing. More dangerous is vigorous but misdirected testing activity; namely, test automation. (See the last theme.)

Even if novice testers were well guided, having so much of the testing staff be transients could only work if testing is a shallow algorithmic discipline. In fact, good testers require deep knowledge and experience.

The second classic mistake is recruiting testers from the ranks of failed programmers. There are plenty of good testers who are not good programmers, but a bad programmer likely has some work habits that will make him a bad tester, too. For example, someone who makes lots of bugs because he's inattentive to detail will miss lots of bugs for the same reason.

So how should the testing team be staffed? If you're willing to be part of the training department, go ahead and accept new programmer hires. Accept as applicants programmers who you suspect are rejects (some fraction of them really have gotten tired of programming and want a change) but interview them as you would an outside hire. When interviewing, concentrate less on formal qualifications than on intelligence and the character of the candidate's thought. A good tester has these qualities:

methodical and systematic.
tactful and diplomatic (but firm when necessary).
skeptical, especially about assumptions, and wants to see concrete evidence.
able to notice and pursue odd details.
good written and verbal skills (for explaining bugs clearly and concisely).
a knack for anticipating what others are likely to misunderstand. (This is useful both in finding bugs and writing bug reports.)
a willingness to get one's hands dirty, to experiment, to try something to see what happens.
Be especially careful to avoid the trap of testers who are not domain experts. Too often, the tester of an accounting package knows little about accounting. Consequently, she finds bugs that are unimportant to accountants and misses ones that are. Further, she writes bug reports that make serious bugs seem irrelevant. A programmer may not see past the unrepresentative test to the underlying important problem. (See the discussion of reporting bugs in the next theme.)

Domain experts may be hard to find. Try to find a few. And hire testers who are quick studies and are good at understanding other people's work patterns.

Two groups of people are readily at hand and often have those skills. But testing teams often do not seek out applicants from the customer service staff or the technical writing staff. The people who field email or phone problem reports develop, if they're good, a sense of what matters to the customer (at least to the vocal customer) and the best are very quick on their mental feet.

Like testers, technical writers often also lack detailed domain knowledge. However, they're in the business of translating a product's behavior into terms that make sense to a user. Good technical writers develop a sense of what's important, what's confusing, and so on. Those areas that are hard to explain are often fruitful sources of bugs. (What confuses the user often also confuses the programmer.)

One reason these two groups are not tapped is an insistence that testers be able to program. Programming skill brings with it certain advantages in bug hunting. A programmer is more likely to find the number 2,147,483,648 interesting than an accountant will. (It overflows a signed integer on most machines.) But such tricks of the trade are easily learned by competent non-programmers, so not having them is a weak reason for turning someone down.

If you hire according to these guidelines, you will avoid a testing team that lacks diversity. All of the members will lack some skills, but the team as a whole will have them all. Over time, in a team with mutual respect, the non-programmers will pick up essential tidbits of programming knowledge, the programmers will pick up domain knowledge, and the people with a writing background will teach the others how to deconstruct documents.

All testers - but non-programmers especially - will be hampered by a physical separation between developers and testers. A smooth working relationship between developers and testers is essential to efficient testing. Too much valuable information is unwritten; the tester finds it by talking to developers. Developers and testers must often work together in debugging; that's much harder to do remotely. Developers often dismiss bug reports too readily, but it's harder to do that to a tester you eat lunch with.

Remote testing can be made to work - I've done it - but you have to be careful. Budget money for frequent working visits, and pay attention to interpersonal issues.

Some believe that programmers can't test their own code. On the face of it, this is false: programmers test their code all the time, and they do find bugs. Just not enough of them, which is why we need independent testers.

But if independent testers are testing, and programmers are testing (and inspecting), isn't there a potential duplication of effort? And isn't that wasteful? I think the answer is yes. Ideally, programmers would concentrate on the types of bugs they can find adequately well, and independent testers would concentrate on the rest.

The bugs programmers can find well are those where their code does not do what they intended. For example, a reasonably trained, reasonably motivated programmer can do a perfectly fine job finding boundary conditions and checking whether each known equivalence class is handled. What programmers do poorly is discovering overlooked special cases (especially error cases), bugs due to the interaction of their code with other people's code (including system-wide properties like deadlocks and performance problems), and usability problems.

Crudely put, good programmers do functional testing, and testers should do everything else. Recall that I earlier claimed an over-concentration on functional testing is a classic mistake. Decent programmer testing magnifies the damage it does.

Of course, decent programmer testing is relatively rare, because programmers are neither trained nor motivated to test. This is changing, gradually, as companies realize it's cheaper to have bugs found and fixed quickly by one person, instead of more slowly by two. Until then, testers must do both the testing that programmers can do and the testing only testers can do, but must take care not to let functional testing squeeze out the rest.

Theme Four: The Tester At Work

When testing, you must decide how to exercise the program, then do it. The doing is ever so much more interesting than the deciding. A tester's itch to start breaking the program is as strong as a programmer's itch to start writing code - and it has the same effect: design work is skimped, and quality suffers. Paying more attention to running tests than to designing them is a classic mistake. A tester who is not systematic, who does not spend time laying out the possibilities in advance, will overlook special cases. They may be the same subtle ones that the programmers overlooked.

Concentration on execution also results in unreviewed test designs. Just like programmers, testers can benefit from a second pair of eyes. Reviews of test designs needn't be as elaborate as product design reviews, but a short check of the testing approach and the resulting tests can find significant omissions at low cost.

What is a test design?

A test design should contain a description of the setup (including machine configuration for a configuration test), inputs given to the product, and a description of expected results. One common mistake is being too specific about test inputs and procedures.

Let's assume manual test implementation for the moment. A related argument for automated tests will be discussed in the next section. Suppose you're testing a banking application. Here are two possible test designs:

Design 1

Setup: initialize the balance in account 12 with $100.

Procedure:

Start the program.
Type 12 in the Account window.
Press OK.
Click on the 'Withdraw' toolbar button.
In the withdraw popup dialog, click on the 'all' button.
Press OK.
Expect to see a confirmation popup that says "You are about to withdraw all the money from this account. Continue?"
Press OK.
Expect to see a 0 balance in the account window.
Separately query the database to check that the zero balance has been posted.
Exit the program with File->Exit.

Design 2

Setup: initialize the balance with a positive value.

Procedure:

Start the program on that account.
Withdraw all the money from the account using the 'all' button.
It's an error if the transaction happens without a confirmation popup.
Immediately thereafter:
- Expect a $0 balance to be displayed.
- Independently query the database to check that the zero balance has been posted.

The first design style has these advantages:

The test will always be run the same way. You are more likely to be able to reproduce the bug. So will the programmer.
It details all the important expected results to check. Imprecise expected results make failures harder to notice. For example, a tester using the second style would find it easier to overlook a spelling error in the confirmation popup, or even that it was the wrong popup.
Unlike the second style, you always know exactly what you've tested. In the second style, you couldn't be sure that you'd ever gotten to the Withdraw dialog via the toolbar. Maybe the menu was always used. Maybe the toolbar button doesn't work at all!
By spelling out all inputs, the first style prevents testers from carelessly overusing simple values. For example, a tester might always test accounts with $100, rather than using a variety of small and large balances. (Either style should include explicit tests for boundary and special values.)
However, there are also some disadvantages:

The first style is more expensive to create.
The inevitable minor changes to the user interface will break it, so it's more expensive to maintain.
Because each run of the test is exactly the same, there's no chance that a variation in procedure will stumble across a bug.
It's hard for testers to follow a procedure exactly. When one makes a mistake - pushes the wrong button, for example - will she really start over?
On balance, I believe the negatives often outweigh the positives, provided there is a separate testing task to check that all the menu items and toolbar buttons are hooked up. (Not only is a separate task more efficient, it's less error-prone. You're less likely to accidentally omit some buttons.)

I do not mean to suggest that test cases should not be rigorous, only that they should be no more rigorous than is justified, and that we testers sometimes error on the side of uneconomical detail.

Detail in the expected results is less problematic than in the test procedure, but too much detail can focus the tester's attention too much on checking against the script he's following. That might encourage another classic mistake: not noticing and exploring "irrelevant" oddities. Good testers are masters at noticing "something funny" and acting on it. Perhaps there's a brief flicker in some toolbar button which, when investigated, reveals a crash. Perhaps an operation takes an oddly long time, which suggests to the attentive tester that increasing the size of an "irrelevant" dataset might cause the program to slow to a crawl. Good testing is a combination of following a script and using it as a jumping-off point for an exploration of the product.

An important special case of overlooking bugs is checking that the product does what it's supposed to do, but not that it doesn't do what it isn't supposed to do. As an example, suppose you have a program that updates a health care service's database of family records. A test adds a second child to Dawn Marick's record. Almost all testers would check that, after the update, Dawn now has two children. Some testers - those who are clever, experienced, or subject matter experts - would check that Dawn Marick's spouse, Brian Marick, also now has two children. Relatively few testers would check that no one else in the database has had a child added. They would miss a bug where the programmer over-generalized and assumed that all "family information" updates should be applied both to a patient and to all members of her family, giving Paul Marick (aged 2) a child.

Ideally, every test should check that all data that should be modified has been modified and that all other data has been unchanged. With forethought, that can be built into automated tests. Complete checking may be impractical for manual tests, but occasional quick scans for data that might be corrupted can be valuable.

Testing should not be isolated work

Here's another version of the test we've been discussing:

Design 3

Withdraw all with confirmation and normal check for 0.

That means the same thing as Design 2 - but only to the original author. Test suites that are understandable only by their owners are ubiquitous. They cause many problems when their owners leave the company; sometimes many month's worth of work has to be thrown out.

I should note that designs as detailed as Designs 1 or 2 often suffer a similar problem. Although they can be run by anyone, not everyone can update them when the product's interface changes. Because the tests do not list their purposes explicitly, updates can easily make them test a little less than they used to. (Consider, for example, a suite of tests in the Design 1 style: how hard will it be to make sure that all the user interface controls are touched in the revised tests? Will the tester even know that's a goal of the suite?) Over time, this leads to what I call "test suite decay," in which a suite full of tests runs but no longer tests much of anything at all.

Another classic mistake involves the boundary between the tester and programmer. Some products are mostly user interface; everything they do is visible on the screen. Other products are mostly internals; the user interface is a "thin pipe" that shows little of what happens inside. The problem is that testing has to use that thin pipe to discover failures. What if complicated internal processing produces only a "yes or no" answer? Any given test case could trigger many internal faults that, through sheer bad luck, don't produce the wrong answer.

In such situations, testers sometimes rely solely on programmer ("unit") testing. In cases where that's not enough, testing only through the user-visible interface is a mistake. It is far better to get the programmers to add "testability hooks" or "testpoints" that reveal selected internal state. In essence, they convert a product like this:





to one like this:



It is often difficult to convince programmers to add test support code to the product. (Actual quote: "I don't want to clutter up my code with testing crud.") Persevere, start modestly, and take advantage of these facts:

The test support code is often a simple extension of the debugging support code programmers write anyway.
A small amount of test support code often goes a long way.
A common objection to this approach is that the test support code must be compiled out of the final product (to avoid slowing it down). If so, tests that use the testing interface "aren't testing what we ship". It is true that some of the tests won't run on the final version, so you may miss bugs. But, without testability code, you'll miss bugs that don't reveal themselves through the user interface. It's a risk tradeoff, and I believe that adding test support code usually wins. See [Marick95], chapter 13, for more details.

In one case, there's an alternative to having the programmer add code to the product: have a tool do it. Commercial tools like Purify, Boundschecker, and Sentinel automatically add code that checks for certain classes of failures (such as memory leaks). They provide a narrow, specialized testing interface. For marketing reasons, these tools are sold as programmer debugging tools, but they're equally test support tools, and I'm amazed that testing groups don't use them as a matter of course.

Testability problems are exacerbated in distributed systems like conventional client/server systems, multi-tiered client/server systems, Java applets that provide smart front-ends to web sites, and so forth. Too often, tests of such systems amount to shallow tests of the user interface component because that's the only component that the tester can easily control.

Finding failures is only the start

It's not enough to find a failure; you must also report it. Unfortunately, poor bug reporting is a classic mistake. Tester bug reports suffer from five major problems:

They do not describe how to reproduce the bug. Either no procedure is given, or the given procedure doesn't work. Either case will likely get the bug report shelved.
They don't explain what went wrong. At what point in the procedure does the bug occur? What should happen there? What actually happened?
They are not persuasive about the priority of the bug. Your job is to have the seriousness of the bug accurately assessed. There's a natural tendency for programmers and managers to rate bugs as less serious than they are. If you believe a bug is serious, explain why a customer would view it the way you do. If you found the bug with an odd case, take the time to reproduce it with a more obviously common or compelling case.
They do not help the programmer in debugging. This is a simple cost/benefit tradeoff. A small amount of time spent simplifying the procedure for reproducing the bug or exploring the various ways it could occur may save a great deal of programmer time.
They are insulting, so they poison the relationship between developers and testers.
[Kaner93] has an excellent chapter (5) on how to write bug reports. Read it.

Not all bug reports come from testers. Some come from customers. When that happens, it's common for a tester to write a regression test that reproduces the bug in the broken version of the product. When the bug is fixed, that test is used to check that it was fixed correctly.

However, adding only regression tests is not enough. A customer bug report suggests two things:

That area of the product is buggy. It's well known that bugs tend to cluster.
That area of the product was inadequately tested. Otherwise, why did the bug originally escape testing?
An appropriate response to several customer bug reports in an area is to schedule more thorough testing for that area. Begin by examining the current tests (if they're understandable) to determine their systematic weaknesses.

Finally, every bug report is a gift from a customer that tells you how to test better in the future. A common mistake is failing to take notes for the next testing effort. The next product will be somewhat like this one, the bugs will be somewhat like these, and the tests useful in finding those bugs will also be somewhat like the ones you just ran. Mental notes are easy to forget, and they're hard to hand to a new tester. Writing is a wonderful human invention: use it. Both [Kaner93] and [Marick95] describe formats for archiving test information, and both contain general-purpose examples.

Theme Five: Technology Run Rampant

Test automation is based on a simple economic proposition:

If a manual test costs $X to run the first time, it will cost just about $X to run each time thereafter, whereas:
If an automated test costs $Y to create, it will cost almost nothing to run from then on.
$Y is bigger than $X. I've heard estimates ranging from 3 to 30 times as big, with the most commonly cited number seeming to be 10. Suppose 10 is correct for your application and your automation tools. Then you should automate any test that will be run more than 10 times.

A classic mistake is to ignore these economics, attempting to automate all tests, even those that won't be run often enough to justify it. What tests clearly justify automation?

Stress or load tests may be impossible to implement manually. Would you have a tester execute and check a function 1000 times? Are you going to sit 100 people down at 100 terminals?
Nightly builds are becoming increasingly common. (See [McConnell96] or [Cusumano95] for descriptions of the procedure.) If you build the product nightly, you must have an automated "smoke test suite". Smoke tests are those that are run after every build to check for grievous errors.
Configuration tests may be run on dozens of configurations.
The other kinds of tests are less clear-cut. Think hard about whether you'd rather have automated tests that are run often or ten times as many manual tests, each run once. Beware of irrational, emotional reasons for automating, such as testers who find programming automated tests more fun, a perception that automated tests will lead to higher status (everything else is "monkey testing"), or a fear of not rerunning a test that would have found a bug (thus leading you to automate it, leaving you without enough time to write a test that would have found a different bug).

You will likely end up in a compromise position, where you have:

a set of automated tests that are run often.
a well-documented set of manual tests. Subsets of these can be rerun as necessary. For example, when a critical area of the system has been extensively changed, you might rerun its manual tests. You might run different samples of this suite after each major build.
a set of undocumented tests that were run once (including exploratory "bug bash" tests).
Beware of expecting to rerun all manual tests. You will become bogged down rerunning tests with low bug-finding value, leaving yourself no time to create new tests. You will waste time documenting tests that don't need to be documented.

You could automate more tests if you could lower the cost of creating them. That's the promise of using GUI capture/replay tools to reduce test creation cost. The notion is that you simply execute a manual test, and the tool records what you do. When you manually check the correctness of a value, the tool remembers that correct value. You can then later play back the recording, and the tool will check whether all checked values are the same as the remembered values.

There are two variants of such tools. What I call the first generation tools capture raw mouse movements or keystrokes and take snapshots of the pixels on the screen. The second generation tools (often called "object oriented") reach into the program and manipulate underlying data structures (widgets or controls).

First generation tools produce unmaintainable tests. Whenever the screen layout changes in the slightest way, the tests break. Mouse clicks are delivered to the wrong place, and snapshots fail in irrelevant ways that nevertheless have to be checked. Because screen layout changes are common, the constant manual updating of tests becomes insupportable.

Second generation tools are applicable only to tests where the underlying data structures are useful. For example, they rarely apply to a photograph editing tool, where you need to look at an actual image - at the actual bitmap. They also tend not to work with custom controls. Heavy users of capture/replay tools seem to spend an inordinate amount of time trying to get the tool to deal with the special features of their program - which raises the cost of test automation.

Second generation tools do not guarantee maintainability either. Suppose a radio button is changed to a pulldown list. All of the tests that use the old controls will now be broken.

GUI interface changes are of course common, especially between releases. Consider carefully whether an automated test that must be recaptured after GUI changes is worth having. Keep in mind that it can be hard to figure out what a captured test is attempting to accomplish unless it is separately documented.

As a rule of thumb, it's dangerous to assume that an automated test will pay for itself this release, so your test must be able to survive a reasonable level of GUI change. I believe that capture/replay tests, of either generation, are rarely robust enough.

An alternative approach to capture/replay is scripting tests. (Most GUI capture/replay tools also allow scripting.) Some member of the testing team writes a "test API" (application programmer interface) that lets other members of the team express their tests in less GUI-dependent terms. Whereas a captured test might look like this:

text $main.accountField "12"
click $main.OK
menu $operations
menu $withdraw
click $withdrawDialog.all
...
a script might look like this:

select-account 12
withdraw all
...
The script commands are subroutines that perform the appropriate mouse clicks and key presses. If the API is well-designed, most GUI changes will require changes only to the implementation of functions like withdraw, not to all the tests that use them. Please note that well-designed test APIs are as hard to write as any other good API. That is, they're hard, and you shouldn't expect to get it right the first time.

In a variant of this approach, the tests are data-driven. The tester provides a table describing key values. Some tool reads the table and converts it to the appropriate mouse clicks. The table is even less vulnerable to GUI changes because the sequence of operations has been abstracted away. It's also likely to be more understandable, especially to domain experts who are not programmers. See [Pettichord96] for an example of data-driven automated testing.

Note that these more abstract tests (whether scripted or data-driven) do not necessarily test the user interface thoroughly. If the Withdraw dialog can be reached via several routes (toolbar, menu item, hotkey), you don't know whether each route has been tried. You need a separate (most likely manual) effort to ensure that all the GUI components are connected correctly.

Whatever approach you take, don't fall into the trap of expecting regression tests to find a high proportion of new bugs. Regression tests discover that new or changed code breaks what used to work. While that happens more often than any of us would like, most bugs are in the product's new or intentionally changed behavior. Those bugs have to be caught by new tests.

Code coverage

GUI capture/replay testing is appealing because it's a quick fix for a difficult problem. Another class of tool has the same kind of attraction.

The difficult problem is that it's so hard to know if you're doing a good job testing. You only really find out once the product has shipped. Understandably, this makes managers uncomfortable. Sometimes you find them embracing code coverage with the devotion that only simple numbers can inspire. Testers sometimes also become enamored of coverage, though their romance tends to be less fervent and ends sooner.

What is code coverage? It is any of a number of measures of how thoroughly code is exercised. One common measure counts how many statements have been executed by any test. The appeal of such coverage is twofold:

If you've never exercised a line of code, you surely can't have found any of its bugs. So you should design tests to exercise every line of code.
Test suites are often too big, so you should throw out any test that doesn't add value. A test that adds no new coverage adds no value.
Only the first sentences in (1) and (2) are true. I'll illustrate with this picture, where the irregular splotches indicate bugs:



If you write only the tests needed to satisfy coverage, you'll find bugs. You're guaranteed to find the code that always fails, no matter how it's executed. But most bugs depend on how a line of code is executed. For example, code with an off-by-one error fails only when you exercise a boundary. Code with a divide-by-zero error fails only if you divide by zero. Coverage-adequate tests will find some of these bugs, by sheer dumb luck, but not enough of them. To find enough bugs, you have to write additional tests that "redundantly" execute the code.

For the same reason, removing tests from a regression test suite just because they don't add coverage is dangerous. The point is not to cover the code; it's to have tests that can discover enough of the bugs that are likely to be caused when the code is changed. Unless the tests are ineptly designed, removing tests will just remove power. If they are ineptly designed, using coverage converts a big and lousy test suite to a small and lousy test suite. That's progress, I suppose, but it's addressing the wrong problem.

A grave danger of code coverage is that it is concrete, objective, and easy to measure. Many managers today are using coverage as a performance goal for testers. Unfortunately, a cardinal rule of management applies here: "Tell me how a person is evaluated, and I'll tell you how he behaves." If a person is evaluated by how much coverage is achieved in a given time (or in how little time it takes to reach a particular coverage goal), that person will tend to write tests to achieve high coverage in the fastest way possible. Unfortunately, that means shortchanging careful test design that targets bugs, and it certainly means avoiding in-depth, repetitive testing of "already covered" code.

Using coverage as a test design technique works only when the testers are both designing poor tests and testing redundantly. They'd be better off at least targeting their poor tests at new areas of code. In more normal situations, coverage as a guide to design only decreases the value of the tests or puts testers under unproductive pressure to meet unhelpful goals.

Coverage does play a role in testing, not as a guide to test design, but as a rough evaluation of it. After you've run your tests, ask what their coverage is. If certain areas of the code have no or low coverage, you're sure to have tested them shallowly. If that wasn't intentional, you should improve the tests by rethinking their design. Coverage has told you where your tests are weak, but it's up to you to understand how.

You might not entirely ignore coverage. You might glance at the uncovered lines of code (possibly assisted by the programmer) to discover the kinds of tests you omitted. For example, you might scan the code to determine that you undertested a dialog box's error handling. Having done that, you step back and think of all the user errors the dialog box should handle, not how to provoke the error checks on line 343, 354, and 399. By rethinking design, you'll not only execute those lines, you might also discover that several other error checks are entirely missing. (Coverage can't tell you how well you would have exercised needed code that was left out of the program.)

There are types of coverage that point more directly to design mistakes than statement coverage does (branch coverage, for example). However, none - and not all of them put together - are so accurate that they can be used as test design techniques.

One final note: Romances with coverage don't seem to end with the former devotee wanting to be "just good friends". When, at the end of a year's use of coverage, it has not solved the testing problem, I find testing groups abandoning coverage entirely. That's a shame. When I test, I spend somewhat less than 5% of my time looking at coverage results, rethinking my test design, and writing some new tests to correct my mistakes. It's time well spent.


Some Classic Testing Mistakes

The role of testing

Thinking the testing team is responsible for assuring quality.
Thinking that the purpose of testing is to find bugs.
Not finding the important bugs.
Not reporting usability problems.
No focus on an estimate of quality (and on the quality of that estimate).
Reporting bug data without putting it into context.
Starting testing too late (bug detection, not bug reduction)
Planning the complete testing effort

A testing effort biased toward functional testing.
Underemphasizing configuration testing.
Putting stress and load testing off to the last minute.
Not testing the documentation
Not testing installation procedures.
An overreliance on beta testing.
Finishing one testing task before moving on to the next.
Failing to correctly identify risky areas.
Sticking stubbornly to the test plan.
Personnel issues

Using testing as a transitional job for new programmers.
Recruiting testers from the ranks of failed programmers.
Testers are not domain experts.
Not seeking candidates from the customer service staff or technical writing staff.
Insisting that testers be able to program.
A testing team that lacks diversity.
A physical separation between developers and testers.
Believing that programmers can't test their own code.
Programmers are neither trained nor motivated to test.
The tester at work

Paying more attention to running tests than to designing them.
Unreviewed test designs.
Being too specific about test inputs and procedures.
Not noticing and exploring "irrelevant" oddities.
Checking that the product does what it's supposed to do, but not that it doesn't do what it isn't supposed to do.
Test suites that are understandable only by their owners.
Testing only through the user-visible interface.
Poor bug reporting.
Adding only regression tests when bugs are found.
Failing to take notes for the next testing effort.
Test automation

Attempting to automate all tests.
Expecting to rerun manual tests.
Using GUI capture/replay tools to reduce test creation cost.
Expecting regression tests to find a high proportion of new bugs.
Code coverage

Embracing code coverage with the devotion that only simple numbers can inspire.
Removing tests from a regression test suite just because they don't add coverage.
Using coverage as a performance goal for testers.
Abandoning coverage entirely.

Plan–Do–Check–Act Cycle

Deming Cycle, Shewhart Cycle

Description

The plan–do–check–act (PDCA) cycle consists of four steps to follow for improvement or for making changes. Just as a circle has no end, the PDCA cycle should be repeated again and again for continuous improvement.

When to Use• When starting a new improvement project

• When stuck moving from one phase to another of a project

• To plan data collection and analysis in order to verify and prioritize problems or root causes

• When implementing a solution

• When reviewing your improvement process for what you learned

Procedure

Plan: Recognize an opportunity and plan the change.

Do: Test the change. Carry out a small-scale study.

Check: Review the test, analyze the results, and identify learning’s.

Act: Take action based on what you learned in the check step. If you were successful, incorporate the learning’s from the test into wider changes. If the change did not work, go through the cycle again with a different plan.
CAST

Computer Aided Software Testing.

Code Complete

Phase of development where functionality is implemented in entirety; bug fixes are all that are left. All functions found in the Functional Specifications have been implemented.

Code Coverage

An analysis method that determines which parts of the software have been executed (covered) by the test case suite and which parts have not been executed and therefore may require additional attention.

Code Inspection

A formal testing technique where the programmer reviews source code with a group who ask questions analyzing the program logic, analyzing the code with respect to a checklist of historically common programming errors, and analyzing its compliance with coding standards.

Code Walkthrough

A formal testing technique where source code is traced by a group with a small set of test cases, while the state of program variables is manually monitored, to analyze the programmer's logic and assumptions.

Sunday, March 7, 2010

Requirements Stability Index

A requirement stability index (RSI) is a metric used to organize, control, and track changes to the originally specified requirements for a new system project or product. Typically, a project begins, after consultation with customers or clients and research into their needs, with the creation of a requirements document. The document expresses what the customer or client needs and expects and, at least implicitly, what the developer will provide. The client or customer representative group reviews the document and, if in agreement with its specifications, signs it. This process (called signing off) is intended to ensure that customer representatives or clients have agreed - in writing - on the specifics involved.

Almost inevitably, however, once the design and development process is underway, customers or clients think of changes or embellishments they would like, a phenomenon known as requirement creep. An important part of project management, requirements management has become more challenging with the faster pace of technology. The RSI gives developers a means of continuing to document requirements as they change throughout the development process, and to monitor deviations from those originally specified.

Thanks for your time !

Thursday, March 4, 2010

Requirements Specification

A requirements specification, in the broadest sense, is an agreement between the customers and the organization regarding what must be built to satisfy user needs.This kind of term is often used to encapsulate various types of requirements documents and specification documents - and, to be sure, that is fine as long as everyone is clear on what is and what is not contained within the term. The bottom-line is that you have to determine if there is an operational reason to make a distinction. In other words, if you will respond differently to something that is a "specification" than you would to sometihng that is a "requirement" and furthermore if that differing response is warranted, then you should use two terms.

Thanks for your time!

Wednesday, March 3, 2010

Risk-Driven Testing

How to find the most important bugs first?

How the concept of Risk can make this happen?

What components to use for assigning Risk?


Whenever there's too much to do and not enough time to do it, we have to prioritize so that at least the most important things get done. In testing, there's never enough time or resources. In addition, the consequences of skipping something important are severe. So prioritization has received a lot of attention. The approach is called Risk Driven Testing, where Risk has very specific meaning.

Take the pieces of your system, whatever you use - modules, functions, section of the requirement Impact is what would happen if this piece somehow malfunctioned. Would it destroy the customer database? Or would it just mean that the column headings in a report didn't quite line up?

Likelihood is an estimate of how probable it is that this piece would fail.

In the Days of Darkness, there are reasons to Smile

“The darkest hour is just before the dawn” meaning: - There is hope, even in the worst of circumstances.

The below extract from Shakespeare's "The Tempest" justifies this statement

"O, a cherubin
Thou wast that did preserve me. Thou didst smile,
Infused with a fortitude from heaven,
When I have decked the sea with drops full salt,
Under my burden groaned; which raised in me
An undergoing stomach, to bear up
Against what should ensue."

-Prospero , THE TEMPEST, WILLIAM SHAKESPEARE.

Oh, you were an angel
That saved me! You smiled,
Instilled with strength from heaven,
When I have covered the sea with salty tears,
Groaning under my load, which raised in me
A new strength, to put with
Whatever should follow.

This extract justifies how a father, who was in a very bad state of mind was able to fight all odds and be successful just because of the child’s smile which gave him the inspiration, strength, courage and determination to fight and face all the difficult circumstances in life. Some people see crushing workload as an exciting challenge! While some take a scary path into uncharted territory as an adventure... May be they know that at least a part of the solution is in how one perceives it, whether they take it as problem or a challenge or/and adventure! It all lies in ones' mind set. Without exception, every human being has the ability to transform any weakness or suffering into strength, power, peace, health, and abundance.

Everyday is a new day! With a new challenge to face. Some may be good and some bad, but both are essential in life. Good Days gives us Happiness and bad days give us experience. The soul would have no rainbow had the eyes no tears...yes in times of darkness there is a reason to smile because there is always another chance in life. It’s never over. Well ever wondered what keeps us going in such awful times?! ...... Its HOPE.

No matter what comes your way! Believe in yourself and never let the flame of Hope to die!!!

My thoughts on "Reasons to Smile in the days of Darkness"

Fire in Building killed 9!

Terrorism resulted in massive destruction!

Earthquake destroyed the whole of the village.

Thinning of the green layer in India has increased pollution, change in weather and threat to Natural Resorces. Difficult days ahead!

Reading the above headlines, for a moment, I lost interest in life.

I am tired of doom and gloom headlines...

Count your blessings, we're told, but it’s just not in our nature. We'd rather count our problems. Our species survived by reacting instantly to threats and the ancient humans who stopped to smell the roses made easier targets for Predators.

Today, the predators are mostly gone, but we’re still so primed to pay attention to bad news that we tend to ignore what’s going well. As soon as we solve one problem, we take the progress for granted and find a new cause for alarm. Every now and again it doesn’t hurt to take stock of just how/what good we have it. For instance, Darkness.

Know not how Darkness became a synonym for miseries of Life. Perhaps, Authors and Poets have to be blamed for using Darkness to the difficulties/troubles in their similes. Even today, Darkness holds the same negative meaning and most of us are timid towards IT.

I strongly oppose and disagree on this conventional understanding of Darkness! Any Couple who has enjoyed the moonlight dinner, Infant’s Mother who has fed her child showing Moon and Person who has enjoyed the nature’s amazing piece of art on the clear Sky will surely join me in this aspect.

Darkness like Brightness is the integral part of the Nature. Just imagine how the World would be without Darkness! It would certainly be a typical ugly creation having a single nostril for inhaling only, no way to exhale...

Sunset welcomes Darkness, the Darkness for which every living species would be waiting for….. to repose. This relaxation rejuvenates our mind and body for the next day! The calmness of this Darkness will warm you up to Smile for the next day.

Moon, Stars and other galaxies excel in this Darkness carving a broad SMILE on all of us@World! Cool Breeze and blooming of few flowers like Queen of Bethlehem adds on value to the Darkness.

Ultimately, Brightness and Darkness are God’s bequest for all species to act and relax!

Only a night in the month may be devoid of Moon and Stars! So what a beautiful smile of yours with a shining denture will brighten IT up!

If the Darkness still exists as negative factor, I shall travel with you for awhile.

O.K. Darkness may be miseries of Life, but still you can smile to motivate yourself on the path. It all depends on individual mindset how to lead a life. Look at the Lotus, SHE smiles all the way her short span of life in the midst of the Sludge.

To Live happier, more fulfilling lives, when we encounter a difficult circumstances, we must keep shift our perspective and continually ask ourselves, “Is there a wiser, more enlightened way of looking at this seemingly negative situation?”

One of the great physicists is reported to have said that we live on a minor planet of a very average star located within the outer limits of one of a hundred thousand million galaxies. How is that for a shift in perspective? Given this information, are your troubles really that big? Are the problems you have experienced or the challenges you might currently be facing really as serious as you have made them out to be?

We walk this planet for such a short time. In the overall scheme of things, our lives are mere blips on the canvas of eternity. So have the wisdom to enjoy the journey and savor the Process.