0% found this document useful (0 votes)
10 views26 pages

Software Quality Assurance Fundamentals

1. The document discusses software quality assurance (SQA), which includes auditing quality management systems and standards like ISO 9000 to ensure quality across the entire software development process, including design, coding, testing, and more. SQA controls processes, while quality control controls products. 2. It describes differences between assuring quality in software versus manufactured products, as software is intangible and evolving rather than a finished physical product. The quality processes for software must be as fluid and adaptable as the defects they aim to prevent. 3. The document outlines the tasks, goals, and objectives of quality engineering to ensure necessary quality levels are met, quality is predictable and risks are minimized, and what quality means for

Uploaded by

Sheik Yousuf
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views26 pages

Software Quality Assurance Fundamentals

1. The document discusses software quality assurance (SQA), which includes auditing quality management systems and standards like ISO 9000 to ensure quality across the entire software development process, including design, coding, testing, and more. SQA controls processes, while quality control controls products. 2. It describes differences between assuring quality in software versus manufactured products, as software is intangible and evolving rather than a finished physical product. The quality processes for software must be as fluid and adaptable as the defects they aim to prevent. 3. The document outlines the tasks, goals, and objectives of quality engineering to ensure necessary quality levels are met, quality is predictable and risks are minimized, and what quality means for

Uploaded by

Sheik Yousuf
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

UNIT I FUNDAMENTALS OF SOFTWARE QUALITY ASSURANCE

The Role of SQA SQA Plan SQA considerations SQA people Quality
Management Software Configuration Management.
Software Quality Assurance (SQA) consists of a means of monitoring the software engineering
processes and methods used to ensure quality. It does this by means of audits of the quality
management system under which the software system is created. These audits are backed by one
or more standards, usually ISO 9000.
It is distinct from software quality control which includes reviewing requirements documents,
and software testing. SQA encompasses the entire software development process, which includes
processes such as software design, coding, source code control, code reviews, change
management, configuration management, and release management. Whereas software quality
control is a control of products, software quality assurance is a control of processes.
Software quality assurance is related to the practice of quality assurance in product
manufacturing. There are, however, some notable differences between software and a
manufactured product. These differences stem from the fact that the manufactured product is
physical and can be seen whereas the software product is not visible. Therefore its function,
benefit and costs are not as easily measured. What's more, when a manufactured product rolls off
the assembly line, it is essentially a complete, finished product, whereas software is never
finished. Software lives, grows, evolves, and metamorphoses, unlike its tangible counterparts.
Therefore, the processes and methods to manage, monitor, and measure its ongoing quality are as
fluid and sometimes elusive as are the defects that they are meant to keep in check.
Tasks
Quality Engineering
The activity consisting of the cohesive collection of all tasks that are primarily performed to
ensure and help continually improve the quality of an endeavors process and work products
Goals
The typical goals of quality engineering are to:

Ensure that the necessary levels of quality are achieved.

Make the achievement of quality predictable and repeatable.

Minimize endeavor, organizational, and personal risks due to poor quality.

Objectives
The typical objectives of quality engineering are to:

Define what quality means on the endeavor in terms of a quality model defining quality
factors and quality sub factors.

Plan the quality tasks including helping the requirements team determine and specify the
quality requirements and associated quality factors (attributes) and quality metrics.

Assure
the
quality
of
the
process
used
by
the
endeavor.
Thus, quality assurance is concerned with fulfilling the quality requirements and

achieving
the
quality
factors
Are we building the products right?

of

the

endeavors

Control the quality of the work products delivered during the endeavor.
Thus, quality control is concerned with fulfilling the quality requirements and achieving
the
quality
factors
of
the
endeavors
work
products.
Are we building the right products?

Examples
Examples of quality engineering based on scope include:

Application Quality Engineering

Business Quality Engineering

Contact Center Quality Engineering

Data Center Quality Engineering

Preconditions
Quality engineering typically may begin when the following preconditions hold:

The endeavor is started.

The quality team is initially staffed and trained in quality engineering.

Completion Criteria
Quality engineering is typically complete when the following post conditions hold:

Tasks

process.

The endeavor is complete.

The

following

diagram

illustrates

the

relationships

between

the

quality

tasks:

Plan
The purpose of this Software Quality Assurance Plan (SQAP) is to define the techniques,
procedures, and methodologies that will be used at the Center for Space Research (CSR) to
assure timely delivery of the software that meets specified requirements within project resources.
The use of this plan will help assure the following: (1) That software development, evaluation
and acceptance standards are developed, documented and followed. (2) That the results of
software quality review and audits will be given to appropriate management within CSR. This
provides feedback as to how well the development effort is conforming to various CSR
development standards. (3) That test results adhere to acceptance standards.
Teams
The SQA team shall check that the quality is maintained during the project and that the proper
quality procedures are being followed, discovered problems are reported to the Project
Management. The members of the project team must work according to the part(s) of the SQAP
that applies to their specific task.

The tasks of the SQA team


For the first phase of the project (UR), the SQA team must see to it that the following documents
are properly reviewed internally before they are submitted for an external review.

The URD

The SQA team must check whether the URD:


o contains a general description of the software that has to be developed;
o contains requirements on the software to be developed as stated by the client;
o contains constraints on the software to be developed;
o contains a priority list of the requirements.

The SPMP
The SQA team must check whether the goals of the project are clearly described. A life
cycle approach for the project must be defined. The SQA team must ensure that the
SPMP is realistic by checking:
o the assumptions made during the planning of the project;
o restrictions with respect to plan (e.g. availability of members);
o external problems (e.g. delivery of PCs, interface card and drivers).

The SCMP
With respect to the SCMP, the SQA team has to check whether the document provides
procedures concerning:
o CI identification
o CI storage
o CI change control
o CI status indication
All documents must have a unique identifier and backups must be made at least once
every three days.

The SQAP
With respect to the SQAP, the SQA team must check wether the SQAP contains:
o Project standards
o Review procedures
o Problem reporting procedures
o Responsibilities of the project members with respect to quality assurance

Tasks during SR phase

For the second phase of the project (SR), the SQA team must see to it that the following
documents are properly reviewed internally before they are submitted for an external review.

The SRD
The SQA team must check whether the SRD:
o contains requirements on the software to be developed, these requirements must
be based on the software requirements stated in the URD;
o contains constraints on the software to be developed, these constraints must be
based on the software contains in the URD;
o contains a priority list of the requirements.
o contains a traceability matrix.

The SPMP-SR
The SQA team must ensure that the SPMP is realistic by checking:
o the assumptions made during the planning;
o restrictions with respect to the planning (e.g. availability of members);
o external problems (e.g. external software/code).

The SCMP-SR
Which respect to the SCMP, the SQA team must check weather the SCP contains:
o The additional baselines.

The SQAP-SR
With respect to the SQAP, the SQA team must check wether the SQAP contains:
o The Tasks of the SQA team during the SR phase.

Tasks during AD phase

For the third phase of the project (AD), the SQA team must see to it that the following
documents are properly reviewed internally before they are submitted for an external review.

The ADD
The SQA team must check whether the ADD:
o contains an architectural design of the software to be developed, this design must
describe a logical model and the interfaces between the different classes;

o contains pre and post conditions of the methods in the logical model;
o contains a traceability matrix where the design is checked to the software
requirements in the SRD.

The SPMP-AD
The SQA team must ensure that the SPMP is realistic by checking:
o the assumptions made during the planning;
o restrictions with respect to the planning (e.g. availability of members);
o external problems.

The SCMP-AD
Which respect to the SCMP, the SQA team must check wether the SCMP contains:
o the additional baselines.

Documentation:
Project documentation may include many kinds of documents (e.g., plans, task reports,
development products, problem reports, phase summary reports). Project size, criticality (i.e., the
severity of the consequence of failure of the system), and complexity are some features that may
affect the amount of documentation a project should need. For example, the design
documentation may consist of a single document describing both the system architecture and the
detailed modules or it may consist of separate documents for the architecture and subsystems.
The purpose of this section is not to specify how many documents should be required. Rather,
this section identifies the information content needed for any project and the timeliness of
requirements so that the information can be used by the vendor, the utility, and the NRC
reviewers. Because the NRC reviewers cannot determine the characteristics of the software
product without substantial technical specifications, project plans, and reports, NRC should
specify the technical products of the vendor that the utility must provide NRC.
Review:
The reviewers will also need to evaluate the installation package, which consists of installation
procedures, installation medium (e.g., magnetic tape), test case data used to verify installation,
and expected output from the test cases. In some instances, the product may already be installed
in the utility. NRC should request documentation on the results of installation and acceptance
testing.
1. Software Quality
1.1. Definition
Software quality is called the conformance to explicitly stated functional
and performance requirements, documented development standards,
and implicit characteristics.
Important points:

- software requirements are the foundation from which quality is measured ;


- specified standards define development criteria that guide the manner
in which the software is engineered ;
- if the software meets only the explicit requirements, and does not meet
the implicit requirements, the software quality is suspect.
1.1 Software Quality factors
Operational characteristics:
- correctness - does it do what I want?
- reliability - does it do it accurately?
- efficiency - will it run efficiently on my hardware?
- integrity - is it secure?
- usability - is it designed for the user?

Product revision:
- maintainability - can I fix it?
- flexibility - can I change it?
- testability - can I test it?
Product transition:
- portability - will I be able to use it on another machine?
- reusability - will I be able to reuse some of the software?
- interoperability - will I be able to interface it with another system?
1.2 Metrics for Grading the Software Quality factors
- audit ability - the ease with which conformance to standards can be checked
- accuracy - the precision of computations and control
- communication commonality - the degree to which standard interfaces are used
- completeness - the degree to which the implementation has been achieved
- conciseness - the compactness of the program in terms of lines of code
- consistency - the use of uniform design and documentation techniques
- data commonality - the use of standard data structures and types
- error tolerance - the damage that occurs when the program encounters an error
- execution efficiency - the run-time performance of the program
- expandability - the degree to which the design can be extended
- generality - the breadth of potential application of program components

- hardware independence - the degree of decoupling from the hardware


- modularity - the functional independence of program components
- operability - the ease of operation with the system
- security - existence of mechanisms that protect the data and the program
- simplicity - the degree of understandability of the program without difficulty
- traceability - the ability to trace a component back to the requirements
1.3 The Software Quality System
The quality factors are developed in a system called a quality system,
or quality management system.
The software quality system consists of the managerial structure,
responsibilities, activities, capabilities and resources to ensure that
the developed software products have the desired quality.
The quality management system encompasses the following activities:
- reviews of the projects qualities
- career development of staff
- development of standards and procedures
The concrete details of the quality management system will be contained
in a quality manual. A quality manual will contain standards, procedures
and guidelines and will be influenced by external standards.
- a standard is instruction of how a project document or program code
is to be displayed ;
- a procedure is a step-by-step set of instructions describing how
a particular software activity is to be carried out ;
- a guideline - consists of advice on best practice.
Software Configuration Management

Software Configuration Management


Software configuration management (SCM) is the discipline of controlling the evolution of
complex software systems. This chapter surveys tools that support or automate aspects of SCM.
It proposes a standard terminology, describes the areas that are amenable to automation,
discusses a representative set of existing SCM tools, and identifies directions for future research
and development. A glossary of terms is included.
Introduction

Configuration management (CM) is the discipline of controlling the evolution of complex


systems; software configuration management (SCM) is its specialization for computer programs
and associated documents. General CM is beneficial for any large system that. Due to its
complexity, cannot be made perfect for all the uses to which it will be put. Such a system will be
subject to numerous, sometimes conflicting changes during its lifetime, giving rise not to a single
system, but to a set of related systems, called a system family. A system family consists of a
number of components that can be con-figured to form individual family members. A substantial
number of the components must be shared among members to make the family economically
viable. Maintaining order in large and expanding system families is the goal of CM.
SCM differs from general CM in the following two ways. First, software is easier to change than
hardware, and it therefore changes faster. Even relatively small software systems, developed by
a single team, can experience a significant rate of change, and in large systems, such as
telecommunications systems, the update activities can totally overwhelm manual configuration
management procedures. Second, SCM is potentially more automatable. Because all
components of a software system are easily stored on-line. CM for physical systems is hampered
by having to handle objects that are not within reach of programmable controls.
As
CAD/CAM and robotics bring manufacturing processes more and more under computer control
physical configuration management will undoubtedly adopt some of the approaches used for
software. VLSI already does: Circuit design and circuit processing can be managed like software
design and compilation.
Effective software configuration management coordinates programmers working in teams.
some of the confusion caused by interaction among team members. The co-coordinating
functions of configuration management are introduced below, and illustrated with questions or
statements familiar to anyone who has worked in software development.
Identification
Identifying the individual components and configurations is a prerequisite for controlling their
evolution. Reliable identification helps avoid the fol-lowing problems:

"This program worked yesterday. What happened?"


"I cant reproduce the error in this configuration."
"I fixed this problem long ago. Why did it reappear?"
"The online documentation doesnt match the program."
"Do we have the latest version?"
Change Tracking
Change tracking keeps a record of what was done to which component for what reason, at what
time, and by whom. It helps answer the following questions:

"Has this problem been fixed?


"Which bug fixes went into this copy?"
"This seems like an obvious change. Was it tried before?"
"Who is responsible for this modification?"

"Were these independent changes merged?"


Version Selection and Base lining
Selecting the right versions of components and configurations for testing and base lining can be
difficult. Machine support for version selection helps with composing consistent configurations
and with answering the following questions:
"How do I configure a test
system that contains my temporary fixes to the last baseline, and the released fixes of
all other components?"
"Given a list of fixes and enhancements, how do I configure a system that
incorporates them?"
"This enhancement wont be ready until the next release. How do I con-figure it out of
the current baseline?"
"How exactly does this version differ from the Baseline?"
Software Manufacture
Putting together a configuration requires numerous steps such as pre and post processing,
compiling, linking, formatting, and regression testing. SCM systems must automate that process
and at the same time should be open for adding new processing programs. To reduce redundant
work, they must manage a cache of recently generated components. Automation avoids the
following problems:

"I just fixed that. Was something not recompiled?"


"How much recompilation will this change cost?"
"Did we deliver an up-to-date binary version to the customer?
"I wonder whether we applied the processing steps in the right order."

"How exactly was this


configuration produced?"
"Were all regression tests performed on this version?
Managing Simultaneous Update
Simultaneous update of the same component by several programmers can-not always be
prevented. The configuration management system must note such situations and supply tools for
merging competing changes later. In so doing it helps prevent problems like the following:

"Why did my change to this module disappear?"


"What happened to my unfinished modules while I was out of town?"
"How do I merge these changes into my version?"
"Do our changes conflict?"

This chapter discusses software tools for automating the functions introduced above. The basis of
all tools is representation, so we develop a model for representing multi-version/
multiconfiguration systems. Section 2 establishes basic terminology, while Sections 3 and 4
introduces versions. Later sections on version selection, software manufacture and modification
requests can be read in any order. Background material and manual CM procedures can be found
in References [3, 5, 7].

Basic SCM Concepts

This section defines the basic elements of a data base for software configuration management.
The data base stores all software objects produced during the life-cycle of a project.
A software object is any kind of identifiable, machine-readable document generated during the
course of a project. The document must be stored on-line to be fully controllable by an SCM
system. Examples of software objects are requirements documents, design documents,
specifications, interface descriptions, program code, test programs, test data, test output, binary
code, user manuals, or VLSI designs.
Every software object has a unique identifier and a body containing the actual information. A set
of attributes associated with software objects and a facility for linking objects via various
relations are also needed. For example, attributes record time of creation and last read access,
and relations link objects to their revisions and variants. The set of attributes and relations must
be extensible; later sections will introduce a basic set. We also need a facility to describe
subclasses or subtypes of the general software object. For instance, the subclass may fix the
language in which the body is written, or the structure editor used to compose the body, or
whether the object represents an interface or an implementation. The subclass also defines the
set of operations available on objects of that class, such as compiling, configuring, printing, etc.
The body of a software object is immutable, that is, once the body has been completed, it can
only be read. Any "change" of a body actually creates a new software object with the changed
body. Immutability is important for configuration management, because it prevents
misidentification: an object identifier is associated with one and only one constant body, and not
with several different versions. Most other attributes and relations of software objects remain
changeable, however, so new information can be added.
Software objects have two orthogonal refinements, one according to how they were created, the
other according to the structure of their body. For creation, we distinguish source and derived
objects. For internal structure, we distinguish atomic objects and configurations.
2.1 Creation of Software Objects
A source object is a software object that is composed manually, for instance with an interactive
editor. Creating a source object requires human action; it cannot be produced automatically.
A derived object is generated fully automatically by a program, usually from other software
objects. A program that produces derived objects is called a deriver. Examples of derivers are
compilers, linkers, document formatters, pretty printers, cross references, and call graph
generators. Normally, de-rived objects need not be stored, since they can be regenerated,
provided both the deriver and the input are available or can be received. To reduce the delay
caused by regeneration, a smart configuration management sys-tem maintains a cache of derived
objects that are likely to be reused.
Unlike derived objects, which can be deleted to make room, source objects are "sacred", because
deleting them may cause irreparable damage or at least significant delay until they are
reconstructed. However, derived objects may also become "sacred", i.e., they must not be deleted
merely to make room, if it is impossible or time consuming to reproduce them. For in-stance,
derived objects that are imported from other sites, especially vendor supplied programs, must not
be deleted, even though they are derived in most cases. Another example are derived objects for
which the original de-rivers have stopped working (if they have not been ported to new hardware, say), or if the corresponding input objects have been lost.

A special case is derived objects that are modified manually. Examples are automatically
generated program skeletons and templates that are fleshed out by hand, or object code that is
patched manually. In principle, these manual modifications produce new source objects. 1
However; the SCM sys-tem should store a traceability link that records the dependency between
the two objects. This link can be used for generating a reminder to update the source object if the
derived object changes. Traceability links should also be recorded among dependent source
objects, for example between a specification and its implementation, or a program and its
documentation. In fact, most source objects in an SCM system depend on one or more other
objects, Software Configuration Management except perhaps the initial requirements
specification. Traceability information is extremely valuable for automatically producing update
reminders, for reviewing completeness of changes, and for informing maintainers what
information they need to consider when preparing a change.
2.2 Structure of Software Objects
The body of a software object is either atomic or structured. An atomic object, or atom, has a
body that is not decomposable for SCM; its body is an opaque data structure with a set of generic
operations such as copying, deletion, re-naming, and editing.
An atomic object may consist of a program written in some language, a syntax tree produced by
a structure editor, a data structure generated by a WYSIWYG word processor, or an object code
module produced by a compiler.
A configuration has a body that consists of sub-objects, which may them-selves have subobjects,
and so on. Configurations have two subclasses: composites and sequences. A composite object,
or simply composite, is a record structure comprised of fields. Each field consists of a field
identifier and s field value. A field value is either an object identifier or a version group
identifier. An example of a composite object is a software package consisting of a program, a
users manual, and an installation procedure. An-other example is a regression test object,
consisting of a test program, input data, expected output data, and a comparator for comparing
expected and actual output. Thus, fields may contain data as well as operations.
A sequence is a list of object and version group identifiers. Sequences represent ordered
multisite of objects. They are used for combining sub-objects that are of the same class, or when
the number of sub-objects is indeterminate. In contrast to composites, the individual elements of
a sequence fulfill identical roles and are treated in the same way for SCM purposes, such as the
list of object code modules constituting a library.
Note that the above definitions permit version group identifiers in composites and sequences. A
version group is a set of related source or derived objects that can replace each other under
certain assumptions (see Sections 3 and 4 for details). The purpose of version groups here is to
permit compact representations of multiple software objects with the same structure. By using a
version group identifier instead of an object identifier, configurations need not be updated if new
versions are added to the groups. On the other hand, a version selection process must decide
which versions to choose when processing such configurations.
Because of the need to distinguish between "precise" and "loose" configurations, we introduce
the following terms. A generic composite is a composite with at least one field value that is either
a version group identifier or a generic configuration (i.e., a generic composite or a generic
sequence). The opposite of a generic composite is a baseline composite, which is a composite
whose field values are atomic objects, baseline composites, or baseline se-quences. The
subclasses generic sequences and baseline sequence are defined analogously. Finally, a generic
configuration, also called a system model, is a generic composite or a generic sequence. A
baseline configuration, or simply baseline, is a baseline composite or baseline sequence.

Software Configuration Management


We follow Clam in stipulating that derivers which produce several outputs must package them
into a single, derived configuration. For example, a compiler which produces object code, a list
of warnings, and a symbol table would store all three of these derived objects into one
composite. This convention simplifies the bookkeeping involved in managing derived objects.
In both composites and sequences, source and derived objects may be freely intermixed.
However, including derived objects presents a problem: Since the derived objects may not yet
exist, there may be no known identifiers for them. Instead, we must represent a derived object
with a descriptor that will cause the object to be generated when it is needed. This descriptor
must specify not only the derivers, but all parameter settings for the deriv-ers as well. If some of
the parameter settings are under-specified, then the version selection process must choose and
record them (see Section 5).

For clarity, we should point out some uses of the above definitions. Suppose a software house
delivers a single, binary program to a customer. This pro-gram is a single, derived object. It most
cases, this object was generated from a baseline configuration recorded at the software house.
The purpose of the baseline is to guarantee that the derived object can be reproduced when
needed. The software house may also deliver a configuration, per-haps a composite that consists
of one or more binaries and a manual. The delivered configuration may also contain source
programs, because the pro-grams will be interpreted, or because the customer wishes to compile
source locally. The customer may also need to adapt the source code to local needs. Thus,
depending on how much the customer expects to do, a more or less complete SCM system must
be available at the customer site to take over portions of the software houses SCM functions.
3 Source Versions
SCM systems have to cope with constant change. Corrective, adaptive, and perfective
maintenance activities produce a steady stream of updates. Since most changes are incremental,
they are best viewed as producing related versions of objects rather than separate, unrelated
objects. This section deals with versions of source objects; versions produced by derivers will be
treated in Section 4.
3.1 Source Version Groups
An important concept for dealing with multiple versions is the source version group. A source
version group is a set of source objects that are connected via the relations revision-of, variantof, and their subtypes. These relations are defined below. Note, however, that source version
groups may contain atoms, composites, sequences, and even mixtures of those.
y revision-of x: This relation holds if and only if x and y are source objects and y was
produced by changing a copy of x. Thus, revision-of records the development history of
source objects. The subtypes of this
1. The term "parametric" is sometimes used as synonym for "generic. Software
Configuration Management relation, correction-of, adaptation-of, and enhancement-of,
capture the nature of the change. It is possible for several of these subtypes to hold
simultaneously between a pair of objects.
The relation revision-of and its subtypes are transitive, ant symmetric, and reflexive. Objects of
a version group that are transitively related by revision-of etc. are simply called revisions.
1. y variant-of x: This relation holds if and only if x and y are source objects that are

indistinguishable under a given abstraction. An abstraction de-fines relevant properties


while ignoring (irrelevant) details. It permits variation by not prescribing certain
properties o behaviors. The intent is to define abstractions in such a way that variants can
replace each other in a software systems without requiring changes in their client programs. Variant of is actually a ternary relation, since it must identify a common
abstraction. Few programming environments permit the specification of such an
abstraction. One approach is to introduce subsets of interfaces, called views. Another,
more promising approach is to represent abstractions explicitly as super classes in objectoriented programming languages
2. A commonly used abstraction is the functional specification. The functional specification
ignores space and time efficiencies, so two variants under this abstraction may differ in
internal algorithms and data structures. Similarly, one may decide to ignore the choice of
programming language, target machine, target operating system, or target user group. As
long as the functional specifications of variants are identical, client programs depending
of only the functional specification do not have to be rewritten if a different variant is
chosen.
Thus, software systems can be reconfigured by merely replacing individual objects. The
interested reader is referred to Parnas work [30, 31] for criteria on how to design
software systems in such a way that likely changes can be hidden behind invariant
interfaces.
3. For some abstractions, it is possible that details even in the functional specification are
irrelevant. For example, sorting programs can be classified as stable or unstable. A stable
sort guarantees to leave elements with the same sorting key in the original order. Under
some abstraction, stableness of sorting procedures may be irrelevant. Thus, the common
property of two variants can be a subset of their functional specifications. A common
manifestation of this aspect is that only a subset of the interface made available by a
program is used by clients.
4. For a given abstraction, the relation variant-of is an equivalence relation because it is
transitive, symmetric, and reflexive. Objects in a version group that are transitively
related by variant-of are simply called variants. Subclasses of variant-of describe the
abstraction under which the variants are indistinguishable. Characterizing variants under
system-defined and user-defined abstractions is a topic of current research in SCM.
Software Configuration Management
The relation variant-of may or may not parallel revision-of, depending on whether the variant
was produced by changing an existing object. Variants must usually be maintained in parallel,
and in practice their number should be kept small.
The relations revision-of, variant-of and their subtypes apply to configurations. This is in
contrast to SCM systems like SCCS [34], RCS [39], CMS [1], and DSEE [25], where versions
of configurations appear to be an after-thought. For example, with RCS, one would have to
collect descriptions of all configurations and sub configurations into a single, atomic object
called a Make file, and allow versions of the entire set only. Versioning of configurations at this
level is on too coarse a grain for effective SCM. The Gandalf project [16, 41] was among the
first to experiment with versions of configurations (called compositions) as well as variants
(called implementations or realizations).
3.2 Structure of Source Version Groups
As defined previously, a source version group is a set of source objects that are related via
revision-of and variant-of For simplicity, we assume that version groups are closed with respect

to these relations. In other words, no re-vision-of and variant-of link may cross version group
boundaries.

FIGURE 1
par.1

par.2

par.3
con.1

1.1

1.2

1.3

2.1

2.2

3.1

fix.1
revision-of
variant-of
con.1
par.1
fix.1

conflict at a change
parallel version 1
bugfix 1

Figure 1. A source version group with revisions and variants


Revision-of forms a directed, acyclic graph reflecting the development history, whereas variantof identifies the starting points of parallel lines of develClosure is not strictly necessary. Sometimes it is convenient to make some revision in a group the
initial revision in another.
Software Configuration Management
For a young source version group without variants, the graph structure is simple: It consists of a
single list linked via revision-of that begins with the most recent object and ends with the oldest.
At least initially, this list represents the main line of development and is often called the main
branch or trunk. As the version group ages, side branches may form. Some of these side
branches may wither, others may later be merged with the main branch. Side branches are
needed for accommodating parallel development, conflicting updates, and temporary fixes.

Consider FIGURE 1 as an example of a source version group illustrating various types of


branches. The revisions numbered 1.1,1.2,...,3.1 represent the main branch. The revisions par.l,
par.2, and par.3 constitute a parallel line of development. Note that par.l is both a variant and a
revision of 1.3. A special case of parallel development is distributed development, in which
customers modify released software themselves. The modifications can be relayed back to the
development organization for merging into the next re-lease, or must be merged into future
releases by the customers locally. See Reference [39] for an example of how to set up version
groups for distributed development.

Revisions 3.1 and con.l illustrate conflicting updates. This situation arises when two
programmers wish to update the same revision (here: 2.2) simultaneously, and neither can wait
for the other to finish. This situation is un-desirable, yet cannot always be prevented in practice.
SCM should warn programmers in this case, but allow work to proceed by forming a temporary
side branch for later merging. Note that such conflicts can only occur at branch tips. Reference
[39] discusses a range of strategies for dealing with these conflicts.
Revision fix.1 illustrates the handling of temporary fixes. Suppose the need to correct revision
1.3 arises after 2.1 and 2.2 have been completed. To reflect the actual development history, SCM
places the correction on a side branch starting at revision 1.3. The correction is later merged with
2.2, resulting in 3.1.
3.3 Operations on Source Version Groups
Virtually all SCM systems in use today use some form of a check-out/edit/ check-in cycle for
adding revisions to source version groups. The check-out operation creates a copy of the revision
to be modified and reserves it for the user. Check-out also links the new copy to its original with
the revision-of relation. The user can then update the copy with an arbitrary editor. As long as the
copy remains checked out, it remains inaccessible to others. Any subsequent check-out of the
same original revision causes a branch to form, with a warning stating that a merge operation
will be necessary later. The check-in operation signals the completion of the changes. This
operation makes the (modified) copy visible to other users. Before a revision is checked in, it
should satisfy some quality control criterion, such as a successful test, to make sure it is usable
by other team members.
In the period between check-out and check-in, a revision may actually go through several
successive edit cycles, until the change is acceptable. Whenever the editor writes out an object, a
new revision is created. All of these revisions, except for the latest one, are called minor
revisions. Minor re-visions are deleted upon check-in of the latest revision. They are needed for
short-term backup purposes, in case of machine crashes or inadvertent, disastrous deletes during
editing. Most programming environments limit the number of minor revisions to one or two. For
instance, EMACS [36] saves one minor revision and periodically writes a checkpoint as another.
Software Configuration Management
Three-way revision merging is important for combining parallel lines of development. A three
way merge first identifies the commonalities among a base version and two of its parallel
revisions, and then integrates the changes. The merge process also detects conflicting changes.
These must be resolved manually. In practice, the merging process works well, provided changed
segments are well separated from each other by unchanged ones. Examples of three-way text

mergers are diff3 and rcsmerge [39]. These pro-grams are based on the algorithms that compute
deltas, i.e., the differences among revisions (see Section 3.4). Recently, Reps et al [33] have
made some progress towards improved merge conflict resolution using data flow information.
A consistent revision numbering scheme is important for version selection. Most SCM systems
use a Dewey decimal notation, with revisions on the main branch numbered by a pair of the form
(release-number, level number). Some systems extend this notation to branches in such a way
that the structure of the revision graph is reflected by the numbering. Unfortunately, this notation
becomes clumsy as the number of branches increases. A better approach is to simply select a
unique, symbolic identifier for each branch and to number revisions on each branch with a single
number or a pair. The relation revision-of can be consulted to determine the lineage of a revision.
While revision numbers together with attributes such as check-in date, author, and state are
sufficient for selecting revisions, additional, descriptive attributes are needed for differentiating
and selecting variants. An adequate approach is to let variant attributes take on subsets of values
from enumerated types. For instance, one may wish to provide an attribute that indicates the
target operating systems on which a certain variant can run. This at-tribute would have as value a
subset of an enumerated type listing all relevant operating systems. All revisions of a variant
would have the same variant attributes; changing them creates a new variant. Clearly, the attributes and types for describing variants must be user-definable.
To support change tracking, every object in a source version group carries a state attribute and a
log entry. The state attribute indicates the status of a re-vision. For example, check-out and
check-in set the attribute to in-preparation and experimental, respectively. A revision can later
be promoted to a higher state, for example stable or released. The set of states should be
extensible. To allow for effective tracing, the attribute should not just show the current value, but
actually log all state changes with date and person responsible for the change.
The log entry is extremely important for change tracking. It stores a commentary requested
during check-in, describing the changes completed. Browsing the log messages helps determine
what happened to software.
Software Configuration Management
object over time, and sometimes prevents attempting changes that had earlier been abandoned as
unsuccessful. Because of the usefulness of the log entry, the Crystal SCM system [2] actually
requests a log message during check-out. For recording the programmers intentions. A checkout log helps determine what changes are in preparation. Check-in returns this message to the
user, who can then edit it into the final, permanent log entry.
3.4 Implementation of Source Version Groups
Source version Groups and the objects in them must be represented as persistent objects in an
object base. The object base has traditionally been implemented with hierarchical file systems,
by either placing the objects and relations in separate files in a special directory, or by encoding
this information in a single file. These implementations provide sufficient reliability, but
recovery, consistency control, access synchronization, and authorization are realized in an ad hoc
manner.
Building the object base on top of a full-fledged data base management sys-tem seems to be an
attractive alternative, because a DBMS would provide high reliability and systematic

mechanisms for handling recovery, consistency control, access synchronization, and


authorization. However, commercial DBMSs are optimized towards business applications, i.e.,
for processing of large quantities of rather small records. SGM presents exactly the opposite
requirement, namely moderate quantities of large objects with complex internal structure. Using
a business-oriented DBMS for SCM there-fore results in an "impedance mismatch",
characterized by awkward data modeling and poor performance [27]. Current developments in
engineering databases, such as DAMOKLES [11] or object-oriented data bases, should lead to
more appropriate data models and adequate efficiency.
In both file and data base implementations, accessing a particular source object usually requires a
special regeneration process that reconstitutes the object from deltas. Deltas are used to conserve
space (see below). First generation SCM systems such as SCCS, CMS, and RCS provided
separate operations to rebuild a desired version in a temporary file, which could then be opened
for reading or writing. Second generation systems such as DSEE integrate the management of
source version groups into the file system. Opening a source object of version group for reading
regenerates it from delta storage; opening for writing does the same but includes the semantics of
the check-out operation. Besides being easier to program, integrating versioning into the file
system or DBMS has the effect of better protection: Users are prevented from destroying the data
structures of a version group by accidentally or intentionally tampering with them using
inappropriate tools such as text editors.
It could also be applied to configurations, but may not produce dramatic savings because of
the small size of those objects.
There are several important design parameters affecting the speed with which an object can be
regenerated from deltas, and how deltas are computed. First, deltas can be stored in forward or
reverse direction. The re-verse direction is preferred, since this method keeps the youngest, most
recent revision on s branch in clear-text, while the others have to be regenerated since younger
revisions are more likely to be needed than older ones, reverse deltas save overall regeneration
time.
Software Configuration Management
Second, deltas can either be interleaved or separate. In interleaved deltas, the lines of all versions
are sorted into a linear data structure, such that a single pass over that data structure can collect
all lines for a desired version in the correct order. This data structure ha the property that
regeneration slows down as the number of versions increases. For this reason, reverse deltas are
best stored separately.
Finally, computing the deltas themselves is an important problem. Deltas can be generated by a
special program that compares pairs of objects, or they can be produced incrementally by the
editors in the programming environment. Relying exclusively on editors to produce the deltas is
risky, be-cause that decision would require that every editor in a programming environment keep
track of differences. There exist only a few of those editors, and they are often functionally
limited. Examples are Kristals P-edit [24] and Frasers EH [15], which are both line-oriented.
Another drawback of relying on delta editors is that all other programs that modify source
objects, such as pretty printers, would have to record their changes, too. Up-dates received from
the field presents another problem: The updates might not have been produced with delta editors,
or, worse yet, the deltas might be relative to old or inaccessible versions. Thus, a separate
program that computes deltas in batch mode is necessary. The right time for this process is
during check-in, when changes are complete and the user has reached a point of closure when a
short wait is tolerable.

There are two efficient algorithms for computing deltas in batch mode. One is based on isolating
a longest common subsequence [18], the other one on identifying block moves [42]. A delta
based on a longest common substring is not necessarily mineral, because it cannot detect
crossing block moves.
Crossing block moves arise if two or more segments (e.g., procedures) appear in a different
order in two revisions. An edit script derived from a longest common substring first deletes the
shorter of the two segments, and then reinserts it. Tichys block move algorithm [42] detects
such permutations and is guaranteed to produce a minimal delta.
Most deltas used in practice are line-based, i.e., the unit for comparison is the line. Two lines are
considered different if they differ by a single character. Clearly, a byte- or word-based delta
would be smaller, but computing it would require many more comparisons and therefore much
more time.
1. Blank compression saves space if a significant fraction of an objects size is due to indentation
Obst [29] reports that with special heuristics, a character-based block-move algorithm runs in the
same time as a line-based one, and produces deltas that are on average 30 per cent smaller. The
heuristic is specifically oriented towards block moves and does not seem applicable to longest
common sub-strings.
For objects that consist of a representation other than text, the existing delta algorithms are easily
adapted by choosing an appropriate unit for compari-son and converting the representation into a
linear sequence. For example, the difference between two syntax trees can be computed by
comparing prefix representations of the trees at the level of individual nodes.
4

Derived Versions

Handling derived versions is much simpler than handling source versions, since they are
computed fully automatically and no human actions need be observed or supported. A derived
version group is a set of derived objects that were generated from the same set of software
objects by varying derivation parameters or derivers. For example, a compiler may be able to
produce code for different target machines, optimized code, non-optimized code, code with
runtime checks, code with debugging hooks, etc. There may also be several compiler versions
available. Conditional compilation falls in this class also. The term derived variants is used for
those objects in a derived version group that offer identical functional specifications to their
client programs.
Derivers may also be able to produce information quite different from inter-mediate or binary
code. There exist derivers to generate call graphs, pretty-printed listings, cross reference tables,
or indexes. These transformations are not called variants, because they do not preserve the
semantic content as compilers do. However, both these transformations and the derived variants
are collected into a derived version group, as long as they were generated from the same input.
The relations revision-of, variant-of and their subtypes are defined on source objects, but extend
naturally to derived objects. For example, if two source objects are revisions of each other, then
so are their derived objects, provided the derived objects were produced with the same deriver
and parameters. By definition, these two derived objects would be in different derived version
groups. A minor difficulty here is that derived objects are often generated from several source

objects. When stating that two derived objects are variants or revisions of each other, it is
therefore useful to qualify this statement with respect to the source object(s) involved.
Section 6 discusses the details of how to generate and keep track of derived objects.
FIGURE 2

1.0

2.0

1.0

2.0

1.1 1.2

1.3

1.1

1.2

1.3

1.1

1.2

1.3

1.4

Figure 2. An AND/OR graph representing a system family


5 Version Selection and Baselining
Generic configurations may represent a large number of baselines. For ex-ample, a mediumsized software system could easily consist of 100 source version groups. Assuming each version
group has merely two versions, a generic configuration containing all 100 groups represents 2 100
separate baselines an impossibly large number. In practice, few of those base-lines will
actually work. The selection problem of software configuration management is finding viable
configurations without exhaustive search.
A simple, structural model that clarifies the selection problem is the AND/ OR graph model
introduced in Reference [38]. This model represents atomic objects as leaf nodes, configurations
as AND-nodes, and source or de-rived version groups as OR-nodes. The successors of an ORnode are the objects in the version group. (We ignore the relations revision-of and variant-of in
case of a source version group for now.) An OA-node implies a choice among its successors,
while an AND-node implies integration. The mod-el permits not just a tree, but a general,
acyclic graph, because objects may be shared among several configurations. The selection
problem in this mod-el is formulated as searching the graph from a given start node, and making
choices at each OR-node such that the nodes selected form a viable configuration.

An example of an AND/OR graph appears in FIGURE 2. Nodes A, B, and C are version groups
of atomic objects, while S and R are version groups containing configurations. AND-nodes are
depicted graphically by arcs connecting their off springs. Labels on the out-arcs of AND-nodes
distinguish composites from sequences. For example, version 1.0 of R is a composite. Note that
by searching the graph starting with version 2.0 of node 5, we reach no OR-nodes. Such a start
node identifies a baseline, because it unambiguously specifies a set of nodes making up a
configuration. Establishing a baseline is important at release time. In a large project, where
multiple changes are carried out concurrently, a baseline is an important point of reference.
Updates usually are relative to a baseline. A private baseline is created whenever an actual
system instance is generated. It may contain revisions that are not yet checked in. It is handled
like a minor revision in that only a few of them are stored per user. A public baseline must not
contain checked-out revisions, is itself checked into a version group, and should satisfy
established quality control criteria. Quality control is a subject beyond the scope of this survey.
An AND-node that leads to one or more OR-nodes represents a generic configuration, since
some selection will be necessary when constructing an actual system instance. Generic
configurations are important for compactly representing a large set of possible baselines,
without having to enumerate all combinations. Without generic configurations. SCM requires
the maintenance of bulky configuration tables. The problem with these tables is that they are
difficult to keep up to date in a large project. For instance, the addition of an upward compatible
version of a pervasively used module may cause such tables to double in size because the new
version can be used wherever the old one was permitted.
Version selection is currently an active research area within SCM. The general approach is to
associate constraints with generic configurations. The constraints are conditions on attributes of
software objects that select appropriate variants, revisions, and derived versions. Attributes
usable for revision selection are revision number and state, creation date, author, and the
relation revision-of with its subclasses. With these attributes it is possible to express the
following example constraint:
For all version groups where the invoker has a revision checked out, select that revision;
otherwise use the most recent revision that is checked in and has state stable.
Constraints of this sort are called "configuration threads" in DSEE [25]. By adding a cut-off
constraint for the creation date (a maximum date), a configuration can be regenerated as it
would have been produced at a certain date.
Variants should be selected based upon the relation variant-of and user-de-fined variant
attributes, as described in Section 3.2. For example, one may want to choose a variant on the
basis of the hardware processors on which it can run. Note that a variant attribute may be
single-valued or set-valued. Using the previous example, a variant may actually run on several
processors. Single-valued attributes for differentiating variants were used in IN-TERCOL and
RCS [41, 38]. The Adele and No made configuration managers [12, 13] use sophisticated
constraints on attributes, including negation and conditionals. The latter can be used to specify
preferences, that is, if a certain constraint cannot be met, then some secondary choice may be
accept-able. A similar approach to preferences, based on a relational database for describing
generic configurations, is due to Bernard et al [4]. Winkler [44] discusses set-valued attributes
and introduces constraints expressed as functions over attribute values.

Additional selection criteria can be based on modification requests (see Sec-tion 7). For
instance, a constraint of the following sort would configure a new release:
Select the previous baseline. Let O be the set of objects in this baseline that have modification
requests to be addressed in the current release, and have a corrected revision for each request.
Replace the elements in 0 with the corrected revisions.
Parameters for the derivers finally select derived versions. An additional degree of freedom is
available here: If a certain parameter is left unspecified, the SCM system can make its own
choice. For instance, if the user does not care whether certain sub configurations have been
compiled with optimization on or off, SCM can choose whatever is available and save
derivation time that way.
If constraint-based version selection is available, it is straightforward to provide an automatic
function for constructing baselines. This function simply runs the selection process and records
the outcome. Recording the outcome involves creating new revisions in the visited
configuration groups. For example, revision 2.0 of S and R in Figure 2 could have been
generated automatically. It is convenient to store the constraints used to produce a baseline
along with it, in order to document the intent behind the baseline. Saving the constraints
permits a similar selection to be repeated at the next release time.
Module interconnection languages (MILs) take a different approach to version selection. They
concentrate on the interfaces among software modules. Type checking the interfaces assure that
only type-safe configurations are constructed. DeRemer and Kron [10] originated the concept of
a MIL, as a language separate from the programming language. Prieto-Diaz [32] gives an
extensive survey of the MILs developed since then. Most MILs suffer from not treating
interfaces as first class software objects. Thus, it is difficult to represent versions of interfaces.
This is a serious limitation, even though versions of interfaces do not arise as frequently as
versions of the implementing programs. Exceptions are the programming languages Mesa and
Cedar [28, 37]. Both provide a common sub-language for describing configurations, called CMesa. A key aspect is the distinction between interfaces and implementations. An interface
contains the types, variables, subprogram headers, etc., visible to clients of the interface,
whereas an implementation of an interface provides the subprogram bodies and data structures
invisible to clients. C-Mesa programs represent not only configurations, but also record the
relations has-implementation and has-client. The first relation holds between interfaces and
corresponding implementations; the second between interfaces and their clients. Both can be
viewed as subtypes of the general traceability relation, because the change of an interface must
trigger changes in affected implementations and clients. A serious limitation of C-Mesa is that
its version scheme distinguishes only two revisions, the current one and its predecessor.
Ada and Modulaalso separate interfaces and implementations and the relations hasimplementation and has-client make dependencies traceable. Ada and Modula do not provide a
separate configuration language. The implicit configurations and unnecessarily strict
recompilation rules in both languages make treatment of versions difficult.

6. Software Manufacture
Software manufacture is the process of generating derived objects. Using the AND/OR graph,
software manufacture operates on a baseline and produces a mirror image of that baseline
containing only derived objects. The nodes in that minor image are connected to the corresponding
nodes in the input baseline, showing the derivation history. In Figure 2, consider what must be
produced by compiling and linking revision 2.0 of S.
To speed up the derivation process, an SGM system must manage a cache containing derived
objects which are likely to be reused. Make [14] is a widely used program that uses a simple form
of such a cache. It is based on a time-stamp mechanism for deciding when to update the cache: If a
de-rived object is older than its input objects, then re derivation is necessary. Make also uses
simple rules to process objects based on their types. One such rule describes how to produce
machine code from C source code. Make can be combined with SCCS or RCS to provide a limited
versioning capability.
Despite its popularity, Make has a number of serious drawbacks for large-scale SGM. The times
tamping mechanism is inappropriate for determining whether a derived object can be reused. When
there are multiple versions, a time stamp is insufficient for deciding from which versions of input
objects a derived object was generated. Another problem is that Make does not record the
parameter settings on derivers. For example, it is impossible to decide whether a given machine
code module was produced with optimization turned on or off. Make also handles derivation
processes with inter-mediate objects inefficiently, because it always re derives a target object if its
intermediate objects have been deleted, regardless of whether the target is up-to-date. Finally,
Make provides derivation rules for atomic objects only; processing of configurations must be
programmed explicitly.
DSEEs handling of derived objects is more reliable. Each derived object carries a history attribute
that describes precisely how the object was produced, including version identifiers and parameter
setting. For high speed processing, DSEE performs parallel manufacture on idle workstations . A
remaining drawback is that DSEE provides no general rule for processing configurations; the
individual steps have to be programmed explicitly for every configuration.
Derivers start up automatically as soon as new source object versions are created. By running
derivations in parallel with the programmers activities, opportunistic manufacture attempts to
have derived objects ready ahead of time. This approach reduces programmer idle time. However,
a problem is limiting the combinatorial explosion of derivations caused by multiple versions.
Without a specific target configuration, almost all of the derivation runs after a change could be
useless.
Odin is a flexible system for managing derived objects. Similar to Make, it uses an extensible set
of rules that form a derivation graph for object types. Unlike Make, Odins rule language covers
derived configurations as well as atoms, and distinguishes sequences and composites. (Make only
has sequences.) Users need only indicate the objects to be combined in configurations, and Odin
determines how to process them, based on their types. For instance, it is not necessary to always re
describe how configurations are linked, or how documents consisting of several parts are
processed. Furthermore, composites handle derivation processes with more than one output
cleanly.
Odin also provides facilities for including quality control tests, such as regression tests, as part of
the derivation. In its cache of derived objects, Odin stores a full history attribute, including the
parameters used during derivation. Unfortunately, support for versioning is poor.

Automatic system manufacture guarantees that the correct derived objects are produced when
necessary. However, the cost of the processing involved may be too high. In large system families,
changing a single line in an object with shared declarations may trigger massive recompilations.
Many of these recompilations may be redundant, because the change may actually affect only a
small fraction of the compilation units. Selective recompilation mechanisms, such as smart
recompilation [40], reduce the number of redundant derivations. These mechanisms analyzes
changes for their effect and prevent redundant compilations when, for example, an unused
declaration is deleted, a new declaration is added, or a comment is changed.
Hood et al [17] generalize smart recompilation to recursive interface dependencies. Smarter
recompilation reduces the number of recompilations further by allowing harmless inconsistencies
to remain. As an example, consider a type declaration T used in a set S of source objects. Assume
we change T into T, and update a few source objects to be compatible with T. Suppose
furthermore we can partition S into a subset S1 in which only T is used, and a subset S2 in which
only T is used. If there are no interactions among S1 and S2 that depend in any way on T or T,
then recompilation of S1 is not necessary and smarter recompilation will suppress it. More
important than saving the recompilations is perhaps the fact that programmers can delay the work
of making the source objects in S1 compatible with T. Thus,programmers can test their changes
without having to wait for others to bring their modules up-to-date. Without a mechanism for
managing in-consistencies in this manner, programmers have to resort to the unsafe practice of
subverting the type checking and manufacturing system to get their work done.
7 Modification Requests
A modification request (MR) is a change proposal. General configuration management is MRdriven, that is, every change is initiated by one or more Mrs. Tracking of modification requests
makes it possible to answer questions about past, current, and future capabilities of a system
family, as well as providing important management data about project status. There is no reason
why SCM should not be MR-driven as well, yet few tools for managing software modification
requests exist. The author is aware of only two published tools: MRCS [23], a control system
running on Unix, and Crystal [2], an SCM system that integrates version control, MR tracking, and
project management.
Modification requests propose to correct errors, to modify existing system capabilities, or to extend
or contract capabilities. An MR may address any set of source objects in the software lifecycle:
requirements documents, de-sign documents, interfaces, program code, test data, documentation,
etc. An MR should be machine-readable and is itself a source object.
Versions of MRs do not seem necessary, but each MR has an attribute that reflects its state.
A useful set of states is submitted, rejected, accepted, delayed, in progress, and completed. When
an MR is first entered, it has state submitted. A review decides whether to accept or reject the MR.
A rejected MR is not discarded, but filed with a note describing the reason for rejection. A third
alternative is to delay an MR, which means that it will assure the state submitted again at a later
time for reconsideration. Once the work involved in an MR is assigned to a person, then the MR
assumes state in progress. State completed indicates the modifications required by the MR have
been performed and tested. To allow for effective tracing of an MR, the state attribute should not
just have the current value, but should actually log all previous states, including the times when the
state changes occurred. That way, it is easy to determine the history of MRs and to find MRs that
have fallen behind schedule.
Usually, each programmer is responsible for a set of related MRs. This set can be represented

naturally by a configuration. Configurations of MRs are often called tasks, and are associated with
a workspace for managing temporary objects.
Additional useful data items associated with an MR are the relations has-MR and has-change. The
first links an object with its MRs, the second an MR with the updates it caused. These relations
support MR-based selection, as illustrated in the second query in Section 5. The submitter or
reviewer of an MR establishes has-MR, while has-change is entered during check-in. To simplify
entry and prevent errors, check-in should allow selection from a menu of relevant MRs. This set
can easily be derived from the MR configurations in the work space.
Crystal implements the above relations. A simple experiment with bug reports on a medium-sized
system showed that software engineers can attach their MRs to the affected objects in an
unfamiliar system with high accuracy, provided the overall system architecture is explained with a
few sentences per object. Crystal therefore presents the submitter of an MR with a sophisticated
browser for locating relevant objects. This browser shows system configurations graphically and
lets the user read documentation as well as the existing MRs (to avoid duplication). As a heuristic
to speed up the search process, the browser even highlights "suspect" components, i.e., those that
changed relative to the last baseline. Once the relation has-MR has been entered, it opens several
possibilities for project management sup-port. The history of the objects can be inspected to
identify competent programmers for carrying out the changes. The history can also yield a rough
estimate for the time required for the change, by averaging past periods between check-out and
check-in. In Crystal, this information is used to update a PERT-chart of maintenance activities is
to accommodate versions of operations, for example versions of compilers. The benefits of
semantic modeling are greater conceptual clarity, direct representation of the model for machine
interpretation, more sophisticated operations and queries, and simplified implementation. Finally,
an interesting topic is building a maintainers assistant, i.e., a pro-gram that helps with carrying
out changes in complex software systems. The maintainer initiates a change, while the assistant
provides decision support and takes over the task of bringing the system back into a consistent
state. For example, the assistant detects all places that are affected by a given change and present
them to the programmer for update. It proposes corrections and perhaps even derives corrections
by observing the programmer. This approach is, of course, not limited to programs; it is just as
applicable to updating specifications or other formal representations consistently. Intensive
research in smart editing systems will be needed to achieve the goal of automating consistency
maintenance.

You might also like