Chapter 1. Introduction to ClearCase

Chapter 1. Introduction to ClearCase

ClearCase is a comprehensive software configuration management system. It manages multiple variants of evolving software systems, tracks which versions were used in software builds, performs builds of individual programs or entire releases according to user-defined version specifications, and enforces site-specific development policies.
These capabilities enable ClearCase to address the critical requirements of organizations that produce and release software:
  • Effective development — ClearCase enables users to work efficiently, allowing them to fine-tune the balance between sharing each other's work and isolating themselves from destabilizing changes. ClearCase automatically manages the sharing of both source files and the files produced by software builds.
  • Effective management — ClearCase tracks the software build process, so that users can determine what was built, and how it was built. Further, ClearCase can instantly recreate the source base from which a software system was built, allowing it to be rebuilt, debugged, and updated — all without interfering with other programming work.
  • Enforcement of development policies — ClearCase enables project administrators to define development policies and procedures, and to automate their enforcement.

ClearCase Data Structures

Figure 1-1 shows a development environment managed by ClearCase. At its heart is a permanent, secure data repository. It contains data that is shared by all users: this includes current and historical versions of source files, along with derived objects built from the sources by compilers, linkers, and so on. In addition, the repository stores detailed “accounting” data on the development process itself: who created a particular version (and when, and why), what versions of sources went into a particular build, and other relevant information.

Figure 1-1. ClearCase Development Environment


Only ClearCase commands can modify the permanent data repository. This ensures orderly evolution of the repository and minimizes the likelihood of accidental damage or malicious destruction.
Conceptually, the data repository is a globally accessible, central resource. The implementation, however, is modular: each source (sub)tree can be a separate versioned object base (VOB). VOBs can be distributed throughout a local area network, accessed independently or linked into a single logical tree. To system administrators, modularity means flexibility; it facilitates load-balancing in the short term, and enables easy expansion of the data repository over the long term.

Note: The repository can even be distributed over a wide-area network, or to sites that have no live data connection at all. This capability is implemented by Atria's MultiSite product, available separately.


Views and Transparent Access

As Figure 1-1 illustrates, users access the ClearCase data repository through views. A view is a software development work environment that is similar to — but greatly improves on — a traditional “development sandbox”. Each view can easily be configured to access just the right source data from the central repository:
  • the up-to-date versions for development of the next major release
  • the versions that went into the port of Release X.Y to hardware architecture Z
  • the versions being used to fix bug #ABC in Release D.E
A view is an isolated “virtual workspace”, which provides dynamic access to the entire data repository. The changes being made to a source file in a particular view are invisible to other views; software builds performed in a view do not disturb the work taking place in other views.
Working in views, ClearCase users access version-controlled data using standard pathnames and their accustomed commands and programs. The view accesses the appropriate data automatically and transparently.
A view's isolation does not render it inaccessible; a view can be accessed from any host in the local area network. For example, a distributed build involves execution of build scripts on several hosts at once, all in the same view. Similarly, a view can be shared by several users, working on a single host or on multiple hosts. One user might “peek” into another's view, just to see what changes are being made to a particular file.

Version Control

The most basic requirement for a software configuration management system is version control — maintaining multiple versions of software development objects. Traditional version-control systems handle text files only; ClearCase manages all software development objects: any kind of file, and directories and links, as well.
Versions of text files are stored efficiently as deltas, much like SCCS or RCS versions. Versions of non-text files are also stored efficiently, using data compression. Version control of directories enables the tracking of changes to the organization of the source code base, which are just as important as changes to the contents of individual files. Such changes include creation of new files, renaming of files, and even major source tree “cleanups”.

Versioned Object Bases (VOBs)

ClearCase development data is organized into any number of versioned object bases (VOBs). Each VOB provides permanent storage for all the historical versions of all the source objects in a particular directory tree. As seen through a ClearCase view, a VOB seems to be a standard directory tree — the “right” versions of the development objects appear, and all other versions are hidden (Figure 1-2). How this works is described in section “Environment Management”.

Figure 1-2. VOBs Appear in Views as Ordinary Directory Trees


A version-controlled object in a VOB is called an element; its versions are organized into a version tree structure, with branches and subbranches (Figure 1-3. As this figure shows, branches have user-defined names, typically chosen to indicate their role in the development process. All versions have integer ID numbers; important versions can be assigned version labels, to indicate development milestones — for example, a product release.

Figure 1-3. Version Tree of an Individual Element

Parallel Development

Each (sub)branch in an element's version tree represents an independent “line of development”. This enables parallel development — creating and maintaining multiple variants of a software system concurrently. Creation of a variant might be a major project (porting an application to a new platform), or a minor detour (fixing a bug; creating a “special release” for an important customer).
The overall ClearCase parallel development strategy is as follows:
  • Establish a baselevel — Development work on a new variant begins with a consistent set of source versions, identified (for example) by a common version label.
  • Use dedicated branches — All changes for a particular variant are made on newly-created branches with a common name.
  • Isolate changes in views — Development work for a particular variant takes place in one or more views that are configured to “see” the versions on the dedicated branches.
For example, changes to several source files might be required to fix bug #819, which was reported in Release 2.6. For each file element, the changes are made on a new branch (named fix819), created at the “baseline” version (labeled RLS2.6). The view in which a user works to fix the bug sees the fix819 branch versions, or else “falls back” to the baseline RLS2.6 version (“View 1” in Figure 1-4). For contrast, this figure also illustrates another view, configured to select different versions of the same file elements.

Figure 1-4. Parallel Development


This strategy enables any number of views — and thus any number of development projects — to be active concurrently. All the views access the required source versions from the shared data repository.

Merging Branches

There is an additional important aspect of the ClearCase parallel development strategy. Work performed on subbranches should periodically be reintegrated (merged) into the main branch, the principal line of development. ClearCase includes tools that automate this process.

Extended Namespace

Most of the time, a user needs just the one version of an element that appears in his view. In some situations, however, he needs convenient access to other versions. Examples include merging the changes made on a subbranch into the main branch, and searching all the versions of an element for an old phrasing of an error message.
ClearCase makes access to historical versions easy, by extending the standard file/directory namespace. In essence, the entire version tree of every element is embedded under its standard pathname. Most of the time, the version tree remains hidden; but special version-extended pathnames allow any program to access any (or all) of an element's versions (Figure 1-5).

Figure 1-5. Version-Extended Pathnames

Environment Management

A software configuration management system must provide a flexible, efficient collection of “development environments”, or “workspaces”, in which users can do their work. ClearCase views fulfill this role, providing these services:
  • access to the appropriate versions of development sources
  • private data storage for use in day-to-day development tasks
  • isolation from activity taking place in other views
  • automatic and user-requested facilities for sharing data with other views, when appropriate

Views and Transparent Access

As described in “ClearCase Data Structures”, a view directly accesses the version-controlled elements in the permanent, shared data repository. There is no need to copy the versions required for a particular project to a view; instead, the correct versions are accessed dynamically. A particular version of each element is selected according to user-specified rules in the view's config spec (“configuration specification”): a file element appears to be an ordinary file; a directory element appears to be an ordinary directory.
The overall effect of automatic version selection is transparency: the version-control system becomes invisible, so that a VOB appears to be a standard directory tree (Figure 1-6). This key feature enables ClearCase to work smoothly with standard system software, third-party commercial applications, and a development team's “home-grown” software tools. Users do not have to discard their accustomed ways of working, or their existing tools. For example, such standard programs as grep, more, ls, and cc will work the same way on ClearCase data as on non-ClearCase data.

Figure 1-6. Version Selection by a View


Views are dynamic — config spec rules are continually reevaluated. This means that a view is open-ended; as new data is added to the central repository, it is immediately accessible to all views. It also means that a view's configuration can be instantly modified — for example, to “shut out” a recent destabilizing change to the repository.

The View as Isolated Workspace

In addition to providing automatic version selection, a view provides an isolated workspace in which users perform such tasks as editing source files, compiling and linking object modules, and testing executables. Any number of users can work in the same source directory, building the same programs; they will never interfere with each other, as long as they work in different views. Conversely, two or more users working together closely can share a single view.
A view's isolation is achieved, in part, by its having a private storage area. This area is principally used to store:
  • source files that are being edited by the user(s) working in the view
  • derived objects produced by software builds—object modules, executables, and so on
This area is also used for incidentals, such as text-editor backup files and cut-and-paste temporary files.

The `Virtual Workspace'

All the files in view-private storage appear to be in the appropriate VOB directory, even though they are (typically) stored on the user's workstation, rather than in the central data repository. That is, the view combines objects in view-private storage with objects in the shared repository to form an isolated “virtual workspace”.
Figure 1-7 shows a listing of a VOB directory, as it appears in the “virtual workspace” created by a view.

Figure 1-7. View Development Environment

Example: Editing Source Files in a View

A user, working in a view, enters a checkout command to make a source file editable (Figure 1-8). This seems to change a file element in the data repository from read-only to read-write. In reality, ClearCase copies the read-only repository version to a writable file in the view's private storage area. This writable file, the checked-out version, appears in the view at the same pathname as the file element; the view accesses this editable, checked-out version until the user enters a checkin command, which updates the repository and deletes the view-private file.

Figure 1-8. Checkout/Checkin and View-Private Storage

The View as Shared Resource

Subject to access permissions, a view is a shared resource, available on all hosts in the local area network. Each view is globally accessible through a simple name, its view-tag. An individual user uses the same view on several hosts during a distributed software build. During an integration period, several users might share a single view, each using his or her own workstation.

View-Extended Naming

A user sometimes needs to compare (or otherwise manipulate) the data seen through two or more views. Access to multiple views in a single command is made possible through view-extended naming. “Extended Namespace” describes how ClearCase extends the file system “downward” by embedding an element's entire version tree under its pathname. Similarly, ClearCase extends the file system “upward” by creating a virtual super-root directory (the viewroot) which conceptually contains all active views. A view-extended pathname accesses the version of a particular element that is seen by a particular view (Figure 1-9).

Figure 1-9. View-Extended Pathname

Build Management

ClearCase supports makefile-based building of software systems. This means users can continue to build systems using their accustomed procedures. They can even use the same tools — for example, a host's system-supplied make program or a third-party build utility. ClearCase's own build program, clearmake, provides compatibility with other make variants, along with powerful enhancements.

Build Auditing

clearmake's fundamental enhancement is build auditing: monitoring of file system activity during a software build, at the system-call level. clearmake implements this capability by working with ClearCase's virtual file system extension, the multiversion file system (MVFS).
Build auditing enables complete and automatic documentation of software builds. A build's “bill-of-materials” and “assembly instructions” are preserved in configuration records (Figure 1-10)

Figure 1-10. Build Auditing and Configuration Records


The files produced by a build (object modules, executables, libraries, and so on) are cataloged in the central repository as derived objects.
Users can compare different builds of the same program — different derived objects built at the same pathname — through their configuration records. Moreover, clearmake automatically uses configuration records during subsequent builds, to implement additional build enhancements.

Build Avoidance

Standard make programs support incremental building of software systems through build avoidance. A “make” of an entire system actually rebuilds only those components that need to be rebuilt, because they are out-of-date with respect to the corresponding source files.
clearmake's build-avoidance scheme is more sophisticated, and specifically designed for use in parallel development situations. Typically, each user modifies only a few source files at a time to produce a variant of a software system. If the same version of a particular source file is used by several programmers, it would be compiled to exactly the same object module in each of their views. clearmake uses configuration records to detect such situations; instead of performing redundant builds, it causes a single derived object to be shared among the views (Figure 1-11). This facility, termed wink-in, saves both disk storage and build time.

Figure 1-11. Derived Object Sharing


Automatic Dependency Detection

Configuration records enable automatic checking of source dependencies as part of build avoidance. All such dependencies (for example, on C-language header files) are logged in a build's configuration record, whether or not they are explicitly declared in a makefile.

Build Script Checking

Configuration records also enable the build-avoidance algorithm to include checking of a target's build script. If the build script has changed, clearmake rebuilds the target. Many make variants ignore build-script changes, and thus fail to perform a rebuild when it is actually required.

Building on Remote Hosts

clearmake supports efficient building of large software systems through its ability to execute multiple build scripts in parallel, and to distribute build script execution to a group of hosts in the local area network. A tunable load-balancing scheme optimizes uses of network resources during a distributed build.
A build can even take place on a host where ClearCase itself has not been installed. This feature is particularly valuable for organizations that support multiple hardware/software architectures, including some not supported by ClearCase.

Process Control

ClearCase provides mechanisms for monitoring and controlling the development process itself. ClearCase does not attempt to impose its own particular policies or procedures — instead, it includes a flexible, powerful toolset, which administrators can use to implement an organization's existing policies.
Process management comprises several functional areas, which ClearCase addresses both with static mechanisms (control structures) and dynamic mechanisms (procedures). Some of the mechanisms are completely automatic; others are created and/or controlled by users and administrators.

Information Capture and Retrieval

ClearCase automatically logs each change to the data repository in the form of an event record, providing an audit trail of development activities. ClearCase includes commands for creating reports based on these records, with many selection and filtering options. Such reports can include the event records for a single version-controlled object, for any user-specified set of objects, or for entire VOBs. Event records can be retained indefinitely, or can be “scrubbed” selectively on a periodic basis, in order to conserve disk space.
When a user merges the changes made on one branch of an element into another branch, ClearCase automatically writes an event record, and also connects the merged versions with a merge arrow (see Figure 1-3). This makes it easy to track (and often, to fully automate) the process of integrating work performed on subbranches back into the main line of development.
Merge arrows are a special case of a more general mechanism for indicating a relationship between two objects. Any two objects in the central repository can be connected with a logical arrow called a hyperlink. This capability addresses such process-control needs as requirements tracing.

Meta-Data Annotations

To supplement the information automatically captured by ClearCase, users can explicitly annotate file system objects. Such annotations are termed meta-data. The hyperlinks and merge arrows discussed just above are one form of meta-data; so are the version labels first discussed on “Version Control”. Attributes provide yet another annotation facility, in the form of name/value pairs. For example, an attribute named CommentDensity might be attached to each version of a source file, to indicate how well the code is commented. Each such attribute might have the value "unacceptable", "low", "medium", or "high".

Notification Procedures

Virtually any operation that modifies the data repository can trigger the execution of a user-defined procedure. A typical use for this capability is to notify one or more users that the operation took place. For example, a trigger on the checkin operation might send mail to the QA department, explaining that a particular user modified a particular file. Special environment variables make the relevant information available to the script or program that implements the user-defined procedure.
In addition to performing notification tasks, triggers can automate a wide variety of process management functions — for example:
  • adding meta-data annotations to the objects that were just modified
  • logging information that is not included in the event records that ClearCase creates automatically
  • initiating a build procedure and/or source-code-analysis procedure whenever certain objects are modified

Policy Enforcement

Every organization has its own “rules of the road”, which provide guidance (gentle or otherwise) as to where, when, and how development activities are to take place. ClearCase's trigger mechanism, introduced in the preceding section, provides a flexible tool for implementing development policies. In particular, a trigger can impose any user-defined requirement or prerequisite on any operation that modifies the data repository.
For example, a trigger might fire whenever a user attempts to checkin a new version of a critical file. The trigger procedure can subject the user and/or the file to any kind of test — and if the test fails, the procedure can cancel the checkin.

Access Control

Various objects in the data repository can be locked, which prevents them from being modified or used. Locks can be fine-grained (for example, locking a particular branch of a particular element) or general (for example, locking an entire VOB). A typical application is locking just the main branch of all elements during a software integration period, except to those few users who will be performing the integration work.
Access modes, or permissions, apply to all elements. Permissions control reading, writing, and executing of objects at the traditional levels of granularity: user (owner), group, and other. They also apply to the physical storage in the underlying file system. Protections effectively thwart attempts to circumvent ClearCase and tamper with the raw data storage.

ClearCase Client-Server Architecture

ClearCase is a “groupware” product, with a distributed client-server architecture. Both the programs that implement ClearCase functions and the development group's data can be distributed throughout a local area network. This makes ClearCase scalable — as workstations are added to the network to accommodate additional users, ClearCase's data-storage and data-processing resources increase, as well.

Note: Using the MultiSite extension to ClearCase to wide-area networks, as well — even networks whose only data communications channel is magnetic tape transfer.

Figure 1-12 shows a typical distribution of ClearCase programs and development data in a network. The data storage is organized as follows:
  • The permanent, shared data repository is implemented as a collection of versioned object bases (VOBs). Several VOBs can be located on the same host; the practical limit is a function both of disk space and of processing resources.
  • Users have individual (or shared) work areas, views, each of which has a private data storage area. A view's storage area is typically located on a user's individual workstation. Central server hosts can also be used — for example, for a shared view or a view in which an entire application will be rebuilt “from scratch”.
  • For increased flexibility, the data storage for an individual VOB or view can be distributed across two or more hosts.
Users access this data with ClearCase client programs (for example, the clearmake build utility), along with standard operating system facilities (text editors, compilers, debuggers) and third-party applications. Access to the data stored in VOBs and views is mediated by ClearCase server programs. Client and server processes communicate with each other using remote procedure call (RPC) facilities. This makes ClearCase network-transparent — users need not be concerned with the physical location of data storage; ClearCase servers make the data available globally.

Figure 1-12. ClearCase Distributed Client-Server Architecture

ClearCase Interfaces

ClearCase has both a command-line interface (CLI) and a graphical user interface (GUI). The CLI is implemented as a set of executables, stored in /usr/atria/bin. (Each user should add this directory to his or her search path.)
The “first among equals” of the CLI utilities is cleartool; through a set of subcommands, it provides the functions performed most often by users: checkout, checkin, list history, display version with annotations, and so on. cleartool uses multicharacter mnemonic options:
% cleartool checkin -identical -nc util.c hello.h
“Checkin files util.c and hello.h, without any comments; do the work even if the new version is identical to its predecessor.”
% cleartool lshistory -since yesterday.17:00 -recurse /vobs/proj/src
“List all events that occurred since 5 pm yesterday, pertaining to objects within the directory tree at /vobs/proj/src.”
% cleartool merge -to msg.c -version /main/alpha_port/LATEST
“Merge the most recent version on the alpha_port branch of file msg.c into the version I'm editing.”
The ClearCase GUI includes several point-and-click programs:
  • xclearcase provides a “master control panel” that is both easy to use and thoroughly customizable. Users can examine and select both their file system data and ClearCase meta-data, with a variety of browsers.
  • xlsvtree displays the version tree of an element, making it easy both to determine how an element has evolved, and to select particular versions for comparison or merging.
  • xcleardiff is a flexible tool for comparing and/or merging the contents of multiple versions of an element, or any other files.
Figure 1-13 illustrates some of the features of the ClearCase GUI.

Figure 1-13. ClearCase Graphical User Interface

ClearCase Documentation

In addition to this manual, the CASEVision™/ClearCase Concepts Guide, the ClearCase printed documentation set includes:
CASEVision™/ClearCase Tutorial
Important information on setting up a user's environment, along with a step-by-step tour through ClearCase's most important features.
CASEVision™/ClearCase User's Guide
Background information and step-by-step procedures for use by individual users.
CASEVision™/ClearCase Administration Guide
Background information and step-by-step procedures for use by ClearCase system administrators.
CASEVision™/ClearCase Reference Pages
All the ClearCase manual pages, for programs, data structures, and administrative utilities.

Documentation

All CLI utilities can display usage syntax summaries. The cleartool utility's help subcommand can display a usage message for individual subcommands:
% cleartool help checkout
Usage: checkout | co [-reserved | -unreserved]
[-branch branch-pname]
[[-data] [-out dest-pname] | -ndata]
[-c comment | -cq | -cqe | -nc] pname ...
The ClearCase manual pages reside within the ClearCase installation area. Users can access these manual pages with the cleartool man subcommand, or with the standard UNIX man(1) facility.
Each of the ClearCase GUI programs has its own context-sensitive help facility. Installation instructions, release notes, and supplementary technical notes are also provided in the ClearCase software.