SourceForge Logo

Topics
What is abicheck
What is an ABI
Documentation
Contact Info
Related Links
Project Page
Mailing Lists
CVS web interface
Source code


Introduction:

This project describes how techniques and tools used in Solaris for library interface definition and binary compatibility may also prove beneficial to Free and Open Source software development projects. Indeed, a number of the practices we describe here have already been adopted by the Linux GLIBC C library project [1,2]. The underlying theme of this project is working to ensure release-to-release binary stability: end-user's systems and applications keep on working even when other components of the system are upgraded.

The process we describe involves library developers (for example GLIBC, X11, GNOME, KDE, ...) defining the public interfaces of their libraries and continuing to provide those interfaces in an upward compatible manner. Any exposed internal interfaces that are not intended for application developer consumption are clearly marked as private. These private interfaces are part of the internal implementation (used, say, for communication within the package) and so do not need to evolve compatibly: by keeping developers off of these private interfaces the library system is free to evolve and modify the implementation aspect without breaking end-users. Also included in the process we describe is the practice of scoping local to each library as many symbols as possible to further reduce exposure to application breakage.

In general, the method we describe below is a useful scheme in terms of providing compatibility to a large established end-user population. This is because the costs of binary breakage (downtime, fixing, rebuilding and re-testing applications) is quite noticeable in that situation. We suspect that system distribution providers may find the techniques we describe to be most worthwhile, because they can be used to improve compatibility and stability for end users using their distribution. In the long term, however, basically everyone benefits from improved compatibility.

In a certain sense this project is complimentary to standards projects, the most important being the Linux Standard Base [3]. One way of looking at the difference is the Linux standard is working on compatibility across different Linux distributions, whereas the mechanisms discussed here are more focussed on the compatibility of a given distribution going forward in time: to avoid end-user application breakage as he upgrades components of his system (in particular, the entire distribution). Both types of compatibility work are important. To a certain extent they overlap each other in what they accomplish, yet they also help each other out by focusing on different areas.

It should be noted that the documentation and tools provided by this project do not (and cannot) by themselves provide a complete solution to the binary instability problem. Foremost, the project plan requires the participation and commitment of library providers to be successful. The more libraries (API/ABI's) that follow this plan, the more binary stability is enhanced and pays-off as time goes on. In addition, this project concentrates on defining and maintaining only a certain part of the interfaces an application depends upon (namely, the library binary interface between it and the "system" shared libraries). This scheme of course cannot stop all compatibility problems (e.g. changes in file formats or file locations), but historically since a good fraction of incompatibility occurs at this interface it is a good place to begin working.

What is the ABI:

The Application Binary Interface (ABI) is the set of supported run-time interfaces available for an application to use on the OS. The ABI is very similar to the API (Application Programming Interface), but differs in that it is the result of the source compilation process. C source code written to the OS API is transformed by the C compiler into a processor architecture specific binary for one of the ABI's (e.g. 32-bit or 64-bit address spaces) supported by the system.

The compilation process introduces several differences between the ABI and API which are important for binary compatibility:

  • Compiler directives (e.g., #define) can replace source-level constructs with different ones. The resulting binary may lack a symbol present in the source, or include one not present in the source.
  • The compiler may generate processor-specific symbols (e.g., arithmetic instructions) which invisibly augment or replace source constructs.
  • The compiler's layout of binaries may be specific to that compiler and the versions of the source language which it accepts. Thus identical code compiled with different compilers may produce incompatible binaries; this has been the case with C++.

The ABI is essentially where binaries of differing origins meet and have to work together as a single process to accomplish the task at hand for the application. There are many opportunities for failure, e.g.:

  • Missing libraries or shared objects.
  • Missing interfaces.
  • Incompatible changes in library interfaces.
  • Libraries needed by an application that, in turn, depend on a different (often incompatible) version of a third library that is also needed by the application.
  • Incompatible changes in the output and behavior of system commands, utilities, and files.

and so on. All of these need to be guarded against or otherwise applications will fail for the end-user after parts of their system or software are upgraded or replaced. There will be benefits if system integrators and ABI providers (e.g. library developers) can work successfully at further reducing exposure to application breakage. It may appear to be asking too much of the system integrators and library developers to do even more work with respect to compatibility (especially since much of this work is done on a volunteer and/or gratis basis), however the pay-off can be huge: the tens of millions of end-users vastly outnumber the number of developers.

The ABI is important because it determines whether or not a binary built on one release of the OS is able to run on subsequent releases. This release-to-release binary compatibility is of increasing importance to users because it means that their investment in applications can be preserved across upgrades. Put another way, fear of application breakage due to binary incompatibility is the single biggest reason for user reluctance to adopt a new release of system software and technology; binary compatibility allays that fear.

Defining the ABI:

Today a standard installation of a Unix operating system will have roughly 20,000 public symbol interfaces exported by the libraries it provides. The number of private interfaces in the libraries (used for intra-package communication e.g. library->library or utility-command->library) is of the same order of magnitude. Library interface management is a large problem and it will continue to get even larger: API's are growing rapidly, primarily fueled by the contributions from many Free and Open Source development projects.

It is true that in the life-cycle of an API there is a rapid growth phase where there are a great deal of changes, and many of these changes introduce incompatibilities. However, as time goes on certain (and eventually nearly all) parts stabilize. Due to the utility of the particular API, (and the utility resulting from applications that use it), dependencies upon that API grow, and hence it becomes increasingly important for it to be provided in a stable and well defined manner.

The task of maintaining an ABI stably is non-trivial. We describe here some techniques applied in Solaris over the past five years that aid in accomplishing this task. We describe a useful framework for defining and maintaining the ABI, but, of course, a good deal of effort is required on the part of library developers to adhere to this framework and focus on and maintain compatibility.

For a given release of the system (that contains a number of "independent" library packages, e.g. GLIBC, X11, GNOME, ...) the simplest way to define the ABI is at the symbol interface level: e.g. shared library libxyz.so.1 provides the list of public interfaces:

	{sym1, sym2, ...}

and exports (presumably out of necessity to communicate with its fellow libraries and utilities in the same package, but not for consumption by external applications) the list of private symbols:

	{private_sym1, private_sym2, ...}.

Everything else is scoped local to the library and cannot be accessed, even by other libraries or utilities in the same library package.

Ideally the public symbol information is not only described by manual pages and documentation, but also resides inside the shared library binary "libxyz.so.1" itself so as to avoid possible discrepancies.

In Solaris and in the GLIBC package of Linux a further step is taken that adds a rather useful structure to the set of public symbols. It is described as follows. When a library first appears, its public symbols are put into a single named set, for example:

	PUBLIC_1: {sym, sym, ...}

and the remaining private symbols are all placed in, for example, the named set:

	PRIVATE: {sym, ...}.

(These names are made up for the sake of example and are not currently used.) Now, when the next release of the library appears, it will likely have some new functionality in the form of new interfaces [4]. Then we add a new public set that reflects this new functionality:

	PUBLIC_2: {sym', sym', ...}

as well as the original PUBLIC_1 set and the PRIVATE set (the latter may have changed in an arbitrary way, but that doesn't matter because only co-shipped libraries and/or utilities in the same library package are supposed to use the private interfaces and for a given release they all work together).

Similarly, as more releases of the library come out, additional public sets are added: PUBLIC_3, PUBLIC_4, ... etc. All of this information, the set names and set members, is recorded in the library itself in special ELF sections. We refer to this procedure as "Library Versioning" [5].

Library versioning is useful in that it can be used to avoid renaming the shared object with new minor release version numbers as it evolves. That is, instead of the sequence of new files: libxyz.so.1.1, libxyz.so.1.2, libxyz.so.1.3, ... as the library evolves, the shared object name can remain fixed at libxyz.so.1 (as long as it evolves upward compatibly). Furthermore, the traditional minor release incrementing "1.1" -> "1.2" really only indicates "something was added". With library versioning recorded in the shared library itself, the information is exactly what was added: it is the interfaces listed in the PUBLIC_2 set.

When an application is built, the Solaris and GNU link editors (ld(1)) record in an ELF section of the resulting binary executable the highest level "watermark" required by the application for each versioned library (e.g. application binary "foo" needs "libxyz.so.1" at level PUBLIC_2 and "libabc.so.1" at level PUBLIC_1, etc). At runtime, the dynamic linker reads this information and while it loads the needed shared libraries it can also quickly check whether the required version levels are supplied by the loaded shared libraries. If not, it will exit with an error since it is known at this point at least one symbol needed by the application (or shared libraries) is missing [6].

The recording of the library version symbol information and needed version levels in shared libraries and executables and also the local scoping symbols practice (see the following section) are useful aspects of dynamic linking. However, a more subtle benefit is having the framework in place in the shared library's source tree for maintaining the monotonically increasing and upward compatible chain of public symbols (the PUBLIC_1, PUBLIC_2, sets in the above example) as well as scoping local any symbols that do not need to be visible.

Mechanics of Versioning Libraries:

We describe briefly here the basic technique used to add the versioning information to shared libraries. The complete details can be found in references [5] and [7].

Both the Solaris and GNU link editors ld(1) support the notion of library symbol versioning by use of an input file (usually called a "versioning mapfile" or "version script"). When this file is passed to the link editor (via command line arguments: ld ... -M <file> and ld ... --version-script=<file> on Solaris and Linux, respectively) and the shared library is assembled, the versioning mapfile will be parsed and its information recorded in special ELF sections of the library.

Here is a made-up example of a versioning mapfile for a fictitious library libfoo.so.1:

	PUBLIC_2 {
		global:
			symbolD;
			symbolE;
	} PUBLIC_1;

	PUBLIC_1 {
		global:
			symbolA;
			symbolB;
			symbolC;
	};

	PRIVATE {
		global:
			__fooimpl;

		local:
			*;
	};

When libfoo.so.1 was first released, it exported just the three public (i.e. intended for developer's use) symbols "symbolA", "symbolB", "symbolC" and it also exported the library package private symbol "__fooimpl". For intuition on the role of "__fooimpl", imagine that a companion library "libbar.so.1" is co-shipped with "libfoo.so.1" and libbar.so.1 occasionally calls __fooimpl() in libfoo.so.1 as, say, a private communication channel for use in the (current) library package implementation.

In a later release of libfoo.so.1 additional functionality was added in the form of the two new functions "symbolD" and "symbolE". Note the "watermark" chaining that occurs in the syntax: PUBLIC_2 includes all of the symbols at level PUBLIC_1. This chaining continues with subsequent releases of the library e.g.:

	PUBLIC_4 {
		global:
			symbolH;
			symbolI;
	} PUBLIC_3;

	PUBLIC_3 {
		global:
			symbolF;
			symbolG;
	} PUBLIC_2;

	...

etc. This mechanism emphasizes the strictly monotonic increase with time of the public interface offering: removing a public symbol (e.g. symbolD) in some later release is very bad since it will break all applications that require symbolD.

When an application binary is built the highest level of the public chain is recorded by the link editor. For example, if the application only used "symbolB" the version dependency "PUBLIC_1" for libfoo.so.1 would be recorded in the binary. If, however, the application used "symbolA", "symbolE", and "symbolF" then the level PUBLIC_3 would be recorded instead. If that application was then distributed to an older system with a library libfoo.so.1 that was only at level PUBLIC_2, then the runtime linker would immediately know something was wrong and would indicate the error and exit [8].

An additional benefit can be achieved with the library versioning technology. Once the symbol sets (e.g. PUBLIC_1, PUBLIC_2, ..., PRIVATE the above example) are defined it is also possible to provide a directive when building a shared library to scope local to the library all remaining symbols. This is done by the "local: *" directive in the mapfile shown above. Traditionally, scoping for libraries is done by using the "static" C keyword in defining internal-implementation functions. However, this only provides a per-file level of scoping, and will not work if the shared library is composed of a number of object files (.o files) and the internal implementation functions are called between the object files. The library versioning "local: *" directive allows the final scoping to occur when the whole shared object is assembled to remove any symbols that do not need to be exported.

This local scoping technique is a convenient and powerful way to make sure no internal implementation symbols "slip out" accidentally. If application developers, either accidentally or intentionally, started using these library-internal implementation symbols, their applications would be at a high risk of breaking in the future. One's initial reaction might be "too bad; that developer should not have used those symbols". However, if the application, customer, or installed base is "important enough" (by some measure) there could be pressure placed on the library developer to actually support these otherwise internal interfaces. It is best to use the "local: *" scoping technique to simply avoid this problem in the first place.

Tools to check for correct use of the ABI:

Once the library symbol versioning is in place in the libraries of a library package, including the "PRIVATE" labelling of the library-package internal symbols discussed above, it is a straight-forward matter to construct tools that can test for applications' conformance to public portion of the ABI.

Ideally, one would desire the entire ABI (i.e. all library packages on the system) to be versioned and classified as described above. However, even when only some of the library packages are versioned there is still much utility in checking against an application's usage against the portion of the ABI provided by those packages.

In Solaris, nearly all of the 30,000 symbols in the 160 shared libraries Solaris provides have been versioned in the manner outlined above. The version set names convention used in Solaris is "SUNW_n.m" for the public chain (where "n" and "m" are integers; "m" plays the role of the minor-release number in the traditional scheme), and "SUNWprivate" for the private symbol set. The "n" may be thought of as the major-release number, but it does not really matter because a major-release, n -> n+1 indicates incompatible change for which there would have to be a separate shared library.

The tool we provide in this project is called "abicheck". It is a simple perl script that runs system utility commands [9] to extract the dynamic bindings of a built executable [10]. For each symbol binding it deduces the symbol version set name the symbol resides in. If that symbol's version set matches "private" abicheck prints out a warning. The user may specify a different matching pattern on the command line.

abicheck is the core functionality of a tool used by Sun in Solaris application certification branding programs called "appcert" [11]. abicheck runs on both Linux and Solaris and is basically appcert with the certification "baggage" removed. We felt it was best to start with a simpler, straight-forward tool that is easier to understand and add enhancements to, rather than port all of appcert to Linux yielding a situation where a fair amount of functionality that doesn't really apply to the task at hand.

One feature that has been carried over from appcert, as an example of possible extensions to abicheck, is a useful check for static linking of system archive files (e.g. libc.a or libsocket.a). The practice of statically linking system libraries into application binaries is not good with respect to binary stability since the "old code" (from the archive) that is bolted into the application may fail to work properly when moved to newer or upgraded systems. The use of static linking of non-co-shipped libraries is strongly discouraged from the binary stability standpoint.

Here is example output from abicheck:

# uname -a
SunOS abi 5.8 Generic sun4u sparc SUNW,Ultra-1
# abicheck reader myclient gdate
reader: PRIVATE: (libc.so.1:SUNWprivate_1.1) _select
myclient: STATIC_LINK: libsocket.a
myclient: STATIC_LINK: libnsl.a
gdate: OK

This output indicates the application "reader" has latched onto a direct call to the private interface, _select(). It should be calling the published interface select(3C). The application "myclient" has statically linked in the networking libraries: libsocket.a and libnsl.a. The application binary "gdate" had no problems detected by the tool and so gets an "OK".

Here is some analogous example output from Redhat Linux 6.2:

# abicheck reader myclient /bin/date
reader: PRIVATE: (libc.so.6:GLIBC_2.1) __poll
myclient: STATIC_LINK: libc.a
/bin/date: OK

One important issue with respect to Linux is that currently no shipped libraries (i.e. libraries in a distribution) have collected the private symbols into a version set with a private label (e.g. there is no GLIBC_PRIVATE for the GLIBC library package). GLIBC is the only library package on Linux that has non-trivial library versioning, and so currently abicheck has hard-wired in the criterion used in GLIBC libraries that a leading underscore "_" indicates a private symbol. There are a number of exceptions to this rule, and so abicheck currently carries along an exception list. We hope in the future a GLIBC_PRIVATE set will be established in the GLIBC library package.

It should be emphasized that abicheck is an initial tool provided as an example of what can be done with public/private library versioning in place. In principle the checking that it does could even be moved to the dynamic linker itself thereby making the abicheck script obsolete. The important thing we feel is to spread meaningful library versioning to many library packages beyond GLIBC (e.g. X11 and GNOME) and to also adopt the private symbol set classifying of library-package internal symbols. Then tools that use the library symbol versioning information become more useful in checking for binary stability and also place a useful structure for library interface definition and maintenance and local scoping in the library package source tree. As API-providing libraries become larger and more complicated (and also more numerous) it will likely pay-off to have this infrastructure in place.

Discussion:

The process described here (library symbol versioning using "versioning mapfiles" including the creation of a SUNWprivate set for Solaris internal interfaces) has been in place in Solaris for a number of years. Currently, (Solaris8) nearly all of the libraries shipped in Solaris (including library sources imported into Solaris: e.g. X11, Motif, and CDE) have this interface definition practice. It is not absolutely clear how much Free and Open Source library development projects will benefit from this type of practice. However, we feel it is likely they will benefit a great deal and certainly believe practices of this sort are worth looking into.

An interesting distinction comes about in that Solaris is basically shipped as a monolithic blob, whereas Open Source operating systems tend to be much more modular with respect to their ABI's. This is true in principle at least, since it is likely most end-users install a particular distribution (that is also a monolithic blob). In any event, the interesting possibility exists that there will be more fruitful ways to create the library interface definitions and versioning than has been done for Solaris. We feel a namespace separation based on library package name (e.g. GNOME_1.4, GNOME_PRIVATE; GLIBC_n.m.l, GLIBC_PRIVATE) is a useful generalization beyond what is done in Solaris. Additional practices may be discovered as useful.

Also interesting is the possibility of "softening" the public/private distinction. As an API is going through a rapid initial growth phase it may be useful to have three categories of symbols: Public, Evolving, and Private. The public ones are set in stone and the library package is committed to maintaining their compatibility, and as before the Private ones are internal-only. The "Evolving" category would contain experimental interfaces. These may change incompatibly (e.g. by changing their arguments, their behavior, or disappear entirely). A developer concerned in producing stable applications should work to stay away from the Evolving interfaces until they are stabilized and have been moved to the Public set; on the other hand a developer needing to take advantage of the experimental interface may decide the potential binary incompatibility for his distributed application is worth the risk.

Extensions:

We feel the plan outlined above is the main message of this project and we hope that the practices will be adopted, suitably modified, by open source projects.

There are, however, some interesting extensions one can apply to the library interface definition practice (in the spirit of IDL) that provide useful by-products.

Rather than maintain regular mapfiles/version-scripts as described in detail above, the library package source tree can maintain simple ASCII repository files (one per library, say) that contain additional information about the library interfaces. The repository will have a number of different fields for each interface, for example:

  • Function signature (i.e. return value type, named arguments and with their associated types).
  • Data variables have the variable type and length recorded.
  • The list of needed C header files (in, say, format) associated with the interface and any required libraries (in, say, -l format).
  • The name of the library version (in the sense discussed in the previous sections) the symbol resides in (e.g. GLIBC_2.2.1 or GLIBC_PRIVATE)
  • Any architecture related information (e.g. a list of different architectures on which the symbol is present).
  • Conditions that indicate an exception has occurred when the interface is called (e.g. return value is NULL).
  • The list of the errno's associated with the interface.
  • Whether the symbol is associated with any interface aliases (i.e. weak symbols)
  • Descriptions or comments about the interface.
and other information. By having such a repository for each library a library developer has a central location to go to when he adds a new interface to the library (or if there are changes to an interface). The central location provides the definition of the interface: one does not need to go rummaging around in header files and C source to find the interface definition and other information.

Some applications using the above sort of fields come to mind:

  • Automatic generation of (some of the) documentation for the interface (e.g. manpages typically document function signatures, required libraries and header files, interface exceptions and errno's).
  • Generation of mapfiles/version-scripts used at the library's build time to record the versioning information.
  • Creation of comparison tools to detect incompatible changes to public interfaces in libraries (e.g. a tool to compare the signatures of interfaces in an earlier release or build with the current build).
  • Use of the function signatures and other information to create lint and debugging versions of the library.
  • Use of the function signatures and other information to create dynamic tracing and "pretty-printing" of calls to the library interfaces. This can be used by developers to debug their applications, and also by end-users to troubleshoot problems encountered in the field.

Most of the libraries that are shipped with Solaris are maintained with an interface repository scheme like that described above. These are used in the Solaris build to create mapfiles that are passed to ld(1) to record library versions, monitor interface incompatibilities, and to generate code used to create tracing utilities for the apptrace(1) tool that is available on Solaris 8 and later.

apptrace uses the link-auditing [5] feature of the Solaris dynamic linker that allows library bindings to be intercepted and replaced by bindings to one's own functions. This mechanism is in the spirit of LD_PRELOAD schemes but is more flexible. The apptrace interceptors for the Solaris libraries act as wrappers for each function call that pretty-print the function's arguments and return value along with calling the actual Solaris library function (so that the application proceeds as normal). It is a good deal faster than tracing tools such as truss(1) and strace(1) that require breakpoint services from the kernel.

Interface call interception coupled with the information in the library interface repository can be generalized to a number of interesting applications, such as creating convenient library "unit tests" and also application fault injection tools (i.e. faked error conditions are passed up to the application). In principle, one can imagine recording much if not all of the repository information into the shared libraries themselves so that information will be available for, and encourage the development of, library interface (ABI) related tools.

References:

[1] http://www.gnu.org/software/libc/

[2] For convenience we will use the short term "Linux" to really mean a Linux kernel based operating system. I.e. a system composed of much free software (e.g. GNU, XFree86, ... etc.) placed around the Linux kernel to yield a complete operating system. Examples include Debian GNU/Linux, Redhat Linux, and SuSE Linux.

[3] http://www.linuxbase.org/

[4] New functionality can, of course, be added to existing interfaces. For example, by passing new values of a parameter to the interface. The behavior of the existing interface cannot change in a way that breaks existing applications.

[5] The Solaris Linker and Libraries Guide may be found in the documentation collection at: http://docs.sun.com/ab2/coll.45.13

[6] This is analogous to setting the LD_BIND_NOW environment variable (i.e. lazy binding is turned off) and looking for unresolved symbols, however the versioning scheme has no measurable impact on application performance.

[7] See the URL in [1] and the "Commands" -> "Version Script" section of the GNU linker "ld" info page (e.g. /usr/info/ld.info*)

[8] This behavior can be overridden via setting an environment variable LD_NOVERSION on Solaris.

[9] On both Solaris and Linux abicheck runs the ldd(1) command with environment LD_DEBUG="files,bindings" to retrieve the dynamic linker's information about dynamic symbol bindings. Additionally, it will run pvs(1), dump(1), and elfdump(1) on Solaris and objdump(1) on Linux to extract additional information about the binaries.

[10] Shared objects may also be checked, but this can often lead to difficulties if not enough information is recorded in the shared object.

[11] http://www.sun.com/developers/tools/appcert/

[12] http://www.usenix.org/publications/library/proceedings/als2000/full_papers/browndavid/browndavid.pdf


Last Updated 2002-01-28.