Applications POSIX.1 conformance testing

Derek Jones
derek@knosof.co.uk
Knowledge Software Ltd
Farnborough, Hants
UK

ABSTRACT

The Standards for POSIX and C were designed to enable the portability of applications across platforms. A lot of work has gone into checking compilers and environments for conformance to these Standards, but almost nothing has been done to check applications conformance. The incorrect assumption being made that the development compiler will warn about any construct that needs looking at. This paper discusses a tool that checks applications software for conformance to these Standards at compile, link and runtime as well as the library interface. Any application that can pass through this checker without producing any warnings is a conforming POSIX program and a strictly conforming C program.

Introduction

POSIX was designed as a standard environment to enable the portability of applications software and to some extent people. This portability of applications software is achieved through the specification of a set of services that every POSIX conforming application can expect to exist on a conforming platform.

For a Standard to be of practical benefit there has to be a method of measuring adherence to its requirements. Work on test suites to check environments for conformance to POSIX are well advanced. There are at least three such suites commercially available. However, the checking of an applications' conformance to POSIX has not received nearly as much attention.

Since the basic goal of POSIX was to enable applications portability through a Portable Operating System Interface it is about time that this imbalance was redressed. This article is about a tool set that was designed to check applications software for conformance to POSIX.1 (using the C bindings). What is described here could equally well apply to the X/Open Portability Guide (XPG) or AT&T's SVID.

The tools used in the POSIX conformance checker were all derived from the Model Implementation C Checker. This Model Implementation was designed to check C programs for strict conformance to the C standard. Model Implementations have been produced for Pascal and Ada. In March 1989 the British Standards Institution signed an agreement with Knowledge Software to produce one for C. The Model Implementation was formally validated by BSI in August 1990 (it was the joint World first validated C compiler) and is currently being used by the validation suite producers to check their own software.

Before describing the tools themselves this paper first discusses the need to check applications software and what the relevant standards have to say about what constitutes a conforming (and therefore potentially portable) program.

Why check?

POSIX offers the software developer the opportunity for a significant reduction in cost and effort when porting applications to different platforms. Given the benefits what stands in the way of creating POSIX conforming applications? There are two main reasons why applications fail to conform to the requirements of POSIX. The immediate problem is one of know how and old habits. Once these are overcome problems are caused by human oversight and error.

Because of the broad range of services offered it can take some time for developers to think POSIX. Old, Unix, programmer habits and know-how are easily transfered to a POSIX development environment. Programmers cannot be expected to be familiar with all the intricacies of POSIX and how it differs from what they are familiar with. Speaking Unix with a POSIX accent will not solve portability problems, particularly to proprietary platforms that support POSIX. It is necessary to speak POSIX as a native language and if using Unix perhaps with a Unix accent. Training can go someway towards ensuring a smoother transition to a POSIX only environment.

Experience over 40 years of software development has shown that it is impossible to produce any significant applications that do not contain bugs. The same principle holds true for writing POSIX conforming applications. Mistakes will be made.

So some means of independently and accurately checking conformance could uncover the majority of these problems and save a considerable amount of time and money later. Studies have shown that the later a problem is discovered the more expensive it is to fix. Thus the obvious time to find these non conforming constructs is prior to the release of the software.

From the marketing perspective Open Systems are being demanded by users. Use of an independent verification tool to check conformance will add weight to any claims of conformance to Open Systems Standards by vendors. From the users perspective demanding such verification is a useful means of ensuring vendor compliance with any Open Systems agreements that they may have.

For those developers considering a move to POSIX information provided by a checking tool can be used to provide an estimate of porting costs for existing applications. By providing hard information on likely problems time/cost estimates for porting an application are likely to be much more accurate than uniformed estimates.

Don't compilers check?

The development compiler is only likely to check for constraint and syntax errors, since it is these constructs that a conforming implementation is required to detect (and must detect in order for a compiler to validate).

One of the principles behind the drafting of the C standard was that existing code should not be broken by wording in the standard. This meant that in many cases the behaviour was left undefined or implementation defined. By not specifying what had to be done, compiler implementors were free to make their own decisions. Thus preserving the correctness of existing, old code. So in general compilers are silent on those constructs whose behaviour may vary across implementations. This freedom means that C programs can behave differently with different ISO validated C compilers, even on the same machine. There are no requirements on compilers to flag occurrences of these non constraint/syntax errors.

The C Standard committee also recognised that compiler vendors would have to rely on existing tools to link separately compiled units together. Since existing linkers were unlikely to check for cross module inconsistencies in external variables and functions it was felt that the C Standard should not mandate such checks.

Runtime checking is not considered to be in the spirit of C programming. Thus compilers do not generate code to check that pointers are within bounds, that the correct number of parameters are passed or check that any of the runtime conditions are violated.

What to check

Having shown the benefits of conforming to POSIX and that the best way of achieving this is to use some form of checking tool we now have to investigate what constructs ought to be flagged and why. There are two main sources of information on constructs that ought to be checked to achieve applications portability:

The text of Standards documents. Here we are interested in applications written in the C language. So the relevant standards are the C language standard (ISO 9899) and the C language bindings provided by the POSIX.1 (ISO 9945-1) standard.

Practical experience. The sources for this information tend to be first hand experiences and conversations with developers on problems that they have encountered. Books on software portability are starting to appear. But on the whole these tend to give general guidelines rather than specific cases. On problem with specific cases is that they go out of date. As compilers and O/S's evolve problems disappear and new ones appear.

The core of the POSIX checker is driven by the requirements given in the C and POSIX.1 standards. Messages are categorised in exactly the same manner as the standards documents. Also any construct that falls into any category (except conforming code) is flagged. Provided with these core checking abilities the user can then provide configuration information (done via source and target profiles, discussed later) to switch off any messages that are not of interest.

Thus no justification, other than appearing in a standards document, is given for flagging these core constructs. Those developers familiar with the standards process will know that the contents of standards are sometimes driven by immediate political needs rather than technical merit. Attempting to weed out the political from the technical issues was not considered to be worthwhile. Matters are greatly simplified (from our point of view) by simply handling all constructs.

The necessity for checks based on practical experience occurs because we live in an imperfect world. Operating systems and compilers do not fully conform to standards and contain bugs. In some cases these bugs are actually features, they are there for compatibility with previous versions of the software. The justification for flagging these constructs goes along the lines "this construct is not supported/behaves differently on the xyz platform". From this observation we draw the conclusion that truly portable applications have to be written using a subset of the facilities and services described in standards documents.

Standards conformance

The POSIX and C standards define two types of conformance, 1) implementation conformance and 2) application (or source code) conformance. In this paper we are interested in the latter.

Application conformance is broken down into various categories. The classification of these categories varies slightly between the two standards.

POSIX specifics

POSIX itself is not specific to the C language. However, it does have a C binding (ISO 9945-1). This binding specifies an interface to the environment, but surprisingly there are no requirements in POSIX.1 for the C source code to conform to the C Standard. However, from the portability perspective any software that conforms to the C Standard should be portable across C compilers running in a POSIX environment. So here we will be considering the POSIX and C Standards as one.

A strictly conforming POSIX.1 application does not rely on any construct whose behaviour is not fully defined, thus it has the greatest portability. A conforming POSIX.1 application may only use facilities described in the standard. However, since the behaviour of some of those facilities may vary across implementations such an application may need to be modified to run on different platforms.

The POSIX.1 standard also defines <National Body> conforming applications and conforming applications using extensions. It is expected that applications conforming to these standards will have weaker portability criteria and are not considered further here.

At this moment in time there are constructs for which it is uncertain (at least to the author) what category of behaviour they cause.

C specifics

The C standard defines terms for a strictly conforming and conforming applications. The C standard categories the behaviour of constructs as follows:

To be strictly conforming a program should not contain an instance of any of these constructs. To be conforming a program should not contain any constraint or syntax errors.

The C standard is all encompassing in that all constructs can be categorised. Over the last few years there has been a considerable debate concerning the status of various C constructs. This has resulted a feeling that any remaining poorly defined constructs are likely to be obscure. There is an active program of documenting C answers to interpretation questions raised by users of the C standard.

C/POSIX.1 differences

The major difference between the POSIX.1 and C standards occurs at runtime. POSIX specifies a much larger set of support functions. Basically it provides an interface to the host operating system, whereas the C standard provides library functions independent of the host OS.

POSIX.1 specifies a set of services that must be provided at runtime. The rules and regulations governing the creation of a runable program are specified to be those given in the relevant language binding. In the case of the C language binding some of the minimum limits given in the C standard are increased, ie number of characters considered significant in an external identifier.

When to check

There are various stages in the application software creation process where checks against Standards can be performed.

During development. As mentioned earlier the sooner problems are found the cheaper it is to fix them. So a checking tool is of use to developers.

During quality assurance. Developers will only usually run tests that relate to their own area of interest. The Q/A department will be looking at the application from an integrated point of view. A checking tool is thus of use in ensuring that all of the software conforms to the relevant company standards.

Prior to source code purchase. When OEM's are investigating the possibility of buying in software it can be difficult to assess vendors claims of portability and standards conformance. Some form of conformance measuring tool would thus be of use in verifying claims of conformance.

What is checked (by this tool set)

Ideally it ought to be possible to check conformance by looking at the source code. However, there are theoretical as well as practical limitations to this approach. The solution adopted in the checking tools described here is to mirror the compile/link/execute method of creating an application, but with a different emphasis. We are primarily interested in checking as part of quality assurance testing, not running the application in a commercial environment.

The POSIX checker was designed to detect and flag all undefined, implementation defined and unspecified constructs as well as all constraint errors, syntax errors and exceeding minimum limits; both at compile, link and runtime.

The phases of conformance checking:

The constructs checked varies between the phases of the checking process.

The first two phases (static analysis) basically check that the source code conforms to the ISO C Standard. There are also a few extra features that POSIX mandates at these stages, such as restrictions on what can be assigned to errno and there are additional header files.

The compile time checks are likely to flag significantly more constructs than the development compiler. Since compile time checking is the easiest to do (from the users point of view) and are the easiest to relate to every attempt has been made to flag constructs in this phase, if possible.

The linker in the POSIX checker was specifically written for handling C. When separately compiled modules are joined together (the link stage) checks are made to ensure that the types of external objects and functions agree. This checking is something that very few linkers perform.

The third phase (dynamic analysis) does runtime checking. This consists of, amongst other things:

This checking is performed by actually executing the applications program.

Background of the tools

All of the tools for manipulating C source code produced by Knowledge Software are based on the same source code. This source started life six years ago and has been through several rewrites since then. Existing products include a C compiler front end, C to other language translators, a C quality assurance tool kit and most recently the POSIX applications checker.

General design aims

It was recognised at an early stage that most existing C programs are a long way from being strictly conforming. The user interface to the POSIX checker was designed to smooth the transition from common usage C to conforming Standard C. Not only is it possible to tailor the severity of every error message but implementation defined features are user selectable.

This tailoring enables users to convert their code in an incremental fashion. Thus the work load can be spread over a period of time. It is also possible to achieve results quickly, rather than having to wait until all of the work is complete.

Although its function is to check applications this was not seen as an excuse to execute slowly. Developers do not like to use tools that are slow and cumbersome. Therefore every attempt was made to ensure that the POSIX checker ran at a reasonable rate.

Support for multiple architectures

As a provider of services POSIX does not concern itself with the underlying computer architecture. On the other hand the C Standard recognises that at their lowest level computers do vary in their implementation. Because it has to execute user programs the POSIX checker has to be able to handle different computer architectures. For this reason the overall design of the tools was not tied to any computer architecture. They can be configured to emulate various architectures. The user can configure the compiler and runtime system to match:

All that is required is information on the target platform to be fed into the platform profiles used by the checker. These platform profiles are subdivided into cpu, compiler and OS profiles. Profiles also exist for individual standards. Information on the most common processors is supplied with the package.

The source code checker

This is a `traditional' compiler front end. It differs from most front ends in that many of its settings are soft. They are read from configuration files at compile time. A significant amount of effort has gone into showing the correctness of this tool. This correctness has involved showing that all of the requirements of the ISO C Standard are implemented and also that the code generated is correct.

Profiling work done on this tool has been used by BSI and NIST to measure coverage of their respective validation suites to the C Standard.

The interface checker

This checker is essentially a linker that was tailor written for handling C programs. Most linkers perform very little interface checking across translation units. They are usually restricted to complaining about missing symbols. The POSIX checker linker performs full type checking across C translation units, ie it checks that the same identifier is declared with compatible type in every file in which it is used. It also merges its input into a form suitable for interpretation by the runtime checker.

It was recognised that developers often require the services of libraries not provided as part of POSIX, ie X windows. The POSIX checker was thus designed to be user extensible. It is possible to refer to non POSIX library functions, have the interface checked and call them at runtime. There is also a method of specifying what runtime interface checking needs to be performed. It is the interface checkers job to build a runtime system capable of executing the users program, including the required interfaces.

The Library

The POSIX checker library provides the functionality required by the POSIX.1 and C standard. The majority of the POSIX functionality is achieved through linking in the POSIX library on the host computer. For the C library functions there is the option of using the host libraries or internally written functions. Following the design aims of the other tools it also gives warnings on the use of any features that may not be supported in other libraries and checks that the parameters to functions are within bounds. This checking is independent of whether the library is implemented internally or through an interface to the host.

The runtime checker

This interpretes the intermediate code generated by the source code checker. This tool acts as a runtime interface between the users application and the POSIX environment. As well as checking the runtime requirements given in the C Standard it also checks the calls to POSIX services.

This interpreter has benefited from the considerable experience gained in tuning and porting interpreters for other products. The requirement that the software should execute quickly has influenced the design of the C abstract machine that exists at runtime.

Checking the checker

A significant amount of work went into the checking and verification of the Model Implementation C Checker, on which the POSIX checker is based. This included producing tests that caused 99.6% of all basic blocks in the code to be executed, cross referencing the source code to the C standard and passing both the BSI and NIST C validation suites.

The latest version of the software is over 100,000 lines of C. It has been ported to Sparc, MC68000 and 80386 (DOS and Unix) platforms. Source code licensees have also ported to i860, MIPS, RS/6000 and VAX.

Practical experiences

Developers tend to have a narrow view of standards. It is coloured by what they know and the tools they use. Unless faced will `real life' situations developers are often loath to modify code or their work practices. Flagging constructs based on requirements in standards documents and giving references to those requirements would appear to satisfy developers needs for justification.

A portability tool that simply flagged constructs because they occured in standards documents would only be doing part of the job. It is necessary to flag perfectly conforming constructs simply because some implementations will process them incorrectly. Although these cases do apply to real world situations, some developers believe that these problems will be fixed eventually and need not affect them.

It is rare to find a developer willing to follow the strict letter of any standard. So the ability to switch off some messages is essential.

So what does the checker achieve?

Any program that can go through all stages of the checking process without any errors or warnings being flagged is a strictly conforming POSIX.1 application and a strictly conforming C program. Thus it should be portable with regard to these Standards. Any porting problem is likely to be as a result of problems in the host environment rather than the application.

Static checking

The C Standard was written to cater for a wide spectrum of platforms, from Coffee machines to Super computers and from 30 year old machines to the latest RISC technology. Experience has shown that constructs considered to be an area of concern to one group of users are of little interest to those working on other platforms. A tool that flags all constructs that lie outside of strictly conforming Standard C is seen as being very verbose. So verbose,in fact, that at times users ignore its output completely.

In order to reduce the number of `uninteresting' message generated, the concept of source and target platform was introduced. By telling the tools which platform should act as a reference environment and knowing the target platform it is possible to filter out those features that are common to both platforms (the idea being that if a program containing such a construct worked on the source platform then it will work on the target). This platform profile contains information on cpu characteristics, the OS and C compiler behaviour. Profiles for the `unknown', C abstract machine and POSIX.1 platform are available for those users who want the create maximally portable applications.

Dynamic checking

The two main types of warnings generated at runtime relate to pointer problems and the POSIX library interface. Problems with pointers are usually seen as program bugs rather than as a portability problem. In order to fit in with this view of the world and speed up other checks it is possible to switch off the pointer checking.

To date little experience has been gained with dynamic checking. Apart from what every developer already suspects, that pointers don't always point where they are supposed, little hard information is available. One issue that has been highlighted however, is that although pointers may have well defined values some of the objects that they point at might themselves be uninitialised.

Conclusion

Applications conforming to the C and POSIX.1 standards offer a reduction in porting costs. The only reliable method of verifying that applications software conforms to the POSIX specification is to use some form of verification tool at all stages of the development and testing of the program. The benefits of such verification include confidence that the software is conforming and will port to other environments and marketing advantages in being able to backup claims of Open Systems conformance.



© Copyright 1992,95. Knowledge Software Ltd. All rights reserved;
Home
Presented at the EurOpen & USENIX Spring 1992 Workshop/Conference