Reusing C code in Java

C and Java are very different languages. However, there is enough similarity that people might be tempted into reusing existing C code in their Java programs.

This discussion is not designed to help people convert entire programs. Rather it is aimed at those cases where a few functions are needed from an existing C program.

No attempt will be made here to exhort people to rewrite applications from the ground up using object oriented methods. Or even to use object oriented concepts at all. We are purely concerned with getting existing C code working in a Java environment.

The ordering of content in this analysis follows the ISO C Standard, "ISO/IEC 9899:1999 Programming Languages -- C". In the interests of brevity the more esoteric differences have been omitted (in some cases a more detailed analysis is provided here (a 10M pdf)).

The reference used for the Java language was "The Java language specification" by James Gosling, Bill Joy, and Guy Steele, ISBN 0-201-63451-1.

Clause 3 Definitions and conventions

The C language does not exactly specify the behaviour of all constructs. Some decisions are left up to the implementation. So the same program can exhibit different behaviours when processed by different implementations.

Java programs are intended to give the same result under all implementations.

Any C constructs whose behaviour is undefined, implementation defined, or unspecified will need to be checked. See Annex G of ISO 9899 for a list of these constructs.

Clause 3.14 Object

C: "A region of data storage in the execution environment, the contents of which can represent values. ... When referenced, an object may be interpreted as having a particular type."

Java: "Variables have type, objects have classes"

Clause 3.16 undefined behaviour

In C use of an indeterminately valued object results in undefined behaviour.

Java goes to a lot of trouble to make sure that use of a variable or object before a value is assigned to it results in a compile time error. Chapter 16 is devoted to the topic of Definite Assignment.

Clause 5.1.1.2 Translation phases

C, phase 1: "... Trigraph sequences are replaced by corresponding single-character internal representations."

Java, phase 1: "A translation of Unicode escapes (3.3) in the raw stream of Unicode characters to the corresponding Unicode character. ..."

C, phase 3: "... Each comment is replaced by one space character. ..."

Java is silent on the status of comments.

C, phase 5: "Each source character set member and escape sequence in character constants and string literals is converted to a member of the execution character set."

Java has escape sequences, but is silent on this point.

C, phase 6: "Adjacent character string literal tokens are concatenated and adjacent wide string literal tokens are concatenated."

Java does not concatenate adjacent string literal tokens. The + operator may be used to concatenate strings at runtime.

Clause 5.1.2.2.1 Program startup

C: "It can be defined with no parameters:
int main(void) { /*...*/ }
or with two parameters ... :
int main(int argc, char *argv[]) { /*...*/ }"

Java: "The method main must be declared public, static and void. It must accept a single argument that is an array of strings."
ie public static void main(String[] args) { /*...*/ )

Clause 5.2.1 Character sets

C: "The values of the members of the execution character set are implementation-defined; any additional members beyond those required by this subclause are locale-specific."

Java uses 16-bit Unicode to represent characters.

Clause 5.2.1.1 Trigraph sequences

Java does not support trigraphs. Thus any sequence of characters starting ?? will not be considered for conversion to its alternative character representation. Java does support the more powerful concept of embedding Unicode directly in the source.

Clause 5.2.1.3 Character display semantics

Java does not support the escape sequences:

\a

\v

Clause 5.2.4.1 Translation limits

The lower bounds given on the number of constructs of a particular type that an implementation is required to be able to process are low, given todays memory availability.

No such minimum limits are given for Java translators.

Clause 5.2.4.2.1 Sizes of integral types <limits.h>

In C various characteristics of the integral types are available as object macros in the header <limits.h>.

In Java the classes Character, Integer and Long provide public fields that give information on the maximum and minimum values of integral quantities. Note: there is no class giving information on the short type.

Clause 5.2.4.2.2 Characteristics of floating types <float.h>

In C various characteristics of the floating types are available as object macros in the header <float.h>.

In Java the classes Float and Double provide fields giving some of this information.

C: "... parameters are used to define the model for each floating-point type."

Java: "The Java types float and double are IEEE 754 32-bit single-precision and 64-bit double-precision binary floating-point values, respectively."

Clause 6.1.1 Keywords

Java does not support the keywords: auto, enum, extern, register, signed, sizeof, struct, typedef, union, unsigned.

Java reserves the keyword goto. However, the goto statement is not supported in Java.

Clause 6.1.2.2 Linkage of identifiers

C's linkage model is intended to ensure multiple declarations resolve to one definition. Such situations occur across multiple translation units. An issue that we are not considering here.

In Java the field modifier static does not affect the visibility of its associated field.

Clause 6.1.2.3 Name space of identifiers

C: "... the syntactic context disambiguates uses that refer to different entities."

Java: "The meaning of a name in Java depends on the context in which it is used."

C has a number of name spaces:

Clause 6.1.2.5 Types

C has object types, function types and incomplete types.

Java has primitive types and reference types.

C: "An object declared as type char is large enough to store any member of the basic execution character set."

Java: "... and char, whose values are 16-bit unsigned integers representing Unicode characters."

C has signed and unsigned integral types.

Java: "The integral types are byte, short, int and long, whose values are 8-bit, 16-bit, 32-bit and 64-bit signed two's-complement integers, respectively, and char, whose values are 16-bit unsigned integers representing Unicode characters."

Java has no long double type.

Java has no enumeration types.

Java has no void type.

Java has no union type.

Java does not have pointer types, it has reference types.

Qualified types

C: Arithmetic types -> Java: Numeric types

C: Arithmetic types + pointer types -> Scalar types

Java: Numeric types + boolean type -> Primitive types

Clause 6.1.2.6 Compatible type and composite type

C: "Two types have compatible type if their types are the same. Additional rules for determining whether two types are compatible are described in Clause 6.5.2 for type specifiers, in 6.5.3 for type qualifiers, and in 6.5.4 for declarators."

Java has no notion of composite type.

Clause 6.1.3.1 Floating constants

Java does not support the long double type.

Clause 6.1.3.2 Integer constants

The type of an integer constant in C is determined by its magnitude and the range of values supported by the underlying types.

In Java all unsuffixed integer constants have type int. Integer constants ending in the suffix l, or L have type long.

In Java any integer constant greater than 2147403647, or its octal or hexadecimal representation, must have a suffix appended to it.

Java does not support unsigned types (although char types only store values between 0 and 65535).

Clause 6.1.3.4 Character constants

The following escape characters are not support in Java:

\?

\a

\v

In C character constants have type int.

In Java they have type char.

C allows implementations to support more than one character in a character constant.

Java requires that a character constant contain a single character.

Java does not implement wide characters using C's notation of L'a'.

Clause 6.1.4 String literals

The following escape characters are not support in Java:

\?

\a

\v

In C string literals have type array of char.

In Java they have type String, a class type.

In C adjacent string literals are concatenated in translation phase 6.

Java requires the use of the + operator to concatenate strings.

In C a null terminator is added to the end of string literals.

In Java no terminator is added to string literals.

Clause 6.1.5 Operators

Java does not treat the following as operators:

[ ] -- Java regards these characters as punctuators, which C does also.

( ) -- Java regards these characters as punctuators, which C does also.

. -- Java regards this character as a punctuator.

->

sizeof

, -- Java regards this character as a punctuator, which C does also.

#

##

Clause 6.1.6 Punctuators

Java does not treat the following as punctuators:

*

=

: -- Possibly an oversight in the Java specification, case 3:

#

...

Clause 6.1.7 Headers

Java does not support the use of header files.

Clause 6.1.8 Preprocessing numbers

Java does not contain a preprocessor.

Clause 6.2.1.1 Characters and integers

In C the signedness of char is implementation defined.

In Java a variable of type char holds "... values are 16-bit unsigned integers representing Unicode characters." byte is an 8-bit signed integral type that is nearer to C's programmers use of the char type.

Clause 6.2.1.2 Signed and unsigned integers

Java does not support unsigned integer types, or the signed keyword.

In C, if a value is demoted to a smaller integer type and the value cannot be represented the result is implementation defined.

Java defines the behaviour as discarding all but the lower bits.

Clause 6.2.1.3 Floating and integral

C: "When a value of integral type is converted to floating type, ..., the result is either the nearest higher or lower value, chosen in an implementation defined manner".

Java requires rounding towards zero.

Clause 6.2.1.5 Usual arithmetic conversions

Java does not support the type long double, or any unsigned integer types. Removing the rules containing these constructs reduces the two sets of rules to being identical.

C code that relies on subtle behaviour caused by unsigned int to long conversion may cause conversion headaches.

Clause 6.2.2.1 Lvalues and function designators

In C array of type converts to pointer to type in many contexts.

In Java no such conversions happen.

Clause 6.2.2.2 void

Java does not support the void type.

Clause 6.2.2.3 Pointers

Java does not support the void type. So there are no rules for casting references to that type.

In Java the expression null (which is also a keyword) has the null type. Java says nothing about the representation of the value of the expression null. The type of this expression is not given any name, although is spoken terms it is referred to as the null type.

Clause 6.3 Expressions

In C the order of evaluation of an expression is unspecified.

Java specifies a left to right evaluation order.

In C bitwise operations on signed types have implementation defined behaviour (because the representation is implementation defined).

Java defines the behaviour (because the representation is defined)..

Clause 6.3.2.1 Array subscripting

In C a "pointer to object type" is required.

In Java an array type is required.

In C the index "... shall have integral type, ..."

Java: "Arrays must be indexed by int values; short, byte, or char values may also be used as index values because they are subject to unary numeric promotion (5.6.1) and become int values. An attempt to access an array component with a long index value results in a compile-time error."

In C arrays of char (strings) may be indexed using this operator.

In Java String is a class, the method charAt may be used to access elements. Objects of this class are not modifiable, and so may not be assigned to. However, the StringBuffer type is modifiable (the method setChatAt may be used to set an element).

Clause 6.3.2.2 Function calls

This postfix-expression is syntactically a primary-expression in Java. And that is the easy difference to understand.

Easy to understand differences:

Clause 6.3.2.3 Structure and union members

Java does not support the struct or union types. The class type may be used to achieve the same effect as struct in C.

Java does not support the -> operator.

Clause 6.3.3.2 Address and indirection operators

Java does not support either of these operators.

Clause 6.3.3.4 The sizeof operator

Java does not support this operator.

Clause 6.3.4 Cast operators

C allows users to give names to types using the typedef keyword.

Java does not support typedef's, or the concept it provides. Only the primitive types, arrays of those types, or references to those types may appear as type-name (class names may also be used, but C does not have them).

Java does not permit numeric types to be cast to reference types, or vice versa.

Clause 6.3.5 Multiplicative operators

If either operand of the division, /, operator is negative:

C: "... whether the result of the / operator is the largest integer less than or equal to the algebraic quotient or the smallest integer greater than or equal to the algebraic quotient is implementation defined, ..."

Java: "Integer division rounds towards 0."

In C if either operand of the remainder, %, operator is negative the sign of the result is implementation defined (a consequence of the behaviour for /).

Both languages require (a/b)*b+a%b to equal a.

Clause 6.3.6 Additive operators

Java does not allow integer values to be added to references.

Java does not allow references to be subtracted.

Clause 6.3.7 Bitwise shift operators

In C if the left operand of the right shift operator "... has a signed type and a negative value, the resulting value is implementation defined."

In Java the >> operator performs sign extension. The >>> operator shifts with zero extension.

Clause 6.3.8 Relational operators

Java requires that the operands have numeric type.

In C the result of this operator has int type.

In Java the result has boolean type.

Clause 6.3.9 Equality operators

In C the result of this operator has int type.

In Java the result has boolean type.

Clause 6.3.13 Logical AND operator

Java requires both operands to have type boolean.

In C the result has type int.

In Java the result has type boolean.

Clause 6.3.14 Logical OR operator

Java requires both operands to have type boolean.

In C the result has type int.

In Java the result has type boolean.

Clause 6.3.15 Conditional operator

Java requires the first operand to have type boolean.

Java does not permit this operator to appear where a void method may appear (a statement expression in C).

C permits both the second and third operands to have void type.

Java prohibits either the second or third operand being an invocation of a void method.

C: "... the usual arithmetic conversions are performed to bring them to a common type and the result is that type."

Java: "If one of the operands is of type T where T is byte, short or char, and the other operand is a constant expression of type int whose value is representable in type T, then the type of the conditional expression is T."
"Otherwise, binary numeric promotion (5.6.2) is applied to the operand types, and the type of the conditional expression is the promoted type of the second and third operands."

Clause 6.3.16 Assignment operators

C: "The order of evaluation of the operands is unspecified."

In Java the order of evaluation is left to right.

Clause 6.3.16.1 Simple assignment

In C operands that overlap, unless the overlap is exact, cause undefined behaviour. Such situations arise through the use of unions, which are not available in Java.

In C complete struct assignment copies the values from one object to another.

In Java class assignment copies the reference, not the values referred to.

Clause 6.3.16.2 Compound assignment

See comments on the right shift operator.

Clause 6.3.17 Comma operator

Java does not support the comma operator.

Clause 6.4 Constant expressions

In C "Each constant expression shall evaluate to a constant that is in the range of representable values for its type." So the type of a constant expression is deduced from the magnitude of its final value.

In Java the type of an expression is deduced from the, possibly promoted, types of the operands. In the case of comparison operators the type is boolean.

Java does not support the sizeof operator.

Java does not support the unary & operator or address arithmetic on references. So such address constants may not be used.

Clause 6.5.1 Storage-class specifiers

Java does not support the storage-class specifiers typedef, extern, auto, or register.

Only the members of a class may be declared static, ie Java does not support static local variables.

Clause 6.5.2 Type specifiers

Java does not support the type specifiers void, signed, unsigned, struct-or-union-specifier, enum-specifier, or typedef-name.

Clause 6.5.2.1 Structure and union specifiers

Java does not support the struct or union specifier. An effect similar to struct may be achieved by using the keyword class.

In Java the C concept of bit-field is not supported.

The layout of the fields of a class is not specified in Java and such information cannot be obtained by writing a Java program.

Clause 6.5.2.2 Enumeration specifiers

Java does not support enumeration specifiers.

Clause 6.5.2.3 Tags

In Java the name of a class may not be omitted in a class declaration.

Clause 6.5.3 Type qualifiers

Java does not support the const type qualifier. The keyword final may be used to achieve a very similar effect.

The keyword volatile (and the Java keyword final) may only appear on the declaration of a field of a class. They may not appear in local declaration of an identifier.

Clause 6.5.4.1 Pointer declarators

Java does not support pointer types, it has reference types.

Java: "There are three kinds of reference types class types (8), interface types (9), and array types (10)."
A reference to a primitive type can be created by declaring a class containing a single field of the appropriate type.

Clause 6.5.4.2 Array declarators

In C an identifier in an array declaration refers to a contiguous amount of storage starting at the base address of the identifier.

In Java an identifier in an array declaration is a reference to a number of elements of storage. Information on the layout of this storage in memory is not available to the programmer.

In C a definition reserves storage for the array.

In Java a definition only reserves storage for the reference. This does not refer to any storage until a reference is assigned to it.

Clause 6.5.4.3 Function declarators (including prototypes)

C functions are called methods in Java.

Methods may not be declared with variable numbers of arguments.

In Java the types of all parameters to methods must be given (C prototypes).

In C a function that takes no parameters is denoted by using the keyword void.

In Java a method that takes no parameters is denoted by an empty parameter list.

Clause 6.5.6 Type definitions

Java does not support the typedef keyword, or the concept suggested by it.

Clause 6.5.7 Initialization

C aggregate initialisation supports a complex mechanism for optionally omitting opening/closing curly brackets and deducing the correct index from the number of initialisers seen so far and the type of the array. C also allows trailing initialisers to be omitted, by implicitly providing zero fill.

Java: "The length of the constructed array will equal the number of expressions."

In C the initial values appear outside of the aggregate or union declaration.

In Java the initial values appear inside the class definition, alongside the field being initialised.

Clause 6.6 Statements

Java does not permit methods to contain unreachable statements (section 14.19 Unreachable Statements is five pages long).

Clause 6.6.1 Labeled statements

Java has tightened up the rules about where a 'case constant-expression' or default label can appear (your C code should not make use of such weird usage anyway).

Clause 6.6.3 Expression and null statements

Java has the philosophy that statements should do something. Thus void'ed expression statements are not allowed. Also the use of the ?: operator at the statement level is not permitted.

Clause 6.6.4.1 The if statement

C: "The controlling expression of an if statement shall have scalar type."

Java: "The Expression must have type boolean ..."

Clause 6.6.4.2 The switch statement

Java does not support switch expressions of type long.

Java requires that the body of a switch statement be a block, ie enclosed in braces.

Clause 6.6.5 Iteration statements

Java requires that the controlling expression have boolean type.

Clause 6.6.5.1 The while statement

See Clause 6.6.5

Clause 6.6.5.2 The do statement

See Clause 6.6.5

Clause 6.6 The for statement

See Clause 6.6.5

Clause 6.6.6.1 The goto statement

Java does not support the goto statement.

Clause 6.6.6.4 The return statement

Java does not allow a method returning a value to contain a return statement without an expression.

Clause 6.7.1 Function definitions

C function calls are known as methods in Java.

Methods must appear within a class definition.

Java does not support the extern keyword.

Java does not allow methods taking variable numbers of arguments to be defined.

Clause 6.7.2 External object definitions

Java does not support external objects. It has classes that get instantiated.

Clause 6.8 Preprocessing directives

Java does not contain a preprocessor.

Clause 7.3 Character handling <ctype.h>

Java contains the class java.lang.Character

isalnum -> isLetterOrDigit

isalpha -> isLetter

iscntrl -> ???

isdigit -> isDigit

isgraph -> ???

islower -> isLowerCase

isprint -> ???

ispunct -> ???

isspace -> isSpace
Note that Java does not consider '\v', vertical tab, to be a space character.

isupper -> isUpperCase

isxdigit -> ???

tolower -> toLowerCase

toupper -> toUpperCase

Clause 7.6 Nonlocal jumps <setjmp.h>

It may be possible to emulate C's setjmp/longjmp functionality using try blocks. Please e-mail your ideas.

Clause 7.7 Signal handling <signal.h>

Java has a sophisticated exception handling mechanism built into the language, it is not a library add-on.

Clause 7.8 Variable arguments <stdarg.h>

Java does not allow methods taking a variable numbers of arguments to be declared.

Revision History

 1 Jul 05 Added references to c0x
15 Jan 97 Got a few useful ideas from this paper.
27 Dec 96 Updated, plus comments from Eamonn McManus, emcmanus@gr.osf.org
24 Dec 96 Created


Home
derek at knosof dotty co.uk

Copyright (c) 1996,1997,2005,2008. Knowledge Software Ltd. All rights reserved.
1 Jul 2005