ICMAKE Part 2

by Frank B. Brokken
4. The grammar of icmake source files

Icmake source files are written according to a well-defined syntax, closely resembling the syntax of the C programming language. This is no coincidence. Since the C programming language is so central in the Unix operating system, we assumed that many people using the Unix operating system are familiar with this language. Providing a new tool which is founded on this familiar programming language relieves everybody of the burden of learning yet another dialect, thus simplifying the use of the new system and allowing its new users to concentrate on its possibilities rather than on its grammatical form.

Considering icmake's specific function, we have incorporated a lot of familiar constructs from C into icmake: most C operators were implemented in icmake, as were some of the standard C runtime functions. In this respect icmake's grammar is a subset of the C programming language. However, we have taken the liberty of defining two datatypes not normally found in C. There is a datatype `string' (yes, its variables contain strings) and a datatype `list', containing lists of strings. We believe these extensions to the C programming language are so minor that just this paragraph would probably suffice for their definition. However, they will be described in somewhat greater detail in the following sections. Also, some elements of C++ are found in icmake's grammar: some icmake-functions have been overloaded; they do different but comparable tasks depending on the types of arguments they are called with. Again, we believe this to be a minor departure from the `pure C' grammar, and think this practice is very much in line with C++'s philosophy.

4.1. Comment and preprocessor directives.

One of the tasks of the preprocessor is to strip the makefile of comment. Icmake recognizes two types of comment: standard C-like comment and end-of-line comment, which is also recognized by the Gnu C compiler and by Microsoft's C compiler.

Standard comment must be preceded by /* and must be closed by */. This type of comment may stretch over more than one line. End-of-line comment is preceded by // and ends when a new line starts.

Lines which start with #! are skipped by the preprocessor. This feature is included to allow the use of executable makefiles. Apart from the #! directive, icmake recognizes two more preprocessor directives: #include and #define. All preprocessor directives start with a `#'-character which must be located at the first column of a line in the makefile.

4.1.1 The #include directive.

The #include directive must obey the following syntax:

#include "filename"

or:

#include <filename>

When the preprocessor icm-pp encounters this directive, `filename' is read. The filename may include a path specification. When the filename is surrounded by double quotes, icm-pp attempts to access this file exactly as stated. When the filename is enclosed by < and >, icm-pp attempts to access this file relative to the directory pointed to by the environment variable IM. Using the #include directive, large icmake scripts may be modularized, or a set of standard icmake source scripts may be used to realize a particular icmake script.

4.1.2. The #define directive.

The #define directive is a means of incorporating constants in a makefile. The directive follows the following syntax:

#define identifier redefinition-of-identifier

The defined name (the name of the defined constant) must be an identifier according to the C programming language: the first character must be an underscore or a character of the alphabet; subsequent characters may be underscores or alphanumerics.

The redefinition part of the #define directive consists of spaces, numbers, or whatever is appropriate. The preprocessor simply replaces all occurrences of the defined constant following the #define directive by the redefinition part. Note that redefinition's are not further expanded; an already defined name which occurs in the redefinition part is not processed but is left as-is.

Also note that icm-pp considers the redefinition part to be all characters found on a line beyond the defined constant. This would also include comment, if found on the line. Consequently, it is normally not a good idea to use comment-to-end-of-line on lines containing #define directives.

4.2. Constants, types and variables

Constants may be used in the makefile to indicate a number or a string. Int constants are denoted by numeric characters; e.g., 13 is an int constant. A second way to denote an int constant is by enclosing a character in single quotes. The numeric value of the constant is then the ascii number of the character, e.g., the constant `A' has the value 65. The character between quotes may not be `escaped', such as `\n'. Only single characters are allowed in this notation of integer constants.

String constants are denoted by text between double quote marks, e.g., “a string” is a piece of text.

Icmake recognizes four types: int, string, list and void. The types serve the following purposes:

  • int: The type `int' is used to represent numerical 16-bit signed values.

  • string: The type `string' is used to represent strings, like the strings used in C.

  • list: The type `list' is used for variables and return values of functions consisting of lists of strings. There are no list-constants. Instead, lists always have to be built run-time.

  • void: The type `void' is used only with functions, to indicate that these functions do not return values.

The types int, string and list are also used for defining variables and arguments. Icmake allows global variables and local variables. The declaration of a variable or an argument must state the type of the variable; a counter variable would be an int, while a variable containing the names of all files having extension `.c' would be a list.

Some of the built-in functions of icmake (see the section about icmake's functions) return a value of one of the types int, string or list. The returned value may be assigned to a variable of the same type or may be passed to another function.

Similarly to built-in functions, user-defined functions are assumed to return a value which is either int, string or list. The int type is the default. Functions may be defined as not returning a value. Such functions have the `void' returntype.

The definition of variables follows a C-like syntax. Arguments are defined as in ansi-C. An illustration of the use of types is found in the following listing. Note the use of the constants 55 and “main.c” (a string constant).

string myfun (int x, string y, list z)  // a
user-defined function
{       // of type string, having 3
      int       // parameters
          counter,      // local variables: 2 ints, i;    // 1 string
          and 1 list
      string
          name;
      list
          cfiles;
      counter = 55;     // counter is set to 55 name = "main.c";  //
      name is set to string main.c return (name);    // a string is
      returned to the
  }     // caller
4.3. Strings and escape sequences

Strings in makefiles are used to represent both filenames and displayed text. Icmake allows a number of special formatting sequences in strings to facilitate the display of text. These sequences are called, in analogy to the C programming language, escape sequences. Icmake recognizes the following escape sequences: Escape sequence Action

\a      alert (bell)
\b      backspace character
\f      formfeed character
\n      newline
\r      carriage return character
                tab
\v      vertical tab
\-other-        literal -other-, e.g., \\

Escape sequences in strings are identified by a backslash character \ followed by a character which identifies the escape sequence. Like C, Icmake allows string-concatenation. Long strings, extending over several lines of text, can be built by separating string constants by white-space characters (blanks, tabs, newlines).

4.4. The code of a makefile

This section discusses the user-defined functions which may appear in a makefile and also defines other syntactical constructs.

4.4.1. Flow control statements

Icmake recognizes six control statements:

  • if statements, including if-else

  • while statements

  • for statements

  • return statements

  • break statements

  • exit statements

The exit() statement, though a function in C, is part of the icmake language. The exit statement may be given an expression yielding an int. If an int expression follows, its value is returned as an int to the operating system. Otherwise, the returned value is undefined. The other flow control statements are analogous to the corresponding ones in the C programming language.

4.4.2. User-defined functions.

Icmake allows the construction of user-defined functions in a makefile. The definition of a function must follow an ansi-C-like syntax, however, minor differences exist between an icmake function and a C function. These differences are highlighted in this section.

The definition of a function must follow the syntax:

  1. Optionally, the return type of the function is specified. The type is void, int, string or list. The default return type is int.

    When a function explicitly returns using a return statement, the returned value must match the return type. If a function does not use a returns statement, an undefined value is returned. Functions which are defined as void can also use the return statement, albeit without an expression.

  2. Following the optional return type, the function name must follow. The name must be an identifier, i.e., the first character must be an underscore or a character of the alphabet, and optional following characters may be underscores or alphanumerics.

  3. Following the function name, a ( is expected.

  4. A parameter list may follow, consisting of parameter specifications separated by , (this is referred to as an ansi-C parameter list). Parameter specifications consist of the parameter type (int, string or list) and the parameter name (an identifier).

    In contrast to C, icmake does not allow user-defined functions to have a variable number of parameters.

  5. Following the optional parameter list, a ) is expected.

  6. Next, the code of the function is expected: statements enclosed by { and }.

  7. Following the first { of the code block, local variables may be defined. The definition of local variables consists of the variable type, one or more variable names separated by commas, and a semicolon.

In contrast to C, local variables can only be defined immediately after the outer curly brace of the function code block. Variables cannot be defined within a block of statements.

In contrast to C, icmake initializes all local variables to zero.

Icmake does not allow forward references. This means that a function may be called only after it has been defined. Recursive function calls are accepted. Furthermore, the statement which calls a function must supply the exact number of required arguments and each argument type must match the parameter list of the function. The built-in functions are predefined and may therefore be used anywhere within functions.

4.4.3. The user-defined function main()

The code section of a makefile must contain at least one user-defined function, called main(). The execution of a makefile starts at this function. The run-time support system of icmake provides three arguments which the function main() may use. The arguments are used to hold the command line parameters of the icmake invocation and the environment setting.

The three arguments are most commonly referred to as argc, argv and envp. Argc is an int argument, holding the number of command line parameters. Argv is a list, holding the command line parameters themselves. Envp is a list holding the environment setting. A definition of the main() function which uses all arguments argc, argv and envp is given below:

int main(int argc, list argv, list envp)
        {
            // statement(s)
        }

Users may wish to define the main() function without arguments when the command line parameters need not be examined. In this case, the main() function can be defined as:

int main
{
//statements(s)
}

It is also possible to define the main() function to use only the first or the first two arguments (argc and argv). A sample makefile which prints its command line arguments is given below. The functions printf() and element() used in this example are discussed in the function-section below:

The arguments passed to main() functions as the list argv are:

void main (int argc, list argv)
   {
       int
           i;
       for (i = 0; i < argc; i++)
           printf ("Argument ", i, " is ", element (i, argv), "\n");
   }
  1. The name of the binary makefile which is interpreted by icm-exec. This is always the first argument.

  2. Remaining arguments are those arguments which were explicitly supplied on the command line.

For example, to supply the arguments one, two and three to a makefile called try.im, one of the following invocations can be used:

        icmake test - one two three
                or:
        icmake -i test.im one two three

In both cases, the first int argument of the function main() will have the value four. The first element of the list argv holds the name of the binary makefile (test.bim); the remaining elements of argv hold the arguments one, two and three.

The third argument of main(), envp, is a list holding the setting of the environment (the environment variables). An example of such a variable is PATH, specifying where the operating system searches for executable files. The envp list consists of pairs of elements, where each first element of the pair holds the variable name (e.g., the string PATH) and where the second element of each pair holds the value of the variable (e.g., a list of directories where executable files may be found).

An example of a makefile which prints the settings of environment variables is given below:

void main (int argc, list argv, list envp)
        {
            int
                i;
            for (i = 0; i < sizeof (envp); i += 2)
                printf ("variable ", element (i, envp), " has value ",
                       element (i + 1, envp));
        }
4.4.4. Expressions and operators

Icmake allows a large number of operators to form or combine expressions. Binary operators may be used with the following operand-types:

operand-types

  1. Each binary operator must be used with two variables or constants of the same type, e.g., the addition of an int and a string is not allowed; icmake performs no default type casting.

  2. Some operators may not be used with some types, e.g., string subtraction is not allowed, but string addition is.

  3. The operators have a certain priority; some operators are evaluated before others. The priority of operators is identical to the priority used by C.

The binary operators recognized by icmake are summarized in the following table:

binary operators

All binary operators with the exception of the assignment operators are left-associative. The assignment operators are right-associative. The operators at the top of the table have the lowest priority; those at the bottom have the highest priority. Operators with different priority are separated by lines.

The unary operators are summarized in the following table. The unary operators have higher priority than binary operators and are right-associative. The exception is the expression-nesting operator, which surrounds an expression and does not associate.

unary operators

4.4.4.1. Logical operators

Icmake recognizes three logical operators: the logical and (&&), the logical or (||) and the logical not (!). These operators can be used to combine or reverse logical expressions.

The logical not operator reverses the logical outcome of an expression. The logical and operator and the logical or operator group conditions. Icmake evaluates a combined condition using these operators until the outcome of the condition is determined, in analogy to C:

  1. In the condition (c1 && c2), c2 is not evaluated if c1 yields zero, since when c1 yields zero the combined condition can only fail. Therefore, c2 is evaluated only if c1 yields not zero.

  2. In the condition (c1 || c2), c2 is not evaluated if c1 yields not zero, since when c1 yields not zero the combined condition can only succeed. Therefore, c2 is only evaluated if c1 yields zero.

Logical operators may be used with any type of expression. An int constant or variable yields its integer representation. A string constant or variable yields not zero when the length of the string is non-zero; e.g., string “a” yields not zero.

A list or variable yields not zero when the number of strings in the list is not zero, e.g., in the following code fragment the making process is stopped when no files with extension “.c” are found:

list

to be continued...

Look for part 3 of 4 of the IC Make Article in the next issue of Linux Journal.

Load Disqus comments