C

From LQWiki
Jump to navigation Jump to search

The C programming language is certainly the most influential in the world of programming. Created at first by Brian Kernighan and Dennis Ritchie in 1972, this language is characterized by its closeness to the hardware without being architecture dependent. Not that it is complicated (like Assembly), but you can move generally as you want in the memory for example, but may crash if you are not permitted to read or write to that memory.

The C language is often associated with UNIX, maybe because it has been developed on and written with it. But it is not linked with any system nor architecture. You can use C in many applications with this language. It, however, has come under criticism for being insecure, when it is not used correctly (for example, bounds overflow checking is not done by default - allowing the chance for buffer overflow attacks in poorly written software).

Note that even if it is still the most used, the C language has gradually become an old language. Having been written in 1972 means it was created before many modern studies on programming language theory. These studies provide huge advances in the creation of a language, especially in the precision and speed with which it's compiled. For example, you don't have a real range of lines and columns when there is an error or a warning in C; recent languages do. However, Kernighan and Ritchie wrote 'The C Programming Language' in 1978, which served as the de facto standard until it was modified as ANSI C in 1988 and formal standards were promulgated in 1990 and 1999 ISO standards. So C has continued to evolve over time.

Language Features

Types

Basic Types

char: character 1 byte -128 to 127
int: integer 2 byte -32768 to 32767
float: float 4 byte -3.4E38 to 3.4E38
double: double float 8 byte -1.8E308 to 1.8E308

short: shorter type 2 byte -32768 to 32767
long: longer type 4 byte -2147483648 to 2147483647

signed: signed type
unsigned: unsigned type

Expressions

An expression is a series of operators and quantities to be operated on (operands) that returns a result. This result can be used as the operand of a larger expression or assigned into a variable. Expressions are used throughout the language - in if statements, loops, function calls - and they can even be statements by themselves.

This last use is interesting. Just 1; is a valid C statement. Of course, a statement is only interesting if it has side-effects. Saying a=1; has the side-effect of assigning the value 1 to the variable a. Side-effects are especially important with the logical operations and the tertiary if-then operator.

Operators

There are tons of operators in C. Mathematical, bitwise logical, logical, comparison, and a few others. Operators have precedence, meaning that some take effect before others. Parentheses can be used to do things in the order you want.

Mathematical

The most familiar operators for most people are +, -, * (multiplication) and ÷. Division uses a slash /, and there's also the modulus operator % for remainder after integer division. I think of the unary operators + and - (negative) as mathematical as well.

Assignment

Assignment operators have the side-effect of setting a variable's value. In addition to the typical =, there are several operators like += which means, "Add this to the variable." The assignment operators are =, +=, -=, *=, /=, %=, &=, ^=, |=, <<=, and >>=.

a += b; // equivalent to a = a + b
a -= b; // equivalent to a = a - b
Comparison

These operators compare two values, returning 1 for true or 0 for false. The test to see if two values are equal is (a == b) with a double equals sign (since a single '=' is used for assignment). The comparison operators are ==, != (not equal), <, >, <=, and >=.

A common mistake is the use of something like

if (a = b) { // WRONG!

when it should have been

if (a == b) {

The former will redefine a equal to b, and use the new value of a as the condition for the IF statement; so for any non-zero value of b, this will succeed (and mess up a in the meantime). Watch out for this trap! It's one = for telling, two == for asking.

Logical

Logical operators combine values that are true or false. The key concept is that is not 0 is true, and 0 is false. Each of these operators returns 1 for true.

a && b // test if a AND b are true
a || b // test if a OR  b are true 
!a     // test if the negation of a is true
a ^ b  // exclusive-OR; (a || b) && !(a && b)

It is important to realise that C stops processing an expression as soon as the answer is known for sure -- not every sub-expression in a compound expression will necessarily be evaluated. The expressions a and b can have side-effects like assigning the values of variables. Because of the difficulty in debugging such expressions, the use of function calls and operators with side-effects within logical expressions is discouraged. It is common in shell scripts to use logical operators as quick-and-dirty if statements.

As an example, if we have an expression such as

if ((a && b) || c) {

if a is false, b will not be evaluated (because a && b cannot possibly be true if a is false). But c will have to be evaluated, because we do not know the result in advance. If a is true, b will be evaluated; and if b is also true, then c will not be evaluated, since we know that the result of the OR operation will be true.

Tertiary if-then operator

This operator is unique in that it takes three operands. The form is (a)?(b):(c). (Parentheses are recommended because this operator has low precedence.) The operator evaluates a, and if it is true, then it returns b. Otherwise it returns c. If there are side-effects, a regular if statement is recommended.

Being an operator, ?: does return a value. So

d = ((a) ? (b) : (c));

sets d = b (and does not evaluate c) if a is true (i.e. anything but zero), or d = c (and does not evaluate b) if a is false (zero).

Bitwise Operators

Bitwise operators, as the name suggests, operate on binary numbers on a bit-by-bit basis.

a & b = bitwise AND
a | b = bitwise OR
a ^ c = bitwise XOR (exclusive OR)

Each bit of a is ANDed, ORed or XORed with the corresponding bit of b and the answer is inserted in the corresponding bit position in the result. For example, (I'm using hexadecimal numbers here to make it a little more obvious -- but this obviously works with any numeric values)

0xe8 & 0x1f = 0x08
0x51 | 0x60 = 0x71
0xaa ^ 0xff = 0x55

These operators are very useful when storing several boolean values in an integer or char. Also, you can somtimes make use of the fact that -- in ASCII codes -- bit 5 (i.e. the 32's) is set (1) in the lower case letters, but cleared (0) in the corresponding capital letter.

Bit Shift Operators

These operators shift the bits of a binary number to the left or the right. Bits that "fall off the end" are lost for all time, and zeros are shifted into the opposite end.

x = a << b; // bits of a, left hand shifted b times
y = a >> b; // bits of a, right hand shifted b times

These operators are meant to be combined with the bitwise operators above. Suppose we wished to store a day of the week day (between 0 and 6), a date within the month date (between 1 and 31), and a month month (between 1 and 12) in an int variable ddm. We need 3 bits to store day, 5 bits to store date and 4 bits to store month, and we could build up an expression like this:

ddm = (day << 9) | (date << 4) | month;

and then the reverse would be

day = ((ddm & 7<<9) >>9);
date = ((ddm & 31<<4) >>4);
month = (ddm & 15);

We use the AND operator to select just the bits we are interested in, the two shift operators to position them where we want, and the OR operator to assemble the groups of bits together.

Increment and Decrement

The ++ and -- operators -- a shorthand notation for adding and subtracting one from a variable -- can each be used in two different ways. If the operator comes after the variable name, then the variable is incremented or decremented after its value is read and passed on. If the operator comes before the variable name, then the variable is modified before its value is read.

a = b++; // equivalent to a = b; b += 1
a = b--; // equivalent to a = b; b -= 1
a = ++b; // equivalent to b += 1; a = b
a = --b; // equivalent to b -= 1; a = b
Other

The comma, pointer/structure/array operators, type casts.

Control structures

if (else) instruction

The general look of a C if-then-else statement is something like

if(expression){
   A
} else {
   B
}

If expression is different from 0 then A (whatever that might be) is executed, otherwise, that is expression is 0, B is executed.

The real issue here is of course; what is expression, A and B. expression is from the compilers point of view any value that can be interpreted as a number, that's almost anything in C. For instance the value 2 would be syntactically correct. The key issue is that 2 is different from 0, thus A would be executed. A and B is any piece of code actually, for example printf("Hello World!");

Here's an example of an if statement:

if(customers > 0){
    printf("There are people in the shop"); // state the obvious 
    getHelp(); // call some functions that gets more personnel
} else {
    drinkMoreCoffee(); // do what you're trained to do
}

The tertiary operator (above) can be used as a quick form of the if statement.

while (do) instruction

The general look of while statement:

while(expression){
    do something
}

expression is the same thing as with if statements, i.e. something that can be evaluated either true (different from zero) or false (that is zero). What's happening here is that we first test if expression is true, if so the body of the while statement is executed as long as expression is true. For example:

int a = 0; 
while(a < 5){
    a++; 
    printf("%d, ", a); 
}
printf("\n");

This would produce the output:

1, 2, 3, 4, 5,

Key note is that whether a is less than 5 or not is just tested in the beginning of the while loop, so it's possible to write the number 5. Then of course the loop is terminated.

The brother while loop is the so called do-while-loop

do {
    do something 
} while(expression); 

The key note here is that whether expression is true or not is tested at the end of the loop, that means that the body of the loop is executed at least once.

It's of course possible to construct a loop that runs forever (infinite loop). For example:

int a = 0; 
while(a < 3){
   printf("I'm a fruitcake\n"); 
} 

for instruction

The general look of a for-loop is:

for(A;B;C){
    do something
}

Here A is executed just once when we enter the for-loop, typically to initialize some variables. B is a truth conditional like the ones in the while loop, if B is true then we continue looping. C is executed every time at the end of the loop. For example:

for(i = 0; i < 5; i++){
    printf("%d, ", i); 
}
printf("\n"); 

This would produce the output:

0, 1, 2, 3, 4,

to guide you through it: First i is set to 0, this happens just once. Then we test if i is less than 5, clearly, it's 0. Now we enter the for-loop and print the value of, i. At the end of the loop i is incremented by 1. Then test again weather i is less than 5, print it's values, and so on.

Please note that A, B and C can all be omitted, This is valid syntax:

for(;;){
    printf("I'm looping forever and I'm happy with it.\n"); 
}

It's not too common to introduce loops like this, but it works. The idea is that you can do something like:

int i = 0; 
for(;i < 10; printf("Hello")){
    i++; 
}

Also you can do more things in A and C, For example you want to initialize variables i to 0 and j to 5 at the beginning of the loop.

for(i = 1, j = 5; (i*j) > 0; j++, j--){
    printf("%d, ", i*j); 
}

The , between i = 0 and j = 5 is the separator here. Same style goes at the end where j is incremented and j decremented.

Functions

First of all a function can be thought of as a way to store a procedure. This procedure contains some instructions that you want to carry out many times. Also it's a way to structure your program. It can also be regarded as a function in a more mathematical sense. A wide variety of functions are available in the standard library, glibc (Some of these are actually in the math library, accessed by linking the program with -lm).

Here is the general outfit of a C function:

return_type function_name(argument_1, argument_2, ..., argument_n){
    function_body
}

A simple example of a function:

int max_i(int a, int b){
    if(a > b){
        return a;
    }
    return b; 
}

Functions can of course call other functions, for example:

int find_max(int *nums, int len){
    int max = 0, i; 
    for(i = 0; i < len; i++){
        max = max_i(nums[i], max); 
    }
    return max; 
}

Further functions can call themselves, this example calculates n!.

int fac(int n){
    if(n == 1){
        return 1; 
    }
    return n*fac(n - 1); 
}

Functions can return all basic types such as, int, float, double, char, etc. They can also return pointers to all these. They can return pointers to structs and void. If a function is declared to return void it simply returns nothing. Don't confuse this with functions that returns pointers to void, that is void*. returning void* is simply returning a general pointer to something we don't the type of.

An example of a function that 'returns' void.

void printObvious(){
    pritnf("I'm a function that returns nothing.\n"); 
}

As we can see, we can omit the return statement since this function does not return anything.

Books

  • C Programming Language, Brian W. Kernighan, Dennis Ritchie
  • C: A Reference Manual, Samuel P. Harbison, Guy L. Steele
  • The C Puzzle Book, Alan R. Feuer

For beginning users:

  • C How to Program, by Dietel et al. (Only the first two versions are recommended by my source.)
  • Practical C Programming, by Steve Oualline

See also