Pointers are one of the "hard" subjects of the C language. They are somehow mysterious, quite difficult for beginners to grasp, and their extensive use within C makes them unavoidable.
Pointers are machine addresses, i.e. they point to data. It is important to have clear this distinction: pointers are NOT t 515h78f he data they point to, they contain just a machine address where the data will be found. When you declare a pointer like this:
FILE *infile;
you are declaring: reserve storage for a machine address and not a FILE structure. This machine address will contain the location where that structure starts.
The contents of the pointer are undefined until you initialize it. Before you initialize a pointer, its contents can be anything; it is not possible to know what is in there, until you make an assignment. A pointer before is initialized is a dangling pointer, i.e. a pointer that points to nowhere.
A pointer can be initialized by:
Assign it a special pointer value called NULL, i.e. empty.
Assignment from a function or expression that returns a pointer of the same type. In the frequencies example we initialize our infile pointer with the function fopen, that returns a pointer to a FILE.
Assignment to a specific address. This happens in programs that need to access certain machine addresses for instance to use them as input/output for special devices. In those cases you can initialize a pointer to a specific address. Note that this is not possible under windows, or Linux, or many operating systems where addresses are virtual addresses. More of this later.
You can assign a pointer to point to some object by taking the address of that object. For instance:
int integer;
int *pinteger = &integer;
Here we make the pointer "pinteger" point to the int "integer" by taking the address of that integer, using the "&" operator. This operator yields the machine address of its argument.
You can access the data the pointer is pointing to by using the "*" operator. When we want to access the integer "pinteger" is pointing to, we write:
*pinteger = 7;
This assigns to the "integer" variable indirectly the value 7.
In lcc-win32 pointers can be of two types. We have normal pointers, as we have described above, and "references", i.e. compiler maintained pointers, that are very similar to the objects themselves.
References are declared in a similar way as pointers are declared:
int a = 5; // declares an integer a
int * pa = &a; // declares a pointer to the integer a
int &ra = a; // declares a reference to the integer a
Here we have an integer, that within this scope will be called "a". Its machine address will be stored in a pointer to this integer, called "pa". This pointer will be able to access the data of "a", i.e. the value stored at that machine address by using the "*" operator. When we want to access that data we write:
*pa = 8944;
This means:
"store at the address contained in this pointer pa, the value 8944".
We can also write:
int m = 698 + *pa;
This means
"add to 698 the contents of the integer whose machine address is contained in the pointer pa and store the result of the addition in the integer m"
We have a "reference" to a, that in this scope will be called "ra". Any access to this compiler maintained pointer is done as we would access the object itself, no special syntax is needed. For instance we can write:
ra = (ra+78) / 79;
Note that with references the "*" operator is not needed. The compiler will do automatically this for you.
It is obvious that a question arises now: why do we need references? Why can't we just use the objects themselves? Why is all this pointer stuff necessary?
Well this is a very good question. Many languages seem to do quite well without ever using pointers the way C does.
The main reason for these constructs is efficiency. Imagine you have a huge database table, and you want to pass it to a routine that will extract some information from it. The best way to pass that data is just to pass the address where it starts, without having to move or make a copy of the data itself. Passing an address is just passing a 32-bit number, a very small amount of data. If we would pass the table itself, we would be forced to copy a huge amount of data into the called function, what would waste machine resources.
The best of all worlds are references. They must always point to some object, there is no such a thing as an uninitialized reference. Once initialized, they can't point to anything else but to the object they were initialized to, i.e. they can't be made to point to another object, as normal pointers can. For instance, in the above expressions, the pointer pa is initialized to point to the integer "a", but later in the program, you are allowed to make the "pa" pointer point to another, completely unrelated integer. This is not possible with the reference "ra". It will always point to the integer "a".
When passing an argument to a function, if that function expects a reference and you pass it a reference, the compiler will arrange for you passing only the address of the data pointed to by the reference.
|