Tuesday, December 30, 2008

C and Linux programming


Writing and Using libraries:

Virtually all programs are linked against one or more libraries. Any program that uses a C functin (such as printf or malloc) will be linked against the C runtime library. If your program has a GUI, it will be linked against windowing libraries.

We need to decide whether to link the library statically or dynamically. If we choose to link statically, program will become bigger in size and harder to upgrade, but probabyly easier to deploy. If we link dynamically, programs will be smaller, easier to upgrade, but harder to deploy.

Archives:

archive (or static library) - Simply a collection of object files stored in a single file. When we provide an archive to the linker, the linker searches the archive for the object files it needs, extracts them and links them into your program

archive can be created using ar command. archive files uses .a extension.

to combine test1.0 and test2.o and make a single archive libtest.a

=> ar cr libtest.a test1.o test2.o

the cr flags tesll ar to create the archive. Now we can link with this archive using the -ltest option with gcc or g++

When the linker encounters an archive on the command line, it searches the archive for all definitions of symbols (functions or variables) that are referenced from hte object files that it has already processed but not yet defined. The object files that define those symbols are extracted from the archive and included in the final executable.

Shared libraries:

Shared library (also known as shared object , or as dynamically linked library) - similar to archive in that it is a grouping of object files. However, there are many important differences. The most fundamental difference is that when a shared library is linked into a program, the final executable does not actually conbtain the code that is present in the shared library. Instead, the executable merely contains a reference to the shared library. If several programs on the system are linked against the same shared library they will all reference the library, but non will actually be included. Thus the library is 'shared' among all the programs that link with it.

Another important difference is that a shared library is not merely a collection of object files, out of which the linker chooses those that are needed to satisfy undefined references. Instead, the object files that compose the shared library are combined into a single object file so that a program that links against a shared library always includes all of the code in the library, rather than just those portions that are needed.

To create shared library, we mush compile the objects that will make up the library using the -fPIC option to the compiler as shown below

=> gcc -c -fPIC test1.c

the -fPIC options tells the compiler that you are going to be using test.o as part of a shared object.

PIC - stands for Position Independent Code. The functions in shared library may be loaded at different addresses in different programs, so the code in teh shared object must not depend on the address (or position) at which it is loaded.

to combine hte object files into a shared library

=>gcc -shared -fPIC -o libtest.so test1.0 test2.0

-shared option tells the compiler to produce a shared library rather than an ordinary executable. Shared libraries use the extension .so, which stands for shared object.Like static archives the name always begins with lib to indicate that the file is a library.

linking with a shared library is just like linking with a static archive. For example, the following line will link with libtest.so if it is in the current directory or one of the standard library search directories on the system

=> gcc -static -o app app.o -L, -ltest

the ldd command displays the shared libraries that are linked into an executable. These libraries need to be available when the executable is run.

using LD_LIBRARY_PATH

When we link a program with shared library, the linker does not put the full path to the shared library in the resulting executable. Instead, it places only the name of the shared library. When the program is actually run, the system searches for teh shared library and loads it. The system searches only/lib and /usr/lib, by default. If a shared library that is linked into your program is installed outside of those directories, it will not be found, and the system will refuse to run the program.

One solution to this problem is to use the -Wl, -rpath option when linking the program.

=> gcc -o app app.o -L, -ltest -Wl, -rpath, /usr/local/lib

in this case, when app is run, the system will search /usr/local/lib for any required shared libraries.

Another solution is, to set the LD_LIBRARY_PATH environment variable when running the program. like PATH variable, LD_LIBRARY_PATH is a colon-separated list of directories. For ex, if LD_LIBRARY_PATH is /usr/local/lib:/opt/lib, then those directories will be searched before standard /lib and /usr/lib directories.

Pros and Cons:

Shared libraries :

- Saves the space. If the same library is referenced by many programs, we can save lot of space by linking with shared libraries.

- Upgrading is easy. If we upgrade library, all the programs referencing that library are upgraded

Static Libraries:

- If we want to upgrade only one program referencing the library, then static library is good.

- If we are not going to install the library in /lib or /usrlib then shared library is not suitable as each user of the program has to set the LD_LIBRARY_PATH properly.

Dynamic Loading and Unloading:

For loading some code at run time without explicitly linking in that code. (ex: Plug-in)

We can load the library libtest.so by using dlopen in linux

dlopen ("libtest.so", RTLD_LAZY)

The second parameter is a flag that indicates how to bind symbols in the shared library.

To use dynamic loading functions, include /dlfcn.h/ and link with the -ldl option to pick up the libdl library

dlclose unloads the shared library.

Error codes for system calls:

System call return zero if operation succeeds, or a non zero value if the operation fails.

Most system calls use a special variable named errno to store additional information in case of failure. When a call fails, the system sets errno to a value indicating what went wrong. All system calls use the same variable. So we need to copy that variable into another variable immediately after the failed call. The value of errno will be overwritten the next time you make a system call.

Error values are integers; possible values are given by prepocesor macros, by convention named in all capitals and starting with "E" - for example, EACCESS and EINVAL.

include the /errno.h/ in the header to use errno

strerror - returns a character string description of an errno error code. include /string.h/

perror - prints the error description directly to the stderr stream. Pass to perror a char string prefix to print before the error description, which is usally name of the function. include /stdio.h/


ex:

main(){

int fd;

fd = open("inputfile.txt, O_RDONLY);

if (fd ==1)

fprintf(stderr, "Error opening file : %s\n",strerror(errno));

exit(1);

}

prints:

Error opening file : No such file or directory

EINTR: error code set during interrupts / when the I/O is blocked.


System call failures:

System calls can fail in many ways, for example:



- The system can run out of resources (or the program can exceed the resource limits enforced by the system of a single program). For example, the program might try to allocate too much memory, to write too much to a disk, or to open too many files at the same time.



- Linux may block a certain system call when a program attempts to perform an operation for which it does not have permission. For example, a program might attempt to write to a file marked read-only, to access the memory of another process, or to kill another user's program.



- The arguments to a system call might be invalid, either because the user provided invlid input or because of program bug. For instance, the program might pass an invalid memory address or an invalid file descriptor to a system call. Or a program might attempt to open a directory as an ordinary file, or might pass the name of an ordinary file to a system call that expects a directory.



- A system call can fail for reasons external to program like accessing faulty device or not supported device for I/O



- A system call can be some times interrupted by the external event, such as delivery of signal.



Using assert:

assert macro is the simplest method to check for unexpectd conditions in the standard C. The argment to this macro is boolean expression. The program is terminated if the expression evaluates to false, after printing a error message containing the source file and line number and the text of the expression.



This is very useful for a variety of consistency checks internal to a program. assert can be used to test the validity of function arguments, to test preconditions and post conditions of function calls and to test for unexpected return values.



For performance critical code, runtime checks such as assert can impose significant performance penalty. In these cases, we can compile the code with the NDEBUG macro defined, by using -DNDEBUG flag on compiler command line. With NDEBUG set, the code of assert macro will be taken away. So we ned to be careful while putting the code in the assert macro. We should not call functions inside assert expressions, assign variables, or use modifying paramets such as ++.



Assert can be used in places such as



- Check against the NULL pointer, {assert (pointer !=NULL);}

The error message generated will be "Assertion 'pointer != ((void * )0)' failed"... This is better for debugging than the normal error "Segmentation fault (core dumped)"



- Check condition on function parameter values. For ex; if a function can take only + value for parameter user assert (i >0);





temp files:

mkstemp() - creates a unique temporary file from the file name template, create the file with permissions for current user and opens the file for read/write. The file name teamplate is a character string ending with "XXXXXX" (6 capital X); mkstemp replaces the X's with characters so that the file name is unique. The return value is a file descriptor.



temporory files created with mkstemp are not deleted automatically. It's upto us to remove the temp file when it is not needed.





#include /sdtio.h/

#include /unistd.h/



int write_temp_file (char *buffer, size_t length)

{

char temp_filename[] = "/tmp/temp_file.XXXXXX"

int fd = mkstemp(temp_filename);

/* unlink the file immediately, so that it will be removed when the file descriptor is closed */

unlink(temp_filename);

write(fd, &length, sizeof(length); /* writes number of bytes */

write(fd, buffer, lentgh); /* writes data */

return fd;

}





/* Reading the data from tmp file */



char* read_temp_file(int temp_file, size_t* length)

{

char *buffer;

int fd = temp_file;

lseek(fd, 0, SEEK_SET); /* Rewind to the begining of the file */

read(fd, length, sizeof(*length)); /*Read the size of data in temp file*/

buffer = (char*) malloc(*length);

read (fd, buffer, *length);

close (fd); /* this will cause the temp file to go away */

return buffer;

}





If we don't need to pass the temporary file to another program we can use tmpfile instead of mkstemp



printing the environment:

#include /stdio.h/



/* The ENVIRON variable contains the environment information. This variable of type char**, is a NULL terminated array of pointers to character strings. */



extern char **environ;



int main() {

char **var;

for (var = environ; *var !=NULL; ++var)

printf ("%s\n", *var);

return 0;

}



use setenv, unsetenv functions to set and clear the environment variables.





Exit Codes:

We can obtain the exit code of a most recently executed program using the $? variable



echo $? will give the exit code of recently executed program



In C++,

endl - token flushes a stream in addition to printing a new line character.

\n - will not flush the stream.





stdout, stderr:
Stdout is buffered. data written to stdout is not send to the console until the buffer fills. We can explicitly flush the buffer by calling the following

fflush (stdout);

In contrast, stderr is not buffered; data written to stderr goes directly to the console;

this loop doesn't print the # every second, instead the #s are buffered and bunch of them are printed when the buffer fills.

while(1) {
printf("#");
sleep(1);
}

In the below loop, the # is printed once a second

while(1) {
fprintf(stderr, "#");
sleep(1);
}




Simple Sorting:

#include

using namespace std;

#define maxsize 100


int main()

{

int temp, i, j, n, list[maxsize];

cout<<"\nEnter your list size: "; cin>>n;

// prompting the data from user and store it in the list...

for(i=0; i";

cin>>list[i];

}



// do the sorting...

for(i=0; i list[j])

{

// these three lines swap the elements list[i] and list[j].

temp = list[i];

list[i] = list[j];

list[j] = temp;

}

cout<<"\nSorted list, ascending: "; for(i=0; i
cout<<" "<
cout<
return 0;

}