Embed Your Career: April 2012

message queue

We’ll now take a look at the third and final System V IPC facility: message queues. In many ways, message queues are like named pipes, but without the complexity associated with opening and closing the pipe. However, using messages doesn’t get us away from the problems that we have with named pipes, such as blocking on full pipes. Message queues provide a reasonably easy and efficient way of passing data between two unrelated processes. They have the advantage over named pipes that the message queue exists independently of both the sending and receiving processes, which removes some of the difficulties that occur in synchronizing the opening and closing of named pipes.

Message queues provide a way of sending a block of data from one process to another. Additionally, each block of data is considered to have a type, and a receiving process may receive blocks of data having different type values independently. The good news is that we can almost totally avoid the synchronization and blocking problems of named pipes by sending messages. Even better, we can “look ahead” for messages that are urgent in some way. The bad news is that, just like pipes, there’s a maximum size limit imposed on each block of data and also a limit on the maximum total size of all blocks on all queues throughout the system.
While stating that these limits are imposed, the X/Open specification offers no way of discovering what the limits are, except that exceeding them is a valid reason for some message queue functions to fail. Linux does have two defines, MSGMAX and MSGMNB, which define the maximum size in bytes of an individual message and the maximum size of a queue, respectively. These macros may be different or, for that matter, not even present on other systems.
The message queue function definitions are

#include <sys/msg.h>
int msgctl(int msqid, int cmd, struct msqid_ds *buf);
int msgget(key_t key, int msgflg);
int msgrcv(int msqid, void *msg_ptr, size_t msg_sz, long int msgtype, int msgflg);
int msgsnd(int msqid, const void *msg_ptr, size_t msg_sz, int msgflg);

msgget
We create and access a message queue using the msgget function:
int msgget(key_t key, int msgflg);
The program must provide a key value that, as with other IPC facilities, names a particular message queue. The special value IPC_PRIVATE creates a private queue, which in theory is accessible only by the current process. As with semaphores and messages, on some Linux systems the message queue may not actually be private. Since a private queue has very little purpose, that’s not a significant problem. As before, the second parameter, msgflg, consists of nine permission flags. Aspecial bit defined by IPC_CREAT must be bitwise ORed with the permissions to create a new message queue. It’s not an error to set the IPC_CREAT flag and give the key of an existing message queue. The IPC_CREAT flag is silently ignored if the message queue already exists. The msgget function returns a positive number, the queue identifier, on success or –1 on failure.

msgsnd
The msgsnd function allows us to add a message to a message queue:
int msgsnd(int msqid, const void *msg_ptr, size_t msg_sz, int msgflg);
The structure of the message is constrained in two ways. First, it must be smaller than the system limit, and second, it must start with a long int, which will be used as a message type in the receive function. When you’re using messages, it’s best to define your message structure something like this:
struct my_message {
long int message_type;
/* The data you wish to transfer */
}
Since the message_type is used in message reception, you can’t simply ignore it. You must declare your data structure to include it, and it’s also wise to initialize it, so that it contains a known value. The first parameter, msqid, is the message queue identifier returned from a msgget function. The second parameter, msg_ptr, is a pointer to the message to be sent, which must start with a long int type as described previously.
The third parameter, msg_sz, is the size of the message pointed to by msg_ptr. This size must not
include the long int message type. The fourth parameter, msgflg, controls what happens if either the current message queue is full or the systemwide limit on queued messages has been reached. If msgflg has the IPC_NOWAIT flag set, the function will return immediately without sending the message and the return value will be –1. If the msgflg has the IPC_NOWAIT flag clear, the sending process will be suspended, waiting for space to become available in the queue. On success, the function returns 0, on failure –1. If the call is successful, a copy of the message data has been taken and placed on the message queue.

msgrcv
The msgrcv function retrieves messages from a message queue:
int msgrcv(int msqid, void *msg_ptr, size_t msg_sz, long int msgtype, int msgflg);
The first parameter, msqid, is the message queue identifier returned from a msgget function.
The second parameter, msg_ptr, is a pointer to the message to be received, which must start with a long int type as described above in the msgsnd function.
The third parameter, msg_sz, is the size of the message pointed to by msg_ptr, not including the long int message type.
The fourth parameter, msgtype, is a long int, which allows a simple form of reception priority to be implemented. If msgtype has the value 0, the first available message in the queue is retrieved. If it’s greater than zero, the first message with the same message type is retrieved. If it’s less than zero, the first message that has a type the same as or less than the absolute value of msgtype is retrieved. This sounds more complicated than it actually is in practice. If you simply want to retrieve messages in the order in which they were sent, set msgtype to 0. If you want to retrieve only messages with a specific message type, set msgtype equal to that value. If you want to receive messages with a type of n or smaller, set msgtype to -n. The fifth parameter, msgflg, controls what happens when no message of the appropriate type is waiting to be received. If the IPC_NOWAIT flag in msgflg is set, the call will return immediately with a return value of –1. If the IPC_NOWAIT flag of msgflg is clear, the process will be suspended, waiting for an appropriate type of message to arrive.
On success, msgrcv returns the number of bytes placed in the receive buffer, the message is copied into the user-allocated buffer pointed to by msg_ptr, and the data is deleted from the message queue. It returns –1 on error.

msgctl
The final message queue function is msgctl, which is very similar to that of the control function for
shared memory.
int msgctl(int msqid, int command, struct msqid_ds *buf);
The msqid_ds structure has at least the following members:
struct msqid_ds {
uid_t msg_perm.uid;
uid_t msg_perm.gid
mode_t msg_perm.mode;
}
The first parameter, msqid, is the identifier returned from msgget.
The second parameter, command, is the action to take. It can take three values:

Command Description
IPC_STAT Sets the data in the msqid_ds structure to reflect the values associated with the message queue.
IPC_SET If the process has permission to do so, this sets the values associated with the message queue to those provided in the msqid_ds data structure.
IPC_RMID Deletes the message queue.
0 is returned on success, –1 on failure. If a message queue is deleted while a process is waiting in a msgsnd or msgrcv function, the send or receive function will fail.

Signal Handling in Linux

Signals have a very precise lifecycle. First, a signal is raised (we sometimes also say it is sent or generated). The kernel then stores the signal until it is able to deliver it.
Finally, once it is free to do so, the kernel handles the signal as appropriate. The kernel can perform one of three actions, depending on what the process asked it to do:

Ignore the signal
No action is taken. There are two signals that cannot be ignored: SIGKILL and SIGSTOP. The reason for this is that the system administrator needs to be able to kill or stop processes, and it would be a circumvention of that right if a process could elect to ignore a SIGKILL (making it unkillable), or a SIGSTOP (making it unstoppable).

Catch and handle the signal
The kernel will suspend execution of the process’ current code path, and jump to a previously registered function. The process will then execute this function. Once the process returns from this function, it will jump back to wherever it was when it caught the signal. SIGINT and SIGTERM are two commonly caught signals. Processes catch SIGINT to handle the user generating the interrupt character—for example, a terminal might catch this signal and return to the main prompt. Processes catch SIGTERM to perform necessarily cleanup, such as disconnecting from the network, or removing temporary files, before terminating. SIGKILL and SIGSTOP cannot be caught.

Perform the default action
This action depends on the signal being sent. The default action is often to terminate the process. This is the case with SIGKILL, for instance. However, many signals are provided for specific purposes that concern programmers in particular situations, and these signals are ignored by default because many programs are not interested in them. We will look at the various signals and their default actions shortly.

Traditionally, when a signal was delivered, the function that handled the signal had no information about what had happened except for the fact that a particular signal had occurred. Nowadays, the kernel can provide a lot of context to programmers who want to receive it, and signals can even pass user-defined data, like later and more advanced IPC mechanisms.

Signals Used in Linux :

1) SIGHUP    2) SIGINT    3) SIGQUIT    4) SIGILL    5) SIGTRAP
6) SIGABRT    7) SIGBUS    8) SIGFPE    9) SIGKILL    10) SIGUSR1
11) SIGSEGV    12) SIGUSR2    13) SIGPIPE    14) SIGALRM    15) SIGTERM
16) SIGSTKFLT    17) SIGCHLD    18) SIGCONT    19) SIGSTOP    20) SIGTSTP
21) SIGTTIN    22) SIGTTOU    23) SIGURG    24) SIGXCPU    25) SIGXFSZ
26) SIGVTALRM    27) SIGPROF    28) SIGWINCH    29) SIGIO    30) SIGPWR
31) SIGSYS    34) SIGRTMIN    35) SIGRTMIN+1    36) SIGRTMIN+2    37) SIGRTMIN+3
38) SIGRTMIN+4    39) SIGRTMIN+5    40) SIGRTMIN+6    41) SIGRTMIN+7    42) SIGRTMIN+8
43) SIGRTMIN+9    44) SIGRTMIN+10    45) SIGRTMIN+11    46) SIGRTMIN+12
47) SIGRTMIN+13 48) SIGRTMIN+14    49) SIGRTMIN+15    50) SIGRTMAX-14
51) SIGRTMAX-13    52) SIGRTMAX-12 53) SIGRTMAX-11    54) SIGRTMAX-10
55) SIGRTMAX-9         56) SIGRTMAX-8    57) SIGRTMAX-7                  58) SIGRTMAX-6
59) SIGRTMAX-5         60) SIGRTMAX-4         61) SIGRTMAX-3    62) SIGRTMAX-2
63) SIGRTMAX-1       64) SIGRTMAX

The numbers in this shows the Signals number. Which we used during the programming.

Signal Handling
#include <signal.h>
#include <stdio.h>
#include <unistd.h>

void signal_handler(int sig)
{
printf(“ I got signal %d\n”, sig);
(void) signal(SIGINT, SIG_DFL);
}
int main()
{
(void) signal(SIGINT, signal_handler);
while(1)
{
printf(“Hello World!\n”);
sleep(1);
}
return 0;
}
$ ./ctrlc1
Hello World!
Hello World!
Hello World!
Hello World!
^C
-I got signal 2
Hello World!
Hello World!
Hello World!
Hello World!
^C
$
As you can see from this example, the signal handling function takes a single integer parameter, the signal number that caused the function to be called. This can be useful if the same function is used to handle more than one signal. Here we print out the value of SIGINT, which on this system happens to have the value 2. You shouldn’t rely on traditional numeric values for signals; always use signal names in new programs.

It is not safe to call all functions, such as printf, from within a signal handler. A useful technique is to use a signal handler to set a flag and then check that flag from the main program and print a message if required. Toward the end of the chapter, you will find a list of calls that can safely be made inside signal handlers.

Sending Signals
A process may send a signal to another process, including itself, by calling kill. The call will fail if the program doesn’t have permission to send the signal, often because the target process is owned by another user. This is the program equivalent of the shell command of the same name.

#include <sys/types.h>
#include <signal.h>
int kill(pid_t pid, int sig);

The kill function sends the specified signal, sig, to the process whose identifier is given by pid. It returns 0 on success. To send a signal, the sending process must have permission to do so. Normally, this means that both processes must have the same user ID (that is, you can send a signal only to one of your own processes, although the superuser may send signals to any process). kill will fail, return -1, and set errno if the signal given is not a valid one (errno set to EINVAL), if it doesn’t have permission (EPERM), or if the specified process doesn’t exist (ESRCH).

Semaphores: A synchronization mechnism

When we write programs that use threads operating in multiuser systems, multiprocessing systems, or a combination of the two, we often discover that we have critical sections of code, where we need to ensure that a single process (or a single thread of execution) has exclusive access to a resource. Semaphores have a complex programming interface. Fortunately, we can easily provide ourselves with a much-simplified interface that is sufficient for most semaphore-programming problems. a semaphore is a special variable that takes only whole positive numbers and upon which only two operations are allowed: wait and signal. Since “wait” and “signal” already have special meanings in Linux and UNIX programming, we’ll use the original notation:

❑ P(semaphore variable) for wait
❑ V(semaphore variable) for signal

These letters come from the Dutch words for wait (passeren: to pass, as in a checkpoint before the critical section) and signal (vrijgeven: to give or release, as in giving up control of the critical section). You may also come across the terms “up” and “down” used in relation to semaphores, taken from the use of signaling flags.

Semaphore Definition

The simplest semaphore is a variable that can take only the values 0 and 1, a binary semaphore. This is the most common form. Semaphores that can take many positive values are called general semaphores. For the remainder of this chapter, we’ll concentrate on binary semaphores.
The definitions of P and V are surprisingly simple. Suppose we have a semaphore variable sv. The two operations are then defined as follows:
P(sv) If sv is greater than zero, decrement sv. If sv is zero, suspend execution of this process.
V(sv) If some other process has been suspended waiting for sv, make it resume execution. If no process is suspended waiting for sv, increment sv.
Another way of thinking about semaphores is that the semaphore variable, sv, is true when the critical section is available, is decremented by P(sv) so it’s false when the critical section is busy, and is incremented by V(sv) when the critical section is again available. Be aware that simply having a normal variable that you decrement and increment is not good enough, because you can’t express in C, C++, or almost any conventional programming language the need to make a single, atomic operation of the test to see whether the variable is true, and if so change the variable to make it false. This is what makes the semaphore operations special.

Linux Semaphore Facilities

Now that we’ve seen what semaphores are and how they work in theory, we can look at how the features are implemented in Linux. The interface is rather elaborate and offers far more facilities than are generally required. All the Linux semaphore functions operate on arrays of general semaphores rather than a single binary semaphore. At first sight, this just seems to make things more complicated, but in complex cases where a process needs to lock multiple resources, the ability to operate on an array of semaphores is a big advantage. In this chapter, we will concentrate on using single semaphores, since in most cases that’s all you will need to use.
The semaphore function definitions are

#include <sys/sem.h>
int semctl(int sem_id, int sem_num, int command, ...);
int semget(key_t key, int num_sems, int sem_flags);
int semop(int sem_id, struct sembuf *sem_ops, size_t num_sem_ops);

As we work through each function in turn, remember that these functions were designed to work for arrays of semaphore values, which makes their operation significantly more complex than would have been required for a single semaphore.
Notice that key acts very much like a filename in that it represents a resource that programs may use and cooperate in using if they agree on a common name. Similarly, the identifier returned by semget and used by the other shared memory functions is very much like the FILE * file stream returned by fopen in that it’s a value used by the process to access the shared file. And, just as with files, different processes will have different semaphore identifiers, though they refer to the same semaphore. This use of a key and identifiers is common to all of the IPC facilities discussed here, although each facility uses independent keys and identifiers.

Semget

The semget function creates a new semaphore or obtains the semaphore key of an existing semaphore.

int semget(key_t key, int num_sems, int sem_flags);

The first parameter, key, is an integral value used to allow unrelated processes to access the same semaphore. All semaphores are accessed indirectly by the program supplying a key, for which the system then generates a semaphore identifier. The semaphore key is used only with semget. All other semaphore functions use the semaphore identifier returned from semget.
There is a special semaphore key value, IPC_PRIVATE (usually 0), that is intended to create a semaphore that only the creating process could access. This rarely has any useful purpose, which is fortunate, because on some Linux systems the manual pages list as a bug the fact that IPC_PRIVATE may not prevent other processes from accessing the semaphore. The num_sems parameter is the number of semaphores required. This is almost always 1. The sem_flags parameter is a set of flags, very much like the flags to the open function. The lower nine bits are the permissions for the semaphore, which behave like file permissions. In addition, these can be bitwise ORed with the value IPC_CREAT to create a new semaphore. It’s not an error to have the IPC_CREAT flag set and give the key of an existing semaphore. The IPC_CREAT flag is silently ignored if it is not required. We can use IPC_CREAT and IPC_EXCL together to ensure that we obtain a new, unique semaphore. It will return an error if the semaphore already exists.
The semget function returns a positive (nonzero) value on success; this is the semaphore identifier used in the other semaphore functions. On error, it returns –1.

semop

The function semop is used for changing the value of the semaphore:

int semop(int sem_id, struct sembuf *sem_ops, size_t num_sem_ops);

The first parameter, sem_id, is the semaphore identifier, as returned from semget. The second parameter,
sem_ops, is a pointer to an array of structures, each of which will have at least the following members:

struct sembuf {
short sem_num;
short sem_op;
short sem_flg;
}

The first member, sem_num, is the semaphore number, usually 0 unless you’re working with an array of semaphores. The sem_op member is the value by which the semaphore should be changed. (You can change a semaphore by amounts other than 1.) In general, only two values are used, –1, which is our P operation to wait for a semaphore to become available, and +1, which is our V operation to signal that a semaphore is now available. The final member, sem_flg, is usually set to SEM_UNDO. This causes the operating system to track the changes made to the semaphore by the current process and, if the process terminates without releasing the semaphore, allows the operating system to automatically release the semaphore if it was held by this process. It’s good practice to set sem_flg to SEM_UNDO, unless you specifically require different behavior. If you do decide you need a value other than SEM_UNDO, it’s important to be consistent, or you can get very confused as to whether the kernel is attempting to “tidy up” your semaphores when your process exits. All actions called for by semop are taken together to avoid a race condition implied by the use of multiple semaphores. You can find full details of the processing of semop in the manual pages.

Semctl

The semctl function allows direct control of semaphore information:
int semctl(int sem_id, int sem_num, int command, ...);
The first parameter, sem_id, is a semaphore identifier, obtained from semget. The sem_num parameter is the semaphore number. You use this when you’re working with arrays of semaphores. Usually, this is 0, the first and only semaphore. The command parameter is the action to take, and a fourth parameter, if present, is a union semun, which according to the X/OPEN specification, must have at least the following members:

union semun {
int val;
struct semid_ds *buf;
unsigned short *array;
}
Many versions of Linux have a definition of the semun union in a header file (usually sem.h), though X/Open does say that you have to declare your own. If you do find that you need to declare your own, check the manual pages for semctl to see if there is a definition given. If there is, we suggest you use exactly the definition given in your manual, even if it differs from that above. There are many different possible values of command allowed for semctl. Only the two that we describe here are commonly used. For full details of the semctl function, you should consult the manual page. The two common values of command are
❑ SETVAL: Used for initializing a semaphore to a known value. The value required is passed as
the val member of the union semun. This is required to set the semaphore up before it’s used for the first time.
❑ IPC_RMID: Used for deleting a semaphore identifier when it’s no longer required. The semctl function returns different values depending on the command parameter. For SETVAL and IPC_RMID it returns 0 for success and –1 on error.

File I/O : system calls

File I/O

Opening Files

The most basic method of accessing a file is via the read( ) and write( ) system calls. Before a file can be accessed, however, it must be opened via an open( ) or creat( ) system call. Once done using the file, it should be closed using the system call close( ).

The open( ) System Call

A file is opened, and a file descriptor is obtained with the open( ) system call:

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

int   open (const char *name, int flags);
int   open (const char *name, int flags, mode_t mode);

The open( ) system call maps the file given by the pathname name to a file descriptor, which it returns on success. The file position is set to zero, and the file is opened for access according to the flags given by flags.

Flags for open( )

The flags argument must be one of O_RDONLY, O_WRONLY, or O_RDWR. Respectively, these arguments request that the file be opened only for reading, only for writing, or for both reading and writing.

For example, the following code opens /home/vineet/test for reading:

Int fd;
fd = open ("/home/vineet/test", O_RDONLY);
if (fd == -1)
/* error */
A file opened only for writing cannot also be read, and vice versa. The process issuing the open( ) system call must have sufficient permissions to obtain the access requested.

The flags argument can be bitwise-ORed with one or more of the following values, modifying the behavior of the open request:

O_APPEND
The file will be opened in append mode. That is, before each write, the file position will be updated to point to the end of the file. This occurs even if another process has written to the file after the issuing process’ last write, thereby changing the file position.

O_ASYNC
A signal (SIGIO by default) will be generated when the specified file becomes readable or writable. This flag is available only for terminals and sockets, not for regular files.

O_CREAT
If the file denoted by name does not exist, the kernel will create it. If the file already exists, this flag has no effect unless O_EXCL is also given.

O_DIRECT
The file will be opened for direct I/O .

O_DIRECTORY
If name is not a directory, the call to open( ) will fail. This flag is used internally by the opendir( ) library call.

O_EXCL
When given with O_CREAT, this flag will cause the call to open( ) to fail if the file given by name already exists. This is used to prevent race conditions on file creation.

O_LARGEFILE
The given file will be opened using 64-bit offsets, allowing files larger than two gigabytes to be opened. This is implied on 64-bit architectures.

O_NOCTTY
If the given name refers to a terminal device (say, /dev/tty), it will not become the process controlling terminal, even if the process does not currently have a controlling terminal. This flag is not frequently used.

O_NOFOLLOW
If name is a symbolic link, the call to open( ) will fail. Normally, the link is resolved, and the target file is opened. If other components in the given path are links, the call will still succeed. For example, if name is /etc/ship/plank.txt, the call will fail if plank.txt is a symbolic link. It will succeed, however, if etc or ship is a symbolic link, so long as plank.txt is not.

O_NONBLOCK
If possible, the file will be opened in nonblocking mode. Neither the open( ) call, nor any other operation will cause the process to block (sleep) on the I/O. This behavior may be defined only for FIFOs.

O_SYNC
The file will be opened for synchronous I/O. No write operation will complete until the data has been physically written to disk; normal read operations are already synchronous, so this flag has no effect on reads. POSIX additionally defines O_DSYNC and O_RSYNC; on Linux, these flags are synonymous with O_SYNC.

O_TRUNC
If the file exists, it is a regular file, and the given flags allow for writing, the file will be truncated to zero length. Use of O_TRUNC on a FIFO or terminal device is ignored. Use on other file types is undefined. Specifying O_TRUNC with O_RDONLY is also undefined, as you need write access to the file in order to truncate it.

For example, the following code opens for writing the file /home/teach/pearl. If the file already exists, it will be truncated to a length of zero. Because the O_CREAT flag is not specified, if the file does not exist, the call will fail:

int   fd;
fd = open ("/home/vineet/test1", O_WRONLY | O_TRUNC);
if (fd == -1)
/* error */

Owners of New Files

S_IRWXU
Owner has read, write, and execute permission.
S_IRUSR
Owner has read permission.
S_IWUSR
Owner has write permission.
S_IXUSR
Owner has execute permission.
S_IRWXG
Group has read, write, and execute permission.
S_IRGRP
Group has read permission.
S_IWGRP
Group has write permission.
S_IXGRP
Group has execute permission.
S_IRWXO
Everyone else has read, write, and execute permission.
S_IROTH
Everyone else has read permission.

S_IWOTH
Everyone else has write permission.
S_IXOTH
Everyone else has execute permission.

The creat( ) Function

The combination of O_WRONLY | O_CREAT | O_TRUNC is so common that a system call exists to provide just that behavior:

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

int creat (const char *name, mode_t mode);

Return Values and Error Codes
Both open( ) and creat( ) return a file descriptor on success. On error, both return -1, and set errno to an appropriate error value (Chapter 1 discussed errno and listed the potential error values). Handling an error on file open is not complicated, as generally there will have been few or no steps performed prior to the open that need to be undone. A typical response would be prompting the user for a different filename or
simply terminating the program.

Reading via read( )

The most basic—and common—mechanism used for reading is the read( ) system call.

#include <unistd.h>
ssize_t read (int fd, void *buf, size_t len);

Each call reads up to len bytes into buf from the current file offset of the file referenced by fd. On success, the number of bytes written into buf is returned. On error, the call returns -1, and errno is set. The file position is advanced by the number of bytes read from fd. If the object represented by fd is not capable of seeking (for example, a character device file), the read always occurs from the “current” position.
Basic usage is simple. This example reads from the file descriptor fd into word . The number of bytes read is equal to the size of the unsigned long type, which is four bytes on 32-bit Linux systems, and eight bytes on 64-bit systems. On return, nr contains the number of bytes read, or -1 on error:

unsigned long word;
ssize_t nr;

/* read a couple bytes into 'word' from 'fd' */
nr = read (fd, &word, sizeof (unsigned long));
if (nr == -1)
/* error */

There are two problems with this -ve implementation: the call might return without reading all len bytes, and it could produce certain errors that this code does not check for and handle.

Writing with write( )

The most basic and common system call used for writing is write( ). write( ) is the counterpart of read( ).

#include <unistd.h>
ssize_t write (int fd, const void *buf, size_t count);

A call to write( ) writes up to count bytes starting at buf to the current file position of the file referenced by the file descriptor fd. Files backed by objects that do not support seeking (for example, character devices) always write starting at the “head.”

On success, the number of bytes written is returned, and the file position is updated in kind. On error, -1 is returned, and errno is set appropriately. A call to write( ) can return 0, but this return value does not have any special meaning; it simply implies that zero bytes were written.
As with read( ), the most basic usage is simple:

const char *buf = "My ship is solid!";
ssize_t nr;
/* write the string in 'buf' to 'fd' */
nr = write (fd, buf, strlen (buf));
if (nr == -1)
/* error */

But again, as with read( ) , this usage is not quite right. Callers also need to check for the possible occurrence of a partial write:

unsigned long word = 1720;
size_t count;
ssize_t nr;
count = sizeof (word);
nr = write (fd, &word, count);
if (nr == -1)
/* error, check errno */
else if (nr != count)
/* possible error, but 'errno' not set */


Closing Files
After a program has finished working with a file descriptor, it can unmap the file descriptor from the associated file via the close( ) system call:

#include <unistd.h>
Int close (int fd);
A call to close( ) unmaps the open file descriptor fd, and disassociates the process from the file. The given file descriptor is then no longer valid, and the kernel is free to reuse it as the return value to a subsequent open( ) or creat( ) call. A call to close( ) returns 0 on success. On error, it returns -1, and sets errno appropriately. Usage is simple:

if (close (fd) == -1)
perror ("close");

Note that closing a file has no bearing on when the file is flushed to disk. If an application wants to ensure that the file is committed to disk before closing it, it needs to make use of one of the synchronization options discussed earlier in “Synchronized I/O.”
Closing a file does have some side effects, though. When the last open file descriptor referring to a file is closed, the data structure representing the file inside the kernel is freed. When this data structure is freed, it unpins the in-memory copy of the inode associated with the file. If nothing else is pinning the inode, it too may be freed from memory (it may stick around because the kernel caches inodes for performance reasons, but it need not). If a file has been unlinked from the disk, but was kept open before it was unlinked, it is not physically removed until it is closed and its inode is removed from memory. Therefore, calling close( ) may also result in an unlinked file finally being physically removed from the disk.