UNIX Interprocess Communications (I)
Outline
IPC fundamentals
Pipe FIFO Message queue
IPC Fundamentals
What is IPC?
Mechanisms to transfer data between processes Not all important procedures can be easily built in a single process
Why is it needed?
Why do processes communicate?
To share resources Client/server paradigms Inherently distributed applications Reusable software components Other good software engineering reasons
The Basic Concept of IPC
A sending process needs to communicate data to a receiving process Sender wants to avoid details of receivers condition Receiver wants to get the data in an organized way
IPC from the OS Point of View
Private address space OS address space
Private address space
Process A
Process B
Fundamental IPC Problem for the OS
Each process has a private address space Normally, no process can write to another processs space How to get important data from process A to process B?
OS Solutions to IPC Problem
Fundamentally, two options 1. Support some form of shared address space
Shared memory
2. Use OS mechanisms to transport data from one address space to another
Files, messages, pipes, RPC
Fundamental Differences in OS Treatment of IPC Solutions
Shared memory
OS has job of setting it up And perhaps synchronizing But not transporting data OS involved in every IPC Os transports data
Messages, etc
Desirable IPC Characteristics
Fast Easy to use Well defined synchronization model Versatile Easy to implement Works remotely
IPC and Synchronization
Synchronization is a major concern for IPC
Allowing sender to indicate when data is transmitted Allowing receiver to know when data is ready Allowing both to know when more IPC is possible
IPC and Connections
IPC mechanisms can be connectionless or require connection Connectionless IPC mechanisms require no preliminary setup Connection IPC mechanisms require negotiation and setup before data flows
Sometimes much is concealed from user, though
Connectionless IPC
Data simply flows Typically, no permanent data structures shared in OS by sender and receiver + Good for quick, short communication + less long-term OS overhead - Less efficient for large, frequent communications - Each communication takes more OS resources per byte
Connection-Oriented IPC
Sender and receiver pre-arrange IPC delivery details OS typically saves IPC-related information for them Advantages/disadvantages pretty much the opposites of connectionless IPC
Basic IPC Mechanisms
File system IPC Message-based IPC Procedure call IPC Shared memory IPC
IPC Through the File System
Sender writes to a file Receiver reads from it But when does the receiver do the read?
Often synchronized with file locking or lock files
Special types of files can make file-based IPC easier
File IPC Diagram
Process A
Data
Process B
Message-Based IPC
Sender formats data into a formal message
With some form of address for receiver
OS delivers message to receivers message input queue (might signal too) Receiver (when ready) reads a message from the queue Sender might or might not block
Message-Based IPC Diagram
OS
Process A
Data sent from A to B
Bs message queue
Process B
Procedure Call IPC
Interprocess communication uses same procedure call interface as intraprocess
Data passed as parameters Information returned via return values
Complicated since destination procedure is in a different address space Generally, calling procedure blocks till call returns
File IPC Diagram
main () { . call(); . . . Data as parameters . . . server(); . . .
Data as return values
}
Process A Process B
Shared Memory IPC
Different processes share a common piece of memory
Either physically or virtually
Communications via normal reads/writes May need semaphores or locks
In or associated with the shared memory
Shared Memory IPC Diagram
main () { . x = 10 . . . write variable x . . . print(x); . . .
x: 10
read variable x
}
Process A Process B
Synchronizing in IPC
How do sending and receiving process synchronize their communications? Many possibilities
Based on which process block when
Examples that follow in message context, but more generally applicable
Blocking Send, Blocking Receive
Both sender and receiver block Sender blocks till receiver receives Receiver blocks until sender sends Often called message rendezvous
Non-Blocking Send, Blocking Receive
Sender issues send, can proceed without waiting to discover fate of message Receiver waits for message arrival before proceeding
Essentially, receiver is message-driven
Non-Blocking Send, Non-Blocking Receive
Neither party blocks Sender proceeds after sending message Receiver works until message arrvies
Either receiver periodically checks in non-blocking fashion Or some form of interrupt delivered
Addressing in IPC
How does the sender specify where the data goes? In some cases, the mechanism makes it explicit (e.g., shared memory and RPC) In others, there are options
Direct Addressing
Sender specifies name of the receiving process Using some form of unique process name Receiver can either specify name of expected sender
Or take stuff from anyone
Indirect Addressing
Data is sent to queues, mailboxes, or some other form of shared data structure Receiver performs some form of read operations on that structure Much more flexible than direct addressing
Duality in IPC Mechanisms
Many aspects of IPC mechanisms are duals of each other Which implies that these mechanisms have the same power First recognized in context of messages vs. procedure calls At least, IPC mechanisms can be simulated by each other
So which IPC mechanism to build/choose/use?
Depends on model of computation And on philosophy of user In particular cases, hardware or existing software may make one perform better
Typical UNIX IPC Mechanisms
Different versions of UNIX introduced different IPC mechanisms
Pipes Message queues Semaphores Shared memory Sockets RPC
Pipes
Only IPC mechanism in early UNIX systems (other than files)
Uni-directional Unformatted Uninterpreted Interprocess byte streams
Accessed in file-like way Only used for parent-child or sib process pairs
Pipe Details
One process feeds bytes into pipe A second process reads the bytes from it Potentially blocking communication mechanism Requires close cooperation between processes to set up
Named pipes allow more flexibility
Pipes and Blocking
Writing more bytes than pipe capacity blocks the sender
Until the receiver reads some of them
Reading bytes when none are available blocks the receiver
Until the sender writes some
Single pipe cant cause deadlock
Piping in a C program: <stdio.h>
Piping is a process where the input of one process is made the input of another. We have seen examples of this from the UNIX command line using | We will now see how we do this from C programs We will have two (or more) forked processes and will communicate between them UNIX allows two ways of opening a pipe
popen() -- Formatted Piping
FILE *popen(char *command, char *type) -- opens a pipe for I/O where the command is the process that will be connected to the calling process thus creating the pipe. The type is either ``r'' - for reading, or ``w'' for writing popen() returns is a stream pointer or NULL for any errors A pipe opened by popen() should always be closed by pclose(FILE *stream)
We use write() and read() to communicate with the pipe's stream
#include <unistd.h> #include <stdlib.h> #include <stdio.h> #include <string.h>
Reading Output From an External Program
int main() { FILE *read_fp; char buffer[BUFSIZ + 1]; int chars_read; memset(buffer,\0,sizeof(buffer)); read_fp = popen(uname a, r); if (read_fp !=NULL){ chars_read = fread(buffer, sizeof(char), BUFSIZ, read_fp); if (chars_read > 0){ printf(Output was:-\n%s\n,buffer); } pclose(read_fp); exit(EXIT_SUCCESS); } exit(EXIT_FAILURE); }
Sending Output to popen
#include <unistd.h> #include <stdlib.h> #include <stdio.h> int main() { FILE *write_fp; char buffer[BUFSIZ + 1]; sprintf(buffer, Once upon a time, there was \n); write_fp = popen (od c, w); if (write_fp != NULL){ fwrite( buffer , sizeof(char), strlen(buffer), write_fp); pclose(write_fp); exit(EXIT_SUCCESS); } exit(EXIT_FAILURE); }
pipe() -- Low level Piping
int pipe(int fd[2]) -- creates a pipe and returns two file descriptors, fd[0], fd[1] fd[0] is opened for reading, fd[1] for writing pipe() returns 0 on success, -1 on failure and sets errno accordingly The standard programming model is that after the pipe has been set up, two (or more) cooperative processes will be created by a fork and data will be passed using read() and write() Pipes opened with pipe() should be closed with close(int fd )
#include <unistd.h> #include <stdlib.h> #include <stdio.h> #include <string.h> Int main() { int data_processed; int file_pipes[2]; const char some_data[] = 123 char buffer[BUFSIZ + 1];
The pipe Function
memset(buffer, \0, sizeof(buffer)); if (pipe(file_pipes) == 0){ data_processed = write(file_pipes[1], some_data, strlen(some_data)); printf(Wrote %d bytes\n, data_processed); data_processed = read(file_pipes[0], buffer, BUFSIZ); printf(Read %d bytes: %s\n, data_processed, buffer); exit(EXIT_SUCCESS); } exit (EXIT_FAILURE); }
Example: Parent writes to a child int pdes[2]; pipe(pdes); if ( fork() == 0 ) { /* child */ close(pdes[1]); /* not required */ read( pdes[0]); /* read from parent */ ..... } else { close(pdes[0]); /* not required */ write( pdes[1]); /* write to child */ ..... }
#include <unistd.h> #include <stdlib.h> #include <stdio.h> #include <string.h> int pdes[2]; char string[128]; char string2[] = "This msg is ent to child process"; main(){ pipe(pdes); if(fork()==0){ close(pdes[1]); read(pdes[0],string,128); printf("The msg recieved from parent process:\n"); printf("%s\n",string); } else{ close(pdes[0]); write(pdes[1],string2,strlen(string2)); } }
Named Pipes: FIFOs
So far, we have only been able to pass data between programs that are related, i.e. programs that have been started from a common ancestor process. We would like unrelated processes to be able to exchange data We do this with FIFOs, often referred to as named pipes A named pipe is a special type of file that exists as a name in the file system, but behaves like the unnamed pipes
We can create named pipes from the command line $ mkfifo filename E.g. $ mkfifo fpipe $ grep .c < fpipe & $ls > fpipe
From inside a program, we can use the following call:
#include <sys/types.h> #include <sys/stat.h> Int mkfifo(const char *filename, mode_t mode);
#include <unistd.h> Creating a #include <stdlib.h> Named Pipe #include <stdio.h> #incldue <sys/types.h> #include <sys/stat.h> Int main() { int res = mkfifo(/tmp/my_fifo,0777); if (res == 0) printf(FIFO created \n); exit (EXIT_SUCCESS); } We can look for the pipe in terminal with: $ls -lF /tmp/my_fifo prwxr-xr-x 1 macpbook wheel 0 2 Apr 16:32 /tmp/my_fifo|
Creating a Named Pipe
1. 2.
First, lets try reading the (empty) FIFO
$ cat < /tmp/my_fifo
Now try writing to the FIFO
$ echo abcdefg > /tmp/my_fifo
3.
If we do both at once, we can pass information through the pipe:
$ mkfifo /tmp/my_fifo $ cat < /tmp/my_fifo & [1] 1316 $ echo abcdefg > /tmp/my_fifo abcdefg [1]+ Done $ cat < /tmp/my_fifo
Opening a FIFO with open
The main restriction on opening FIFOs is that a program may not open a FIFO for reading and writing with the mode O_RDWR If we do wish to pass data in both directions between programs, its much better to use either a pair of FIFOs or pipes, one for each direction Or (unusually) explicity change the direction of the data flow by closing and re-opening the FIFO
Opening a FIFO with open
There are four legal combinations of O_RDONLY, O_WRONLY and the O_NONBLOCK A read on an empty blocking FIFO (i.e. one not opened with O_NONBLOCK) will wait until spme data can be read. A write on a full FIFO will wait until the data can be written.
open(const char *path, O_RDONLY); In this case, the open call will block, I.e. not return until a process opens the same FIFO for writing
open(const char *path, O_RDONLY | O_NONBLOCK); The open call will now succeed and return immediately, even if the FIFO had not been opened for writing by any process open(const char *path, O_WRONLY); In this case, the open call will block until a process opens the same FIFO for reading open(const char *path, O_WRONLY | O_NONBLOCK); This will always return immediately, but if not process has the FIFO open for reading, open will return an error, -1, and the FIFO wont be opened. If a process does have the FIFO open for reading, the file descriptor returned can be used for writing to the FIFO
Inter-process Communication with FIFOs
#include <unistd.h> #include <stdlib.h> #include <stdio.h> #include <string.h> #include <fcntl.h> #include <limits.h> #include <sys/types.h> #include <sys/stat.h> #define FIFO_NAME /tmp/my_file #define BUFFER_SIZE PIPE_BUF #define TEN_MEG (1024*1024*10)
Int main() { int pipe_fd; int res; int open_mode = O_WRONLY; int bytes_sent = 0; char buffer[BUFFER_SIZE + 1]; if (access(FIFO_NAME, F_OK) == -1){ res = mkfifo(FIFO_NAME, 0777); if (res != 0){ fprintf(stderr, Could not create fifo %s\n, FIFO_NAME); exit(EXIT_FAILURE) } } printf(Process %d opening FIFO O_WRONLY\n,getpid()); pipe_fd = open(FIFO_NAME, open_mode); printf(Process %d result %d\n, getpid(),pipe_fd);
if (pipe_fd != -1){ while(bytes_sent < TEN_MEG){ res = write(pipe_fd, buffer, BUFFER_SIZE); if (res == -1){ fprintf(stderr, Write error on pipe\n); exit(EXIT_FAILURE); } bytes_sent +=res; } (void) close(pipe_fd); } else{ exit(EXIT_FAILURE); } printf(Process %d finished\n, getpid()); exit(EXIT_FAILURE);
#include <unistd.h> #include <stdlib.h> #include <stdio.h> #include <string.h> #include <fcntl.h> #include <limits.h> #include <sys/types.h> #include <sys/stat.h> #define FIFO_NAME /tmp/my_fifo #define BUFFER_SIZE PIPE_BUF
Int main() { int pipe_fd; int res; int open_mode = O_RDONLY; char buffer(BUFFER_SIZE + 1);
int bytes_read = 0; memset(buffer, \0, sizeof(buffer)); printf(Process %d opening FIFO O_RDONLY\n,getpid()); pipe_fd = open(FIFO_NAME, open_mode); printf(Process %d result %d\n, getpid(), pipe_fd); if (pipe_fd != -1){ do{ res = read(pipe_fd, buffer, BUFFER_SIZE); bytes_read += res; } while (res > 0); (void) close(pipe_fd); } else{ exit(EXIT_FAILURE);
} printf(Process %d finished, %d bytes read\n, getpid(), bytes_read); exit(EXIT_SUCCESS); }
UNIX Message Queues
Introduced in System V Release 3 UNIX Like pipes, but data organized into messages Message component include:
Type identifier Length Data
What Are Message Queues
A message queue is a queue onto which messages can be placed A message is composed of a message type (which is a number), and message data A message queue can be either private, or public
If it is private, it can be accessed only by its creating process or child processes of that creator
Creating A Message Queue - msgget()
This system call accepts two parameters - a queue key, and flags
The key may be one of:
IPC_PRIVATE - used to create a private message queue. a positive integer - used to create (or access) a publiclyaccessible message queue
The second parameter contains flags that control how the system call is to be processed
It may contain flags like IPC_CREAT or IPC_EXCL
The lowest 9 bits of the flags are used to define access permission for the queue, much like similar 9 bits are used to control access to files
the bits are separated into 3 groups - user, group and others. In each set, the first bit refers to read permission, the second bit - to write permission, and the third bit is ignored
#include <stdio.h> /* standard I/O routines. */ #include <sys/types.h> /* standard system data types. */ #include <sys/ipc.h> /* common system V IPC structures. */ #include <sys/msg.h> /* message-queue specific functions. */ /* create a private message queue, with access only to the owner. */ int queue_id = msgget(IPC_PRIVATE, 0600); /* <-- this is an octal number. */ if (queue_id == -1) { perror("msgget"); exit(1); } since the permission bits are '0600', only processes run on behalf of this user will have access to the queue.
The Message Structure - struct msgbuf
struct msgbuf { long mtype; /* message type, a positive number (cannot be zero). */ char mtext[1]; /* message body array. usually larger than one byte. */ };
/* first, define the message string */ char* msg_text = "hello world"; /* allocate a message with enough space for length of string and */ /* one extra byte for the terminating null character. */ struct msgbuf* msg = (struct msgbuf*)malloc(sizeof(struct msgbuf) + strlen(msg_text)); /* set the message type. for example - set it to '1'. */ msg->mtype = 1; /* finally, place the "hello world" string inside the message. */ strcpy(msg->mtext, msg_text);
When allocating a space for a string, one always needs to allocate one extra byte for the null character terminating the string. In our case, we allocated strlen(msg_text) more than the size of "struct msgbuf", and didn't need to allocate an extra place for the null character, cause that's already contained in the msgbuf structure (the 1 byte of mtext there). We don't need to place only text messages in a message. We may also place binary data. In that case, we could allocate space as large as the msgbuf struct plus the size of our binary data, minus one byte. Of-course then to copy the data to the message, we'll use a function such as memset(), and not strcpy().
Writing Messages Onto A Queue msgsnd()
Once we created the message queue, and a message structure, we can place it on the message queue, using the msgsnd() system call It takes the following parameters: int msqid - id of message queue, as returned from the msgget() call. struct msgbuf* msg - a pointer to a properly initializes message structure int msgsz - the size of the data part (mtext) of the message, in bytes. int msgflg - flags specifying how to send the message.
So in order to send our message on the queue, we'll use msgsnd() like this:
int rc = msgsnd(queue_id, msg, strlen(msg_text)+1, 0); if (rc == -1) { perror("msgsnd"); exit(1); }
Note that we used a message size one larger then the length of the string, since we're also sending the null character.
Reading A Message From The Queue msgrcv()
This system call accepts the following list of parameters: int msqid - id of the queue, as returned from msgget(). struct msgbuf* msg - a pointer to a pre-allocated msgbuf structure. It should generally be large enough to contain a message with some arbitrary data (see more below). int msgsz - size of largest message text we wish to receive. Must NOT be larger then the amount of space we allocated for the message text in 'msg'. int msgtyp - Type of message we wish to read. may be one of:
0 - The first message on the queue will be returned. a positive integer - the first message on the queue whose type (mtype) equals this integer (unless a certain flag is set in msgflg, see below). a negative integer - the first message on the queue whose type is less than or equal to the absolute value of this integer.
int msgflg - a logical 'or' combination of any of the following flags:
IPC_NOWAIT - if there is no message on the queue matching what we want to read, return '-1 MSG_EXCEPT - if the message type parameter is a positive integer, then return the first message whose type is NOT equal to the given integer. MSG_NOERROR - If a message with a text part larger than 'msgsz' matches what we want to read, then truncate the text when copying the message to our msgbuf structure. If this flag is not set and the message text is too large, the system call returns '-1'
/* prepare a message structure large enough to read our "hello world". */ struct msgbuf* recv_msg = (struct msgbuf*)malloc(sizeof(struct msgbuf)+strlen("hello world")); /* use msgrcv() to read the message. We agree to get any type, and thus */ /* use '0' in the message type parameter, and use no flags (0). */ int rc = msgrcv(queue_id, recv_msg, strlen("hello world")+1, 0, 0); if (rc == -1) { perror("msgrcv"); exit(1); }
If the message on the queue was larger than the size of "hello world" (plus one), we would get an error, and thus exit. If there was no message on the queue, the msgrcv() call would have blocked our process until one of the following happens:
a suitable message was placed on the queue. the queue was removed (and then errno would be set to EIDRM). our process received a signal
Process Synchronization With Semaphores
What Is A Semaphore?
A semaphore is a resource that contains an integer value, and allows processes to synchronize by testing and setting this value in a single atomic operation This means that the process that tests the value of a semaphore and sets it to a different value (based on the test), is guaranteed no other process will interfere with the operation in the middle.
Two types of operations can be carried on a semaphore:
wait and signal.
A set operation first checks if the semaphore's value equals some number.
If it does, it decreases its value and returns. If it does not, the operation blocks the calling process until the semaphore's value reaches the desired value.
signal operation increments the value of the semaphore, possibly awakening one or more processes that are waiting on the semaphore A semaphore set is a structure that stores a group of semaphores together, and possibly allows the process to commit a transaction on part or all of the semaphores in the set together.
Creating A Semaphore Set - semget()
Similarly to the creation of message queues, we supply some ID for the set, and some flags (used to define access permission mode and a few options). We also supply the number of semaphores we want to have in the given set
/* ID of the semaphore set. */ int sem_set_id_1; int sem_set_id_2; /* create a private semaphore set with one semaphore in it, */ /* with access only to the owner. */ sem_set_id_1 = semget(IPC_PRIVATE, 1, IPC_CREAT | 0600); if (sem_set_id_1 == -1) { perror("main: semget"); exit(1); } /* create a semaphore set with ID 250, three semaphores */ /* in the set, with access only to the owner. */ sem_set_id_2 = semget(250, 3, IPC_CREAT | 0600); if (sem_set_id_2 == -1) { perror("main: semget"); exit(1); }
Setting And Getting Semaphore Values With semctl()
/* use this to store return values of system calls. */ int rc; /* initialize the first semaphore in our set to '3'. */ rc = semctl(sem_set_id_2, 0, SETVAL, 3); if (rc == -1) { perror("main: semctl"); exit(1); } /* initialize the second semaphore in our set to '6'. */ rc = semctl(sem_set_id_2, 1, SETVAL, 6); if (rc == -1) { perror("main: semctl"); exit(1); } /* initialize the third semaphore in our set to '0'. */ rc = semctl(sem_set_id_2, 2, SETVAL, 0); if (rc == -1) { perror("main: semctl"); exit(1); }
Using Semaphores For Mutual Exclusion With semop()
Sometimes we have a resource that we want to allow only one process at a time to manipulate
Example: File operations
/* this function updates the contents of the file with the given path name. */ void update_file(char* file_path, int number) { /* structure for semaphore operations. */ struct sembuf sem_op; FILE* file; /* wait on the semaphore, unless it's value is non-negative. */ sem_op.sem_num = 0; sem_op.sem_op = -1; /* <-- Comment 1 */ sem_op.sem_flg = 0; semop(sem_set_id, &sem_op, 1); /* Comment 2 */ /* we "locked" the semaphore, and are assured exclusive access to file. */ /* manipulate the file in some way. for example, write a number into it. */ file = fopen(file_path, "w"); if (file) { fprintf(file, "%d\n", number); fclose(file); } /* finally, signal the semaphore - increase its value by one. */ sem_op.sem_num = 0; sem_op.sem_op = 1; /* <-- Comment 3 */ sem_op.sem_flg = 0; semop(sem_set_id, &sem_op, 1); }
Comment 1 - before we access the file, we use semop() to wait on the semaphore. Supplying '-1' in sem_op.sem_op means: If the value of the semaphore is greater then or equal to '1', decrease this value by one, and return to the caller. Otherwise (the value is 1 or less), block the calling process, until the value of the semaphore becomes '1', at which point we return to the caller. Comment 2 - The semantics of semop() assure us that when we return from this function, the value of the semaphore is 0. Why? it couldn't be less, or else semop() won't return. It couldn't be more due to the way we later on signal the semaphore. And why it cannot be more then '0'? read on to find out... Comment 3 - after we are done manipulating the file, we increase the value of the semaphore by 1, possibly waking up a process waiting on the semaphore. If several processes are waiting on the semaphore, the first that got blocked on it is wakened and continues its execution.
Using Semaphores For ProducerConsumer Operations With semop()
/* this variable will contain the semaphore set. */ int sem_set_id; /* semaphore value, for semctl(). */ union semun sem_val; /* structure for semaphore operations. */ struct sembuf sem_op; /* first we create a semaphore set with a single semaphore, */ /* whose counter is initialized to '0'. */ sem_set_id = semget(IPC_PRIVATE, 1, 0600); if (sem_set_id == -1) { perror("semget"); exit(1); } sem_val.val = 0; semctl(sem_set_id, 0, SETVAL, sem_val);
/* we now do some producing function, and then signal the */ /* semaphore, increasing its counter by one. */ . . sem_op.sem_num = 0; sem_op.sem_op = 1; sem_op.sem_flg = 0; semop(sem_set_id, &sem_op, 1); . . . /* meanwhile, in a different process, we try to consume the */ /* resource protected (and counter) by the semaphore. */ /* we block on the semaphore, unless it's value is non-negative. */ sem_op.sem_num = 0; sem_op.sem_op = -1; sem_op.sem_flg = 0; semop(sem_set_id, &sem_op, 1); /* when we get here, it means that the semaphore's value is '0' */ /* or more, so there's something to consume. */ . .