fork()
and exec()
versus Windows' CreateProcess()
while (1) { write (1, "$ ", 2); // 1 = STDOUT_FILENO readcommand (0, command, args); // parse user input, 0 = STDIN_FILENO if ((pid = fork ()) == 0) { // child? exec (command, args, 0); } else if (pid > 0) { // parent? wait (0); // wait for child to terminate } else { perror ("Failed to fork\n"); } }
read
, write
,
fork
, exec
, wait
.
conventions: -1 return value signals error,
error code stored in errno
,
perror
prints out a descriptive error
message based on errno
.
The split of process creation into fork and exec turns out to have been an inspired choice, though that might not have been clear at the time; see today's assigned paper.
$ ls
read (0, buf, bufsize)
write (1, "hello\n", strlen("hello\n"))
fcntl(fd, F_SETFD, FD_CLOEXEC)
$ ls > tmp1just before exec insert:
close(1); creat("tmp1", 0666); // fd will be 1
The kernel always uses the first free file descriptor, 1 in this case.
Could use dup2()
to clone a file descriptor to a new number.
$ sh < script > tmp1If for example the file
script
contains
echo one echo twoFD inheritance makes this work well.
$ ls f1 f2 nonexistant-f3 > tmp1 2> tmp1after creat, insert:
close(2); creat("tmp1", 0666); // fd will be 2why is this bad? illustrate what's going on with file descriptors. better:
close(2); dup(1); // fd will be 2or in bourne shell syntax,
$ ls f1 f2 nonexistant-f3 > tmp1 2>&1Read Chapter 3 of Advanced Programming in the UNIX Environment by W. Richard Stevens for a detailed understanding of how file descriptors are implemented. In particular, read Section 3.10 to understand how file sharing works.
$ sort < file.txt > tmp1 $ uniq tmp1 > tmp2 $ wc tmp2 $ rm tmp1 tmp2can be more concisely done as:
$ sort < file.txt | uniq | wc
int fdarray[2]; char buf[512]; int n; pipe(fdarray); write(fdarray[1], "hello", 5); n = read(fdarray[0], buf, sizeof(buf)); // buf[] now contains 'h', 'e', 'l', 'l', 'o'
fork()
, so this also works:
int fdarray[2]; char buf[512]; int n, pid; pipe(fdarray); pid = fork(); if(pid > 0){ write(fdarray[1], "hello", 5); } else { n = read(fdarray[0], buf, sizeof(buf)); }
fork()
we already have,
to set up a pipe:
int fdarray[2]; if (pipe(fdarray) < 0) panic ("error"); if ((pid = fork ()) == 0) { child (left end of pipe) close (1); tmp = dup (fdarray[1]); // fdarray[1] is the write end, tmp will be 1 close (fdarray[0]); // close read end close (fdarray[1]); // close fdarray[1] exec (command1, args1, 0); } else if (pid > 0) { // parent (right end of pipe) close (0); tmp = dup (fdarray[0]); // fdarray[0] is the read end, tmp will be 0 close (fdarray[0]); close (fdarray[1]); // close write end exec (command2, args2, 0); } else { printf ("Unable to fork\n"); }
$ compute &
Figure on process address space: code, static data, stack, heap. On fork, the whole address space gets replicated. On thread create, the created thread has a different program counter, registers, and stack (through stack pointer). Everything else is shared between threads.
Kernel-level threads are just processes minus separate address spaces. Discuss the kernel scheduler which is invoked at every timer interrupt. Each thread is an independent entity for the kernel.
Write the cswitch
function for processes and
threads. Notice that switching among threads requires no privileged
operations. Switching the stack can be done by switching the sp
register. A cswitch needs to be fast (typically a few 100 microseconds).
Advantages of threads over processes
User-level threads can be implemented inside a process by writing
scheduler()
and cswitch()
functions. The
scheduler can be called periodically using SIGALRM signal.
Pros of user-level threads:
Cons of user-level threads:
Threading models (slide)
Thread pools (slide)
hits
shared code: hits = hits + 1
Assume the shared code gets compiled into the following assembly
code
ld [hits], R add R, R, 1 st R, [hits]
hits
represents a memory location, and [hits]
represents the contents of that memory location. ld [hits], R
loads the contents of hits
to register R
.
If two threads try and execute this code simultaneously what
can happen? Assume hits=10
before starting these threads.
What interleavings cause hits=12
? What interleavings
cause hits=11
?
This is a problem associated with concurrent access to a shared data region, in short concurrency. This is also called a race condition (perhaps because it is a race between two threads and the result depends on who wins).
Concurrency could be "logical concurrency" on a single CPU due to possibly arbitrary interruptions (or preemptions) caused by the timer interrupt, or could be "physical concurrency" due to threads executing simultaneously on different CPUs. The nature of the problem in both cases is identical.
Another example:
Thread A: i = 0; while (i < 10) i = i + 1; print "A won!"; Thread B: i = 0; while (i > -10) i = i - 1; print "B won!";Who wins? Guaranteed that someone wins? What if both threads run on identical speed CPU executing in parallel? (guaranteed to go on forever?)
Another example
push(v): if (n == stack_size) return full; stack[n] = v; n = n + 1;Some bad schedules? Some that will work?
One way to deal with the problem is to enforce atomicity
of certain code regions. In the hits
example, we wanted
the three instructions to execute atomically with respect to themselves.
The code region that needs to be made atomic for the program
to be correct is also called the critical section.
Locks are one way of implementing atomic regions. A lock L
is a shared variable that provides two methods acquire(L)
and release(L)
. Any lock L
can be acquired by at most one thread at a time. Another thread
can acquire it only after it's holder has released it.
Any thread trying to acquire a lock which is already acquired by another
thread must wait for it to be released.
Locks allow implementation of atomic regions by bracketing the
critical section using calls to acquire()
and
release
. For our hits
example, the
new code will be:
lock hit_lock; //shared lock. ... acquire(hit_lock); hit = hit + 1; release(hit_lock);What happens when one thread is inside the critical section and another thread tries to enter it at the same time? What are the possible interleavings now? Are all possible interleavings correct?