Unix I/O, Sockets
Demo programs:
TCP/IP Socket programming
Sockets — the Unix standard API for network I/O
Windows has something almost identical, called “winsock”
File descriptors
A file descriptor is an integer value naming an underlying open file resource.
File descriptors are used in all Unix system calls which do I/O, or other operations on files/streams.
read and write
read
— read data from a file/stream named by a file descriptor into a buffer
write
— write data from a buffer to a file/stream named by a file descriptor
See the write_to_file.c
demo program.
fdopen
read
and write
are a pain. For example, no formatted input/output.
Solution: convert a file descriptor into a FILE * using the fdopen() function. Then, use standard I/O functions such as fprintf
, fgets
, and fscanf
to do I/O
Note that attempting to use a single FILE* to read from and write to a socket will not necessarily work. (TODO: figure out why!) A work-around is to use the dup
system call to duplicate the socket file descriptor, and then make two calls to fdopen
to create file handles for reading an writing:
Unix man pages
The Unix man
command shows Unix manual pages. These are extremely useful for getting quick reference documentation on system calls and C library functions.
Unix system calls (such as read
and write
) are documented in section 2 of the manual, and standard C library functions (such as fdopen
) are documented in section 3 of the manual.
Some example commands to try:
man 2 open
man 2 read
man 2 write
man 3 fdopen
Sockets
A socket is a communications endpoint.
Datagram socket: Can send/receive datagrams — discrete chunks of data. Analogy: mailbox.
Stream socket: is one end of a connection. A connection is a private communication channel between two communicating processes. Analogy: telephone.
Important concepts:
Address — uniquely identifies a host on a network
Port — distinguishes sockets on the same host from each other.
At the time the socket API was created (at UC Berkeley in the early 1980s), a variety of network protocol stacks were in active use, such as
- TCP/IP
- DECNET
- others?
For this reason, the socket API treats socket addresses as an opaque data type, with multiple address types as “subclasses”.
sockaddr_un
— a union of all available socket types
For TCP/IP, you will use the sockaddr_in
address type.
Socket API
The socket API is implemented in a number of system calls.
socket
— create a socket
- domain — what protocol family the socket will use. PF_INET for TCP/IP
- type — SOCK_STREAM for stream, SOCK_DGRAM for datagram
- protocol — identifies a particular protocol to use, if not uniquely determined by type. Just use 0 for any TCP/IP socket.
The socket
system call is typically used to create a server socket, which allows a process to accept incoming connections.
listen
— wait for incoming connections
- sockfd — the server socket on which to listen for incoming connections
- backlog — maximum number of incoming connections which will be queued
The return value is 0 if a connection has been received, or -1 if an error occurred.
accept
— accept an incoming connection
- sockfd — the server socket file descriptor
- addr — a pointer to a
sockaddr
struct (really, a “subclass” ofsockaddr
) where the client’s network address will be stored - addrlen – pointer to a variable containing the size in bytes of the struct that addr points to; will be modified to store the actual address size
The return value is a file descriptor naming a socket connected to a two-way communications channel to the client.
connect
— connect to a server
Connects to a remote process. This is the system call that a client process uses to connect to establish a connection to a server process.
Concurrency issues
Once a process is listening on a server socket, any number of clients may request connections. If the server handles the connections in a strictly one at a time manner, then undesirable behavior results: only one client will be able to connect at a time.
Threads are one possible solution: the server can create a thread to handle each new connection, allowing multiple connections to be handled simultaneously. This is an application of threads where concurrency is the motivation, rather than parallelism (although parellelism is often desirable in server applications as well.)
One somewhat difficult issue that arises in server applications is how to handle blocking operations, mainly I/O operations such as accept
. For example, we might want a mechanism to allow the server to shut itself down cleanly. However, if the server is “stuck” in a blocking call to accept
(or other blocking system call), then it may be difficult to gracefully “unstick” the thread that is suspended in the blocking system call. One possible solution is to use nonblocking I/O. A file descriptor, including a server socket, can be marked as nonblocking using the fcntl
system call. The select
system call allows a calling thread to do a timed wait until either a file descriptor becomes “ready” (e.g., an incoming connection is available on a server socket), or a timeout expires. Because nonblocking I/O avoids any operations that would block indefinitely, it allows the program to check for situations such as a shutdown request.
Example programs
- write_to_file.c — open a file and write to it using Unix system calls
- server.c — a simple server program implemented using Unix system calls for I/O
- server2.c — similar to server.c, but using
fdopen
and standard I/O functions - server3.c — a server that does two-way communication with clients, by reading lines and sending them back: use the
telnet
program as a client - server4.c — like server3.c, but uses threads to support multiple concurrent connections
- client.c — a simple client program (which connects to the server implemented by server.c and server2.c), using Unit system calls for I/O
- client2.c — similar to client.c, but using
fdopen
and standard I/O functions