NP - V Unit
NP - V Unit
Introduction:
There are some fundamental differences between applications written using TCP versus those that use UDP. These are because of the differences in the two transport layers: UDP is a connectionless, unreliable, datagram protocol, quite unlike the connection-oriented, reliable byte stream provided by TCP. The client does not establish a connection with the server. Instead, the client just sends a datagram to the server using the sendto function, which requires the address of the destination (the server) as a parameter. Similarly, the server does not accept a connection from a client. Instead, the server just calls the recvfrom function, which waits until data arrives from some client. recvfrom returns the protocol address of the client, along with the datagram, so the server can send a response to the correct client.
sockfd, void *buff, size_t nbytes, int flags, struct sockaddr *from, socklen_t *addrlen);
sockfd, const void *buff, size_t nbytes, int flags, const struct sockaddr *to, socklen_t Both return: number of bytes read or written if OK, 1 on error
The first three arguments, sockfd, buff, and nbytes, are identical to the first three arguments for read and write functions The to argument for sendto is a socket address structure containing the protocol address (e.g., IP address and port number) of where the data is to be sent. The size of this socket address structure is specified by addrlen. The recvfrom function fills in the socket address structure pointed to by from with the protocol address of who sent the datagram. The number of bytes stored in this socket address structure is also returned to the caller in the integer pointed to by addrlen. The final argument to sendto is an integer value, while the final argument to recvfrom is a pointer to an integer value (a value-result argument). The final two arguments to recvfrom are similar to the final two arguments to accept: The contents of the socket address structure upon return tell us who sent the datagram (in the case of UDP) or who initiated the connection (in the case of TCP). The final two arguments to sendto are similar to the final two arguments to connect: We fill in the socket address structure with the protocol address of where to send the datagram (in the case of UDP) or with whom to establish a connection (in the case of TCP).Both functions return the length of the data that was read or written as the value of the function. In the typical use of recvfrom, with a datagram protocol, the return value is the amount of user data in the datagram received. Writing a datagram of length 0 is acceptable. In the case of UDP, this results in an IP datagram containing an IP header (normally 20 bytes for IPv4 and 40 bytes for IPv6), an 8-byte UDP header, and no data. This also means that a return value of 0 from recvfrom is acceptable for a datagram protocol: It does not mean that the peer has closed the connection, as does a return value of 0 from read on a TCP socket. Since UDP is connectionless, there is no such thing as closing a UDP connection.
If the from argument to recvfrom is a null pointer, then the corresponding length argument (addrlen) must also be a null pointer, and this indicates that we are not interested in knowing the protocol address of who sent us data. Both recvfrom and sendto can be used with TCP, although there is normally no reason to do so.
Our UDP client and server programs follow the function call flow that we diagrammed in Figure 8.1. Figure 8.2 depicts the functions that are used. Figure 8.3 shows the server main function
#include<netunet/in.h> #include<netdb.h> int main(int argc, char **argv) { int sockfd; struct sockaddr_in servaddr, cliaddr; sockfd = Socket(AF_INET, SOCK_DGRAM, 0); bzero(&servaddr, sizeof(servaddr)); servaddr.sin_family = AF_INET; servaddr.sin_addr.s_addr = htonl(INADDR_ANY); servaddr.sin_port = htons(SERV_PORT); bind(sockfd,(struct sockaddr*)&servaddr,sizeof(servaddr)); dg_echo(sockfd,(struct sockaddr*) &cliaddr, sizeof(cliaddr)); }
#include <sys/types.h> #include<sys/socket.h> #include<stdlib.h> #include<string.h> #include<stdio.h> #include<netunet/in.h> #include<netdb.h> void dg_echo(int sockfd, struct sockaddr socklen_t clilen) *pcliaddr,
4
{ int n; socklen_t len; char mesg[MAXLINE]; for ( ; ; ) { len = clilen; n = Recvfrom(sockfd, mesg, MAXLINE, 0, pcliaddr, &len); Sendto(sockfd, mesg, n, 0, pcliaddr, len); }
}
This function is a simple loop that reads the next datagram arriving at the server's port using and sends it back using sendto.
First, this function never terminates. Since UDP is a connectionless protocol, there is nothing like an EOF as we have with TCP. Next, this function provides an iterative server, not a concurrent server as we had with TCP. There is no call to fork, so a single server process handles any and all clients. In general, most TCP servers are concurrent and most UDP servers are iterative. There is implied queuing taking place in the UDP layer for this socket. Indeed, each UDP socket has a receive buffer and each datagram that arrives for this socket is placed in that socket receive buffer. When the process calls recvfrom, the next datagram from the buffer is returned to the process in a first-in, first-out (FIFO) order. This way, if multiple datagrams arrive for the socket before the process can read what's already queued for the socket, the arriving datagrams are just added to the socket receive buffer. But, this buffer has a limited size.
TCP client/server from when two clients establish connections with the server
There are two connected sockets and each of the two connected sockets on the server host has its own socket receive buffer
There is only one server process and it has a single socket on which it receives all arriving datagrams and sends all responses. That socket has a receive buffer into which all arriving datagrams are placed. The main function in Figure 8.3 is protocol-dependent (it creates a socket of protocol AF_INET and allocates and initializes an IPv4 socket address structure), but the dg_echo function is protocol-independent. The reason dg_echo is protocol-independent is because the caller (the main function in our case) must allocate a socket address structure of the correct size, and a pointer to this structure, along with its size, are passed as arguments to dg_echo. The function dg_echo never looks inside this protocol-dependent structure: It simply passes a pointer to the structure to recvfrom and sendto. recvfrom fills this structure with the IP address and port number of the client, and since the same pointer (pcliaddr) is then passed to sendto as the destination address, this is how the datagram is echoed back to the client that sent the datagram.
#include<stdio.h> #include<netunet/in.h> #include<netdb.h> int { main(int argc, char **argv) int sockfd; struct sockaddr_in servaddr; if(argc != 2) err_quit("usage: udpcli <IPaddress>"); bzero(&servaddr, sizeof(servaddr)); servaddr.sin_family = AF_INET; servaddr.sin_port = htons(SERV_PORT); Inet_pton(AF_INET, argv[1], &servaddr.sin_addr); sockfd = Socket(AF_INET, SOCK_DGRAM, 0); dg_cli(stdin, sockfd, (struct sockaddr*)&servaddr, sizeof(servaddr)); exit(0); }
Figure 8.8
dg_cli
#include <sys/types.h> #include<sys/socket.h> #include<stdlib.h> #include<string.h> #include<stdio.h> #include<netunet/in.h> #include<netdb.h> void dg_cli(FILE *fp, int sockfd, const struct sockaddr *pservaddr, socklen_t servlen) { int n; char sendline[MAXLINE], recvline[MAXLINE + 1]; while (Fgets(sendline, MAXLINE, fp) != NULL) { Sendto(sockfd, sendline, strlen(sendline), 0, pservaddr, servlen); n = Recvfrom(sockfd, recvline, MAXLINE, 0, NULL, NULL); recvline[n] = 0; /* null terminate */ Fputs(recvline, stdout);
}
There are four steps in the client processing loop: read a line from standard input using fgets, send the line to the server using sendto, read back the server's echo using recvfrom, and print the echoed line to standard output using fputs.( Refer diagram 8.2) With a UDP socket, the first time the process calls sendto, if the socket has not yet had a local port bound to it, that is when an ephemeral port is chosen by the kernel for the socket. As with TCP, the client can call bind explicitly, but this is rarely done. As with the server function dg_echo, the client function dg_cli is protocol-independent, but the client main function is protocol-dependent. The main function allocates and initializes a socket address structure of some protocol type and then passes a pointer to this structure, along with its size, to dg_cli.
The only way to prevent this is to place a timeout on the recvfrom. but timeout is not entire solution. It doesnt know whether client datagram never made it to server or server reply never made it back. So timeout is only present solution to prevent client from blocking state.
with
servaddr.sin_port = htons(7);
We do this so we can use any host running the standard echo server with our client.
9
} }
10
This will works fine if the server is on a host with just a single IP address. But this program can
fail if the server is multihomed. We run this program to our host freebsd4, which has two interfaces and two IP addresses. macosx % host freebsd4 freebsd4.unpbook.com has address 172.24.37.94 freebsd4.unpbook.com has address 135.197.17.100 macosx % udpcli02 135.197.17.100 hello reply from 172.24.37.94:7 (ignored) goodbye reply from 172.24.37.94:7 (ignored) We specified the IP address that does not share the same subnet as the client. From the above example it is clear that while verifying the response from the server it is ignoring the response of the server which is having two IP addresses.( one is unknown) The IP address returned by recvfrom( the source IP address of the UDP datagram) is not the IP address to which we sent the datagram. Solutions: 1) Verify the respondent host name by looking its name from DNS 2) Create a socket for every IP address Bind IP address to socket Wait for any of these sockets to become readable Reply from this socket.
5) Asynchronous errors are not returned for UDP sockets. Unless the socket has been connected.
We mentioned at the end of (server not running)that an asynchronous error is not returned on a UDP socket unless the socket has been connected. Indeed, we are able to call connect for a UDP socket. But this does not result in anything like a TCP connection: There is no three-way handshake. Instead, the kernel just checks for any immediate errors .
It records the IP address and port number of the peer (from the socket address structure passed to connect), and returns immediately to the calling process.
An unconnected UDP socket, the default when we create a UDP socket A connected UDP socket, the result of calling connect on a UDP socket
With a connected UDP socket, three things change, compared to the default unconnected UDP socket: 1. We can no longer specify the destination IP address and port for an output operation. That is, we do not use sendto, but write or send instead. Anything written to a connected UDP socket is automatically sent to the protocol address (e.g., IP address and port) specified by connect. 2. We do not need to use recvfrom to learn the sender of a datagram, but read, recv, or recvmsg instead. The only datagrams returned by the kernel for an input operation on a connected UDP socket are those arriving from the protocol address specified in connect. Datagrams destined to the connected UDP socket's local protocol address (e.g., IP address and port) but arriving from a protocol address other than the one to which the socket was connected are not passed to the connected socket. This limits a connected UDP socket to exchanging datagrams with one and only one peer. 3. Asynchronous errors are returned to the process for connected UDP sockets.
12
The application calls connect, specifying the IP address and port number of its peer. It then uses read and write to exchange data with the peer. Datagrams arriving from any other IP address or port (which we show as "???" in Figure 8.15) are not passed to the connected socket because either the source IP address or source UDP port does not match the protocol address to which the socket is connected. These datagrams could be delivered to some other UDP socket on the host. If there is no other matching socket for the arriving datagram, UDP will discard it and generate an ICMP "port unreachable" error. In summary, we can say that a UDP client or server can call connect only if that process uses the UDP socket to communicate with exactly one peer. Normally, it is a UDP client that calls
connect.
Calling
connect
A process with a connected UDP socket can call connect again for that socket for one of two reasons:
A DNS client can be configured to use one or more servers, normally by listing the IP addresses of the servers in the file /etc/resolv.conf. If a single server is listed (the leftmost box in the figure), the client can call connect, but if multiple servers are listed (the second box from the right in the figure), the client cannot call connect. Also, a DNS server normally handles any client request, so the servers cannot call connect.
13
client may overrun the server (e.g. 96% loss rate: mostly lost due to receive buffer overflow, some due to network congestion) use netstat -s to check the loss
solution: use SO_RCVBUF option to enlarge buffer, use request-reply model instead of bulk transfer The technique of flow control is important to avoid flooding a system with more data than it can handle due to limited bandwidth. One technique of flow control is to limit the number of unacknowledged packets. E.g.: increase control when number of acknowledgement packets received is much less than the number of packets sent.
connect
to determine
14
socklen_t len; struct sockaddr_in cliaddr, servaddr; if (argc != 2) err_quit("usage: udpcli <IPaddress>"); sockfd = Socket(AF_INET, SOCK_DGRAM, 0); bzero(&servaddr, sizeof(servaddr)); servaddr.sin_family = AF_INET; servaddr.sin_port = htons(SERV_PORT); Inet_pton(AF_INET, argv[1], &servaddr.sin_addr); Connect(sockfd, (SA *) &servaddr, sizeof(servaddr)); len = sizeof(cliaddr); Getsockname(sockfd, (SA *) &cliaddr, &len); printf("local address %s\n", Sock_ntop((SA *) &cliaddr, len)); exit(0); }
If we run the program on the multihomed host freebsd, we have the following output:
freebsd % udpcli09 206.168.112.96 local address 12.106.32.254:52329 freebsd % udpcli09 192.168.42.2 local address 192.168.42.1:52330 freebsd % udpcli09 127.0.0.1 local address 127.0.0.1:52331
The first time we run the program, the command-line argument is an IP address that follows the default route. The kernel assigns the local IP address to the primary address of the interface to which the default route points. The second time, the argument is the IP address of a system connected to a second Ethernet interface, so the kernel assigns the local IP address to the primary address of this second interface. Calling connect on a UDP socket does not send anything to that host; it is entirely a local operation that saves the peer's IP address and port. We also see that calling connect on an unbound UDP socket also assigns an ephemeral port to the socket.
15
The client must specify the server's IP address and port number for the call to sendto. Normally, the client's IP address and port are chosen automatically by the kernel, although we mentioned that the client can call bind if it so chooses. If these two values for the client are chosen by the kernel, we also mentioned that the client's ephemeral port is chosen once, on the first sendto, and then it never changes. The client's IP address, however, can change for every UDP datagram that the client sends, assuming the client does not bind a specific IP address to the socket. The reason is: If the client host is multihomed, the client could alternate between two destinations, one going out the datalink on the left, and the other going out the datalink on the right. In this worst-case scenario, the client's IP address, as chosen by the kernel based on the outgoing datalink, would change for every datagram. What happens if the client binds an IP address to its socket, but the kernel decides that an outgoing datagram must be sent out some other datalink? In this case the IP datagram will contain a source IP address that is different from the IP address of the outgoing datalink
16
There are at least four pieces of information that a server might want to know from an arriving IP datagram: the source IP address, destination IP address, source port number, and destination port number. Figure 8.13 shows the function calls that return this information for a TCP server and a UDP server.
17