Copyright © 2005 Gene Michael Stover. All rights reserved. Permission to copy, store, & view this document unmodified & in its entirety is granted.
Since 1997, I've had an idea for a way to implement reliable datagrams efficiently & simply, in terms of a stream protocol such as TCP/IP. I'll discuss that idea here.
If I ever implement the idea, I'll do so in Lisp & C, & I'll discuss the source code here & provide links to it. (If I implement it, the source code will be licensed according to the Gnu Public License agreement or the Gnu Lesser Public License agreement.)
A datagram is a chunk of data sent over a network & containing the information necessary to identify the destination. Contrast this with a chunk of data sent over a connection-oriented protocol, in which the destination is identified by the connection, not within the chunk of data.
A reliable datagram is a datagram that guarrantees it will be delivered exactly once except in the face of catastrophic network problems. Contrast this with most datagram protocols, which guarrantee that a given datagram will be delivered at most once.
There are some experimental protocols that offer reliable datagram services, but they are not widely used. I want to create a reliable datagram protocol at the user level so that it can be installed wherever required, maybe even self-contained within an application.
Reliable datagrams are convenient for application programmers.
A datagram servic is more convenient than a stream service for many types of interprocess communication because it can place each message in a unit of data which can be treated atomically by the communicating programs. That unit of data is the datagram.
An unreliable datagram service, such as UDP, is inadequate as a messaging protocol because it requires the communicating programs to handle protocol errors themselves.
So many situations call for a reliable datagram service & an Application Programmer's Interface (API) for accessing the service conveniently.
Here are the requirements I place on the reliable datagram service I'd like to implement.
Is a datagram a sequence of octets, with the destination or source address being closely related but not part of it, or is a datagram an object which contains a destination &/or source address & a sequence of octets?
A datagram that a program is processing is probably used with its sender's address much of the time. In Lisp, it would be simple enough to make a datagram be a sequence of octets & to associate it with an address in a list when necessary. C isn't as flexible; to associate the octets with the address, it will be most convenient to put them in a structure. I don't need to do that in Lisp, but I'd like to have similar APIs in both languages, so I'll make a datagram structure in both languages.
If a reliable datagram service multiplexes datagrams on a particular TCP port, a full address in reliable datagram space might be the host's Internet address, a TCP port number, & a reliable datagram port number.
Wait a minute. Back up. Here's an idea:
What if an address is a hostname & a queue name? Reliable datagrams could be implemented in terms of a service accessible from TCPMUX. The reliable datagram service could have a default TCPMUX name, but applications might have a custom reliable datagram service running, labeled with a custom name in TCPMUX.
When receiving datagrams, an application program gives the reliable datagram library the name of the queue(s) from which it wants to receive messages. Yes, multiple processes on a host could receive from the same queue if they all knew the name of the queue.
There would be no need to create a reliable datagram service object within the application program.
I presume that all of these functions live in a package named CYBERTIGGYR-RELIA or something like that.
(defstruct datagram octets src-addy dst-addy)
A datagram contains the octets of a message, plus the address of the sender or receiver.
The octets are in a list, vector, or possibly some other sequence.
When sending a datagram, you must fill-in the destination address before calling SEND.
When receiving a datagram, the RECV function will fill-in the source address.
Other functions in this reliable datagram library bind error conditions to this global variable. I'll define those error values later. They will probably be symbols. If *STATUS* is bound to NIL, there has been no error since *STATUS* was last bound to NIL.
create-queue name
=>
name-or-nil
Create an input queue. Until you create the queue, no one can send datagrams to it. There is no privacy for the receiver, though. Any process on this host may receive from the queue if they know the queue's name.
Name is a string. It identifies the queue. I need to define the characters that are permitted in the queue's name. It should probably allow the characters that may be in an URL. How about a maximum length?
Should the queue be private? If so, how to implement that?
Returns the normalized queue name (in case characters must be encoded) on success. If the queue can't be created, returns NIL. It is not an error if the queue already exists.
send octets addy
=>
good-or-bad
Sends the sequence of octets to the address addy (in reliable datagram space).
Octets may be a list of octets or a vector of octets. Maybe other types of sequences of octets are permissible.
Addy identifies a destination. It may be any of these:
Returns NIL on error (which might refer to a previously sent datagram). On success, returns true.
recv queue-name
=>
list-or-nil
Attempts to receive the next available datagram from the service. If there is a datagram available in the queue, you get it as a list. Otherwise, you get NIL.
Queue-name is a string. It identifies the input queue on this host.
List-or-nil will be NIL if there are no datagrams ready for input. It'll also be NIL if there was an error. To distinguish between a no-datagrams situation & an error situation, examine *STATUS*.
If there is a datagram in the queue, you'll get a two-element list. The list's FIRST is a vector of the octets in the datagram. The list's SECOND identifies the sender. It will have one of the forms that may be used for the SEND function. (For the sanity of the application programmer, I should declare that it will have exactly one of those formats always, or there should be an optional argument which specifies which of the formats the SECOND should have.)
close-queue name
I presume all these symbols exist in a module called RELIA or something like that. The way I program modules, that means these symbols in C code might be prefixed with ``RELIA_''.
Holds the last error status or 0 if there hasn't been an error.
char *
CreateQueue (name)
char name[];
Creates a datagram input queue. Normalizes the queue's name, if necessary, & returns a dynamic copy of the normalized name. (Caller must release the name's memory with xfree.)
On error, returns NULL.
It is not an error if the queue already exists.
int
Send (octets, length, addy, queue)
char octets[/* length */];
int length;
struct in_addr *addy;
char queue[];
Octets is a list of the octets in your datagram. Length is the number of octets in the list.
Addy is the Internet address of the destination hostname.
Queue is the name of the reliable datagram queue on the destination host.
Returns 0 if there have been no errors. Returns non-zero if this datagram or some predecessor couldn't be sent.
int
Recv (queue, octets, length, addy)
char queue[];
char octets[/* length */];
int length;
struct in_addr *addy;
If there is a datagram waiting in the queue, copies its octets into octets, but will not over-flow octets, assuming its length is length. Stuffs the sender's host address into *addy. Returns the number of octets that were copied into octets.
If there is no datagram, returns a negative number but does not put an error into errno because this is not an error.
If there is an error, returns a negative number & puts an error code into errno.
void
CloseQueue (name)
char name[];
Testing one two three tea & jam.
One way to implement reliable datagrams is to create a TCP connection to the destination process, send the entire datagram, & close the TCP socket.
Advantages of this approach are:
Disadvantages of this approach are:
Another way to implement a reliable datagram service is to build an error-detection & retransmission feature on terms of UDP. The reliable datagram service would handle the retransmits automatically; the applications which used the reliable datagram service would not need to worry about it.
Advantages of this technique include:
Disadvantages of this technique include:
The reliable datagram service creates a single TCP connection per pair of processes that want to communicate. It multiplexes on that single TCP connection datagrams sent between the two processes. The reliable datagram protocol uses a single TCP connection for all messages between two processes. Messages passing in either direction go over the same TCP connection.
The reliable datagram service creates & destroyes the TCP connections internally, without worrying the application program.
If a given process communicates with multiple processes, the reliable datagram protocol will (probably) create a TCP connection for each of them. If the application program tries to send to a new process, & the reliable datagram service tries to create a new TCP connection but is alerted that it has reached its limit, it can silently close the least recently used connection & create the new one. This connection-swapping is hidden from the application program.
Advantages of this technique are:
Disadvantages of this technique are:
A bunch of processes that want to communicate with each other (maybe they are the same application) could be programmed to use a particular TCP port on which to multiplex all their reliable datagrams.
This is like the ``TCP connection per process'' technique except that it creates a TCP connection between each pair of hosts that want to exchange reliable datagrams.
Just like the ``TCP connection per process'' technique, it multiplexes on a single TCP connection all the datagrams between a pair of processes, but unlike the other technique, this technique multiplexes the datagrams from all the processes on host A that are sent to or from host B.
Advantages of this technique include:
Disadvantages of this technique include:
On the other hand, a bunch of related processes might start their own reliable datagram service on a TCP port that was compiled into them. They'd all use that instance of the service, & it could multiplex all the messages they sent or received. So it's only half a disadvantage, I'd say.
This is the technique I'll use if/when I implement a reliable datagram protocol.
It's a library that can be linked into C programs.
The application program should not need to worry about the implementation of the reliable datagrams. In particular, it should not need to configure communication with the local reliable datagram service.
The library could communicate with the local reliable datagram service through unix datagram sockets when on unix or UDP on Windows or Macintosh.
It's a library that can be loaded into a Lisp program & used from it.
The application program will not need to do special configuration that relates to the implementation of reliable datagrams.
The library communicates with the local reliable datagram service through files because that is portable among Common Lisp.
A future version of the library could use UDP or TCP sockets on Lisp implementations that offer them, even if UDP & TCP are offered in non-portable ways.
I'll probably get around to implementing it some day.
``TCP/IP'' usually means the suite of network protocols that are built on the Internet Protocol. The two most notable players in that suite are TCP & UDP.
TCP is the Transmission Control Protocol. It is a reliable stream protocol. TCP is defined in RFC 793.
UDP is the User Datagram Protocol. It is an unreliable datagram protocol. UDP is defined in RFC 768.
Gene Michael Stover 2008-04-20