This document outlines the design and internals of pydecnet. Rev 0.0, 4/23/2019 General structure The overall structure (modules, layers, threads) of pydecnet closely resembles the component layering used as a descriptive technique in the DECNET Architecture (DNA) documents, particularly the Phase IV documents. The design aims for ease of understanding and correctness rather than worrying much about optimizing performance. Each node (system) is implemented mostly in a single thread, whose name is the system name, created at pydecnet startup. Helper threads are used for communication tasks -- HTTP including the JSON API, and the datalink receive functions -- so these can use blocking operations for simplicity. Function calls "downward" roughly match those shown in the DNA specifications. However, pydecnet does not use the polling model for handing inbound data as the DNA model does. Instead, data flow "upward" is by "work items" queued to the system thread and delivered when the thread looks for work. That work item dispatching is in node.py. It ensures that handling of external input is synchronous with the rest of the thread, so the single-threaded model of the spec carries over to the implementation. Timers are implemented by a helper thread for each system, using a "Timer Wheel" implementation (see the paper by Varghese and Lauck). Timeouts are delivered as work items. Packet parsing and generation The DNA specs use a fairly consistent way of describing packet layouts, as a sequence of fields of various types. For example, a field might be a byte string of a fixed length, an image field (string preceded by a one-byte length), a 2 or 4 byte little-endian integer value, or various other things. One common encoding is the "TLV" encoding, seen for example in the MOP System ID message. In that format, there is a variable number of items, each consisting of a type code identifying the item and its data encoding, a length field giving the length of the value, and the value itself. All these encodings are handled by subclassing the packet.Packet class. Each subclass defines a particular packet layout. The fields for that packet are given by the _layout class attribute, which lists the fields and their encoding. For details of how this is done, refer to the comments on function process_layout in packet.py. Good examples can be found in nsp.py, mop.py, and routing_packets.py. Subclasses inherit the attributes and layout of their base class, with any additional slots or any additional layout items added. So a common header can be defined by a subclass of Packet, and then particular packet types that begin with that common header can be subclasses of that header class with additional fields beyond the common header defined in each _layout. The use of classes to describe packet formats is convenient, for example it allows parsed packets to be passed around and code to check "is this an X packet" by "if isinstance (pkt, X)". But subclassing needs to be done with caution. If Y is a subclass of X, the check "isinstance (pkt, X)" will accept Y. If that is not wanted -- if X and Y are distinct packet types that have to be handled separately -- the solution is to make X and Y both subclasses of a common base that is not itself used for packets. Example of this technique can be found in nsp.py, classes AckData and AckOther. Instances of packet subclasses are Python objects with attributes corresponding to each of the field names given in the layout table. In addition, if an _addslots class attribute is defined, that names additional attributes to be created in the packet instances. All packet instances have fields "src" (the source of the data, if applicable) and "decoded_from" (a copy of the byte string parsed to build this instance, if applicable). Packet parsing is done by constructing an instance of the packet class with the data to be parsed as argument. If the packet is invalid, an exception will result. If the data is longer than the defined layout, and there is a "payload" field listed in the _addslots class attribute, any extra bytes are assigned to the "payload" attribute of the new packet object. Otherwise, the packet is invalid and rejected. Alternatively, an instance of the class can be created with no arguments (which constructs a packet with null field values), then filled in by calling the packet.decode method passing the byte string to be parsed. For this case, any extra data is returned as the function result, to be handled by the caller as needed. A packet object can be built or a previously constructed one modified by assigning values to the packet object attributes. For example, to do forwarding of data packets in the routing layer, the packet would be parsed, then the "visits" field updated, and the resulting packet is then sent if it can be forwarded. A packet is converted to a byte string for transmission either by feeding it to the bytes () function, or by invoking the "encode" method of the object. Session layer API TBD: how applications request DECnet data services. HTML generation TBD HTTP POST JSON API TBD: monitoring, control (in the future) and data service access via HTTP POST of JSON requests. Design notes for the point to point datalink related state machines In the DNA architecture, the routing point to point sublayer runs the routing layer initialization handshake state machine, starting with a "routing init" message, possibly followed by a verification message, and ending in the running state where hello messages are sent as needed and a listen timeout detects loss of communication. Below that is some point to point datalink which has its own initialization state machine. DECnet treats these two as separate, and in particular is designed so they can fail separately. The routing point to point sublayer has a "datalink start" state where the datalink layer is doing its initialization, but it has a timeout and will give up and reinitialize the datalink if its startup takes too long. This design makes sense where the two are separate components that can fail separately, such as a DMC-11 communications controller. But in PyDECnet all this is in a single process and the "separate failure" case does not apply. And a consequence of the routing layer timeout is that the routing layer may at times reinitialize the datalink layer just as the datalink is getting ready to report successful startup. For this reason, these two layers and their interaction in PyDECnet is slightly different from the spec, for efficiency and in particular to avoid the issue of clashing timeouts at the two layers. The design relies on the fact that the data link layer does not fail separately, and will always (a) reliably report to the routing layer when it completes startup, and (b) keep trying to initialize forever until it succeeds. So the routing layer initialization state machine does not have a timeout in the DS state. Instead, it remains in that state until the data link layer reports Datalink UP. Once started, the data link state machine will keep trying until intialization completes. If it loses a TCP connection during that process, the connection is simply re-established silently (the routing layer does not hear about that). Reconnect and resending of data link initialization messages is done from a timer that has a bounded backoff on it, so it retries promptly on the first few tries after startup or if the datalink was previously up, but slows down so it doesn't keep hammering on a peer that is down or inoperative. Once the data link is up (and DlStatus UP has been sent to routing), a loss of a TCP connection will trigger a datalink restart, and any data link restart (from TCP disconnect or from a datalink protocol event that is defined to cause a restart) will produce a DlStatus DOWN to routing. That is sent after the data link has completed shutdown, including stopping the receive thread, and has entered the Halted state. Routing will restart the data link after a delay. There is one exception, Multinet UDP. This is because Multinet is not actually a datalink at all; it is merely a trivial encapsulation and address mapping. In the Multinet TCP case this is fairly well hidden by the fact that TCP provides the equivalent of the DNA data link layer services, i.e., we use the TCP connection machinery as the data link layer initialization and report datalink up to routing when the TCP connection has been made. But UDP has no connections. So for the Multinet UDP case, there is a (fairly fast at first) timer in the DS state that causes the routing init messages to be resent until a proper response has been received. This is controlled by the "start works" flag which is already in place to do the protocol workarounds needed to compensate (as best we can) for the lack of a "restart notification" in the Multinet UDP case.