-
Notifications
You must be signed in to change notification settings - Fork 30
The packet switching endeavour #143
Description
The goal of this endeavour is a source-routed, packet-switched overlay network in libp2p. The top motivations for this are:
- Fixing problems that we already have. It will tremendously help with many kinds of connectivity problems (NAT, censorship, hybrid and "weird" networks).
- Making non-stream use cases possible. Right now everything in IPFS and libp2p, from the CLI down to multistream and secio, assumes a reliable transport.
- Unlocking future possibilities. A multiformats-based overlay network will be a very big deal. It'll give us much more control over networking "policy" in the widest sense. This ranges from transit incentivization to onion routing to unicorns and rainbows.
Note: take all the following as preliminary notes to finally kickstart the endeavour and discussion. Parts of it are inspired by or directly taken from cjdns (switch and cryptoauth).
What's source routing / packet switching / overlay network?
An overlay network is a computer network that is built on top of another network. Nodes in the overlay network can be thought of as being connected by virtual or logical links, each of which corresponds to a path, perhaps through many physical links, in the underlying network. ... The Internet was originally built as an overlay upon the telephone network, while today (through the advent of VoIP), the telephone network is increasingly turning into an overlay network built on top of the Internet.
-- Wikipedia: Overlay network
Packet switching is a digital networking communications method that groups all transmitted data into suitably sized blocks, called packets, which are transmitted via a medium that may be shared by multiple simultaneous communication sessions. Packet switching increases network efficiency, robustness and enables technological convergence of many applications operating on the same network.
-- Wikipedia: Packet switching
Source routing, also called path addressing, allows a sender of a packet to partially or completely specify the route the packet takes through the network. In contrast, in non-source routing protocols, routers in the network determine the path based on the packet's destination.
-- Wikipedia: Source routing
For example TOR is an overlay network (addressed by public keys), but only exposes a SOCKS proxy. IPFS forms an overlay network too. Its existing protocol for line switching uses a relay node to open a connection to another node, via the relay node.
A telephone network analogy to the three modes of networking:
- Current: your telephone is directly wired to the telephones of everybody else you regularly call.
- Line switching: your telephone is wired to a relay station in the middle of the country. Every other telephone in the country is wired to the relay station. You call the relay station, and ask them to hook you up to one of the other wires.
- Packet switching: all telephones in your neighbourhood are wired to a local switch. The local switch is wired to other local switches elsewhere. When you dial someone, a route through the network of switches is looked up, either by your phone, or step-by-step while your voice walks through the network.
How does it work?
In the case of packet-switching itself, without a reliability protocol on top, we can't really speak of "connections". For the sake of this document's simplicity, we'll do it regardless.
Just imagine a "connection" as UDP sockets on two respective hosts, with every datagram ("packet") transparently wrapped in encryption and authentication ("crypto session"). Packet switching can use other packet-based transport like Ethernet or iodine, and with framing also stream-based ones like TCP or WebSockets.
Every connection is registered with the switch, which assigns it a slot number. Routes ("switch labels") are made up of these numbers, and represent street-routing-like instructions ("turn right, then left after 500 m") through this network of interconnected switches. Because of this numbering, the switch can operate without any memory lookups. Most other existing switching protocols use names or addressing which require processing at every hop. The downside is that we need to be careful with re-numbering when connections come and go.
To make the exposed interface of the switch as simple as possible, the local endpoint is registered with the switch too, just like connections are. That means there are UDP, Ethernet, etc. transports backed by a socket, and self/loopback connections backed by code in the same process. There could even be e.g. a unix socket transport, which allows processes on the same machine to be endpoints.
Routing
Two kinds of routing are involved here.
- The already existing peer-routing no longer looks up physical addresses in the underlying network, but instead finds routes (switch labels) through the network of switches, so that communication between endpoints can happen.
- The new switch-routing looks up physical addresses like peer-routing used to. It doesn't do so for the immediate sake of exchanging communications though, but for placing you in the network topology strategically well, based on certain policies.
We can use a null switch-routing mechanism for now, which opens a connection to every endpoint, so that every switch label returned by peer-routing is only one hop. This allows us to implement packet switching without having to settle for a routing algorithm yet.
Using source routing (instead of distance-vector routing) yields a few nice benefits:
- The local endpoint is in control over the route. Even if (in e.g. a supernode routing scheme), it asks other nodes to come up with and maintain routes, it's always in control.
- The switches are completely dumb and simple, and can be implemented in hardware. There can be nodes which are pure switches, and don't actually speak any protocols on top. ISP-grade switching equipment is freaking expensive, which is a huge barrier to entry. If we want people to exchange data within their neighbourhood, we need to lower that barrier.
New protocols
- multigram
- For multiplexing different protocols on one datagram/packet connection.
- This will likely piggypack a
number => protocol nametable to the crypto handshake, and all following data packets merely include this (e.g. 1-byte) number in their header. - Spec draft available: Add multigram draft 1 specs#123
- switch
- Read packets from connections, write them to the respective "next" connection.
- Interface for registering/unregistering connections established by transports.
- cryptoauth
- Transparently encrypts and authenticats all packets, between switches (hop-to-hop, between switch and transport) and between endpoints (end-to-end, between switch and application protocol). Other options include DTLS 1.3 once it's available, and possibly more. CryptoAuth has not yet been formally audited.
- new peer-routing
- Emits switch labels representing routes through the switch network.
Changes to existing stack
- libp2p ReadMsg()/WriteMsg() interface
- In line with the existing Read()/Write() interface.
- multiaddr schemes
- Figure out the addressing scheme for
/ipX/udp/cryptoauth/switch/some-protocol. /switch/0000.4321.1234.4321/ipfs/Qmfoobarscheme (address for overlay network).
- Figure out the addressing scheme for
- peer-routing becomes switch-routing
- Emits physical addresses for opening connections using the available transports.
How reliability fits in
It'll fit in nicely. We'll figure it out. More later. :)
How to continue
tl;dr bottom-up, then top-down, then spread out.
- Introduce switch and cryptoauth, but leave the current routing in place
- This basically means that we'll still make direct connections to every endpoint we want to communicate with.
- This would need to already include the peer-routing/switch-routing change.
- Implement ReadMsg()/WriteMsg() interface
- Implement e.g. identify and ping as packet-based protocols.
- Make reliability over switch work. This would likely be as multigram protocols over the switch.
- Make switching over a reliable transport work (framing).
One way or the other, peer-routing should be deferred until we have a good idea what to do there.
