A Source Routing Solution To Non-Transitive Connectivity Problems in Distributed Hash Tables
A Source Routing Solution To Non-Transitive Connectivity Problems in Distributed Hash Tables
Acknowledgement: This project was partly funded by the pecially the ones to direct logical overlay neighbors, are
German Research Foundation (Deutsche Forschungsgemeinschaft - essential to the stability and efficiency of a DHT. To es-
DFG), contract number ME 1703/4-1 and by EPSRC, contract number
GR/S69009/01.
tablish an overlay connection, two nodes have to be able
Abstract to exchange messages using the underlaying network. If
they are able to exchange messages they are said to have
Distributed hash tables are popular third genera- connectivity.
tion P2P protocols which are well understood in the- However, recent research [3, 5] has revealed that real-
ory. These protocols usually assume that every node in world networks, involving the PlanetLab [8] and also the
the overlay is able to exchange messages with any other global Internet, tend not to provide connectivity among
overlay node. However, this assumption is not always all pairs of overlay nodes. Instead, in these networks
true for real-world networks, including the PlanetLab or the nodes experience non-transitive connectivity (NTC),
the entire Internet. In these networks, the non-transitive which is defined as follows: Let A, B, and C be three
connectivity phenomenon is experienced, in which some overlay nodes. Let A have connectivity to B and let B
overlay nodes are able to exchange messages with a cer- be have connectivity to C. If A has no connectivity to C,
tain node and others are not. This turned out to be a the three nodes exhibit NTC in this situation.
serious problem, particularly for structured P2P over- The NTC phenomenon is a serious problem for
lays. Non-transitive connectivity issues were mainly ig- DHTs. Their provable features rely mostly on the as-
nored by P2P research for a long time, but have been sumption that all pairs of nodes have connectivity, thus
intensively discussed recently. This paper suggests a problems are arising in practice. Freedman et al. [3]
new measure for the degree of non-transitive connectiv- have named several concrete DHT problems in networks
ity and presents a comprehensive, source routing based with NTC, involving invisible nodes, routing loops, bro-
solution, to overcome non-transitive connectivity prob- ken return paths, and inconsistent roots. The problems
lems in distributed hash tables. have been explained in detail and a set of specialized
algorithms has been suggested to overcome each single
problem. A comprehensive solution, however, is still
1 Introduction missing.
Distributed hash tables (DHTs) are third generation As a measure for the degree of NTC, Gerding and
P2P protocols forming structured overlays which have Stribling [5] suggested to analyze all possible triples of
several advantages compared to other P2P paradigms. overlay nodes in a network, and divide the triples that ex-
Their theory is well understood and most of their prop- perience NTC by the total number of triples. They found
erties are provable. DHTs provide a lookup service and out that approximately 9.9% of all considered triples in
are able to find a route from a source node to a target PlanetLab experienced NTC. In [3], additionally tran-
node within a limited number of overlay hops. Par- sient NTC has been analyzed, in which node triples tem-
ticularly, DHTs are implementing a tradeoff between porarily experienced NTC. These time periods, affecting
overlay management load and information location load. about 2.3% of all considered triples in PlanetLab, oc-
Therefore, every peer maintains a fixed, comparatively cur for many reasons, e.g. link failures or BGP routing
small number of connections to other overlay nodes, updates. Unfortunately, this node triple based measure
predefined by the DHT structure. These connections, es- does not precisely predict the impact of NTC effects to
the DHT, as explained in Section 4. eliminate problems resulting from NTC in DHTs.
In this paper a comprehensive solution is presented The suggested solution in this paper is to exchange
to overcome the above mentioned problems caused by SRs in DHTs, instead of exchanging potentially un-
NTC for DHTs, instead of fixing each of the problems reachable addresses. A node A which has connectivity
separately. The remainder of the paper is structured as to node B, for instance, constructs the SR [A::B] and
follows: In Section 2 source routing is applied to DHTs, exchanges it with other nodes. At the first glance this
solving the NTC problem in static situations. Addi- seems rather similar to an address exchange, because
tional heuristic strategies, needed to overcome NTC in source and destination addresses are usually carried in
more realistic dynamic networks, are presented in Sec- the header of a packet. However, only the information
tion 3. Section 4 introduces a new measure for NTC of node B being reachable from node A is expressed by
and argues why this measure better predicts the impact the SR, not B being reachable from anywhere else. Us-
of NTC. A statistical and a measurement evaluation of ing this mechanism, every node exclusively exchanges
the suggested solutions based on the new measure is information that can be used by other nodes.
given there. In Section 5, differences of the presented To successfully apply source routing mechanisms to
approach to other solutions are outlined. The paper is DHTs, the SRs have to by assembled in a consistent way.
concluded in Section 6. An SR containing a source node A, is at first usable by
A only. When other nodes learn this SR from A, they
2 Applying Source Routing to DHTs
have to assemble a new SR with this information. If,
Source routing is a well known routing mechanism e.g., node A has connectivity to B, it constructs the SR
being used in mobile ad-hoc networks (MANETs), for in- [A::B]. When node C receives this SR from A, it has to
stance. MANETs can be considered as networks with a assemble a new SR to B. Since C has obviously an SR
high degree of NTC, because only physically neighbor- to A, e.g. [C::A], it assembles the new SR [C:AA:B].
ing hosts within radio transmission range have connec- Applying a loop elimination mechanism to this SR, it
tivity. Fixed networks require different route discovery is simplified to [C:A:B]. A is now the IN for packets
mechanisms since the degree of NTC is usually lower routed from C to B. A more complex scenario is shown
than in MANETs. in Figure 1. Node A receives an SR [C:DE:F] from node
A source route (SR) contains the addresses of the C. The SR from node A to node C is [A:B:C]. Node A
source overlay node, the destination overlay node, and can now assemble a new SR [A:BCDE:F] from A to F.
all intermediate nodes (INs) in the overlay, on the route In [4] similar mechanisms are suggested to run DHTs in
from the source to the destination. The SR is carried MANETs.
in the header of a packet. Thus, packet forwarding is
done by looking up the next hop towards the destina-
tion in the header. An SR is represented in this paper by
[A:BC· · ·:Z], where A is the address of the source node,
Z is the address of the destination node, and B, C, ... are
the addresses of the INs. If an SR involves one or more
INs it is called a indirect route. An SR without INs is
called a direct route and represented by [A::Z] (A and
Z have connectivity in this case). An SR consists exclu-
sively of segments which are pairs of nodes having con-
nectivity. For simplicity, connectivity has been defined Figure 1: Node A assembles the SR [C:DE:F] and the SR
to be bidirectional in this paper. With this precondition, [A:B:C] to a new SR [A:BCDE:F].
an SR consisting of bidirectional segments also enables
a bidirectional message exchange. In practice, commu- Note that exchanging SRs instead of addresses does
nication paths may exist which are not bidirectional. Al- not affect the advanced operation of a DHT, SRs fulfill
though these unidirectional paths are not considered in a similar task as addresses do. Thus existing DHTs, can
this paper, they can potentially be used by the suggested be modified to use SRs without imposing major changes
mechanisms and will be researched in future work. to their core algorithms, preserving their stability, scala-
The NTC problems in DHTs are caused by the ex- bility, and converging properties.
change of addresses, which are possibly not reachable With the assumption of a static network scenario,
from other overlay nodes. When receiving such ad- not considering leaving nodes, failing nodes, the load
dresses, nodes get an inconsistent view of the overlay. of nodes, or transient NTC, the so far suggested mech-
Thus, avoiding the exchange of addresses which are not anisms (SR exchange and SR assembly) are sufficient.
reachable from everywhere in the overlay is the key to Modified DHTs operate in this static scenario as if all
addresses were reachable from all overlay nodes.
In practice however, networks tend to have dynamic
properties, in which the above mentioned assumptions
are not valid. Further problems have to be considered.
SRs might contain several INs, since they have not been
shortened, yet. Nodes may become overloaded if they
are used as IN in too many SRs. Furthermore, SRs are
getting invalid if INs are leaving the overlay. DHTs will
find alternative routes in this case, but only if the over-
lay is not already partitioned. The only source for direct
routes so far is the bootstrap process, which itself is usu-
ally not very flexible (e.g. the user configures a boot-
strap node for his application). As long as no additional
direct routes are explored, only direct routes specified at
bootstrap time are available. In the worst case this leads
to unacceptable overlay topologies, e.g., all SRs using a
single bootstrap node as IN. Thus, the first problem to
be solved is to shorten the assembled SRs. The second
problem is to apply load balancing among the INs. And Figure 2: Shortening of [A:BCD:E] by node A: a) Connectiv-
the third problem is to increase the direct route diversity. ity of the nodes. b) Node A fails to reach node E directly. c)
Section 3 suggests solutions to these problems. Node A fails to reach node D directly. d) Node A succeeds to
reach node C directly, the source route [A:CD:E] is usable.
3 Source Routing under Dynamic Net-
work Conditions
Two known heuristic approaches are applied to DHT low degree of NTC (cf. Section 4).
environments. They solve the problems mentioned in Similar route-shortening mechanisms have been ap-
Section 2 that arise in dynamic environments. Route plied to other source routing systems before [6]. In net-
shortening and load balancing aim at creating mainly works with a high degree of NTC (e.g. MANETS) its
short or direct SRs, in which the additionally created effectiveness is limited, because the probability to find a
source routing load is balanced evenly among all pos- shortcut within an SR is low. In networks with a low de-
sible INs. During their operation the two mechanisms gree of NTC (e.g. fixed Internet) however, the suggested
also actively discover new direct routes. route-shortening approach is highly effective, as shown
The route-shortening mechanism reduces the num- in Section 4.
ber of INs in a newly assembled SR by detecting con- The load-balancing mechanism (to balance the load
nectivity (shortcuts) between pairs of INs in the SR. In on INs) suggested in this paper relies on locally avail-
the DHT context route shortening can be applied in a able information. In more detail, a load-balanced SR
straightforward way. Let a node A have a newly as- of any node X has three important properties: First, the
sembled, loop-eliminated route [A:B:C] to node C. The SR contains at most one IN. Second, if the SR contains
SR [A:B:C] might still not be the optimal route, since an IN, the IN is an overlay neighbor of X. Third, if the
node A and C possibly have connectivity. Thus, node SR contains an IN, X has connectivity to the IN. If a
A actively probes the route [A::C] to shorten the SR. node has an SR in its routing table, which does not fulfill
If no connectivity is available, the longer route [A:B:C] these conditions, it attempts to load-balance it. There-
is used. Figure 2 shows a more complex scenario with for it first chooses overlay neighbors as candidates, to
three INs. A assembles the new SR [A:BCD:E]. Be- which it has a direct route. Then, it defines a random
fore using it, it probes (in this order) [A::E], [A::D], and ordering of the candidates. It probes, whether the first
[A::C]. In this example node A has connectivity to node candidate can be used as new IN in the SR (it has to
C, thus the resulting shortened SR is [A:CD:E]. have connectivity to the destination). This procedure
An SR is not exchanged until the shortening proce- is repeated after a certain time period (e.g., the keep-
dure is completed, in order to exclusively exchange short alive interval) until all available candidates are probed
SRs. This is important, because the overhead for prob- or a suitable candidate has been found. Meanwhile the
ing routes gets higher with increasing SR length. Note not yet balanced SR is used. If there is a low degree of
that an SR shortened with this heuristic is not necessar- NTC in a network, the probability is high to find a suit-
ily the globally shortest SR to a destination but is a good able candidate in short time. Figure 3 shows an exam-
approximation to it in fixed networks having relatively ple of applying the suggested load-balancing heuristic.
Node A has routing-table entries [A::B], [A::C], [A::D], ponents have been added, Router, RouteAssembler, and
[A:B:E], and [A:J:F]. The SRs [A:B:E] and [A:J:F] are RouteManager. The router component is responsible for
considered for load balancing. [A:B:E] is already load
balanced, since it fulfills all preconditions. The SR
[A:J:F] on the other hand, is not load balanced, because
the IN J is not an overlay neighbor of A. A will attempt
to substitute J with one of the directly reachable overlay
neighbors B, C or D, in random order.
3.5e+07
# non−balancable routes 60
# load−balancing overhead messages
3e+07 50
Number of routes/messages
2.5e+07 40
30
2e+07
20
1.5e+07
10
1e+07 0
0.7 0.75 0.8 0.85 0.9 0.95 1
Node transitivity q
5e+06