0% found this document useful (0 votes)
35 views8 pages

A Source Routing Solution To Non-Transitive Connectivity Problems in Distributed Hash Tables

This document proposes a source routing solution to address non-transitive connectivity problems in distributed hash tables (DHTs). It begins by describing how real-world networks exhibit non-transitive connectivity, which violates assumptions of DHT protocols. It then introduces source routing, where nodes exchange source routes instead of addresses. This allows nodes to determine reachable paths through intermediaries. The paper presents methods for constructing consistent source routes in DHTs and evaluates the solution statistically and through measurements on PlanetLab.

Uploaded by

ivanich79
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views8 pages

A Source Routing Solution To Non-Transitive Connectivity Problems in Distributed Hash Tables

This document proposes a source routing solution to address non-transitive connectivity problems in distributed hash tables (DHTs). It begins by describing how real-world networks exhibit non-transitive connectivity, which violates assumptions of DHT protocols. It then introduces source routing, where nodes exchange source routes instead of addresses. This allows nodes to determine reachable paths through intermediaries. The paper presents methods for constructing consistent source routes in DHTs and evaluates the solution statistically and through measurements on PlanetLab.

Uploaded by

ivanich79
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

A Source Routing Solution to Non-Transitive Connectivity Problems in

Distributed Hash Tables

Ivan Dedinski, Andreas Berl, Alexander Hofmann, Sebastian Heglmeier,


Bernhard Sick and Hermann de Meer
University of Passau , Faculty of Computer Science and Mathematics
94030 Passau, Germany
{dedinski, berl, hofmann, heglmeie, sick, demeer}@fmi.uni-passau.de

Acknowledgement: This project was partly funded by the pecially the ones to direct logical overlay neighbors, are
German Research Foundation (Deutsche Forschungsgemeinschaft - essential to the stability and efficiency of a DHT. To es-
DFG), contract number ME 1703/4-1 and by EPSRC, contract number
GR/S69009/01.
tablish an overlay connection, two nodes have to be able
Abstract to exchange messages using the underlaying network. If
they are able to exchange messages they are said to have
Distributed hash tables are popular third genera- connectivity.
tion P2P protocols which are well understood in the- However, recent research [3, 5] has revealed that real-
ory. These protocols usually assume that every node in world networks, involving the PlanetLab [8] and also the
the overlay is able to exchange messages with any other global Internet, tend not to provide connectivity among
overlay node. However, this assumption is not always all pairs of overlay nodes. Instead, in these networks
true for real-world networks, including the PlanetLab or the nodes experience non-transitive connectivity (NTC),
the entire Internet. In these networks, the non-transitive which is defined as follows: Let A, B, and C be three
connectivity phenomenon is experienced, in which some overlay nodes. Let A have connectivity to B and let B
overlay nodes are able to exchange messages with a cer- be have connectivity to C. If A has no connectivity to C,
tain node and others are not. This turned out to be a the three nodes exhibit NTC in this situation.
serious problem, particularly for structured P2P over- The NTC phenomenon is a serious problem for
lays. Non-transitive connectivity issues were mainly ig- DHTs. Their provable features rely mostly on the as-
nored by P2P research for a long time, but have been sumption that all pairs of nodes have connectivity, thus
intensively discussed recently. This paper suggests a problems are arising in practice. Freedman et al. [3]
new measure for the degree of non-transitive connectiv- have named several concrete DHT problems in networks
ity and presents a comprehensive, source routing based with NTC, involving invisible nodes, routing loops, bro-
solution, to overcome non-transitive connectivity prob- ken return paths, and inconsistent roots. The problems
lems in distributed hash tables. have been explained in detail and a set of specialized
algorithms has been suggested to overcome each single
problem. A comprehensive solution, however, is still
1 Introduction missing.
Distributed hash tables (DHTs) are third generation As a measure for the degree of NTC, Gerding and
P2P protocols forming structured overlays which have Stribling [5] suggested to analyze all possible triples of
several advantages compared to other P2P paradigms. overlay nodes in a network, and divide the triples that ex-
Their theory is well understood and most of their prop- perience NTC by the total number of triples. They found
erties are provable. DHTs provide a lookup service and out that approximately 9.9% of all considered triples in
are able to find a route from a source node to a target PlanetLab experienced NTC. In [3], additionally tran-
node within a limited number of overlay hops. Par- sient NTC has been analyzed, in which node triples tem-
ticularly, DHTs are implementing a tradeoff between porarily experienced NTC. These time periods, affecting
overlay management load and information location load. about 2.3% of all considered triples in PlanetLab, oc-
Therefore, every peer maintains a fixed, comparatively cur for many reasons, e.g. link failures or BGP routing
small number of connections to other overlay nodes, updates. Unfortunately, this node triple based measure
predefined by the DHT structure. These connections, es- does not precisely predict the impact of NTC effects to
the DHT, as explained in Section 4. eliminate problems resulting from NTC in DHTs.
In this paper a comprehensive solution is presented The suggested solution in this paper is to exchange
to overcome the above mentioned problems caused by SRs in DHTs, instead of exchanging potentially un-
NTC for DHTs, instead of fixing each of the problems reachable addresses. A node A which has connectivity
separately. The remainder of the paper is structured as to node B, for instance, constructs the SR [A::B] and
follows: In Section 2 source routing is applied to DHTs, exchanges it with other nodes. At the first glance this
solving the NTC problem in static situations. Addi- seems rather similar to an address exchange, because
tional heuristic strategies, needed to overcome NTC in source and destination addresses are usually carried in
more realistic dynamic networks, are presented in Sec- the header of a packet. However, only the information
tion 3. Section 4 introduces a new measure for NTC of node B being reachable from node A is expressed by
and argues why this measure better predicts the impact the SR, not B being reachable from anywhere else. Us-
of NTC. A statistical and a measurement evaluation of ing this mechanism, every node exclusively exchanges
the suggested solutions based on the new measure is information that can be used by other nodes.
given there. In Section 5, differences of the presented To successfully apply source routing mechanisms to
approach to other solutions are outlined. The paper is DHTs, the SRs have to by assembled in a consistent way.
concluded in Section 6. An SR containing a source node A, is at first usable by
A only. When other nodes learn this SR from A, they
2 Applying Source Routing to DHTs
have to assemble a new SR with this information. If,
Source routing is a well known routing mechanism e.g., node A has connectivity to B, it constructs the SR
being used in mobile ad-hoc networks (MANETs), for in- [A::B]. When node C receives this SR from A, it has to
stance. MANETs can be considered as networks with a assemble a new SR to B. Since C has obviously an SR
high degree of NTC, because only physically neighbor- to A, e.g. [C::A], it assembles the new SR [C:AA:B].
ing hosts within radio transmission range have connec- Applying a loop elimination mechanism to this SR, it
tivity. Fixed networks require different route discovery is simplified to [C:A:B]. A is now the IN for packets
mechanisms since the degree of NTC is usually lower routed from C to B. A more complex scenario is shown
than in MANETs. in Figure 1. Node A receives an SR [C:DE:F] from node
A source route (SR) contains the addresses of the C. The SR from node A to node C is [A:B:C]. Node A
source overlay node, the destination overlay node, and can now assemble a new SR [A:BCDE:F] from A to F.
all intermediate nodes (INs) in the overlay, on the route In [4] similar mechanisms are suggested to run DHTs in
from the source to the destination. The SR is carried MANETs.
in the header of a packet. Thus, packet forwarding is
done by looking up the next hop towards the destina-
tion in the header. An SR is represented in this paper by
[A:BC· · ·:Z], where A is the address of the source node,
Z is the address of the destination node, and B, C, ... are
the addresses of the INs. If an SR involves one or more
INs it is called a indirect route. An SR without INs is
called a direct route and represented by [A::Z] (A and
Z have connectivity in this case). An SR consists exclu-
sively of segments which are pairs of nodes having con-
nectivity. For simplicity, connectivity has been defined Figure 1: Node A assembles the SR [C:DE:F] and the SR
to be bidirectional in this paper. With this precondition, [A:B:C] to a new SR [A:BCDE:F].
an SR consisting of bidirectional segments also enables
a bidirectional message exchange. In practice, commu- Note that exchanging SRs instead of addresses does
nication paths may exist which are not bidirectional. Al- not affect the advanced operation of a DHT, SRs fulfill
though these unidirectional paths are not considered in a similar task as addresses do. Thus existing DHTs, can
this paper, they can potentially be used by the suggested be modified to use SRs without imposing major changes
mechanisms and will be researched in future work. to their core algorithms, preserving their stability, scala-
The NTC problems in DHTs are caused by the ex- bility, and converging properties.
change of addresses, which are possibly not reachable With the assumption of a static network scenario,
from other overlay nodes. When receiving such ad- not considering leaving nodes, failing nodes, the load
dresses, nodes get an inconsistent view of the overlay. of nodes, or transient NTC, the so far suggested mech-
Thus, avoiding the exchange of addresses which are not anisms (SR exchange and SR assembly) are sufficient.
reachable from everywhere in the overlay is the key to Modified DHTs operate in this static scenario as if all
addresses were reachable from all overlay nodes.
In practice however, networks tend to have dynamic
properties, in which the above mentioned assumptions
are not valid. Further problems have to be considered.
SRs might contain several INs, since they have not been
shortened, yet. Nodes may become overloaded if they
are used as IN in too many SRs. Furthermore, SRs are
getting invalid if INs are leaving the overlay. DHTs will
find alternative routes in this case, but only if the over-
lay is not already partitioned. The only source for direct
routes so far is the bootstrap process, which itself is usu-
ally not very flexible (e.g. the user configures a boot-
strap node for his application). As long as no additional
direct routes are explored, only direct routes specified at
bootstrap time are available. In the worst case this leads
to unacceptable overlay topologies, e.g., all SRs using a
single bootstrap node as IN. Thus, the first problem to
be solved is to shorten the assembled SRs. The second
problem is to apply load balancing among the INs. And Figure 2: Shortening of [A:BCD:E] by node A: a) Connectiv-
the third problem is to increase the direct route diversity. ity of the nodes. b) Node A fails to reach node E directly. c)
Section 3 suggests solutions to these problems. Node A fails to reach node D directly. d) Node A succeeds to
reach node C directly, the source route [A:CD:E] is usable.
3 Source Routing under Dynamic Net-
work Conditions
Two known heuristic approaches are applied to DHT low degree of NTC (cf. Section 4).
environments. They solve the problems mentioned in Similar route-shortening mechanisms have been ap-
Section 2 that arise in dynamic environments. Route plied to other source routing systems before [6]. In net-
shortening and load balancing aim at creating mainly works with a high degree of NTC (e.g. MANETS) its
short or direct SRs, in which the additionally created effectiveness is limited, because the probability to find a
source routing load is balanced evenly among all pos- shortcut within an SR is low. In networks with a low de-
sible INs. During their operation the two mechanisms gree of NTC (e.g. fixed Internet) however, the suggested
also actively discover new direct routes. route-shortening approach is highly effective, as shown
The route-shortening mechanism reduces the num- in Section 4.
ber of INs in a newly assembled SR by detecting con- The load-balancing mechanism (to balance the load
nectivity (shortcuts) between pairs of INs in the SR. In on INs) suggested in this paper relies on locally avail-
the DHT context route shortening can be applied in a able information. In more detail, a load-balanced SR
straightforward way. Let a node A have a newly as- of any node X has three important properties: First, the
sembled, loop-eliminated route [A:B:C] to node C. The SR contains at most one IN. Second, if the SR contains
SR [A:B:C] might still not be the optimal route, since an IN, the IN is an overlay neighbor of X. Third, if the
node A and C possibly have connectivity. Thus, node SR contains an IN, X has connectivity to the IN. If a
A actively probes the route [A::C] to shorten the SR. node has an SR in its routing table, which does not fulfill
If no connectivity is available, the longer route [A:B:C] these conditions, it attempts to load-balance it. There-
is used. Figure 2 shows a more complex scenario with for it first chooses overlay neighbors as candidates, to
three INs. A assembles the new SR [A:BCD:E]. Be- which it has a direct route. Then, it defines a random
fore using it, it probes (in this order) [A::E], [A::D], and ordering of the candidates. It probes, whether the first
[A::C]. In this example node A has connectivity to node candidate can be used as new IN in the SR (it has to
C, thus the resulting shortened SR is [A:CD:E]. have connectivity to the destination). This procedure
An SR is not exchanged until the shortening proce- is repeated after a certain time period (e.g., the keep-
dure is completed, in order to exclusively exchange short alive interval) until all available candidates are probed
SRs. This is important, because the overhead for prob- or a suitable candidate has been found. Meanwhile the
ing routes gets higher with increasing SR length. Note not yet balanced SR is used. If there is a low degree of
that an SR shortened with this heuristic is not necessar- NTC in a network, the probability is high to find a suit-
ily the globally shortest SR to a destination but is a good able candidate in short time. Figure 3 shows an exam-
approximation to it in fixed networks having relatively ple of applying the suggested load-balancing heuristic.
Node A has routing-table entries [A::B], [A::C], [A::D], ponents have been added, Router, RouteAssembler, and
[A:B:E], and [A:J:F]. The SRs [A:B:E] and [A:J:F] are RouteManager. The router component is responsible for
considered for load balancing. [A:B:E] is already load
balanced, since it fulfills all preconditions. The SR
[A:J:F] on the other hand, is not load balanced, because
the IN J is not an overlay neighbor of A. A will attempt
to substitute J with one of the directly reachable overlay
neighbors B, C or D, in random order.

Figure 4: An enhanced CHORD architecture.

packet forwarding according to the SR. Every bypass-


Figure 3: Node A attempts to substitute intermediate node J
ing packet is processed to decide whether it must be de-
with nodes B, C or D. livered to the local node or forwarded to the next node
in the SR. The routeassembler component is in charge
DHTs usually distribute the overlay neighbors of a of the SR assembly as described in Section 2. Every
node fairly among all overlay nodes, leading to (ideally) new route passes the RouteAssembler component be-
distinct routing tables at every node. Thus, the suggested fore being inserted into the CHORD fingertable. The
IN load-balancing technique nicely fits in existing DHT routemanager component is responsible for the route-
load-distribution mechanisms. shortening and load-balancing tasks. While route short-
Available load-balancing properties of DHT hash ening is done before the first usage of the route (as de-
functions have been used before to balance the load on scribed before), the more time consuming load balanc-
INs. FreePastry [1] uses a similar mechanism, for in- ing is done asynchronously.
stance, but in a limited way (cf. Section 5). In this work
this approach is generalized and in Section 4 a compre- 4 Evaluation
hensive performance evaluation is provided. This section evaluates the effectiveness and overhead
Note that without the presence of NTC, both route- of the suggested heuristics and provides results of a sta-
shortening and load-balancing approaches do not affect tistical and measurement analysis under different NTC
the DHT. An enhanced DHT operates as if it had not conditions. For this study, the CHORD-based architec-
been modified at all in this case. No overhead is pro- ture presented in Section 3 has been implemented and
duced in terms of additional packets or increased packet evaluated in the PlanetLab environment.
size. The SRs then only consist of the source node For the statistical analysis a new measure for the de-
and the destination node address, which are usually con- gree of NTC is necessary. The triple-based measure,
tained in a packet anyway. Solely direct routes are used suggested by [5], was derived from the definition of
without presence of NTC. NTC. However, this measure has some disadvantages.
The two suggested heuristics operate the better, the On one hand, the missing connectivity of a node pair
lower the degree of NTC in a network is. As already is counted more than once, if the node pair is involved
mentioned, the degree of NTC in fixed networks is at a in several NTC triples, e.g. (ABC), (ADC), (AEC)... .
relatively low level. But even parts of the network ex- On the other hand, a missing connectivity between two
periencing higher degrees of NTC are still supported by nodes is not counted if it is not involved in a node triple
source routing, as shown in Section 4. However, the with NTC. This means, the triple-based measure con-
SRs might get longer for these network parts, since the siders the connectivity of some overlay node pairs to be
suggested route-shortening and load-balancing heuris- more important than the connectivity of others. But, this
tics might perform less efficient in this case. does not always reflect the impact of NTC to DHTs. An
As a proof of concept, source routing has been in- example for this is the broken return path problem [3]
tegrated into CHORD [13], which is a popular DHT which may be caused by any overlay node pair without
implementation. The resulting architecture is shown in connectivity, independent of being part of a triple with
Figure 4 and evaluated in Section 4. In the CHORD NTC. Therefore, this paper suggests an NTC measure
routing table the use of network addresses has been re- based on the connectivity of overlay node pairs in the
placed by the use of SRs. In addition three new com- network. The coherence of a node pair based measure
and the triple based measure is as follows: If a non- nodes Xi+1 , Xi+2 · · · Xl , the SR cannot be shortened.
partitioned overlay network contains a triple of nodes Thus, the probability upper bound for the emerging of
with NTC it also contains a node pair without connec- an SR of a length greater or equal to l is
tivity (proof is trivial). If a non-partitioned overlay with
more than two nodes contains a node pair without con-
Pl−2
RS≥l (qmin ) = (1 − qmin ) i=1 i . (1)
nectivity, it also contains a node triple with NTC (proof
skipped, due to lack of space). This allows a node-pair Consequently an upper bound for the probability of
based definition of NTC: Two nodes cause NTC in a the emerging of an SR of length l is
non-partitioned overlay network containing more than
two nodes, if they don’t have connectivity. The degree of
NTC is then defined as NNPPtotalnc
where N Pnc is the num-
RS=l (qmin ) = RS≥l (qmin ) − RS≥l+1 (qmin ). (2)
ber of all overlay node pairs without connectivity and
N Ptotal is the total number of overlay node pairs.
Equations 1 and 2 show, that the probability for a
Further properties concerning single nodes can be de-
long SR to emerge decreases faster than exponential,
rived from this measure: Let X be a node of a DHT
with (1 − qmin ) as basis. In reality, the probability of
overlay network, establishing connections to its prede-
the emerging of an SR of length l is much lower, e.g.,
fined overlay neighbors (e.g., fingers in CHORD). In a
because not all nodes Xi have node transitivity qmin .
physical network with a certain degree of NTC, the node
Lc This is confirmed by measurements in PlanetLab, for in-
transitivity q of a node X is defined as Ltotal −1 . Lc stance.
is the number of all nodes in the network having con-
To evaluate the overhead of the route-shortening
nectivity to X and Ltotal is the total number of nodes.
heuristic, an upper bound for the expected value of the
This means q is the probability of a node X to have
additionally produced messages is given. For an SR of
connectivity to node Y , which is randomly chosen from
length l the exact number of sent probing messages is
Ltotal − 1 nodes. In any DHT every node establishes Pl−2
connections to a prescribed set of overlay neighbors with j=1 j. Let R be the total number of SRs in the over-
a certain ID. The ID is determined by the DHT’s hash lay. R can be calculated by multiplying the Ltotal by the
function, e.g., SHA or MD5, to ensure a load balanc- average number of neighbors per node. This, combined
ing among the nodes. In CHORD, e.g., a node with ID with Equation 2 results in an upper bound of
id attempts to establish connections to the nodes with LX
total l−2
IDs id + 20 , id + 21 · ··, id + 2log(H) , where H is the X
R· RS=l (qmin ) · j (3)
highest value in the ID space. Determining the IDs of
l=2 j=0
the nodes using a (good) hash function leads to a set of
neighbors of a node X, which are randomly chosen from for the overall route-shortening overhead.
all Ltotal − 1 nodes. The analysis in this paper assumes, To evaluate the effectiveness of the load-balancing
that there is no stochastic dependency between the con- heuristic, the total number of non-balanceable SRs is
nectivity of any two node pairs. In this case a DHT taken as a measure. For a non-balanceable SR, the
node X has connectivity to any of its overlay neighbors load-balancing heuristic fails, due to a lack of alterna-
with probability q. A high degree of stochastic indepen- tive nodes. Let node X have n overlay neighbors. The
dence is given in networks which provide many alter- probability of k neighbors (k ≤ n) having connectiv-
native overlay routes e.g. the PlanetLab or the Internet. ity to X is binomially distributed with a density func-
This assumption has been confirmed by measurements tion Bq (n, k). The probability of the n − k remaining
in PlanetLab. neighbors of having no connectivity to one of the other
Commonly, q is not equal for all nodes X. A node k neighbors, is (1 − q)k . Thus, the expected number of
hidden behind a firewall, for instance, might have much non-balanced neighbors is:
less connectivity, than other nodes in the network have.
Thus, for the analysis in this paper, the value of q is Z 1 n
X n−k
X
randomly distributed among all nodes with an unknown Px (q) · Bq (n, k) · B(1−q)k (j) · j dq (4)
density function Px (q). 0 k=0 j=0
To evaluate the effectiveness of the route-shortening
heuristic, a probability upper bound for the emerging of at node X.
an SR with a length greater or equal to l is given (l ≥ 2). To evaluate the overhead of the load-balancing
Let qmin be the lowest node transitivity of a node in an heuristic, upper bounds for the number of sent messages
SR [X1 : X2 X3 · · · Xl−1 : Xl ]. If and only if every node are given. With increasing node transitivity the number
Xi with 0 < i < l − 1 does not have connectivity to the of sent messages also increases. If a node X has n − k
indirect SRs, which are in a worst case all non-balanced, 9.9% (triple based measure) in average. The studies are
k−1
based on the data provided by [14] where the connectiv-
Oq (n, k) = (n−k)·[(1 − q)k · k +
X
(1 − q)j · q · j)] ity (node pairs which are able to ping each other) of cur-
j=0
rently about 300 nodes is collected daily. In this contri-
(5) bution, further node connectivity experiments have been
is the expected value for the overhead produced at X for done. On one hand, a more current view on the Plan-
a certain q. Consequently, with Px (q) and Equation 4, etLab’s situation has been gathered. This is important,
because the PlanetLab network grows constantly. In our
Z 1 n
X experiments a total number of 390 usable nodes were
Px (q) · Bq (n, k) · Oq (n, k) dq (6)
0 identified. On the other hand, the node transitivity q has
k=0
been calculated from the gathered data, because q and
is the overall load-balancing overhead. especially the distribution Px (q) can not directly be de-
To illustrate the effect of Equations 4 and 6 an exam- rived from the NTC percentage. The set of 390 Plan-
ple calculation of the load-balancing effectiveness and etLab nodes used in the experiment had connectivity to
overhead for a CHORD DHT with Lt = 106 nodes is one node at the University of Passau (UP). This node
shown in Figure 5 under various node transitivity con- was also used as a bootstrap node in the later CHORD
ditions. The node transitivity q is set identical for all experiments.
nodes, because Px (q) is unknown. Every CHORD node Every PlanetLab node attempted to establish a con-
has n = 20 fingers (neighbors). The graph shows that nection to all other nodes via the secure shell (ssh) appli-
the load-balancing algorithm balances all SRs for a q cation. According to PlanetLab conventions, ssh ports
of 0.5, which is far below the minimum node transitiv- are always open at every node. Thus, if a connection
ity measured in PlanetLab (see below). Also the load- to the ssh port times out, it is either due to NTC or to
balancing overhead is relatively small for a q of more congestion. To minimize the probability of a failure due
than 0.5. The figure shows that for q = 0.5 about 107 to congestion, a high timeout value of about 60 seconds
load-balancing overhead messages are sent. q = 0.5 has been chosen.
means, that half of all 2 · 107 SRs are indirect. Thus, 90
only a single load-balancing overhead message per indi- avg
80
rect SR is produced in this case.
70
Number of PlanetLab nodes

3.5e+07
# non−balancable routes 60
# load−balancing overhead messages
3e+07 50
Number of routes/messages

2.5e+07 40

30
2e+07
20
1.5e+07
10

1e+07 0
0.7 0.75 0.8 0.85 0.9 0.95 1
Node transitivity q
5e+06

0 Figure 6: Node transitivity in PlanetLab.


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Node transitivity q
The measured node transitivity q for every Planet-
Figure 5: Load-balancing effectiveness and overhead, de- Lab node is shown in Figure 6. No PlanetLab node had
pending on the node transitivity. a node transitivity lower than 0.7, whereas the average
Note, that route-shortening and load-balancing strate- node transitivity was about 0.96. According to this, a 4%
gies have been analyzed separately. In practice however degree of NTC (node pair based measure) has been cal-
the two strategies interfere with each other and increase culated from the collected data. Applying the 0.7 lowest
their effects. The load-balancing procedure applied after node transitivity to the graph from Figure 5 yields an
the route shortening, for instance, might further shorten upper bound of 0 non-balanceable SRs and about 0.41
an SR. It is not trivial to estimate these interference ef- load-balancing overhead messages per indirect SR.
fects, thus the real behavior of the system has been mea- In Figure 6 four different clusters can be seen, con-
sured in PlanetLab. taining nodes with nearly similar node transitivity. To-
There are several measurement studies proving the gether with results of previous measurements this indi-
existence of NTC in PlanetLab [3, 5]. The experienced cates that there is no trivial (closed formula) probability
degree of NTC in these measurements has been about distribution approximating Px (q) in general. As a con-
sequence NTC modeling of a particular network has to and Coral [2] (based on Kademlia [7]) were analyzed.
be supported by measurements. Isolated solutions for each of the mentioned problems
In a second experiment the CHORD implementation were discussed. The effects of invisible nodes, for in-
described in Section 3 was tested in PlanetLab. After the stance, can be decreased by the use of virtual coordinate
bootstrapping and stabilizing of the DHT, a snapshot of systems, sending several lookup threads in parallel, or
all SRs in the system has been made. First, the effec- caching unreachable nodes. The inconsistent root prob-
tiveness of the proposed solution has been confirmed. lem can be addressed by consensus algorithms, which
All nodes established successfully connections to their however are not explored well in the DHT context, and
fingers. Approximately 3% of the direct logical neigh- so on. The disadvantage of such separate solutions is
bors (successors and predecessors in CHORD) had in- that they do not eliminate the source of these problems,
direct routes. These connections would not have been the NTC. No comprehensive solution was given to gen-
operable without the suggested modifications and have erally solve the impact of NTC to DHTs.
caused severe problems. Generally, about 97% of all A technique related to source routing is applied in
routes in the system were direct, complying with 96% Gnutella [11]. Reachability of nodes is ensured there
measured average node transitivity. Only two of all indi- by flooding the overlay network, thus finding a indi-
rect SRs had more than one IN, proving the effectiveness rect route to a target. However, flooding does not scale
of the suggested route-shortening and load-balancing well,so this approach can not be utilized for DHTs.
mechanisms. The effectiveness of the suggested load- The degree of NTC in mobile environments is usu-
balancing mechanism in PlanetLab is shown in Figure ally much larger than in fixed networks. The feasibility
7. of source routing for DHTs in MANETs is demonstrated
in [4]. However, the presented approach is not applica-
ble to fixed networks as, in order to find SRs, all nodes
within transmission range of a node are probed. Only in
MANETs two nodes having connectivity are also phys-
ical neighbors. The number of neighbors of a node is
typically quite small. In a fixed network the number of
Figure 7: Load distribution in PlanetLab. nodes a node has connectivity to is usually larger in or-
ders of magnitude.
It can be observed, that only 3 nodes carry the load Another DHT implementation for fixed networks that
of three bypassing SRs. 14 nodes carry two SRs, 62 addresses NTC issues is FreePastry [1]. It performs
nodes are involved in a single SR and the rest is never NTC detection by exchanging link state information
used as IN. Only about 4.3% of the nodes carry load that among leaf set nodes. In case of detected NTC, a node
could be further balanced, but there is no node involved randomly picks an IN from its own leaf set. Besides
in more than three SRs. the disadvantage of being probabilistic, this approach
is restricted to the local leaf set, all other nodes in the
5 Related Work
fingertable are not considered. The mechanism is im-
Structured P2P overlays such as DHTs [7, 10, 13] im- plemented in such a way, that indirect routes with more
plicitly rely on the assumption of connectivity between than one IN cannot be constructed.
all pairs of overlay nodes. However, evaluations among
the PlanetLab hosts, for instance, revealed about 9,9% of 6 Conclusion
node triples with NTC [5]. As a reason for that level of Running DHTs in fixed networks with NTC com-
NTC in PlanetLab, a division into three classes of nodes bined with dynamic properties like churn is a difficult
was identified: Nodes of the first two classes have only task, several problems have to be solved. In current DHT
connectivity to nodes within their class and to nodes in implementations including OpenDHT, i3, or Coral, each
the third class, whereas the nodes of the third class have of the evolving problems is fixed with particular isolated
connectivity to nodes of all classes [5]. Based on the solutions. In contrast to these separated algorithms, this
data provided in [14], transient routing problems where paper has presented a general comprehensive solution to
also found to be a main source for NTC in PlanetLab run DHTs in networks with NTC. To achieve this, an in-
[3]. tegrated set of algorithms and heuristics have been sug-
Problems arising from NTC for DHTs are extensively gested to combine the well known source routing mech-
discussed in [3]. DHTs suffer from invisible nodes, rout- anism with DHTs in fixed networks. The proposed solu-
ing loops, broken return paths, inconsistent roots, etc. In tion can easily be applied to existing and future DHT
[3], the problems for the DHT systems OpenDHT [10] protocols without loosing or changing their advanced
(based on Bamboo [9]), i3 [12] (based on CHORD [13]), properties. The suggested modifications of the DHTs
impose no additional overhead in the absence of NTC, technologies, architectures, and protocols for computer
and operate effectively and efficiently in networks with communications, pages 73–84. ACM Press, 2005.
NTC. [11] M. Ripeanu. Peer-to-peer architecture case study:
As a proof of concept, the standard CHORD DHT Gnutella network. In Proceedings of the First Inter-
national Conference on Peer-to-Peer Computing, pages
protocol has been modified to support source routing and
99–100, 2001.
has been evaluated in PlanetLab. Additionally, a new [12] I. Stoica, D. Adkins, S. Zhuang, S. Shenker, and
NTC measure and a statistical NTC model have been de- S. Surana. Internet indirection infrastructure. In SIG-
veloped to analyze the effectiveness and overhead of the COMM ’02: Proceedings of the 2002 conference on Ap-
suggested solution. Both, statistical analysis and mea- plications, technologies, architectures, and protocols for
surement have proved a very good performance and lim- computer communications, pages 73–86. ACM Press,
ited overhead. The statistical analysis has indicated that 2002.
much higher degrees of NTC than existing in PlanetLab [13] I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and
are manageable with the suggested mechanisms. H. Balakrishnan. Chord: A scalable peer-to-peer lookup
Further heuristics will be evaluated in future work service for internet applications. In SIGCOMM ’01:
Proceedings of the 2001 conference on Applications,
to improve the operation of DHTs in networks under
technologies, architectures, and protocols for computer
harder NTC conditions. Apart of that, the integration of
communications, pages 149–160. ACM Press, 2001.
unidirectional connectivity into the suggested solution [14] C. Yoshikawa. PlanetLab All-Pairs Pings. http://ping
will be researched. .ececs.uc.edu/ping. (last visited: 11/29/2006).
References
[1] FreePastry release notes. http://freepastry.org/Free
Pastry/README-2.0b.html. (last visited: 11/23/2006).
[2] M. J. Freedman, E. Freudenthal, and D. Mazières. De-
mocratizing content publication with Coral. In Pro-
ceedings of the 1st USENIX Symposium on Networked
Systems Design and Implementation (NSDI ’04), pages
239–252, 2004.
[3] M. J. Freedman, K. Lakshminarayanan, S. Rhea, and
I. Stoica. Non-transitive connectivity and dhts. In
Proceedings of USENIX WORLDS 2005, pages 55–60,
2005.
[4] T. Fuhrmann. Combining virtual and physical structures
for self-organized routing. In International Workshop on
Self-Organizing Systems, pages 49–61, 2006.
[5] S. Gerding and J. Stribling. Examining the tradeoffs of
structured overlays in a dynamic non-transitive network.
6.829 fall 2003 class project. http://pdos.csail.mit.edu/
˜strib/docs/projects/networking fall2003.pdf, MIT, 2003.
[6] D. B. Johnson and D. A. Maltz. Dynamic source routing
in ad hoc wireless networks. In Imielinski and Korth,
editors, Mobile Computing, volume 353. Kluwer Aca-
demic Publishers, 1996.
[7] P. Maymounkov and D. Mazieres. Kademlia: A peer-
to-peer information system based on the xor metric.
In Peer-to-Peer Systems: First InternationalWorkshop,
IPTPS 2002. Revised Papers, volume 2429/2002, pages
53–65, 2002.
[8] L. Peterson, T. Anderson, D. Culler, and T. Roscoe. A
blueprint for introducing disruptive technology into the
Internet. SIGCOMM Comput. Commun. Rev., 33(1):59–
64, 2003.
[9] S. Rhea and D. Geels. Handling churn in a DHT. In
USENIX 2004 Annual Technical Conference, pages 127–
140, 2004.
[10] S. Rhea, B. Godfrey, B. Karp, J. Kubiatowicz, S. Rat-
nasamy, S. Shenker, I. Stoica, and H. Yu. OpenDHT: a
public DHT service and its uses. In SIGCOMM ’05:
Proceedings of the 2005 conference on Applications,

You might also like