0% found this document useful (0 votes)
34 views6 pages

Ch05 Naming

The document discusses naming in distributed systems, focusing on the importance of names, identifiers, and addresses for entity access. It covers various naming strategies, including flat naming, hierarchical approaches, and structured naming, along with their mechanisms and challenges. Key concepts include identifier properties, forwarding pointers, and name resolution techniques.

Uploaded by

mennatalah777
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views6 pages

Ch05 Naming

The document discusses naming in distributed systems, focusing on the importance of names, identifiers, and addresses for entity access. It covers various naming strategies, including flat naming, hierarchical approaches, and structured naming, along with their mechanisms and challenges. Key concepts include identifier properties, forwarding pointers, and name resolution techniques.

Uploaded by

mennatalah777
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Naming: Names, identifiers, and addresses

Naming

Distributed Systems
(3rd Edition)
Essence
Names are used to denote entities in a distributed system. To operate on an
entity, we need to access it at an access point. Access points are entities that
Maarten van Steen Andrew S. Tanenbaum are named by means of an address.

Note
Chapter 05: Naming A location-independent name for an entity E, is independent from the
addresses of the access points offered by E.
Edited by: Hicham G. Elmongui

2 / 35

Naming: Names, identifiers, and addresses Naming: Flat naming Simple solutions

Identifiers Broadcasting

Pure name
A name that has no meaning at all; it is just a random string. Pure names can
be used for comparison only. Broadcast the ID, requesting the entity to return its current address
Can never scale beyond local-area networks
Identifier: A name having some specific properties Requires all processes to listen to incoming location requests

1 An identifier refers to at most one entity.


Address Resolution Protocol (ARP)
2 Each entity is referred to by at most one identifier.
3 An identifier always refers to the same entity (i.e., it is never reused). To find out which MAC address is associated with an IP address, broadcast the
query “who has this IP address”?
Observation
An identifier need not necessarily be a pure name, i.e., it may have content.

3 / 35 Broadcasting 4 / 35

Naming: Flat naming Simple solutions Naming: Flat naming Home-based approaches

Forwarding pointers Home-based approaches

When an entity moves, it leaves behind a pointer to its next location


Dereferencing can be made entirely transparent to clients by simply
following the chain of pointers Single-tiered scheme: Let a home keep track of where the entity is
Entity’s home address registered at a naming service
Update a client’s reference when present location is found
The home registers the foreign address of the entity
Geographical scalability problems (for which separate chain reduction Client contacts the home first, and then continues with foreign location
mechanisms are needed):
Long chains are not fault tolerant
Increased network latency at dereferencing

Forwarding pointers 5 / 35 6 / 35
Naming: Flat naming Home-based approaches Naming: Flat naming Home-based approaches

The principle of mobile IP Home-based approaches

Host's home Problems with home-based approaches


location 1. Send packet to host at its home
Home address has to be supported for entity’s lifetime
2. Return address Home address is fixed ⇒ unnecessary burden when the entity
of current location
permanently moves
Client's
location Poor geographical scalability (entity may be next to client)

3. Tunnel packet to
current location Note
Permanent moves may be tackled with another level of naming (DNS)
4. Send successive packets
to current location
Host's current location

7 / 35 8 / 35

Naming: Flat naming Distributed hash tables Naming: Flat naming Distributed hash tables

Illustrative: Chord Chord lookup example


Resolving key 26 from node 1 and key 12 from node 28
1 4 Finger table
Consider the organization of many nodes into a logical ring
2 4
3 9 i-1 )
4 9 2
5 18 +
p
Each node is assigned a random m-bit identifier.
(
Actual node cc
31 0 1 i su
30 2
Every entity is assigned a unique m-bit key.
1 9
1 1 29 3 2 9
3 9
2 1
Entity with key k falls under jurisdiction of node with smallest id ≥ k 3
4
1
4
28 4 4
5
14
20
27 5
(called its successor succ(k )).
5 14
Resolve k = 12
26 from node 28 6

25 7
Nonsolution 24 8 1 11

Let each node keep track of its neighbor and start linear search along the ring.
2 11
23 3 14
Resolve k = 26 9 4 18
from node 1 5 28
22 10
1 28
Main Issue in DHT-based Systems 2
3
28
28
21 11 1
2
14
14
4 1
20 12 3 18
5 9
To Efficiently resolve a key k to the address of succ(k). 1 21 19 13
4
5
20
28
2 28 18 14
3 28 17 16 15 1 18
4 28 2 18
5 4 1 20 3 18
2 20 4 28
3 28 5 1
4 28
5 4

General mechanism 9 / 35 General mechanism 10 / 35

Naming: Flat naming Hierarchical approaches Naming: Flat naming Hierarchical approaches

Hierarchical Location Services (HLS) HLS: Tree organization


Invariants
Basic idea
Address of entity E is stored in a leaf or intermediate node
Build a large-scale search tree for which the underlying network is divided into Intermediate nodes contain a pointer to a child if and only if the subtree
hierarchical domains. Each domain is represented by a separate directory rooted at the child stores an address of the entity
node. The root knows about all entities

Principle Storing information of an entity having two addresses in different leaf domains
The root directory Field with no data
Top-level
node dir(T) Field for domain
domain T Location record
dom(N) with
pointer to N for E at node M
Directory node M
dir(S) of domain S
A subdomain S
N
of top-level domain T
(S is contained in T)
Location record
with only one field,
containing an address

A leaf domain, contained in S


Domain D1
Domain D2

11 / 35 12 / 35
Naming: Flat naming Hierarchical approaches Naming: Flat naming Hierarchical approaches

HLS: Lookup operation HLS: Insert operation


Basic principles
(a) An insert request is forwarded to the first node that knows about entity E.
Start lookup at local leaf node
(b) A chain of forwarding pointers to the leaf node is created
Node knows about E ⇒ follow downward pointer, else go up
Upward lookup always stops at root Node has no
Node knows
about E, so request
record for E, is no longer forwarded
so request is Node creates record
and stores pointer
Looking up a location forwarded
to parent M
M
Node creates
Node knows
record and
about E, so request
stores address
Node has no is forwarded to child
record for E, so
that request is M
forwarded to
parent

Domain D
Insert
request

(a) (b)
Look-up
Domain D
request

13 / 35 14 / 35

Naming: Structured naming Name spaces Naming: Structured naming Name spaces

Name space Name space


Naming graph
A graph in which a leaf node represents a (named) entity. A directory node is
an entity that refers to other nodes.
We can easily store all kinds of attributes in a node
A general naming graph with a single root node Type of the entity
Data stored in n1
An identifier for that entity
n0
n2: "elke" home keys Address of the entity’s location
n3: "max"
"/keys" Nicknames
n4: "steen" n1 n5
"/home/steen/keys" ...
elke steen
max
Leaf node
n2 n3 n4 keys Note
.procmail mbox
Directory node
Directory nodes can also have attributes, besides just storing a directory table
"/home/steen/mbox" with (identifier, label) pairs.

Note
A directory node contains a table of (node identifier, edge label) pairs.

15 / 35 16 / 35

Naming: Structured naming Name resolution Naming: Structured naming Name resolution

Name resolution Name linking

Problem Hard link


To resolve a name we need a directory node. How do we actually find that What we have described so far as a path name: a name that is resolved by
(initial) node? following a specific path in a naming graph from one node to another.

Closure mechanism: The mechanism to select the implicit context from which Soft link: Allow a node N to contain a name of another node
to start name resolution First resolve N’s name (leading to N)
[Link]: start at a DNS name server Read the content of N, yielding name
/home/maarten/mbox: start at the local NFS file server (possible recursive Name resolution continues with name
search)
0031 20 598 7784: dial a phone number Observations
[Link]: route message to a specific IP address
The name resolution process determines that we read the content of a
node, in particular, the name in the other node that we need to go to.
Note One way or the other, we know where and how to start name resolution
You cannot have an explicit closure mechanism – how would you start? given name

Closure mechanism 17 / 35 Linking and mounting 18 / 35


Naming: Structured naming Name resolution Naming: Structured naming Name resolution

Name linking Mounting


Issue
The concept of a symbolic link explained in a naming graph Name resolution can also be used to merge different name spaces in a
transparent way through mounting: associating a node identifier of another
Data stored in n1
n2: "elke" home
n0
keys name space with a node in a current name space.
n3: "max"
n4: "steen" n1 n5 "/keys"
Terminology
elke steen
max
Foreign name space: the name space that needs to be accessed
n2 n3 n4
Data stored in n6 Mount point: the node in the current name space containing the node
.procmail mbox keys "/keys" identifier of the foreign name space
Mounting point: the node in the foreign name space where to continue
n6 "/home/steen/keys"
name resolution

Observation Mounting across a network


Node n5 has only one name 1 The name of an access protocol.
2 The name of the server.
3 The name of the mounting point in the foreign name space.

Linking and mounting 19 / 35 Linking and mounting 20 / 35

Naming: Structured naming Name resolution Naming: Structured naming The implementation of a name space

Mounting in distributed systems Name-space implementation


Mounting remote name spaces through a specific access protocol
Name server Name server for foreign name space
Basic issue
Machine A Machine B Distribute the name resolution process as well as name space management
across multiple machines, by distributing nodes of the naming graph.
keys
remote home
Distinguish three levels
vu steen
Global level: Consists of the high-level directory nodes. Main aspect is
"nfs://[Link]/home/steen"
that these directory nodes have to be jointly managed by different
administrations
mbox
Administrational level: Contains mid-level directory nodes that can be
grouped in such a way that each group can be assigned to a separate
administration.
Managerial level: Consists of low-level directory nodes within a single
administration. Main issue is effectively mapping directory nodes to local
Network name servers.
Reference to foreign name space

Linking and mounting 21 / 35 Name space distribution 22 / 35

Naming: Structured naming The implementation of a name space Naming: Structured naming The implementation of a name space

Name-space implementation Name-space implementation


An example partitioning of the DNS name space, including network files

A comparison between name servers for implementing nodes in a name space


Global
gov mil org net
layer com edu jp us
nl
Item Global Administrational Managerial
1 Worldwide Organization Department
oracle yale acm ieee ac co uva vu 2 Few Many Vast numbers
3 Seconds Milliseconds Immediate
eng cs eng
Adminis- jack jill keio nec cs
4 Lazy Immediate Immediate
trational
layer
cs ftp www 5 Many None or few None
csl
ai linda
6 Yes Yes Sometimes
pc24
1: Geographical scale 4: Update propagation
robot pub
2: # Nodes 5: # Replicas
Mana-
globule 3: Responsiveness 6: Client-side caching?
gerial
layer Zone
[Link]

Name space distribution 23 / 35 Name space distribution 24 / 35


Naming: Structured naming The implementation of a name space Naming: Structured naming The implementation of a name space

Iterative name resolution Recursive name resolution


Principle Principle
1 resolve(dir , [name1 , ..., nameK ]) sent to Server0 responsible for dir 1 resolve(dir , [name1 , ..., nameK ]) sent to Server0 responsible for dir
2 Server0 resolves resolve(dir , name1 ) → dir1 , returning the identification 2 Server0 resolves resolve(dir , name1 ) → dir1 , and sends
(address) of Server1 , which stores dir1 . resolve(dir1 , [name2 , ..., nameK ]) to Server1 , which stores dir1 .
3 Client sends resolve(dir1 , [name2 , ..., nameK ]) to Server1 , etc. 3 Server0 waits for result from Server1 , and returns it to client.

1. [nl,vu,cs,ftp] 1. [nl,vu,cs,ftp]
Root
name server Root
2. #[nl], [vu,cs,ftp] 8. #[nl,vu,cs,ftp] name server 2. [vu,cs,ftp]
nl
3. [vu,cs,ftp] Name server
nl node 7. #[vu,cs,ftp] Name server
Client's 4. #[vu], [cs,ftp] nl node 3. [cs,ftp]
name vu Client's
resolver 5. [cs,ftp] Name server name
vu node resolver 6. #[cs,ftp] Name server
6. #[cs], [ftp] vu node 4. [ftp]
cs
7. [ftp] Name server
cs node 5. #[ftp] Name server
8. #[ftp] cs node
ftp
[nl,vu,cs,ftp] #[nl,vu,cs,ftp]
Nodes are [nl,vu,cs,ftp] #[nl,vu,cs,ftp]
managed by
the same server

Implementation of name resolution 25 / 35 Implementation of name resolution 26 / 35

Naming: Structured naming The implementation of a name space Naming: Attribute-based naming Directory services

Caching in recursive name resolution Attribute-based naming

Recursive name resolution of [nl, vu, cs,ftp]


Server Should Looks up Passes to Receives Returns Observation
for node resolve child and caches to requester
In many cases, it is much more convenient to name, and look up entities by
cs [ftp] #[ftp] — — #[ftp]
means of their attributes ⇒ traditional directory services (aka yellow pages).
vu [cs, ftp] #[cs] [ftp] #[ftp] #[cs]
#[cs, ftp]
nl [vu, cs, ftp] #[vu] [cs, ftp] #[cs] #[vu] Problem
#[cs, ftp] #[vu, cs] Lookup operations can be extremely expensive, as they require to match
#[vu, cs, ftp] requested attribute values, against actual attribute values ⇒ inspect all entities
root [nl, vu, cs, ftp] #[nl] [vu, cs, ftp] #[vu] #[nl]
(in principle).
#[vu, cs] #[nl, vu]
#[vu, cs, ftp] #[nl, vu, cs]
#[nl, vu, cs, ftp]

Implementation of name resolution 27 / 35 28 / 35

Naming: Attribute-based naming Hierarchical implementations: LDAP Naming: Attribute-based naming Hierarchical implementations: LDAP

Implementing directory services LDAP


Solution for scalable searching Essence
Implement basic directory service as database, and combine with traditional Directory Information Base: collection of all directory entries in an LDAP
structured naming system. service.
Each record is uniquely named as a sequence of naming attributes
Lightweight Directory Access Protocol (LDAP) (called Relative Distinguished Name), so that it can be looked up.
Directory Information Tree: the naming graph of an LDAP directory
Each directory entry consists of (attribute, value) pairs, and is uniquely named service; each node represents a directory entry.
to ease lookups.

Attribute Abbr. Value Part of a directory information tree


Country C NL C = NL
Locality L Amsterdam
Organization O VU University O = VU University

OrganizationalUnit OU Computer Science OU = Computer Science


CommonName CN Main server CN = Main server
Mail Servers – [Link], [Link], [Link] N
FTP Server – [Link] HostName = star HostName = zephyr
WWW Server – [Link]

29 / 35 30 / 35
Naming: Attribute-based naming Hierarchical implementations: LDAP Naming: Attribute-based naming Decentralized implementations

LDAP Distributed index


Basic idea
Assume a set of attributes {a1 , . . . , aN }
Two directory entries having HostName as RDN Each attribute ak takes values from a set R k
Attribute Value Attribute Value For each attribute ak associate a set Sk = {S1k , . . . , Snkk } of nk servers
Locality Amsterdam Locality Amsterdam
Global mapping F : F (ak , v ) = Sjk with Sjk ∈ Sk and v ∈ R k
Organization VU University Organization VU University
OrganizationalUnit Computer Science OrganizationalUnit Computer Science Observation
CommonName Main server CommonName Main server If L(ak , v ) is set of keys returned by F (ak , v ), then a query can be formulated
HostName star HostName zephyr as a logical expression, e.g.,
HostAddress [Link] HostAddress 137 .37 .20.10 
F (a1 , v 1 ) ∧ F (a2 , v 2 ) ∨ F (a3 , v 3 )
Result of search(‘‘(C=NL)(O=VU University)(OU=*)(CN=Main server)’’)
which can be processed by the client by constructing the set

L(a1 , v 1 ) ∩ L(a2 , v 2 ) ∪ L(a3 , v 3 )

31 / 35 Using a distributed index 32 / 35

Naming: Attribute-based naming Decentralized implementations Naming: Attribute-based naming Decentralized implementations

Drawbacks of distributed index Alternative: map all attributes to 1 dimension and then
index
Space-filling curves: principle
1 Map the N-dimensional space covered by the N attributes {a1 , . . . , aN }
into a single dimension
Quite a few 2 Hashing values in order to distribute the 1-dimensional space among
A query involving k attributes requires contacting k servers index servers.
Imagine looking up “lastName = Smith ∧ firstName = Pheriby ”: the client Hilbert space-filling curve of (a) order 1, and (b) order 4
may need to process many files as there are so many people named 1
“Smith.”
14/16
Index 1 Index 2
12/16

Values attribute #2
Values attribute #2

No (easy) support for range queries, such as “price = [1000 − 2500].” 10/16

8/16

6/16

4/16

Index 0 Index 3 2/16

0/16
0 1 0 2 4 6 8 10 12 14
16 16 16 16 16 16 16 16
(a) Values attribute #1 (b) Values attribute #1

Using a distributed index 33 / 35 Space-filling curves 34 / 35

Naming: Attribute-based naming Decentralized implementations

Space-filling curve

Once the curve has been drawn


Consider the two-dimensional case
a Hilbert curve of order k connects 22k subsquares ⇒ has 22k indices.
A range query corresponds to a rectangle R in the 2-dimensional case
R intersects with a number of subsquares, each one corresponding to an
index ⇒ we now have a series of indices associated with R.

Getting to the entities


Each index is to be mapped to a server, who keeps a reference to the
associated entity. One possible solution: use a DHT.

Space-filling curves 35 / 35

You might also like