SlideShare a Scribd company logo
Next-gen Network Telemetry is
Within Your Packets: In-band OAM
Frank Brockners, Shwetha Bhandari
Continous & Always-On On Demand
Checking Health and Compliance
Continous & Always-On On Demand
Checking Health and Compliance
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
In-Band OAM Solution Ecosystem
A Peek at the Implementation
In-Band OAM – The Technology
Why In-Band OAM? – Use-Case Examples
Prologue: Does your traffic comply?
4
Prolog
Consider TE, Service Chaining, Policy Based Routing, etc...:
“How do you prove that traffic follows the suggested path?”
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
“… Willis says the first issue is connecting VNFs to the infrastructure.
OpenStack does this in a sequential manner, with the sequence serially
numbered in the VNF, but the difficulty comes when trying to verify that
the LAN has been connected to the correct LAN port, the WAN has been
connected to the correct WAN port and so on. "If we get this wrong for a
firewall function it could be the end of a CIO's career," says Willis.”
http://www.lightreading.com/nfv/nfv-specs-open-source/bt-threatens-to-ditch-openstack/d/d-id/718735
October 14, 2015
Light reading citing Peter Willis, Chief researcher for data networks, British Telecom
6
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Ensuring Path and/or Service Chain Integrity
Approach
• Meta-data added to all user traffic
• Based on “Share of a secret”
• Provisioned by controller over
secure channel
• Updated at every service hop
• Verifier checks whether
collected meta-data allows
retrieval of secret
• Path verified
Controller Secret
X
B
CA Verifier
7
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Initial Idea: Combine multiple secrets
“Compose the Onion”
• Approach
• A service is described by a set of secrets, where
each secret is associated with a service function.
Service functions encrypt portions of the meta-data
as part of their packet processing.
• Only the verifying node has access to all secrets. The
verifying nodes re-encrypts the meta-data to validate
whether the packet correctly traversed the service
chain.
• Notes
• To be used only when hardware assisted encryption
is available. i.e. AES-NI instructions or equivalent.
Otherwise this could be very costly operation to verify
at line speed.
“S1”
“S2”
“S3”
Service-Secrets are nested
like layers of an onion
8
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Solution Approach: Leveraging Shamir’s Secret Sharing
Polynomials 101
- Line: Min 2 points
- Parabola: Min 3 points
General: It takes k+1 points to defines a polynomial of degree k.
9
- Cubic function: Min 4 points
Adi Shamir
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Solution Approach: Leverage Shamir’s Secret Sharing
“A polynomial as secret”
• Each service is given a point
on the curve
• When the packet travels through
each service it collects these
points
• A verifier can reconstruct the curve
using the collected points
• Operations done over a finite field
(mod prime) to protect against
differential analysis
(3,46)
(2,28)
(1,16)
X
B
CA
“Secret”:
3𝑥2
+ 3𝑥 + 10
3𝑥2
+ 3𝑥 + 10
Verifier
10
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Operationalizing the Solution
• Leverage two polynomials:
• POLY-1 secret, constant: Each hop gets a point on POLY-1
Only the verifier knows POLY-1
• POLY-2 public, random and per packet.
Each hop generates a point on POLY-2 each time a packet
crosses it.
• Each service function calculates (Point on POLY-1 + Point on
POLY-2) to get (Point on POLY-3) and passes it to verifier by
adding it to each packet.
• The verifier constructs POLY-3 from the points given by all the
services and cross checks whether POLY-3 = POLY-1 + POLY-2
• Computationally efficient:
2 additons, 1 multiplication, mod prime per hop
POLY-1
Secret – Constant
POLY-2
Public – Per Packet
+
=
POLY-3
Secret – Per Packet
11
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Meta Data for Service/Path Verification
• Verification secret is the independent
coefficient of POLY-1
• Computation/retrieval through a cumulative
computation at every hop (“cumulative”)
• For POLY-2 the independent coefficient is
carried within the packet (typically a
combination of timestamp and random
number)
• n bits can service a maximum of 2n packets
• Verification secret and POLY-2 coefficient
(“random”) are of the same size
• Secret size is bound by prime number
Transfer
Rate
RND/
Secret
Size
Max # of packets
(assuming 64 byte
packets)
Time that “random” lasts at
maximum
1 Gbps 64
10 Gbps 64
100 Gbps 64
10 Gbps 56
10 Gbps 48
10 Gbps 40
1 Gbps 32 2200 seconds, 36 minutes
10 Gbps 32 220 seconds, 3.5 minutes
100 GBps 32 22 seconds
12
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Meta-Data Provisioning
• Meta-Data for Service Chain Verification
(POT) provisioned through a controller
(OpenDaylight App)
• Netconf/YANG based protocol
• Provisioned information from Controller to
Service Function / Verifier
• Service-Chain-Identifier
(to be mapped to service chaining technology
specific identifier by network element)
• Service count (number of services in the chain)
• 2 x POT-key-set (even and odd set)
• Secret (in case of communication to the verifier)
• Share of a secret, service index
• 2nd polynomial coefficients
• Prime number
Service Chain Verification App
S3 VerifierS2S1
POT-key-sets:
Prime
secret share
poly-2
POT-key-sets:
Prime
secret share
poly-2
POT-key-sets:
Prime
secret share
poly-2
POT-key-sets:
Secret
Prime
secret share
poly-2
Verification request for
a particular service chain
13
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
• 16* Bytes of Meta-Data for POT
• Random – Unique random number
(e.g. Timestamp or combination of
Timestamp and Sequence number)
• Cumulative (algorithm dependent)
• Transport options for different protocols
• Segment Routing: New TLV in SRH header
• Network Service Header: Type-2 Meta-Data
• In-band OAM for IPv6:
Proof-of-transit extension header
• VXLAN-GPE
Proof-of-transit embedded-telemetry header
• ... more to be added (incl. IPv4)
Proof of Transit: Meta-Data Transport Options
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Random |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Random (contd) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Cumulative |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Cumulative (contd) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
*Note: Smaller numbers are feasible, but require a more
frequent renewal of the polynomials/secrets.
2nd Polynomial
Interative computation of secret
14
…enter: In-Band OAM
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
In-band OAM
• OAM traffic embedded in the data traffic
but not part of the payload of the packet
• OAM “effected by data traffic”
• Example: IPv4 route recording
Out-of-band OAM
• OAM traffic is sent as dedicated traffic,
independent from the data traffic (“probe
traffic”)
• OAM “not effected by data traffic”
• Examples: Ethernet CFM (802.1ag), Ping,
Traceroute
How to send OAM information in packet networks?
“On-Board Unit” Speed control by police car
16
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Remember RFC 791?
17
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
More recently: In-band Network Telemetry for P4
http://p4.org/wp-content/uploads/fixed/INT/INT-current-spec.pdf
18
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
In-Band/Passive OAM - Motivation
• Multipath Forwarding –
debug ECMP networks
• Service/Path Verification –
prove that traffic follows a pre-defined path
• Service/Quality Assurance –
Prove traffic SLAs, as opposed to probe-
traffic SLAs; Overlay/Underlay
• Derive Traffic Matrix
• Custom/Service Level Telemetry
“Most large ISP's prioritize
Speedtest traffic and I would
even go as far to say they
probably route it faster as well
to keep ping times low.”
Source: https://www.reddit.com/r/AskTechnology/comments/2i1nxc/
can_i_trust_my_speedtestnet_results_when_my_isp/
19
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Example use-cases...
• Path Tracing for ECMP networks
• Service/Path Verification
• Derive Traffic Matrix
• SLA proof: Delay, Jitter, Loss
• Custom data: Geo-Location,..
Meta-data required...
• Node-ID, ingress i/f, egress i/f
• Proof of Transit (random, cumulative)
• Node-ID
• Sequence numbers, Timestamps
• Custom meta-data
What if you could collect operational meta-data
within your traffic?
20
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Example: Flow Tracing in EMCP Networks
Probe packet
(“ping”) tests
the “wrong” path
Trace all paths
and detect the one
with issues
21
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Example: Derive the Network Traffic Matrix
22
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 23
Further Examples…
Loss/Re-ordering Detection
Add sequence
number
Check sequence
number Multicast
Delay measurements and
trend analysis
Link 1
Link 2
Link 4
Delay Link1 Delay Link2 Delay Link3 Delay Link4
t = 1 1.2ms 14.8ms 3.8ms 24.8ms
t = 2 1.2ms 14.8ms 3.8ms 24.8ms
t = 3 1.2ms 14.7ms 3.8ms 24.7ms
t = 4 1.2ms 14.8ms 3.8ms 24.8ms
t = 5 1.2ms 14.8ms 3.8ms 24.8ms
t = 6 1.2ms 17.8ms 3.8ms 24.7ms
t = 7 1.2ms 17.9ms 3.8ms 24.7ms
t = 8 1.2ms 18.1ms 3.8ms 24.8ms
Link 3
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Example – Generic customer meta-data:
Geo-Location within your packets
24
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Amending the OAM capabilities of IP
Capability Existing Tools Additions
Continuity Check BFD
• Light-weight continuity check (check without
sending extra traffic)
Connectivity Verification Ping / ICMP echo
• EMCP support
• Acknowledge different packet forwarding
paths in routers
Path Discovery and
Verification
Traceroute
• EMCP support
• Acknowledge different packet forwarding
paths in routers
• Prove correctness/integrity of forwarding path
Defect Indications
• Indicate if a forwarding policy (service chain)
has not been met
Performance Monitoring IPPM (delay/packet loss)
• Metrics for live data traffic
(delay, packet loss)
25
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
In-Band OAM (iOAM)
26
• Gather telemetry and OAM information along the path within the data
packet, as part of an existing/additional header
• No extra probe-traffic (as with ping, trace, ipsla)
• Transport options
• IPv6: Native v6 HbyH extension header or double-encap
• VXLAN-GPE: Embedded telemetry protocol header
• SRv6: Policy-Element (proof-of-transit only)
• NSH: Type-2 Meta-Data (proof-of-transit only)
... additional encapsulations being considered (incl. IPv4, MPLS)
• Deployment
• Domain-ingress, domain-egress, and select devices within
a domaininsert/remove/update the extension header
• Information export via IPFIX/Flexible-Netflow/publish into Kafka
• Fast-path implementation
Hdr iOAM Payload
iOAM domain
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
IPFIX
1
3
4
5
6
2
A
C
B
D
Payload
Hdr
Payload
Hdr
Payload
Hdr
r=45/c=17
A 1
C 4
Payload
Hdr
r=45/c=39
B
A
6
1
C 4
Insert POT
meta-data
Payload
Hdr
r=45/c=0
A 1
POT meta-data
Path-tracing data
Update POT
meta-data
Update POT meta-data
Update POT meta-data
POT Verifier
27
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
• Per node scope
• Hop-by-Hop information
processing
• Device_Hop_L
• Node_ID
• Ingress Interface ID
• Egress Interface ID
• Time-Stamp
• Application Meta Data
• Set of nodes scope
• Hop-by-Hop information
processing
• Service Chain Validation
(Random, Cumulative)
• Edge to Edge scope
• Edge-to-Edge information
processing
• Sequence Number
iOAM: Information carried
28
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Tracing Option
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Option Type | Opt Data Len |IOAM-trace-type| Elements-left |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+
| | |
| Node data List [0] | |
| | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ D
| | a
| Node data List [1] | t
| | a
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
. . . S
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ p
| | a
| Node data List [n-1] | c
| | e
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| | |
| Node data List [n] | |
| | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Hop_Lim | node_id |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ingress_if_id | egress_if_id |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| timestamp |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| app_data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
29
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Proof-of-Transit Option
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Option Type | Opt Data Len | POT type = 0 | reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+
| Random | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P
| Random(contd) | O
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ T
| Cumulative | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| Cumulative (contd) | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+
30
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Edge-to-Edge Option
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Option Type | Opt Data Len | IOAM-E2E-Type | reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| E2E Option data format determined by IOAM-E2E-Type |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Option Type: 000xxxxxx 8-bit identifier of the type of option.
Opt Data Len: 8-bit unsigned integer. Length of the Option Data field of this option, in octets.
iOAM-E2E-Type: 8-bit identifier of a particular iOAM E2E variant.
0: E2E option data is 64-bit Per Packet Counter (PPC) used to identify packet loss and reordering.
Reserved: 8-bit. (Reserved Octet) Reserved octet for future use.g
31
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Transport Options – IPv6, VXLAN-GPE, SRv6, NSH...
• IPv6
• Native v6 HbyH extension
header
• Double-encapsulation:
src=encap-node, dst=original
• VXLAN-GPE: Embedded
telemetry protocol header;
Combines with NSH etc.
• SRv6: iOAM TLV in v6 SR-
header SRH (proof-of-transit
only)
• NSH: Type-2 Meta-Data
(proof-of-transit only)
32
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Implementation Approach
• In-band OAM mechanisms for IP (route recording) have been proposed in the
past, but were always received with skepticism
• Performance concerns
• Footprint for a new technology (incl. new packet header) vs. installed base:
“Is there a sufficiently large greenfield network to allow for the change?”
• Deployment starting point for “in-band OAM for IPv6”
• IoT networks, Data-Center Networks, “greenfield” network deployments
• Let’s stop arguing and go try…
33
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
iOAM6 Test Drive
• Extended Ping
R5#ping
Protocol [ip]: ipv6
Target IPv6 address: ::b:1:0:0:8
Repeat count [5]: 1
Datagram size [100]:
Timeout in seconds [2]:
Extended commands? [no]: y
Source address or interface:
UDP protocol? [no]:
Verbose? [no]:
Precedence [0]:
DSCP [0]:
Include hop by hop Path Record option? [no]: y
Sweep range of sizes? [no]:
% Using size of 296 to accomodate extension headers
Type escape sequence to abort.
Sending 5, 296-byte ICMP Echos to ::B:1:0:0:8, timeout is 2 seconds:
5(5)----------(3)7(4)----------(3)8(5)----------(5)8(3)----------
(4)6(3)----------(5)5!
Success rate is 100 percent (1/1), round-trip min/avg/max = 4/6/10 ms
ingress i/f
node-id
egress i/f
R5
R7
R6
R8
::b:1:0:0:7
node-id 7
::b:1:0:0:6
node-id 6
::b:1:0:0:8
node-id 8
::b:1:0:0:5
node-id 5
ipv6 ioam path-record
ipv6 ioam node-id <node id>
34
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Performance Test Results (native IPv6 encap)
In-band OAM with no performance degradation
IXIA
(TX)
IXIA
(RX)
R1
ISR 3945
R2
ISR 3945
R3
ISR 3945
Encap Transit Decap
CPU Utilization in % at 600Mbps/no drops/IMIX traffic – 100 flows
35
Traffic
Bi Dir
Frame size
in Bytes
Traffic Load
in %
Loss
in %
RX
Throughput
in mbps
Latency
(ns)
AVG
Latency
(ns)
MIN
Latency
(ns)
MAX
CPU UTIL %
ENCAP NODE
(R1)
CPU UTIL %
Inter NODE
(R2)
CPU UTIL %
DECAP NODE
(R3)
CEF IPv6 imix 60 0 1131 187401 54260 921680 23 23 23
GRE+IPv6 imix 60 0.04 1138 186753 56060 923120 24 24 25
iOAM6 imix 60 0 1138 186671 40640 917780 27 24 26
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Introducing Vector Packet Processor - VPP
• VPP is a rapid packet processing development platform for
highly performing network applications.
• It runs on commodity CPUs and leverages DPDK
• It creates a vector of packet indices and processes them
using a directed graph of nodes – resulting in a highly
performant solution.
• Runs as a Linux user-space application
• Ships as part of both embedded & server products, in
volume; Active development since 2002
• Base iOAM code already available in fd.io open
repositories (more coming soon)
• See also: FD.IO (The Fast Data Project) Network IO
Packet Processing: VPP
Management Agent
NC/Y REST ...
Some initial iOAM already in FD.IO:
https://gerrit.fd.io/r/gitweb?p=vpp.git;a=blob_plain;f=vnet/vnet/ip/ip6_hop_by_hop.c;hb=HEAD
36
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
iOAM in VPP (native IPv6 encap)
DPDK
eth-
input
VPP
Ip6-
input
Ip6-
lookup
Ip6-hop-
by-hop
Ip6-add-
hop-by-
hop
Ip6-pop-
hop-by-
hop
Ip6-
rewrite
error-
drop
int-
output
Tengigabitethernet
Tengigabitethernet
Tengigabitethernet
Tengigabitethernet
Management Agent
dpdk-
input
`
PacketVector
API
iOAM data
Collector
` `
Shared memory bus
37
A glimpse at an
in-band OAM
Solution Ecosystem
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Network Operations and SLA diagnostics
In-Band OAM as part of a larger ecosystem
Examples of enhanced use-cases
• Identifying the reason for path
verification failure (triggered
OAM)
• Identifying network and service
chain bottlenecks (OAM
analytics)
• Analyze route changes:
Correlate route changes and
path information
39
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
In-Band OAM Ecosystem: Needs, Principles, Tools
40
High-Speed Dataplane Implementation
• High-Speed data-plane implementation (IOS and VPP)
• Efficient export of all data required for analytics
Data Aggregation and Distribution: Make data available for further processing
• Aggregate; Decouple data production from data consumption/processing
• Loosely coupled system: Interpret/transfor data on read, not on write: Message broker
• Tools: Apache Kafka, pmacctd, ..
Data Processing/Analytics/Visualization – and corresponding Network Control
• Interpret, Analyze, Reason about Data
• Automatically implement conclusions in the network
• Tools: OpenDaylight, ELK stack, ..
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 41
Node A
Node B
Node C
Verifier
Node D
Host 1
Host 3
Host 2
VPP
VPP
VPP
VPP
pmacct
Apache Kafka
OpenDaylight – iOAM6/POT app
Elasticsearch
Logstash
Kafka
iOAM6:
Decap
& Verify
iOAM6:
Transit
iOAM6:
encap
iOAM event control iOAM visualization
Payload
Hdr
Payload
Hdr
r=45/c=0
A 1
Payload
Hdr
r=45/c=17
A 1
D 3
1
3
42
Payload
Hdr
Data
Export
IPFIX,
kafka
Control
Telemetry/
OAM data
Path Verified
Visualization & Analytics
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Ecosystem at work: Example
Configure “Proof of Transit”
Service A
Service B
Service C
Verifier
Node Z
Host 1
Host 3
Host 2
iOAM Domain
Service chain: Service A – Service B – Service C
Routing configured so that:
• Traffic from Host1 is forwarded to Service
B as next hop
• Traffic from Host2 is forwarded to Node Z
as next hop
VPP
VPP
VPP
VPP
42
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Ecosystem components involved
Service A
Service B
Service C
Verifier
Node Z
Host 1
Host 3
Host 2
iOAM Domain
VPP
VPP
VPP
VPP
Netflow Collector
(pmacctd)
Service Chain Verification App
Netconf/YANG IPFIX
“OSS”
(App or simulated by Postman)
Database
(mysql)
Restconf/YANG
43
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
RESTconf call to OpenDaylight
“OSS” simulated by Postman
44
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Rendered instructions for VPP nodes
45
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
POT profiles configured on VPP
46
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
POT profiles configured on VPP
47
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Example Network: Operation
Node Z
Host 1
Host 3
Host 2
Netflow Collector
(pmacctd)
Database
(mysql)
iOAM6 GUI
(NeXT based)
Service A
Service B
Service C
Verifier
POT passes:
packet is in policy
POT fails:
packet is out of policy
VPP
VPP
VPP
VPP
48
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Verification Counters
49
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
IPFIX Data Export
50
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Example: Triggered OAM
Identifying the reason for path verification failure
51
Host 1
Host 3
Host 2
Service A
Service B
Service C
When POT succeeds, we know all Services were traversed correctly
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Example: Triggered OAM
Identifying the reason for path verification failure
52
Host 1
Host 3
Host 2
Service A
Service B
Service C
In case of failure, one only knows that something failed, not
what or where things failed
! fail
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Triggered OAM: Turn on path tracing on demand
Achieve full visibility into the path of the failed flow
53
Host 1
Host 3
Host 2
Service A
Service B
Service C
Path tracing lifts the cover and gives full insight into
the forwarding topology: Forwarding path revealed as incorrect.
! fail
Node Z
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Path Verification Failure Triggers Tracing
Path Verification Failure
Network Controller Message Broker
App / Control
Center
Enable path tracing
for flows which fail
verification
Configure iOAM
tracing
54
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Message BrokerNetwork Controller
App / Control
Center
Path Verification Failure Triggers Tracing
Path Verification Failure
OSS /
Visualization
Path details flow which
failed path verification
55
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Example: OAM Analytics / SLA check
Identifying network & service chain bottlenecks
• Overlay Networks, Service chains / path have associated SLA
• For Overlay Networks and NFV service chaining SLA-check is still very nacent
• Typically no end-to-end service chain SLA measurement
• Common interpretation is service-chain: Health = VM health
• KPIs for overlay networks and service chains are end-to-end delay, jitter and
packet drops
• Approach: Leverage end-to-end, per hop delay, packet drop rates
• Continously monitor SLA in terms of delay and packet drops
• Identify hot spots/bottlenecks
• Suggest/signal solution approaches to Orchestration/OSS
(e.g. Scale-out a service function: launch another instance;
Reconfigure service function: change buffer configuration)
56
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Example: OAM Analytics / SLA check
Identifying network & service chain bottlenecks
57
Host 1
Host 3
Host 2
Observation: Suddenly connectivity issues
between Host 1 and Host 2
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Example: OAM Analytics / SLA check
Identifying network & service chain bottlenecks
Unusual delay/loss
Network Controller Message Broker
OSS• Reconfigure routing
• Schedule new VNF
58
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Delay A-B
Detailed path tracing to identify the reason
59
Service B
Host 1
Host 3
Host 2
Service A
Service C
Node Z
Delay B-C
Delay Z-CDelay A-Z
Observation
• Original path was
A-B-C and fast
• At time t0 a path
change happened:
Traffic rerouted to
A-Z-C
• Either Link A-Z is a
low bandwidth link
causing packet
drops and/or Node
Z is overloaded
t0
t0 t0
t0
! large
delay
! loss
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 60
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 61
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Overlay Networks:
Checking SLA Compliance on the real traffic
63
Summary
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
In-band OAM – Status and more information
• Dataplane Implementation:
• FD.io/VPP (see fd.io)
• IOS (ISR-G2) – PI31 (CCO: End of July/16) – In-band OAM Config Guide for IOS
• IOSv/VIRL
• Supporting information for in-band OAM: https://github.com/CiscoDevNet/iOAM
• Internet Drafts:
In-band OAM requirements: https://tools.ietf.org/html/draft-brockners-inband-oam-requirements-01.txt
In-band OAM data types: https://tools.ietf.org/html/draft-brockners-inband-oam-data-01.txt
In-band OAM transport: https://tools.ietf.org/html/draft-brockners-inband-oam-transport-01.txt
Proof-of-transit: https://tools.ietf.org/html/draft-brockners-proof-of-transit-01.txt
• Videos:
Google+ In-Band OAM group: https://plus.google.com/u/0/b/112958873072003542518/112958873072003542518/videos?hl=en
Youtube In-Band OAM channel: https://www.youtube.com/channel/UC0WJOAKBTrftyosP590RrXw
• Blogs:
blogs.cisco.com/getyourbuildon/a-trip-recorder-for-all-your-traffic
blogs.cisco.com/getyourbuildon/verify-my-service-chain
• Check out the demo on dcloud.cisco.com
• In-band OAM is found in the “Service Provider” category
65
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
In-Band OAM: Summary
• Enhanced visibility for all your traffic:
New sources of data for SDN applications
• Network provided telemetry data gathered and added to live data
• Complement out-of-band OAM tools like ping and traceroute
• Path / Service chain verification
• Record the packet’s trip as meta-data within the packet
• Record path and node (i/f, time, app-data) specific data hop-by-hop and end to end
• Export telemetry data via Netflow/IPFIX/Kafka to Controller/Apps
• In-band OAM can be implemented without forwarding performance degradation
• IOS and OpenSource (FD.io/VPP, OpenDaylight) Implementations
66
Next-gen Network Telemetry is Within Your Packets: In-band OAM

More Related Content

PDF
Next-gen Network Telemetry is Within Your Packets: In-band OAM
PPTX
Network emulator
PPTX
Acl avancée
PDF
hpsr-2020-srv6-tutorial
PPTX
Congestion control
PDF
Network time protocol
PPTX
Random Access Protocols presentation 4th Sem
PPT
Address resolution protocol and internet control message protocol
Next-gen Network Telemetry is Within Your Packets: In-band OAM
Network emulator
Acl avancée
hpsr-2020-srv6-tutorial
Congestion control
Network time protocol
Random Access Protocols presentation 4th Sem
Address resolution protocol and internet control message protocol

What's hot (20)

PPSX
Token ring
PDF
Segment routing tutorial
PDF
MPLS (Multiprotocol Label Switching)
PPTX
Network Telemetry
PPT
Chapter 14 replication
PDF
Route Leak Prevension with BGP Community
PPT
5. icmp
PPT
Mobility management in adhoc network
PPTX
09. storage-part-1
PPTX
Arq Protocols
PPT
Ip multicast
PPTX
Ericsson TN Cards in Details
PDF
VTU 5TH SEM CSE COMPUTER NETWORKS-1 (DATA COMMUNICATION) SOLVED PAPERS
PPTX
Simple mail transfer protocol (smtp)
PPT
Mpls L3_vpn
PDF
Subscriber Traffic & Policy Management (BNG) on the ASR9000 & ASR1000
PPTX
The Segment Routing Architecture (IEEE Globecom 2015)
PPTX
IP Multicasting
PDF
VLANs in the Linux Kernel
PDF
Overview of SCTP (Stream Control Transmission Protocol)
Token ring
Segment routing tutorial
MPLS (Multiprotocol Label Switching)
Network Telemetry
Chapter 14 replication
Route Leak Prevension with BGP Community
5. icmp
Mobility management in adhoc network
09. storage-part-1
Arq Protocols
Ip multicast
Ericsson TN Cards in Details
VTU 5TH SEM CSE COMPUTER NETWORKS-1 (DATA COMMUNICATION) SOLVED PAPERS
Simple mail transfer protocol (smtp)
Mpls L3_vpn
Subscriber Traffic & Policy Management (BNG) on the ASR9000 & ASR1000
The Segment Routing Architecture (IEEE Globecom 2015)
IP Multicasting
VLANs in the Linux Kernel
Overview of SCTP (Stream Control Transmission Protocol)
Ad

Viewers also liked (20)

PPTX
Risk management
PPTX
Writing New Relic Plugins: NSQ
PDF
Bridging the Gap: Connecting AWS and Kafka
PDF
Platform - Technical architecture
PPTX
NSM (Network Security Monitoring) - Tecland Chapeco
PPTX
Python Pants Build System for Large Codebases
PPT
Jake Fox Pd. 5
DOC
Gaurav dev ops (AWS, Linux, Automation-ansible, jenkins:CI and CD:Ansible)
PDF
Automated Infrastructure Security: Monitoring using FOSS
PDF
Yirgacheffe Chelelelktu Washed Coffee 2015
PPT
Introduction to smpc
PPTX
Security For Humans
PPTX
Bsides threat hunting
PDF
AWS + Puppet = Dynamic Scale
PPTX
You know, for search
PPTX
Reversing malware analysis training part3 windows pefile formatbasics
PDF
Neuigkeiten von DEPAROM & Co
PPTX
Persistence in the cloud with bosh
PPTX
Expect the unexpected: Anticipate and prepare for failures in microservices b...
Risk management
Writing New Relic Plugins: NSQ
Bridging the Gap: Connecting AWS and Kafka
Platform - Technical architecture
NSM (Network Security Monitoring) - Tecland Chapeco
Python Pants Build System for Large Codebases
Jake Fox Pd. 5
Gaurav dev ops (AWS, Linux, Automation-ansible, jenkins:CI and CD:Ansible)
Automated Infrastructure Security: Monitoring using FOSS
Yirgacheffe Chelelelktu Washed Coffee 2015
Introduction to smpc
Security For Humans
Bsides threat hunting
AWS + Puppet = Dynamic Scale
You know, for search
Reversing malware analysis training part3 windows pefile formatbasics
Neuigkeiten von DEPAROM & Co
Persistence in the cloud with bosh
Expect the unexpected: Anticipate and prepare for failures in microservices b...
Ad

Similar to Next-gen Network Telemetry is Within Your Packets: In-band OAM (20)

PPTX
Proof of Transit: Securely Verifying a Path or Service Chain
PDF
Cisco Connect Toronto 2017 - Model-driven Telemetry
PDF
Model-driven Telemetry: The Foundation of Big Data Analytics
PDF
Model driven telemetry
PPT
Bandwidth measurement
PPTX
CCNA3 Verson6 Chapter6
PPTX
Lecture Notes- Network Services - Copy.pptx
PPTX
CCNA (R & S) Module 04 - Scaling Networks - Chapter 6
PDF
7.1 Network Layer.pdf1234567891011121314
PPTX
CCNP Route EIGRP Overview
PPTX
CCNP v6 Route: Implementing IP Routing Chapter 2
PPT
PDF
PPTX
ENCOR_Chapter_6.pptx
PDF
APIC EM APIs: a deep dive
PPTX
CCNP ROUTE V7 CH2
PPTX
Ccna v5-S1-Chapter 6
PPTX
Itn instructor ppt_chapter6_network_layer
Proof of Transit: Securely Verifying a Path or Service Chain
Cisco Connect Toronto 2017 - Model-driven Telemetry
Model-driven Telemetry: The Foundation of Big Data Analytics
Model driven telemetry
Bandwidth measurement
CCNA3 Verson6 Chapter6
Lecture Notes- Network Services - Copy.pptx
CCNA (R & S) Module 04 - Scaling Networks - Chapter 6
7.1 Network Layer.pdf1234567891011121314
CCNP Route EIGRP Overview
CCNP v6 Route: Implementing IP Routing Chapter 2
ENCOR_Chapter_6.pptx
APIC EM APIs: a deep dive
CCNP ROUTE V7 CH2
Ccna v5-S1-Chapter 6
Itn instructor ppt_chapter6_network_layer

Recently uploaded (20)

PPTX
ENCOR_Chapter_10 - OSPFv3 Attribution.pptx
PPTX
EthicalHack{aksdladlsfsamnookfmnakoasjd}.pptx
PDF
Slides PDF The Workd Game (s) Eco Economic Epochs.pdf
PDF
Elements Of Poetry PowerPoint With Sources
PPTX
SEO Trends in 2025 | B3AITS - Bow & 3 Arrows IT Solutions
PPTX
QR Codes Qr codecodecodecodecocodedecodecode
PDF
Triggering QUIC, presented by Geoff Huston at IETF 123
PPT
Transformaciones de las funciones elementales.ppt
PDF
“Google Algorithm Updates in 2025 Guide”
PDF
Project English Paja Jara Alejandro.jpdf
PPTX
Generics jehfkhkshfhskjghkshhhhlshluhueheuhuhhlhkhk.pptx
PDF
BGP Security Best Practices that Matter, presented at PHNOG 2025
PDF
Reliable Data Cabling Services for Seamless Connectivity
PDF
The Internet -By the Numbers, Sri Lanka Edition
PDF
KIPER4D situs Exclusive Game dari server Star Gaming Asia
PDF
LABUAN4D EXCLUSIVE SERVER STAR GAMING ASIA NO.1
PPTX
PPT_M4.3_WORKING WITH SLIDES APPLIED.pptx
PPTX
Slides, PPTX World Game (s) Eco Economic Epochs.pptx
PDF
LABUAN4D EXCLUSIVE SERVER STAR GAMING ASIA NO.1
PPTX
Crypto Recovery California Services.pptx
ENCOR_Chapter_10 - OSPFv3 Attribution.pptx
EthicalHack{aksdladlsfsamnookfmnakoasjd}.pptx
Slides PDF The Workd Game (s) Eco Economic Epochs.pdf
Elements Of Poetry PowerPoint With Sources
SEO Trends in 2025 | B3AITS - Bow & 3 Arrows IT Solutions
QR Codes Qr codecodecodecodecocodedecodecode
Triggering QUIC, presented by Geoff Huston at IETF 123
Transformaciones de las funciones elementales.ppt
“Google Algorithm Updates in 2025 Guide”
Project English Paja Jara Alejandro.jpdf
Generics jehfkhkshfhskjghkshhhhlshluhueheuhuhhlhkhk.pptx
BGP Security Best Practices that Matter, presented at PHNOG 2025
Reliable Data Cabling Services for Seamless Connectivity
The Internet -By the Numbers, Sri Lanka Edition
KIPER4D situs Exclusive Game dari server Star Gaming Asia
LABUAN4D EXCLUSIVE SERVER STAR GAMING ASIA NO.1
PPT_M4.3_WORKING WITH SLIDES APPLIED.pptx
Slides, PPTX World Game (s) Eco Economic Epochs.pptx
LABUAN4D EXCLUSIVE SERVER STAR GAMING ASIA NO.1
Crypto Recovery California Services.pptx

Next-gen Network Telemetry is Within Your Packets: In-band OAM

  • 1. Next-gen Network Telemetry is Within Your Packets: In-band OAM Frank Brockners, Shwetha Bhandari
  • 2. Continous & Always-On On Demand Checking Health and Compliance
  • 3. Continous & Always-On On Demand Checking Health and Compliance
  • 4. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public In-Band OAM Solution Ecosystem A Peek at the Implementation In-Band OAM – The Technology Why In-Band OAM? – Use-Case Examples Prologue: Does your traffic comply? 4
  • 5. Prolog Consider TE, Service Chaining, Policy Based Routing, etc...: “How do you prove that traffic follows the suggested path?”
  • 6. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public “… Willis says the first issue is connecting VNFs to the infrastructure. OpenStack does this in a sequential manner, with the sequence serially numbered in the VNF, but the difficulty comes when trying to verify that the LAN has been connected to the correct LAN port, the WAN has been connected to the correct WAN port and so on. "If we get this wrong for a firewall function it could be the end of a CIO's career," says Willis.” http://www.lightreading.com/nfv/nfv-specs-open-source/bt-threatens-to-ditch-openstack/d/d-id/718735 October 14, 2015 Light reading citing Peter Willis, Chief researcher for data networks, British Telecom 6
  • 7. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Ensuring Path and/or Service Chain Integrity Approach • Meta-data added to all user traffic • Based on “Share of a secret” • Provisioned by controller over secure channel • Updated at every service hop • Verifier checks whether collected meta-data allows retrieval of secret • Path verified Controller Secret X B CA Verifier 7
  • 8. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Initial Idea: Combine multiple secrets “Compose the Onion” • Approach • A service is described by a set of secrets, where each secret is associated with a service function. Service functions encrypt portions of the meta-data as part of their packet processing. • Only the verifying node has access to all secrets. The verifying nodes re-encrypts the meta-data to validate whether the packet correctly traversed the service chain. • Notes • To be used only when hardware assisted encryption is available. i.e. AES-NI instructions or equivalent. Otherwise this could be very costly operation to verify at line speed. “S1” “S2” “S3” Service-Secrets are nested like layers of an onion 8
  • 9. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Solution Approach: Leveraging Shamir’s Secret Sharing Polynomials 101 - Line: Min 2 points - Parabola: Min 3 points General: It takes k+1 points to defines a polynomial of degree k. 9 - Cubic function: Min 4 points Adi Shamir
  • 10. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Solution Approach: Leverage Shamir’s Secret Sharing “A polynomial as secret” • Each service is given a point on the curve • When the packet travels through each service it collects these points • A verifier can reconstruct the curve using the collected points • Operations done over a finite field (mod prime) to protect against differential analysis (3,46) (2,28) (1,16) X B CA “Secret”: 3𝑥2 + 3𝑥 + 10 3𝑥2 + 3𝑥 + 10 Verifier 10
  • 11. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Operationalizing the Solution • Leverage two polynomials: • POLY-1 secret, constant: Each hop gets a point on POLY-1 Only the verifier knows POLY-1 • POLY-2 public, random and per packet. Each hop generates a point on POLY-2 each time a packet crosses it. • Each service function calculates (Point on POLY-1 + Point on POLY-2) to get (Point on POLY-3) and passes it to verifier by adding it to each packet. • The verifier constructs POLY-3 from the points given by all the services and cross checks whether POLY-3 = POLY-1 + POLY-2 • Computationally efficient: 2 additons, 1 multiplication, mod prime per hop POLY-1 Secret – Constant POLY-2 Public – Per Packet + = POLY-3 Secret – Per Packet 11
  • 12. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Meta Data for Service/Path Verification • Verification secret is the independent coefficient of POLY-1 • Computation/retrieval through a cumulative computation at every hop (“cumulative”) • For POLY-2 the independent coefficient is carried within the packet (typically a combination of timestamp and random number) • n bits can service a maximum of 2n packets • Verification secret and POLY-2 coefficient (“random”) are of the same size • Secret size is bound by prime number Transfer Rate RND/ Secret Size Max # of packets (assuming 64 byte packets) Time that “random” lasts at maximum 1 Gbps 64 10 Gbps 64 100 Gbps 64 10 Gbps 56 10 Gbps 48 10 Gbps 40 1 Gbps 32 2200 seconds, 36 minutes 10 Gbps 32 220 seconds, 3.5 minutes 100 GBps 32 22 seconds 12
  • 13. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Meta-Data Provisioning • Meta-Data for Service Chain Verification (POT) provisioned through a controller (OpenDaylight App) • Netconf/YANG based protocol • Provisioned information from Controller to Service Function / Verifier • Service-Chain-Identifier (to be mapped to service chaining technology specific identifier by network element) • Service count (number of services in the chain) • 2 x POT-key-set (even and odd set) • Secret (in case of communication to the verifier) • Share of a secret, service index • 2nd polynomial coefficients • Prime number Service Chain Verification App S3 VerifierS2S1 POT-key-sets: Prime secret share poly-2 POT-key-sets: Prime secret share poly-2 POT-key-sets: Prime secret share poly-2 POT-key-sets: Secret Prime secret share poly-2 Verification request for a particular service chain 13
  • 14. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public • 16* Bytes of Meta-Data for POT • Random – Unique random number (e.g. Timestamp or combination of Timestamp and Sequence number) • Cumulative (algorithm dependent) • Transport options for different protocols • Segment Routing: New TLV in SRH header • Network Service Header: Type-2 Meta-Data • In-band OAM for IPv6: Proof-of-transit extension header • VXLAN-GPE Proof-of-transit embedded-telemetry header • ... more to be added (incl. IPv4) Proof of Transit: Meta-Data Transport Options +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Random | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Random (contd) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Cumulative | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Cumulative (contd) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ *Note: Smaller numbers are feasible, but require a more frequent renewal of the polynomials/secrets. 2nd Polynomial Interative computation of secret 14
  • 16. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public In-band OAM • OAM traffic embedded in the data traffic but not part of the payload of the packet • OAM “effected by data traffic” • Example: IPv4 route recording Out-of-band OAM • OAM traffic is sent as dedicated traffic, independent from the data traffic (“probe traffic”) • OAM “not effected by data traffic” • Examples: Ethernet CFM (802.1ag), Ping, Traceroute How to send OAM information in packet networks? “On-Board Unit” Speed control by police car 16
  • 17. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Remember RFC 791? 17
  • 18. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public More recently: In-band Network Telemetry for P4 http://p4.org/wp-content/uploads/fixed/INT/INT-current-spec.pdf 18
  • 19. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public In-Band/Passive OAM - Motivation • Multipath Forwarding – debug ECMP networks • Service/Path Verification – prove that traffic follows a pre-defined path • Service/Quality Assurance – Prove traffic SLAs, as opposed to probe- traffic SLAs; Overlay/Underlay • Derive Traffic Matrix • Custom/Service Level Telemetry “Most large ISP's prioritize Speedtest traffic and I would even go as far to say they probably route it faster as well to keep ping times low.” Source: https://www.reddit.com/r/AskTechnology/comments/2i1nxc/ can_i_trust_my_speedtestnet_results_when_my_isp/ 19
  • 20. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Example use-cases... • Path Tracing for ECMP networks • Service/Path Verification • Derive Traffic Matrix • SLA proof: Delay, Jitter, Loss • Custom data: Geo-Location,.. Meta-data required... • Node-ID, ingress i/f, egress i/f • Proof of Transit (random, cumulative) • Node-ID • Sequence numbers, Timestamps • Custom meta-data What if you could collect operational meta-data within your traffic? 20
  • 21. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Example: Flow Tracing in EMCP Networks Probe packet (“ping”) tests the “wrong” path Trace all paths and detect the one with issues 21
  • 22. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Example: Derive the Network Traffic Matrix 22
  • 23. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 23 Further Examples… Loss/Re-ordering Detection Add sequence number Check sequence number Multicast Delay measurements and trend analysis Link 1 Link 2 Link 4 Delay Link1 Delay Link2 Delay Link3 Delay Link4 t = 1 1.2ms 14.8ms 3.8ms 24.8ms t = 2 1.2ms 14.8ms 3.8ms 24.8ms t = 3 1.2ms 14.7ms 3.8ms 24.7ms t = 4 1.2ms 14.8ms 3.8ms 24.8ms t = 5 1.2ms 14.8ms 3.8ms 24.8ms t = 6 1.2ms 17.8ms 3.8ms 24.7ms t = 7 1.2ms 17.9ms 3.8ms 24.7ms t = 8 1.2ms 18.1ms 3.8ms 24.8ms Link 3
  • 24. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Example – Generic customer meta-data: Geo-Location within your packets 24
  • 25. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Amending the OAM capabilities of IP Capability Existing Tools Additions Continuity Check BFD • Light-weight continuity check (check without sending extra traffic) Connectivity Verification Ping / ICMP echo • EMCP support • Acknowledge different packet forwarding paths in routers Path Discovery and Verification Traceroute • EMCP support • Acknowledge different packet forwarding paths in routers • Prove correctness/integrity of forwarding path Defect Indications • Indicate if a forwarding policy (service chain) has not been met Performance Monitoring IPPM (delay/packet loss) • Metrics for live data traffic (delay, packet loss) 25
  • 26. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public In-Band OAM (iOAM) 26 • Gather telemetry and OAM information along the path within the data packet, as part of an existing/additional header • No extra probe-traffic (as with ping, trace, ipsla) • Transport options • IPv6: Native v6 HbyH extension header or double-encap • VXLAN-GPE: Embedded telemetry protocol header • SRv6: Policy-Element (proof-of-transit only) • NSH: Type-2 Meta-Data (proof-of-transit only) ... additional encapsulations being considered (incl. IPv4, MPLS) • Deployment • Domain-ingress, domain-egress, and select devices within a domaininsert/remove/update the extension header • Information export via IPFIX/Flexible-Netflow/publish into Kafka • Fast-path implementation Hdr iOAM Payload iOAM domain
  • 27. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public IPFIX 1 3 4 5 6 2 A C B D Payload Hdr Payload Hdr Payload Hdr r=45/c=17 A 1 C 4 Payload Hdr r=45/c=39 B A 6 1 C 4 Insert POT meta-data Payload Hdr r=45/c=0 A 1 POT meta-data Path-tracing data Update POT meta-data Update POT meta-data Update POT meta-data POT Verifier 27
  • 28. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public • Per node scope • Hop-by-Hop information processing • Device_Hop_L • Node_ID • Ingress Interface ID • Egress Interface ID • Time-Stamp • Application Meta Data • Set of nodes scope • Hop-by-Hop information processing • Service Chain Validation (Random, Cumulative) • Edge to Edge scope • Edge-to-Edge information processing • Sequence Number iOAM: Information carried 28
  • 29. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Tracing Option 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Option Type | Opt Data Len |IOAM-trace-type| Elements-left | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ | | | | Node data List [0] | | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ D | | a | Node data List [1] | t | | a +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . . . S +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ p | | a | Node data List [n-1] | c | | e +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | | | Node data List [n] | | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Hop_Lim | node_id | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ingress_if_id | egress_if_id | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | timestamp | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | app_data | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 29
  • 30. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Proof-of-Transit Option 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Option Type | Opt Data Len | POT type = 0 | reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ | Random | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P | Random(contd) | O +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ T | Cumulative | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | Cumulative (contd) | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ 30
  • 31. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Edge-to-Edge Option 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Option Type | Opt Data Len | IOAM-E2E-Type | reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | E2E Option data format determined by IOAM-E2E-Type | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Option Type: 000xxxxxx 8-bit identifier of the type of option. Opt Data Len: 8-bit unsigned integer. Length of the Option Data field of this option, in octets. iOAM-E2E-Type: 8-bit identifier of a particular iOAM E2E variant. 0: E2E option data is 64-bit Per Packet Counter (PPC) used to identify packet loss and reordering. Reserved: 8-bit. (Reserved Octet) Reserved octet for future use.g 31
  • 32. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Transport Options – IPv6, VXLAN-GPE, SRv6, NSH... • IPv6 • Native v6 HbyH extension header • Double-encapsulation: src=encap-node, dst=original • VXLAN-GPE: Embedded telemetry protocol header; Combines with NSH etc. • SRv6: iOAM TLV in v6 SR- header SRH (proof-of-transit only) • NSH: Type-2 Meta-Data (proof-of-transit only) 32
  • 33. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Implementation Approach • In-band OAM mechanisms for IP (route recording) have been proposed in the past, but were always received with skepticism • Performance concerns • Footprint for a new technology (incl. new packet header) vs. installed base: “Is there a sufficiently large greenfield network to allow for the change?” • Deployment starting point for “in-band OAM for IPv6” • IoT networks, Data-Center Networks, “greenfield” network deployments • Let’s stop arguing and go try… 33
  • 34. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public iOAM6 Test Drive • Extended Ping R5#ping Protocol [ip]: ipv6 Target IPv6 address: ::b:1:0:0:8 Repeat count [5]: 1 Datagram size [100]: Timeout in seconds [2]: Extended commands? [no]: y Source address or interface: UDP protocol? [no]: Verbose? [no]: Precedence [0]: DSCP [0]: Include hop by hop Path Record option? [no]: y Sweep range of sizes? [no]: % Using size of 296 to accomodate extension headers Type escape sequence to abort. Sending 5, 296-byte ICMP Echos to ::B:1:0:0:8, timeout is 2 seconds: 5(5)----------(3)7(4)----------(3)8(5)----------(5)8(3)---------- (4)6(3)----------(5)5! Success rate is 100 percent (1/1), round-trip min/avg/max = 4/6/10 ms ingress i/f node-id egress i/f R5 R7 R6 R8 ::b:1:0:0:7 node-id 7 ::b:1:0:0:6 node-id 6 ::b:1:0:0:8 node-id 8 ::b:1:0:0:5 node-id 5 ipv6 ioam path-record ipv6 ioam node-id <node id> 34
  • 35. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Performance Test Results (native IPv6 encap) In-band OAM with no performance degradation IXIA (TX) IXIA (RX) R1 ISR 3945 R2 ISR 3945 R3 ISR 3945 Encap Transit Decap CPU Utilization in % at 600Mbps/no drops/IMIX traffic – 100 flows 35 Traffic Bi Dir Frame size in Bytes Traffic Load in % Loss in % RX Throughput in mbps Latency (ns) AVG Latency (ns) MIN Latency (ns) MAX CPU UTIL % ENCAP NODE (R1) CPU UTIL % Inter NODE (R2) CPU UTIL % DECAP NODE (R3) CEF IPv6 imix 60 0 1131 187401 54260 921680 23 23 23 GRE+IPv6 imix 60 0.04 1138 186753 56060 923120 24 24 25 iOAM6 imix 60 0 1138 186671 40640 917780 27 24 26
  • 36. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Introducing Vector Packet Processor - VPP • VPP is a rapid packet processing development platform for highly performing network applications. • It runs on commodity CPUs and leverages DPDK • It creates a vector of packet indices and processes them using a directed graph of nodes – resulting in a highly performant solution. • Runs as a Linux user-space application • Ships as part of both embedded & server products, in volume; Active development since 2002 • Base iOAM code already available in fd.io open repositories (more coming soon) • See also: FD.IO (The Fast Data Project) Network IO Packet Processing: VPP Management Agent NC/Y REST ... Some initial iOAM already in FD.IO: https://gerrit.fd.io/r/gitweb?p=vpp.git;a=blob_plain;f=vnet/vnet/ip/ip6_hop_by_hop.c;hb=HEAD 36
  • 37. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public iOAM in VPP (native IPv6 encap) DPDK eth- input VPP Ip6- input Ip6- lookup Ip6-hop- by-hop Ip6-add- hop-by- hop Ip6-pop- hop-by- hop Ip6- rewrite error- drop int- output Tengigabitethernet Tengigabitethernet Tengigabitethernet Tengigabitethernet Management Agent dpdk- input ` PacketVector API iOAM data Collector ` ` Shared memory bus 37
  • 38. A glimpse at an in-band OAM Solution Ecosystem
  • 39. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Network Operations and SLA diagnostics In-Band OAM as part of a larger ecosystem Examples of enhanced use-cases • Identifying the reason for path verification failure (triggered OAM) • Identifying network and service chain bottlenecks (OAM analytics) • Analyze route changes: Correlate route changes and path information 39
  • 40. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public In-Band OAM Ecosystem: Needs, Principles, Tools 40 High-Speed Dataplane Implementation • High-Speed data-plane implementation (IOS and VPP) • Efficient export of all data required for analytics Data Aggregation and Distribution: Make data available for further processing • Aggregate; Decouple data production from data consumption/processing • Loosely coupled system: Interpret/transfor data on read, not on write: Message broker • Tools: Apache Kafka, pmacctd, .. Data Processing/Analytics/Visualization – and corresponding Network Control • Interpret, Analyze, Reason about Data • Automatically implement conclusions in the network • Tools: OpenDaylight, ELK stack, ..
  • 41. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 41 Node A Node B Node C Verifier Node D Host 1 Host 3 Host 2 VPP VPP VPP VPP pmacct Apache Kafka OpenDaylight – iOAM6/POT app Elasticsearch Logstash Kafka iOAM6: Decap & Verify iOAM6: Transit iOAM6: encap iOAM event control iOAM visualization Payload Hdr Payload Hdr r=45/c=0 A 1 Payload Hdr r=45/c=17 A 1 D 3 1 3 42 Payload Hdr Data Export IPFIX, kafka Control Telemetry/ OAM data Path Verified Visualization & Analytics
  • 42. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Ecosystem at work: Example Configure “Proof of Transit” Service A Service B Service C Verifier Node Z Host 1 Host 3 Host 2 iOAM Domain Service chain: Service A – Service B – Service C Routing configured so that: • Traffic from Host1 is forwarded to Service B as next hop • Traffic from Host2 is forwarded to Node Z as next hop VPP VPP VPP VPP 42
  • 43. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Ecosystem components involved Service A Service B Service C Verifier Node Z Host 1 Host 3 Host 2 iOAM Domain VPP VPP VPP VPP Netflow Collector (pmacctd) Service Chain Verification App Netconf/YANG IPFIX “OSS” (App or simulated by Postman) Database (mysql) Restconf/YANG 43
  • 44. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public RESTconf call to OpenDaylight “OSS” simulated by Postman 44
  • 45. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Rendered instructions for VPP nodes 45
  • 46. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public POT profiles configured on VPP 46
  • 47. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public POT profiles configured on VPP 47
  • 48. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Example Network: Operation Node Z Host 1 Host 3 Host 2 Netflow Collector (pmacctd) Database (mysql) iOAM6 GUI (NeXT based) Service A Service B Service C Verifier POT passes: packet is in policy POT fails: packet is out of policy VPP VPP VPP VPP 48
  • 49. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Verification Counters 49
  • 50. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public IPFIX Data Export 50
  • 51. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Example: Triggered OAM Identifying the reason for path verification failure 51 Host 1 Host 3 Host 2 Service A Service B Service C When POT succeeds, we know all Services were traversed correctly
  • 52. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Example: Triggered OAM Identifying the reason for path verification failure 52 Host 1 Host 3 Host 2 Service A Service B Service C In case of failure, one only knows that something failed, not what or where things failed ! fail
  • 53. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Triggered OAM: Turn on path tracing on demand Achieve full visibility into the path of the failed flow 53 Host 1 Host 3 Host 2 Service A Service B Service C Path tracing lifts the cover and gives full insight into the forwarding topology: Forwarding path revealed as incorrect. ! fail Node Z
  • 54. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Path Verification Failure Triggers Tracing Path Verification Failure Network Controller Message Broker App / Control Center Enable path tracing for flows which fail verification Configure iOAM tracing 54
  • 55. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Message BrokerNetwork Controller App / Control Center Path Verification Failure Triggers Tracing Path Verification Failure OSS / Visualization Path details flow which failed path verification 55
  • 56. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Example: OAM Analytics / SLA check Identifying network & service chain bottlenecks • Overlay Networks, Service chains / path have associated SLA • For Overlay Networks and NFV service chaining SLA-check is still very nacent • Typically no end-to-end service chain SLA measurement • Common interpretation is service-chain: Health = VM health • KPIs for overlay networks and service chains are end-to-end delay, jitter and packet drops • Approach: Leverage end-to-end, per hop delay, packet drop rates • Continously monitor SLA in terms of delay and packet drops • Identify hot spots/bottlenecks • Suggest/signal solution approaches to Orchestration/OSS (e.g. Scale-out a service function: launch another instance; Reconfigure service function: change buffer configuration) 56
  • 57. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Example: OAM Analytics / SLA check Identifying network & service chain bottlenecks 57 Host 1 Host 3 Host 2 Observation: Suddenly connectivity issues between Host 1 and Host 2
  • 58. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Example: OAM Analytics / SLA check Identifying network & service chain bottlenecks Unusual delay/loss Network Controller Message Broker OSS• Reconfigure routing • Schedule new VNF 58
  • 59. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Delay A-B Detailed path tracing to identify the reason 59 Service B Host 1 Host 3 Host 2 Service A Service C Node Z Delay B-C Delay Z-CDelay A-Z Observation • Original path was A-B-C and fast • At time t0 a path change happened: Traffic rerouted to A-Z-C • Either Link A-Z is a low bandwidth link causing packet drops and/or Node Z is overloaded t0 t0 t0 t0 ! large delay ! loss
  • 60. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 60
  • 61. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 61
  • 62. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Overlay Networks: Checking SLA Compliance on the real traffic 63
  • 64. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public In-band OAM – Status and more information • Dataplane Implementation: • FD.io/VPP (see fd.io) • IOS (ISR-G2) – PI31 (CCO: End of July/16) – In-band OAM Config Guide for IOS • IOSv/VIRL • Supporting information for in-band OAM: https://github.com/CiscoDevNet/iOAM • Internet Drafts: In-band OAM requirements: https://tools.ietf.org/html/draft-brockners-inband-oam-requirements-01.txt In-band OAM data types: https://tools.ietf.org/html/draft-brockners-inband-oam-data-01.txt In-band OAM transport: https://tools.ietf.org/html/draft-brockners-inband-oam-transport-01.txt Proof-of-transit: https://tools.ietf.org/html/draft-brockners-proof-of-transit-01.txt • Videos: Google+ In-Band OAM group: https://plus.google.com/u/0/b/112958873072003542518/112958873072003542518/videos?hl=en Youtube In-Band OAM channel: https://www.youtube.com/channel/UC0WJOAKBTrftyosP590RrXw • Blogs: blogs.cisco.com/getyourbuildon/a-trip-recorder-for-all-your-traffic blogs.cisco.com/getyourbuildon/verify-my-service-chain • Check out the demo on dcloud.cisco.com • In-band OAM is found in the “Service Provider” category 65
  • 65. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public In-Band OAM: Summary • Enhanced visibility for all your traffic: New sources of data for SDN applications • Network provided telemetry data gathered and added to live data • Complement out-of-band OAM tools like ping and traceroute • Path / Service chain verification • Record the packet’s trip as meta-data within the packet • Record path and node (i/f, time, app-data) specific data hop-by-hop and end to end • Export telemetry data via Netflow/IPFIX/Kafka to Controller/Apps • In-band OAM can be implemented without forwarding performance degradation • IOS and OpenSource (FD.io/VPP, OpenDaylight) Implementations 66

Editor's Notes

  • #9:  Advanced Encryption Standard (AES) – NI (New Instructions)
  • #17: Note the difference to transport OAM (SDH/SONET): In SONET/SDH there is a constant flow of frames OAM is a bit-stream/data between the data encoding sublayer and the physical media and present in every frame, but not part of the data payload This mode of OAM transport is often referred to as “out of band”, because OAM has it’s own “channel” – but still resides side by side with the payload data (if present)
  • #24: IOT examples – Industrial automation where networks are delay sensitive and can employ bicasting of traffic when delay / errors are detected in a give path Timestamp – 64 bit NTP timestamp – seconds and pico seconds since epoch – 1st jan 1970 0:0 UTC
  • #41: Alternatives to kafka include Scribe (created by Facebook).