VxLAN Deployment Model
VxLAN Deployment Model
A practical perspective
How
1. Find this session in the Cisco Live Mobile App
2. Click “Join the Discussion”
3. Install Spark or go directly to the space
4. Enter messages/questions in the space
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 3
Agenda
• Why VXLAN?
• VXLAN Fundamentals
• Overlay Deployment Considerations
• Underlay Deployment Considerations
• Summary and Conclusion
Trend: Flexible Data Center Fabrics
Create Virtual Networks on top of an
efficient IP network
Mobility
Segmentation + Policy
Scale
Automated & Programmable
Full Cross Sectional BW
L2 + L3 Connectivity
Physical + Virtual
V
M
V
M
Physical
Hosts O O
S S
Virtual
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 5
VXLAN Fundamentals
Why Overlays?
Seek well integrated best in class Overlays and Underlays
• Flexibility/Programmability
• Reduced number of touch points
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 7
Overlay Taxonomy
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 8
VXLAN is an Overlay Encapsulation
Data Plane Learning Protocol Learning
Flood and Learn over a multidestination Advertise hosts in a protocol
distribution tree joined by all edge devices amongst edge devices
Encapsulation
VXLAN
t
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 9
VXLAN Packet Structure
Ethernet in IP with a shim for scalable segmentation
FCS
Outer MAC Header Outer IP Header Outer UDP Header VXLAN Header Original Layer 2 Frame
14 Bytes
(4 Bytes Optional) 20 Bytes 8 Bytes 8 Bytes Ethernet Payload
VXLAN Flags
UDP Length
VXLAN Port
0x11 (UDP)
VLAN Type
Ether Type
Misc. Data
Dest. MAC
Checksum
Checksum
RRRRIRRR
IP Header
Source IP
Reserved
Reserved
Src. MAC
Protocol
VLAN ID
Address
Address
Dest. IP
Header
0x8100
0x0800
0x0000
Source
Port
VNI
Tag
48 48 16 16 16 72 8 16 32 32 16 16 16 16 8 24 24 8
Src VTEP MAC Address Src and Dst addresses of Large scale
the VTEPs Allows for 16M segmentation
UDP 4789 possible segments
Next-Hop MAC Address
Hash of the inner L2/L3/L4 headers
of the original frame.
Enables entropy for ECMP Load Tunnel Entropy
50 (54) Bytes of overhead balancing in the Network.
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 10
Data Plane Learning
Dedicated Multicast Distribution Tree per VNI
V V V V V
Web DB DB Web
VM VM VM VM
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 11
Data Plane Learning
Dedicated Multicast Distribution Tree per VNI
V V V V V
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 12
Data Plane Learning
Learning on Broadcast Source - ARP Request Example
ARP Req IP A G
ARP Req IP A G
V V V V V
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 13
Data Plane Learning
Learning on Unicast Source - ARP Response Example
ARP Resp
V V V V V
ARP Resp VTEP 2 VTEP 1
ARP Resp
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 14
Overlay Network Evolution: Edge Devices
Network Overlays Host Overlays Hybrid Overlays
Protocols Flooding Network DB
V V
V V M A A
M M p p
M O O p
O p
O S S O O
S S
S S
BGP Route
Reflector
VTEPs advertise their VNI membership in BGP
1
1 1
IP A IP C IP B
TOR 1 V V V V V
Overlay Neighbors
TOR 3 , IP C TOR 3 TOR 2
TOR 2 , IP B
4 VTEP can perform
3
VTEP obtains list of Head-End Replication
VTEP neighbors for
each VNI
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 16
VXLAN Unicast Mode
Head-end replication
*Broadcast, Unknown Unicast or Multicast
4 VXLAN Encap
BUM Frame IP A IP B
IP A IP C IP B
Overlay Neighbors BUM Frame IP A IP C
POD3 , IP C TOR 1 V V V V V
POD2 , IP B VTEP performs Head-
3
2 End Replication TOR 3 TOR 2
VTEP retrieves the list
of Overlay Neighbors** 1
BUM Frame
**Information statically configured or
dynamically retrieved via control plane
A host sends a L2 BUM* frame BRKDCT-2404 (VTEP© 2017
discovery)
Cisco and/or its affiliates. All rights reserved. Cisco Public 17
BGP EVPN Control Plane for VXLAN
Host and Subnet Route Distribution
Route-Reflectors deployed for scaling purposes
RR RR
iBGP Adjacencies
V V V V V
NLRI:
Host MAC1, IP1
NVE IP L1/MAC L1 RR RR
VNI 5000
Ext.Community:
Encapsulation: VXLAN, NVGRE
Sequence 0
V V V V V
VNI 5000
Host 1
VLAN 10
1. Host Attaches
2. Attachment NVE advertises host’s MAC (+IP) through BGP RR
3. Choice of encapsulation is also advertised
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 19
BGP EVPN Control Plane MAC IP VNI Next- Encap Seq
Hop
Host Moves 1 1 5000 L3
IP L1 VXLAN 01
L3
MAC L1
NLRI:
Host MAC1, IP1
NVE IP L3/MAC L3 RR RR
VNI 5000
Ext.Community:
Encapsulation: VXLAN, NVGRE
Sequence 1
V V V V V
VNI 5000
Host 1
VLAN 10
VXLANORANGE
VXLAN L2
Ingress VXLAN packet on Gateway
Orange segment
• VXLAN VXLANORANGE
SVI
VXLANBLUE
• VLAN
VLAN100 VXLAN VLAN200
Router
Egress interface chosen (bridge
may .1Q tag the packet)
VXLAN L2
Gateway
VM VM
OS OS
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 24
VXLAN Configuration – Enable the Feature set
VxLAN encapsulation
feature nv overlay
feature vn-segment-vlan-based VXLAN Mode
feature bgp
nv overlay vpn
EVPN Control Plane
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 25
VXLAN Configuration – Flood and Learn Overlay
Point to Multi-point tunnel
with VxLAN encapsulation
interface nve1
no shutdown Used to Derive
Local VTEP IP
source-interface loopback1 address
member vni 6000
mcast-group 235.1.1.1
VxLAN Identifier
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 26
VXLAN Configuration – EVPN L2 Overlay
Point to Multi-point tunnel
with VxLAN encapsulation
interface nve1
no shutdown Used to Derive
Local VTEP IP
source-interface loopback1 address
host-reachability protocol bgp
member vni 6000 Use BGP-EVPN
control plane
suppress-arp
mcast-group 235.1.1.1
VxLAN Identifier
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 27
VXLAN Configuration – Mapping VLANs/BDs to VNIs
Layer 2 Gateway Map VNI to VLAN/BD
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 28
L2 VXLAN Configuration – Edge port configuration
Map VLANs to VNIs Apply Encapsulation
Profile to Interface
VLAN CLI Model VSI CLI Model
encapsulation profile vni <name>
dot1q 101,202 vni 5000,6000
interface <phy if> interface <phy if>
switchport mode access service instance 1 vni
switch port access vlan 3002 encapsulation profile <name> default
BD999 VXLAN8989
VLAN202
VLAN3002 VXLAN6000 BD200 VXLAN6000
VLAN3002 VLAN202
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 29
L2 VXLAN Configuration – Edge port configuration
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 30
EVPN Configuration – L2 VNIs
vlan 3002
vn-segment 6000
evpn
vni 6000 l2
rd auto
route-target import auto
route-target export auto
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 31
Centralized Routing in a L2 VXLAN Fabric
IP Core Inter-V(X)LAN and
Core Routing
VXLAN L3 VXLAN L3
Gateway SVI SVI Gateway
HSRP
HSRP
VXLAN L2
Gateway
VM VM
OS OS
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 32
VXLAN Configuration
Layer 3 Gateway Map VNI to VLAN/BD
L3 Fabric
VXLAN L2 VXLAN L2
Gateway Gateway
L2 Gateway redundancy
based on vPC (anycast
VM VM vMAC Emulated VTEP
VTEP address)
OS OS
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 34
VXLAN Configuration
Redundant VTEP Anycast Source-interface Map VNI to VLAN/BD
interface loopback1
ip address <x.x.x.x> VXLAN Tunnel Interface
ip address <anycast-VTEP> secondary
interface nve1
no shutdown
source-interface loopback1 Routed interface
member vni 6000 mcast-group 235.1.1.1
IP A1 IP A2 IP B1 IP B2
VXLAN L2 VXLAN L2 VXLAN L2 VXLAN L2
Gateway Gateway Gateway Gateway
WAN/DCI
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 36
Graceful Insertion and Removal (GIR)
Platform Release
Nexus 5x00/6000 NX-OS 7.1
One command!
Pre-change System Snapshot Nexus 7x00 NX-OS 7.2
no system mode
maintenance
One command!
Pre-change System Snapshot
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 38
vPC VTEP Configuration Best Practices
Must enable peer-gateway --- Ensure VXLAN routed traffic can be
forwarded to the local hosts by both vPC VTEPs
Underlay IP Network
Enable peer-switch, ip arp sync and ipv6 nd sync for improved
convergence in vPC topologies.
Use separate loopback interfaces for VTEP source address and the
other routing protocols’ router ID, such as BGP.
VXLAN Overlay
Anycast
VTEP
vpc domain 100 vPC Address vPC
peer-switch VTEP-1 VTEP-2
peer-keepalive destination 172.32.1.13 source 172.32.1.14
delay restore 200 VLAN
peer-gateway vPC
ip arp synchronize Layer 2 Link
ipv6 nd synchronize
Layer 3 Link
interface nve1
source-interface hold-down-time 180
VTEP-B VTEP-C
EVPN MP-BGP Route Update
Host IP addr: 100.1.1.2 vPC
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 40
Configure vPC VTEPs with peer-gateway
The Encaped VXLAN traffic is
ECMP load-shared to both VTEP-B
and VTEP-C based the outer
destination IP address --- anycast
VTEP address
VXLAN Packet
VXLAN Packet VXLAN Packet
Outr Dst IP: Anycast VTEP
Outr Dst IP: Anycast VTEP Outr Dst IP: Anycast VTEP
Inner Dst MAC: RMAC-B
Inner Dst MAC: RMAC-B Inner Dst MA: RMAC-B
After VXLAN decapsulation,
VTEP-2 needs to be able to
VTEP
route traffic for RMAC-B . This
....... VTEP-B VTEP-C
needs the vPC peer-gateway
Anycast VTEP Address function
vPC
Port-Channel
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 41
vPC Restore and Source Hold-down Timer
Spine Anycast VTEP
Spine vPC Peer-Link Advertisement
Control plane
connection not
adjacencies not
fully established X recovered yet
X Leaf 1 X Leaf
Leaf 2
Leaf 1 4
Leaf 2 2
X Host-to-Leaf
Recovering
connection not
device
recovered yet
Underlay Network
With IP ECMP Load Sharing
vPC
Port-Channel
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 43
Layer-3 Backup link between vPC VTEP (2)
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 44
Loop Prevention and Protection(1)
The current Characteristics of VXLAN
VXLAN VTEP does not participate in VLAN Spanning-Tree
VXLAN does not tunnel STP BPDU
VXLAN does not have a built-in Layer-2 loop detection mechanism
VXLAN does not have storm control in the overlay
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 45
Loop Prevention and Protection (2)
VTEPs
vPC vPC
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 46
Loop Prevention and Protection (3)
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 47
HW VTEP Redundancy – L3 Gateway
L2 VXLAN Overlay
L3 VXLAN Overlay
Overlay: HSRP VIP
Overlay: No HSRP
Underlay: VTEP Anycast IP B
Underlay: Independent VTEPs
IP A1 IP A2 IP B1 IP B2
VXLAN L3 VXLAN L3 VXLAN L3 VXLAN L3
Gateway Gateway Gateway Gateway
WAN/DCI WAN/DCI
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 48
Distributed Gateway Function in L3 Overlays
L3 Boundary
L2/L3 Fabric
L3 Boundary
App App
App App
OS OS OS
OS
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 49
Distributed IP Anycast Gateway
Detailed View
Underlay Underlay
/ IP Core / IP Core
L3 GWY L3 GWY
SVI A SVI B SVI B SVI A
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 50
VXLAN EVPN Configuration – Distributed Gateway SVI
fabric forwarding anycast-gateway-mac 0002.0002.0002
vlan 555
vn-segment 39000 Anycast Gateway MAC address
interface nve1
Used to Derive
no shutdown Local VTEP IP
source-interface loopback1 address
host-reachability protocol bgp
member vni 6000 VxLAN Identifier
suppress-arp
mcast-group 235.1.1.1
member vni 39000 associate-vrf
IP Multicast Group for Multi-destination Traffic
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 53
EVPN CP Configuration – Distributed Gateway
Prefix Routes
(type 5)
router bgp 65000
address-family ipv4 unicast Activate EVPN in
address-family l2vpn evpn iBGP
neighbor x.x.x.x remote-as 65000
update-source loopback1
address-family ipv4 unicast
Activate EVPN route-
address-family l2vpn evpn attribute extensions
send-community extended
vrf customer1
address-family ipv4 unicast Advertise EVPN prefixes
in ipv4 AF
advertise l2vpn evpn
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 54
VXLAN Bridging
802.1Q Tagged Traffic to VNI Mapping
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 55
Distributed IP Anycast Gateway
Packet-Walk – IP Forwarding within the Same Subnet aka Bridging (ARP)
1. PM1 sends an ARP request for Default
Gateway –10.10.10.1
2. The ARP request is suppressed at TOR and
punted to the Supervisor, where MAC and IP L3 Fabric
is learned and distributed MAC IP L2 VNI L3 VNI
rib 2 CPU
3
1
V V
M M
O O
S S
VM1
10.10.10.10 PM1
10.10.10.20
PM1 ARP Cache
10.10.10.1 -> GW_MAC
Standard behavior of End-Host (virtual or physical) to ARP for the Default Gateway
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 56
Distributed IP Anycast Gateway
Packet-Walk – IP Forwarding within the Same Subnet aka Bridging (ARP)
4. VM1 sends an ARP request for PM1 –
10.10.10.20
5. The ARP request is suppressed at TOR and
punted to the Supervisor, where MAC and IP L3 Fabric
MAC IP L2 VNI L3 VNI
is learned and distributed VM1_MAC 10.10.10.10 10000 50000
rib
6. Assuming PM1 is known and a valid route 5 CPU
VXLAN VXLAN
L3 L3
Gateway Gateway
VM1
10.10.10.10 PM1
10.10.10.20
VM1 ARP Cache
10.10.10.20 -> PM1_MAC
If there is Unicast RIB miss on TOR, ARP request will be forwarded to all ports except the original sending
port (ARP snooping). ARP response will be punted to Supervisor of destination TOR for Unicast RIB
population (learn) and subsequently forwarded to source TOR.
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 57
Distributed IP Anycast Gateway
Packet-Walk – IP Forwarding within the Same Subnet aka Bridging (Data Packet)
7. VM1 generates a data packet with PM1_MAC 9
as destination MAC DVTEP: DTOR_L0
SVTEP : STOR_L0
8. TOR receives the packet and performs Layer- VNI 10000
2 lookup for the destination L3
DMAC:Fabric
PM1_MAC
SMAC: VM1_MAC
9. TOR adds VXLAN-Header information 8 DIP: 10.10.10.20
(Destination VTEP, VNI, etc) and forwards the VLAN 123 <-> VNI 10000
PM1_MAC -> DTOR_L0, 10000
SIP : 10.10.10.10
of the equal cost paths available via the VLAN 123 <-> VNI 10000
PM1_MAC -> eth1/23
multiple Spines 7
SIP : 10.10.10.10
DMAC: PM1_MAC
10. The destination TOR receives the packet, SMAC: VM1_MAC
DIP: 10.10.10.20
In case of VM1 is not known to PM1, PM1 would ARP for VM1.
Destination TOR would Proxy for VM1. No Silent-Host discovery
problem.
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 58
VXLAN Routing
Routed Traffic to VNI Mapping
• A common VNI (VNI X) is provisioned amongst the different VTEPs to carry routed traffic
• Routed traffic between VTEPs will be encapsulated in VNI X
• Standard longest prefix match routing takes place:
• Host routes for all known remote hosts are installed at every VTEP Forward over VNI X
• Local hosts are covered by directly connected prefix, a host route will not be present
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 59
Distributed IP Anycast Gateway
Packet Flows – unknown Host (H4) in remote Subnet
• H1 H4
– Routed via SVI B (VTEP1) to VLAN A (VTEP1) Bridged over VNI A (as unknown unicast flood)
• H4 H1
– Routed via SVI A (VTEP2) VNI X SVI B (VTEP1) VLAN B’ (as H1 is known based on response)
• Standard longest prefix match routing: As long as H4 is not learnt by VTEP1, the only path to H4 is the
locally connected subnet (VLAN A’) BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 60
Distributed IP Anycast Gateway
Packet Flows – unknown Host (H1) in remote Subnet
• H4 H1
– Routed via SVI A (VTEP2) to VLAN B (VTEP1) Routed over VNI X (as destination Subnet known)
• H1 H4
– Routed via SVI B (VTEP1) VNI X SVI A (VTEP2) VLAN A’ (as H1 is known based on response)
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 61
EVPN CP Configuration – Advertisement of prefix routes
Inject host IP addresses for host prefixes in Use a route-map to restrict the redistribution
BGP-EVPN to host prefixes
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 62
BGP-EVPN Configuration Summary
feature nv overlay Enable fabric forwarding anycast-gateway-mac 0002.0002.0002
feature vn-segment-vlan-based
feature bgp interface vlan 3002 Distributed Gateway
nv overlay vpn no shutdown
vrf member customer1
ip address 10.10.10.1/24
interface nve1 VXLAN fabric forwarding mode anycast-gateway
no shutdown
source-interface loopback1 vrf context customer1 Tenant VRF with EVPN
host-reachability protocol bgp vni 39000
member vni 6000 rd auto
suppress-arp address-family ipv4 unicast
mcast-group 235.1.1.1 route-target both auto
member vni 39000 associate-vrf route-target both auto evpn
rib
2 CPU
VXLAN VXLAN
L3 L3
Gateway Gateway
VM1
10.10.10.10 PM2
20.20.20.20
VM1 ARP Cache
20.20.20.20 -> GW_MAC
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 64
Distributed IP Anycast Gateway
Packet-Walk – IP Forwarding within the Different Subnet aka Routing (Data Packet)
4. VM1 generates a data packet destined to 6
PM2 IP (20.20.20.20) with GW_MAC as DVTEP: DTOR_L0
destination MAC SVTEP : STOR_L0
VNI 50000
5. TOR receives the packet and performs Layer- L3
DMAC:Fabric
DTOR_MAC
packet across the Layer-3 fabric, picking one 20.20.20.20 -> PM2_MAC
PM2_MAC -> eth1/32
of the equal cost paths available via the 4
SIP : 10.10.10.10
multiple Spines DMAC: GW_MAC
DIP: 20.20.20.20
SMAC: VM1_MAC
7. The destination TOR receives the packet, V
M
O
V
M
O
VLAN 123
VLAN 321
SMAC: GW_MAC
strips off the VXLAN header and performs S S DIP: 20.20.20.20
DMAC: PM2_MAC
SIP : 10.10.10.10
lookup and forwarding toward PM2 VM1
PM2
10.10.10.10
7 20.20.20.20
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 65
SVI Resiliency with anycast MAC
An active GWY at every (redundant) Leaf
Filter:
GWY MAC
Learning on the
Overlay
H2 H4
VTEP1/3 VTEP2/4
L3 Fabric L3 Fabric
L3 GWY L3 GWY L3 GWY L3 GWY L3 GWY L3 GWY L3 GWY L3 GWY L3 GWY L3 GWY L3 GWY L3 GWY
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 67
Integration with Orchestrators and Host Attachment
Compute Network
• Compute Controller (e.g. vCloud Orchestration Orchestration
API
Director) integrated overlay vCloud
Network
Manager /
provisioning Director
Controller
• Integrates physical and virtual end-points
• Topology discovery with triangulation
Spine
Leaf
Trigger
Virtual Access Trigger
VM VM
Hosts OS OS
Overlay
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 68
Automating VNIs/SVIs as hosts attach
VLAN, SVI, VRF, BGP
LDAP
1. Virtual or Physical Machine comes online
3
2. New Trigger Event on TOR
New MAC Learn with VLAN 4
VDP* (N1kv and OVS) with VNI
VMTracker1 with VLAN ID + Port-Group TOR TOR
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 69
Multi-DC Connectivity
Multi-Data Center Connectivity
LAN Extensions and IP mobility
Ethernet extensions between independent fabrics
IP traffic is forwarded via the optimal path (no hair-pinning)
L3 Domain
untagged
VxLAN
VLAN
Data Center 1 Data Center 2
L3 Fabric L3 Fabric
VXLAN L3 VXLAN L3 VXLAN L3 VXLAN L3 VXLAN L3 VXLAN L3 VXLAN L3 VXLAN L3
Gateway Gateway Gateway Gateway Gateway Gateway Gateway Gateway
• Multi-POD
• End-to-End VXLAN EVPN
V V V V V V V V V V V V V V
Inter-POD Network
Multi-POD
• Multi-Fabric
AS#65501 AS#65502
• Connecting VXLAN EVPN via Ethernet
V V V V V V V V V V V V V V
to DCI (two box)
V V V V
Inter-POD Network
Multi-Fabric • Multi-Site
AS#65501 AS#65502
• Integrating VXLAN EVPN into DCI
(single box)
V V V V V V V V V V V V V V
Inter-POD Network
Multi-Site
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 72
Multi-Data Center Scale & Failure Containment
Keeping capacity planning prescriptive
✓
Domain Boundary
DC Fabrics DCI
L3 DCI
L2 DCI
DC 1 DC 2
AS65001 AS65002
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 74
DCI Tunnel Stitching
L2 & L3 Services
L3 DCI L3 DCI
L2 DCI L2 DCI
DC 1 DC 2 DC 1 DC 2
AS65001 AS65002 AS65001 AS65002
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 75
L2-DCI on dedicated devices
Bridging of L2 tunnels is not possible initially
VXLAN L2 tunnels are stitched with the L2 DCI tunnels via VLAN normalization
OTV/VXLAN/MPLS
L2 DCI
VLAN VLAN
L3 DCI
VXLAN DC 1 DC 2 VXLAN
AS65001 AS65002
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 76
EVPN Fabric – OTV Handoff detail (current)
Fabric • Two boxes
• L2 tunnel to L2 tunnel bridging comes with
VXLAN
Border Leaves VNI
M3
vPC
• Border Leaves in L2 Gateway mode
VTEP VTEP
• VNIs map to VLANs
CE • VLANs stretched in OTV
VLAN • Data Plane learning on handoff
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 77
EVPN Fabric – EVPN Handoff detail (current)
Fabric • Two boxes
• L2 tunnel to L2 tunnel bridging in future
VXLAN
Border Leaves VNI
ASICs
vPC
• Border Leaves in L2 Gateway mode
VTEP VTEP
• VNIs map to VLANs
CE • VLANs map back to VNIs
VLAN • Data Plane learning on handoff
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 78
No Anycast Gateway @ Border Leaf
The anycast GWY would be seen as a duplicate host address over the L2 DCI
L2 DCI
AGW AGW
L3 DCI
DC 1 DC 2
AS65001 AS65002
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 79
EVPN – DCI interoperability
EVPN routes are split into L2 and L3 at the DCI edge
Routes for remote hosts will be independent L2 and L3 routes
Routes for local hosts will be combined L2+L3 routes
Mobility semantics are L2(MAC) centered, withdrawal trigger is relayed via L2-DCI
L3 DCI
H2 IP H2 IP+MAC
DC 1 DC 2
AS65001 AS65002
H2
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 80
DCI for EVPN fabrics - L3 Services
Routed L3 Services: VRF-lite, MPLS-VPN
Directory L3 Services: LISP
L2 DCI
L3 DCI
DC 1 DC 2
AS65001 AS65002
H2
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 81
DCI for EVPN fabrics - L3 Services
Routed – VRF-lite/ MPLS-VPN (option A)
• L3 DCI stretches the L3 VNIs providing an e2e path with hierarchy
• Mobility semantics are L2(MAC) centered, withdrawal trigger is relayed via L2-DCI
• Wake-up of dormant hosts (ARP) leverages the L2VNI stretched over the L2-DCI
• All other routed traffic goes over the L3VNI-L3DCI-L3VNI path
L2 DCI
H2
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 82
DCI for EVPN fabrics - L3 Services
Directory – LISP
• Mobility withdrawal trigger relayed over L2-DCI, semantics regenerated at L2-DCI edge
• Routes for remote hosts are only populated on demand
• A default route is used to send traffic to the DCI edge and trigger population of the routes
• Mobility semantics are re-generated at the DCI boundary when information requested on-demand is
obtained from LISP
L2 DCI
DC 1 DC 2
AS65001 AS65002
H2
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 83
Multi-site: Consolidated DCI for EVPN Fabrics
Combined L2 + L3 DCI service
• Single box
• L2 & L3 tunnel stitching
• Handoffs are not normalized to VLANs
• Single protocol for L2 & L3 DCI
• No splitting of MAC/IP routes
L3+L2 DCI
H2 IP+MAC H2 IP+MAC
DC 1 H2 IP+MAC DC 2
AS65001 AS65002
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 84
WAN
Connectivity
Inbound and Outbound Traffic Optimization
Maintain Traffic Symmetry over Optimal Paths
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 86
External Path Optimization with Host Routing
• Host route advertisement
X/32 @ DC1 Branch/
Campus
Y/32 @ DC2
Z/32 @ DC1 WAN/Campus • Cross domain mobility semantics
…
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 87
Nexus 7x00 with F3 – 7.3 (shipping)
ASR9k TH/TYP – 5.3.2 (shipping)
ASR1k – (roadmap)
VPNv4/VPNv6 EVPN
L2 DCI
DC 1 DC 2
AS65502 AS65501
X Y BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 90
Inbound Path Optimization
Design Considerations
WAN Edge devices must be deployed in a different AS than the one used
for the VXLAN EVPN fabric
Inbound traffic destined to endpoints connected in a given Fabric should
always come in via the local connection to the WAN Edge devices
o Leverage route-map with AS-Prepend toward the local WAN Edge devices to
steer traffic for local endpoints
Outbound traffic should always exit via the local connection to the WAN
Edge devices
o Configure higher local-pref (200) for all prefixes toward the local WAN Edge
devices. Use the same local-pref 200 on the L3-DCI connection for all endpoints
connected to remote fabrics (default 100 for all the other prefixes learned via the
L3-DCI connection)
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 91
Handling host state at large scale with LISP
• Similar problem scale to DNS
Branch/Cl
oset • Leverage demand based protocols
LISP XTR LISP Host
• A directory of hosts
WAN/Campus directory
• Location as well as policy
• Location != Routing
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 92
NXOS 7.2
LISP WAN
• Host detection done by the fabric
• LISP mobility triggered by reception of
host routes Site Gateway (SG):
LISP encap/decap
Fabric Protocol LISP signaling
• MP-BGP to LISP handoff (e.g.BGP)
• VXLAN EVPN (Standalone)
• VPNv4
Fabric based Mobility:
Move Detection
Local Routing Fix-up
BGP host advertisement
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 93
LISP to EVPN Multi-pod Integration 10.2.0.2/32 – RLOC A,B,C,D
EVPN Multi-pod scenario 10.2.0.2/32 – RLOC E,F,G,H
Map-System
Map-Notify 4 3
10.2.0.2/32 <E-H> Map-Register
Map-Notify 10.2.0.2/32 <E-H>
LISP Map-Register
Registration/
L3 Core Notifications L3 Core Table:
Routing
Routing Table: 10.2.0.2/32 – L3, 65001
10.2.0.2/32 – L3, 65001 10.2.0.2/32 – L6, 65002
10.2.0.2/32 – L6, 65002 LISP
10.2.0.2/32 – Null0-LISP
2 encap/decap
E F G H
A B C D
LISP eBGP Host Routes
w/Sequence Community BGP
encap/decap 2
2 withdraw
BGP
iBGP Host Routes iBGP Host Routes BGP AS 65002
withdraw BGP AS 65001
L51 L6 L7 L8
L1 L2 L3 L4
Routing Table:
“roamer” 10.2.0.2/32 – L3, 65001
Routing Table: (lands in a foreign 10.2.0.2/32 – Local
10.2.0.2/32 – Local network)
10.2.0.2/32 – L6, 65002
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 94
External Path Optimization for Multi-fabric
Regular routed L3 DCI, LISP based WAN
Ingress:
Branch/Cl
• Host information for each DC registered in LISP
oset
• LISP mobility tracks hosts:
• Updates LISP Mapping System
WAN/Campus
• Triggers refresh of map-cache at branches
Egress:
• Cross DC traffic follows routes via L3-DCI
L3 DCI (VRF-lite) • Branch traffic follows the defaults via the local
L2 DCI
Gateway to the LISP WAN
DC 1 DC 2
AS65001 AS65002
• No special provisioning considerations required
L3 DCI (LISP)
L2 DCI
DC 1 DC 2
AS65001 AS65002
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 96
DCI for EVPN Fabrics – LISP L3 Services
SVIs @ L3 Border provide reachability to all subnets (L2 VNIs to reach unknowns)
Use separate BLs for L2 and L3 DCI service connection
No hosts in an
extended L2 DCI
subnet on this
BL VLAN
L3 DCI
DC 1 DC 2 VXLAN
AS65001 All L2 VNIs AS65002
(and SVIs) on
Default Route this BL
points to the L3
border
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 97
LISP L3 DCI and dormant/unknown hosts
Traffic from X to Y when Y has not been discovered
Use different BLs for L3 and L2 DCI to circumvent the SVI on L2 BL limitation
The default path steers traffic to the LISP router (L3 DCI) to trigger LISP lookups
vlan L2 DCI
VLAN
L2VNI L3 DCI
Routing Table:
“roamer” 10.2.0.2/32 – E-H
Routing Table: (lands in a foreign 10.2.0.2/32 – Local
10.2.0.2/32 – Local network)
Default – A-D
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 99
LISP Fabric Integration
Multi-fabric scenario 10.2.0.2/32 – RLOC E,F,G,H
LIPS Map-System
Map-Request
10.2.0.2/32 <E-H>
Routing Table:
“roamer”
Routing Table: (lands in a foreign 10.2.0.2/32 – Local
network)
10.2.0.2/32 – A-D
Default – A-D BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 100
LISP Ingress Path Optimization for Multi-site
L3-DCI and WAN based on LISP
• LISP handles all communication
Branch/Cl • DC to DC
oset
• Branch-DC
WAN/Campus • No special routing considerations
• Scale benefits of demand based routing
LISP WAN
L2 DCI
DC 1 DC 2
AS65001 AS65002
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 101
Multi-site DCI and dormant/unknown hosts
Traffic from X to Y when Y has not been discovered
Consolidated BL for L3 and L2 DCI. IRB and L2 Stitching
The default path steers traffic to the Default router
ARP Request for Y
flooded down
L2VNIL2DCI
path
L2 DCI
L3 DCI
DC 1 DC 2 VXLAN
AS65001 L2VNI AS65002
svi
Y Receives ARP
request, replies
Default Route X L3VNI Y and is now
points to the L3 discovered
border
= VXLAN Encap/Decap = L2 VNI = L3
© 2017 Cisco and/or its VNI
affiliates. All rights reserved. Cisco Public 102
LISP Fabric Integration 10.2.0.2/32 – RLOC A,B,C,D
Multi-site scenario 10.2.0.2/32 – RLOC E,F,G,H
Map-System
Map-Notify 4 3
10.2.0.2/32 <E-H> Map-Register
Map-Notify 10.2.0.2/32 <E-H>
LISP Map-Register
Registration/
L3 Core Notifications L3 Core Table:
Routing
Routing Table: 10.2.0.2/32 – Local/DCI
10.2.0.2/32 – L3, 65001 10.2.0.2/32 – L6, 65002
LISP
10.2.0.2/32 – Null0-LISP
encap/decap
L2 DCI E F G H
A B C D
LISP
BGP
encap/decap 2
2 withdraw
BGP BGP EVPN +
withdraw RARP/GARP RARP/GARP BGP AS 65002
BGP AS 65001
L51 L6 L7 L8
L1 L2 L3 L4
Routing Table:
“roamer” 10.2.0.2/32 – E-H
Routing Table: (lands in a foreign 10.2.0.2/32 – Local
10.2.0.2/32 – Local network)
Default – A-D
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 103
LISP Fabric Integration
Multi-site scenario 10.2.0.2/32 – RLOC E,F,G,H
LISP Map-System
ARP-Request
10.2.0.2/32 ARP-Reply
10.2.0.2/32 MAC-X
L3 Core L3 Core Table:
Routing
Routing Table:
10.2.0.2/32 – L6, 65002
2 3
10.2.0.2/32 – Local/DCI LISP
10.2.0.2/32 – Null0-LISP
encap/decap
L2 DCI E F G H
A B C D
LISP
encap/decap
4 BGP AS 65002
BGP AS 65001
EVPN
L5 L6 L7 L8
Advertisement L1 L2 L3 L4
1
Routing Table:
“roamer”
Routing Table: (lands in a foreign 10.2.0.2/32 – Local
network)
10.2.0.2/32 – A-D
Default – A-D BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 104
LISP Fabric Integration
Fabric Host Route Distribution @ SG
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 105
Scale Benefits of LISP for the Multi-Fabric L3-DCI
Extend the reduction of state into the DC fabric
Current:
branch
• Most remote host state removed from the TORs.
• Only install state on the TORs for established
DCI/WAN/
Campus cross-DC connections
Direction:
On-demand
(pull) • All remote host state removed from the TORs
• Sending traffic across DCs is a matter of default
DC 1 DC 2
routing to the edge of the DC (LISP forwards the
traffic at the edge)
Push Push
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 108
Multicast Enabled Underlay
Network Overlay
• May use PIM-ASM or PIM-BiDir (Different hardware has different capabilities)
• Polarization: Encapsulated flows appear as a single flow which hashes to a single path
• Entropy in the encapsulation header to depolarize tunnels
• Variable UDP source port in VXLAN outer header
• Underlay must support ECMP hashing on L4 port numbers
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 110
Unicast in the Underlay – Interfaces (1)
How should my Underlay look like Routed Interfaces/Ports
• Know your IP addressing and IP 2 Spine * 3 Leaf = 6 Links
scale requirements 6 Link * 2 (/31) + 3 VTEP
15 IP Addresses required
• Use 1 prefix for all Underlay Links and
(/27)
Loopbacks
• Routed ports/interfaces
• interfaces between Spine and Leaf are
in routed mode (no switchport)
• For each Leaf / Spine connection, at
least a /31 is required
• Local to Remote VTEP (Loopback)
adjacency requires routed interface
in-between
• Exception: connection from SW VTEP
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 111
Unicast in the Underlay – Routing Protocol (1)
How should my Underlay look like
• Routing-Protocol of choice (many flavors
available)
Routed Interface + OSPF/ISIS • OSPF – watch your type
• p2p preferred (only LSA type-1)
• suits well for routed interfaces/ports (optimal from
a LSA database perspective)
p2p • Full SPF calculation on link-change
p2p
• broadcast (LSA type-1 & 2 + BR/DR
election)
• additional election and database overhead
• IS-IS
• independent of IP (CLNS) and well suited
for routed interfaces/ports
• not everyone is familiar with it
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 112
Unicast in the Underlay – Routing Protocol (2)
How should my Underlay look like
Routed Interface + eBGP
• eBGP
• neighbor is interface IP when using
routed interfaces/ports approach
bgp peer
• Use of loopbacks would require bgp peer
additional routing
• The Routing-Protocol Combo
• IGP for underlay topology &
reachability (e.g. IS-IS, OSPF) Routed Interface, IS-IS + iBGP
• iBGP for VTEP (loopback) reachability
• iBGP route-reflector for simplification
p2p
and scale p2p
bgp peer
bgp peer
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 113
Folded Clos Topology
Providing Topology Symmetry
Spine
L3 Fabric
VXLAN L2/3 VXLAN L2/3 VXLAN L2/3 VXLAN L2/3 VXLAN L2/3 VXLAN L2/3
Gateway Gateway Gateway Gateway Gateway Gateway
WAN/DCI
Aggregation
Aggregation Aggregation
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 117
vPC VTEPs as Border Leaves
L3 link between vPC peers to route around failures
Underlay Network with
IP ECMP Load Sharing
One Layer-3 connection per
tenant VRF between the two vPC
RR RR VTEPs to avoid black-holing
traffic
Tenant VRF A
VTEP VTEP VTEP VTEP BL VTEP
BL VTEP Tenant VRF B
Tenant VRF C
Tenant VRF A Tenant VRF A
IP Routing
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 118
vPC VTEPs as Border Leafs
Advertise PIP for better routing around failures
Underlay Network with
IP ECMP Load Sharing
RR RR
black-hole.
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 120
External Routing with Independent Borders
Border Spines and Distributed Anycast GWY
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 121
Instrumentation and Overlay Awareness
NV-edge NV-edge
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 122
Over-speed, Encapsulation & Effective Throughput
encap
10GE 40GE
1500bytes/packet (10Gbps) 1542 bytes/packet (10.1 Gbps)
64bytes/packet (10Gbps) 106 bytes/packet (10.3 Gbps)
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 123
Summary and Conclusion
Summary recommendations & takeaways
• Optimize the location of L2 and L3 GWYs to optimize routing and minimize failure
exposure
• Leverage L3 VXLAN services enabled by control protocols as the main service and L2
extensions as the exception
• Design the underlay with the VXLAN overlay in mind
• A combination of pull protocols and push protocols may render optimal scale and
resiliency
• Design the network hierarchically: both the underlay as well as the overlay
• L3 Gateways are key to a sound overlay design
• Link the provisioning of the overlay and scoping of VNIs to the host orchestration system
for optimal scale
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 125
Complete Your Online
Session Evaluation
• Give us your feedback to be
entered into a Daily Survey
Drawing. A daily winner will
receive a $750 gift card.
• Complete your session surveys
through the Cisco Live mobile
app or on www.CiscoLive.com/us.
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Continue Your Education
• Demos in the Cisco campus
• Walk-in Self-Paced Labs
• Lunch & Learn
• Meet the Engineer 1:1 meetings
• Related sessions
BRKDCT-2404 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 127
Thank you