SlideShare a Scribd company logo
4
Most read
9
Most read
13
Most read
SR-IOV and KVM virtual machines
under GNU/Linux Debian
Emulex OneConnect (OCm14102) 10Gbps cards
Yoann Juet @ University of Nantes, France
Information Technology Services
Version 1.0 (26 Dec 2014)
2/20
Our goal
• Virtualize high-performance servers, firewalls
requiring:
- Low network latency and jitter
- Low processor impact (I/O)
- High throughput (10Gbps)
• Solution: Single Root – IO Virtualization (SR-IOV)
- A single PCI card is showed up as multiple virtual PCI cards
- Exposes n virtual interfaces from a single physical interface
> Shared bandwidth
3/20
Prerequisites
• Virtualization Technology for Directed I/O: Intel VT-d
or AMD-Vi
- Must be supported by both the CPU and the chipset
- Guest machines gain direct memory access (DMA) to PCI(e)
devices, such as Ethernet cards
• PCI-SIG Single Root I/O Virtualization: SR-IOV
- Must be supported by both the Ethernet cards and the BIOS
- Guest machines are able to achieve ~ bare metal performance
4/20
Technical environment
• Dell PowerEdge M620
- Intel Xeon CPU E5-2660
- Dual QLogic ISP8214 10Gbps interfaces
> Logical names eth0, eth1
- Dual Emulex (OCm14102) SFP+ 10Gbps interfaces
> SR-IOV compatible card
> Logical names eth2, eth3
- Operating System Debian 7 (code name "Wheezy")
> Installed on both hosts and guests machines
5/20
BIOS
Host machine
• Ensure SR-IOV BIOS option is globally enabled
- System BIOS > Integrated Devices > SR-IOV Global Enable
6/20
BIOS
Host machine
• Ensure SR-IOV mode is set on both Ethernet cards
- Device Settings > … Emulex 10G ... > Virtualization Mode: SR-IOV
7/20
Debian: Starting with SR-IOV
Host machine
• Some Kernel Requirements:
CONFIG_PCI_IOV=y
CONFIG_PCI_STUB=y
CONFIG_VFIO_IOMMU_TYPE1=y
CONFIG_VFIO=y
CONFIG_VFIO_PCI=y
ONFIG_INTEL_IOMMU_DEFAULT_ON=y
→ Default Debian 7 kernel is not recommended for use with SR-IOV feature.
Rather, prefer a recent kernel that fixes important bugs related to SR-IOV and
provides performance improvements.
8/20
Debian: Starting with SR-IOV
Host machine
• Check for SR-IOV hardware support:
# lspci -v
…
04:00.0 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10)
...
Capabilities: [180 v1] Single Root I/O Virtualization (SR-IOV)
...
Initial VFs: 64, Total VFs: 64, Number of VFs: 64, Function Dependency Link: 00
Kernel driver in use: be2net
04:00.1 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10)
...
Capabilities: [180 v1] Single Root I/O Virtualization (SR-IOV)
Kernel driver in use: be2net
eth2
eth3
9/20
Debian: Starting with SR-IOV
Host machine
• Check for Intel's VT-d IOMMU support:
# dmesg | egrep -i “DMA|IOMMU”
...
dmar: IOMMU 0: reg_base_addr d3000000 ver 1:0 cap d2078c106f0462 ecap f020fe
dmar: IOMMU 1: reg_base_addr df100000 ver 1:0 cap d2078c106f0462 ecap f020fe
...
IOMMU: Setting identity map for device xxxxxxxx
...
PCI-DMA: Intel(R) Virtualization Technology for Directed I/O
…
https://www.kernel.org/doc/Documentation/vfio.txt
10/20
Debian: Starting with SR-IOV
Host machine
• Activate SR-IOV on both 10Gbps interfaces with 8 VFs (63 max. allowed) per PF
# echo “options be2net num_vfs=8” >> /etc/modprobe.d/be2net.conf
Then reboot or reload the be2net driver as shown below:
# rmmod be2net && modprobe be2net num_vfs=8
be2net 0000:04:00.0: Max: txqs 7, rxqs 1, rss 0, eqs 8, vfs 63
be2net 0000:04:04.0: enabling device (0000 -> 0002)
...
be2net 0000:04:04.7: enabling device (0000 -> 0002)
• Warning: SR-IOV activation is currently unavailable through the sysfs interface. As a
consequence, you set exactly the same number of VFs on both PFs (OCm14102):
# echo 8 > /sys/bus/pci/devices/0000:04:00.0/sriov_numvfs
# echo 8 > /sys/bus/pci/devices/0000:04:00.1/sriov_numvfs
device ID for eth2, eth3
11/20
Debian: Starting with SR-IOV
Host machine
• Ensure that new virtual PCIe devices (Virtual Functions) are visible:
# lspci
...
04:00.0 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10)
04:00.1 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10)
04:04.0 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10)
04:04.1 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10)
04:04.2 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10)
04:04.3 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10)
04:04.4 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10)
04:04.5 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10)
04:04.6 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10)
04:04.7 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10)
04:0c.0 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10)
04:0c.1 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10)
04:0c.2 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10)
04:0c.3 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10)
04:0c.4 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10)
04:0c.5 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10)
04:0c.6 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10)
04:0c.7 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10)
8 VFs on the second PF (eth3)
8 VFs on the first PF (eth2)
eth2 and eth3
12/20
Debian: Starting with SR-IOV
Host machine
• Each VF behaves like a traditional network interface - below, logical
names eth4 eth19→
# ip link show | grep mtu
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT
2: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP...
3: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP...
4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 1000
5: eth3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 1000
7: eth4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
8: eth5: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
9: eth6: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
...
17: eth15: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
18: eth16: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
19: eth17: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
20: eth18: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
21: eth19: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
16 unused VFs are visible
4 NICs on-board
13/20
Debian: PCI passthrough with libvirt
Host machine
• First method: Assignment with <hostdev> block
<hostdev mode='subsystem' type='pci' managed='yes'>
<source>
<address domain='<dom_id>' bus='<bus_id>' slot='<slot_id>' function='<func_id>'/>
</source>
</hostdev>
Where <dom_id>, <bus_id>, <slot_id> and <func_id> are given by:
# lspci -D
0000:01:10.0 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)
...
0000:01:11.6 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)
<func_id>
<slot_id>
<bus_id>
<dom_id>
- First virtual PCIe device (VF0): <address domain='0x0000' bus='0x01' slot='0x10' function='0x0'/>
- Last virtual PCIe device (VF7): <address domain='0x0000' bus='0x01' slot='0x11' function='0x6'/>
Excerpt from guest XML file
14/20
Debian: PCI passthrough with libvirt
Host machine
• Second method: Assignment with <interface type='hostdev'> block
<interface type='hostdev' managed='yes'>
<mac address='<virtual_mac_address>'/>
<source>
<address domain='<dom_id>' bus='<bus_id>' slot='<slot_id>' function='<func_id>'/>
</source>
</interface>
Where <virtual_mac_address>' is the guest interface virtual mac address. <dom_id>, <bus_id>, <slot_id>, <func_id> are defined in the previous slide.
Unfortunately, such an assignment method doesn't work on a standard Debian 7 distro (qemu-kvm 1.1.2, libvirt
0.9.12) need to upgrade qemu-kvm to version 1.3 or later→
# virsh define 01-test.xml
Domain 01-test defined from 01-test.xml
# virsh start 01-test
error: Failed to start domain 01-test
error: An error occurred, but the cause is unknown
Excerpt from guest XML file
15/20
Debian: PCI passthrough with libvirt
Host machine
• Third method: Assignment from a pool of VFs
<network>
<name>sriov</name>
<forward mode='hostdev' managed='yes'>
<driver name='vfio'/>
<pf dev='<iface>'/>
</forward>
</network>
<interface type='network'>
<source network='sriov'/>
<vlan>
<tag id='<vlan_id>'/>
</vlan>
</interface>
Again, such an assignment method is currently unsupported on Debian 7 need to upgrade libvirt to version 0.10.0→
or later as well as qemu-kvm for VFIO PCI device assignment
Network XML file
Directory /etc/libvirt/qemu/networks/
Excerpt from guest XML file
16/20
Debian: PCI passthrough with libvirt
Host machine
• Third method is preferred for its simplicity, but, it requires newer
versions of libvirt and qemu-kvm Debian/Backports→
# echo "deb ftp://<mirror-debian>/debian wheezy-backports main" >> /etc/apt/sources.list
# apt-get update
# apt-get -t wheezy-backports install libvirt-bin
# apt-get -t wheezy-backports install qemu-kvm
• Check for correct installation:
# virsh version
Compiled against library: libvirt 1.2.4
Using library: libvirt 1.2.4
Using API: QEMU 1.2.4
Running hypervisor: QEMU 2.0.0
17/20
Debian: PCI passthrough with libvirt
Host machine
• Assign two pools of PCIe devices to passthrough ; no need to worry about VF PCI
IDs...
# vi /etc/libvirt/qemu/networks/pf-eth2.xml
<network>
<name>pf-eth2</name>
<forward mode='hostdev' managed='yes'>
<driver name='vfio'/>
<pf dev='eth2'/>
</forward>
</network>
# virsh net-define /etc/libvirt/qemu/networks/pf-eth2.xml
# virsh net-start pf-eth2
# virsh net-autostart pf-eth2
# modprobe vfio
# vi /etc/libvirt/qemu/networks/pf-eth3.xml
<network>
<name>pf-eth3</name>
<forward mode='hostdev' managed='yes'>
<driver name='vfio'/>
<pf dev='eth3'/>
</forward>
</network>
# virsh net-define /etc/libvirt/qemu/networks/pf-eth3.xml
# virsh net-start pf-eth3
# virsh net-autostart pf-eth3
# virsh net-list
18/20
Debian: PCI passthrough with libvirt
Host machine
• In each guest XML file, specify the source pool, vlan id as well as (if required) the
interface mac address
# vi /etc/libvirt/qemu/myguest.xml
...
<interface type='network'>
<source network='pf-eth<2|3>'/>
<vlan>
<tag id='<vlan_id>'/>
</vlan>
</interface>
...
# virsh define myguest.xml
# virsh autostart myguest
# virsh start myguest
# vi /etc/libvirt/qemu/myguest.xml
...
<interface type='network'>
<mac address='<mac-address>'/>
<source network='pf-eth<2|3>'/>
<vlan>
<tag id='<vlan_id>'/>
</vlan>
</interface>
...
# virsh define myguest.xml
# virsh autostart myguest
# virsh start myguest
OR
19/20
Debian: Starting
Guest machine
• “a pure” Debian 7 (kernel 3.2.x) works
perfectly on guest machines
• Virtual interfaces are using the driver
be2net
20/20
University of Nantes – IT Services
Questions
Yoann (dot) Juet (at) univ–nantes.fr

More Related Content

PPTX
OpenStack Quantum Intro (OS Meetup 3-26-12)
PPTX
Presentation about servers
PPTX
Setting up a web server in Linux (Ubuntu)
ODP
OpenStack Oslo Messaging RPC API Tutorial Demo Call, Cast and Fanout
PDF
2015 FOSDEM - OVS Stateful Services
PPT
Iptables in linux
PDF
BeagleBone Black Bootloaders
PDF
Linux Kernel Overview
OpenStack Quantum Intro (OS Meetup 3-26-12)
Presentation about servers
Setting up a web server in Linux (Ubuntu)
OpenStack Oslo Messaging RPC API Tutorial Demo Call, Cast and Fanout
2015 FOSDEM - OVS Stateful Services
Iptables in linux
BeagleBone Black Bootloaders
Linux Kernel Overview

What's hot (20)

PDF
Deploying IPv6 in OpenStack Environments
PDF
Network Drivers
PDF
Introduction to OpenCL
PDF
Virtualization with KVM (Kernel-based Virtual Machine)
PDF
Deploying IPv6 on OpenStack
PPTX
GPU Architecture NVIDIA (GTX GeForce 480)
PPTX
Ceph Tech Talk -- Ceph Benchmarking Tool
PPT
What is Virtualization
PPTX
Ubuntu OS.pptx
PDF
Java Performance Analysis on Linux with Flame Graphs
PDF
Aynchronous Processing in Kamailio Configuration File
PPT
Samba server
PPT
Basic Linux Internals
PDF
DNS (Domain Name System)
PDF
OpenStackを一発でデプロイ – Juju/MAAS - OpenStack最新情報セミナー 2015年2月
PDF
Booting Android: bootloaders, fastboot and boot images
PDF
Build cloud like Rackspace with OpenStack Ansible
PPTX
Broken Linux Performance Tools 2016
PDF
ISCA 2014 | Heterogeneous System Architecture (HSA): Architecture and Algorit...
PPTX
Netflix viewing data architecture evolution - EBJUG Nov 2014
Deploying IPv6 in OpenStack Environments
Network Drivers
Introduction to OpenCL
Virtualization with KVM (Kernel-based Virtual Machine)
Deploying IPv6 on OpenStack
GPU Architecture NVIDIA (GTX GeForce 480)
Ceph Tech Talk -- Ceph Benchmarking Tool
What is Virtualization
Ubuntu OS.pptx
Java Performance Analysis on Linux with Flame Graphs
Aynchronous Processing in Kamailio Configuration File
Samba server
Basic Linux Internals
DNS (Domain Name System)
OpenStackを一発でデプロイ – Juju/MAAS - OpenStack最新情報セミナー 2015年2月
Booting Android: bootloaders, fastboot and boot images
Build cloud like Rackspace with OpenStack Ansible
Broken Linux Performance Tools 2016
ISCA 2014 | Heterogeneous System Architecture (HSA): Architecture and Algorit...
Netflix viewing data architecture evolution - EBJUG Nov 2014
Ad

Viewers also liked (11)

PDF
SR-IOV, KVM and Intel X520 10Gbps cards on Debian/Stable
PDF
SR-IOV+KVM on Debian/Stable
PDF
Building a Converged Infrastructure based on FCoE, Dell Blades and Force10 sw...
PDF
82599 sriov vm configuration notes
ODP
SR-IOV Introduce
PDF
BUD17-302: LLVM Internals #2
PDF
Intel DPDK Step by Step instructions
PDF
仮想化環境におけるパケットフォワーディング
PDF
DPDK in Containers Hands-on Lab
PDF
10GbE時代のネットワークI/O高速化
PPTX
Understanding DPDK
SR-IOV, KVM and Intel X520 10Gbps cards on Debian/Stable
SR-IOV+KVM on Debian/Stable
Building a Converged Infrastructure based on FCoE, Dell Blades and Force10 sw...
82599 sriov vm configuration notes
SR-IOV Introduce
BUD17-302: LLVM Internals #2
Intel DPDK Step by Step instructions
仮想化環境におけるパケットフォワーディング
DPDK in Containers Hands-on Lab
10GbE時代のネットワークI/O高速化
Understanding DPDK
Ad

Similar to SR-IOV, KVM and Emulex OneConnect 10Gbps cards on Debian/Stable (20)

PPTX
Network Automation Tools
PDF
See what happened with real time kvm when building real time cloud pezhang@re...
PDF
Understanding Open vSwitch
ODP
Securing the network for VMs or Containers
PDF
VLANs in the Linux Kernel
PDF
Open stack advanced_part
PPT
managing your network environment
PDF
Vyos clustering ipsec
PPT
Day 20.3 frame relay
PPTX
The Basic Introduction of Open vSwitch
PPTX
[오픈소스컨설팅] Linux Network Troubleshooting
PDF
SR-IOV ixgbe Driver Limitations and Improvement
PDF
PPT
05 module managing your network enviornment
PDF
DPDK Summit 2015 - RIFT.io - Tim Mortsolf
PPT
Cisco data center support
PPTX
Harmonia open iris_basic_v0.1
PDF
Implementing an IPv6 Enabled Environment for a Public Cloud Tenant
PPT
Icnd210 s08l03
PPT
Day 20.1 configuringframerelay
Network Automation Tools
See what happened with real time kvm when building real time cloud pezhang@re...
Understanding Open vSwitch
Securing the network for VMs or Containers
VLANs in the Linux Kernel
Open stack advanced_part
managing your network environment
Vyos clustering ipsec
Day 20.3 frame relay
The Basic Introduction of Open vSwitch
[오픈소스컨설팅] Linux Network Troubleshooting
SR-IOV ixgbe Driver Limitations and Improvement
05 module managing your network enviornment
DPDK Summit 2015 - RIFT.io - Tim Mortsolf
Cisco data center support
Harmonia open iris_basic_v0.1
Implementing an IPv6 Enabled Environment for a Public Cloud Tenant
Icnd210 s08l03
Day 20.1 configuringframerelay

Recently uploaded (20)

PDF
WOOl fibre morphology and structure.pdf for textiles
PPT
Geologic Time for studying geology for geologist
PPTX
Final SEM Unit 1 for mit wpu at pune .pptx
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PDF
August Patch Tuesday
PDF
Unlock new opportunities with location data.pdf
PDF
Getting started with AI Agents and Multi-Agent Systems
PPTX
observCloud-Native Containerability and monitoring.pptx
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
DOCX
search engine optimization ppt fir known well about this
PDF
Hybrid model detection and classification of lung cancer
PDF
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
PDF
A comparative study of natural language inference in Swahili using monolingua...
PPT
Module 1.ppt Iot fundamentals and Architecture
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
1 - Historical Antecedents, Social Consideration.pdf
WOOl fibre morphology and structure.pdf for textiles
Geologic Time for studying geology for geologist
Final SEM Unit 1 for mit wpu at pune .pptx
A contest of sentiment analysis: k-nearest neighbor versus neural network
August Patch Tuesday
Unlock new opportunities with location data.pdf
Getting started with AI Agents and Multi-Agent Systems
observCloud-Native Containerability and monitoring.pptx
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
search engine optimization ppt fir known well about this
Hybrid model detection and classification of lung cancer
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
A comparative study of natural language inference in Swahili using monolingua...
Module 1.ppt Iot fundamentals and Architecture
Univ-Connecticut-ChatGPT-Presentaion.pdf
Assigned Numbers - 2025 - Bluetooth® Document
Hindi spoken digit analysis for native and non-native speakers
1 - Historical Antecedents, Social Consideration.pdf

SR-IOV, KVM and Emulex OneConnect 10Gbps cards on Debian/Stable

  • 1. SR-IOV and KVM virtual machines under GNU/Linux Debian Emulex OneConnect (OCm14102) 10Gbps cards Yoann Juet @ University of Nantes, France Information Technology Services Version 1.0 (26 Dec 2014)
  • 2. 2/20 Our goal • Virtualize high-performance servers, firewalls requiring: - Low network latency and jitter - Low processor impact (I/O) - High throughput (10Gbps) • Solution: Single Root – IO Virtualization (SR-IOV) - A single PCI card is showed up as multiple virtual PCI cards - Exposes n virtual interfaces from a single physical interface > Shared bandwidth
  • 3. 3/20 Prerequisites • Virtualization Technology for Directed I/O: Intel VT-d or AMD-Vi - Must be supported by both the CPU and the chipset - Guest machines gain direct memory access (DMA) to PCI(e) devices, such as Ethernet cards • PCI-SIG Single Root I/O Virtualization: SR-IOV - Must be supported by both the Ethernet cards and the BIOS - Guest machines are able to achieve ~ bare metal performance
  • 4. 4/20 Technical environment • Dell PowerEdge M620 - Intel Xeon CPU E5-2660 - Dual QLogic ISP8214 10Gbps interfaces > Logical names eth0, eth1 - Dual Emulex (OCm14102) SFP+ 10Gbps interfaces > SR-IOV compatible card > Logical names eth2, eth3 - Operating System Debian 7 (code name "Wheezy") > Installed on both hosts and guests machines
  • 5. 5/20 BIOS Host machine • Ensure SR-IOV BIOS option is globally enabled - System BIOS > Integrated Devices > SR-IOV Global Enable
  • 6. 6/20 BIOS Host machine • Ensure SR-IOV mode is set on both Ethernet cards - Device Settings > … Emulex 10G ... > Virtualization Mode: SR-IOV
  • 7. 7/20 Debian: Starting with SR-IOV Host machine • Some Kernel Requirements: CONFIG_PCI_IOV=y CONFIG_PCI_STUB=y CONFIG_VFIO_IOMMU_TYPE1=y CONFIG_VFIO=y CONFIG_VFIO_PCI=y ONFIG_INTEL_IOMMU_DEFAULT_ON=y → Default Debian 7 kernel is not recommended for use with SR-IOV feature. Rather, prefer a recent kernel that fixes important bugs related to SR-IOV and provides performance improvements.
  • 8. 8/20 Debian: Starting with SR-IOV Host machine • Check for SR-IOV hardware support: # lspci -v … 04:00.0 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10) ... Capabilities: [180 v1] Single Root I/O Virtualization (SR-IOV) ... Initial VFs: 64, Total VFs: 64, Number of VFs: 64, Function Dependency Link: 00 Kernel driver in use: be2net 04:00.1 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10) ... Capabilities: [180 v1] Single Root I/O Virtualization (SR-IOV) Kernel driver in use: be2net eth2 eth3
  • 9. 9/20 Debian: Starting with SR-IOV Host machine • Check for Intel's VT-d IOMMU support: # dmesg | egrep -i “DMA|IOMMU” ... dmar: IOMMU 0: reg_base_addr d3000000 ver 1:0 cap d2078c106f0462 ecap f020fe dmar: IOMMU 1: reg_base_addr df100000 ver 1:0 cap d2078c106f0462 ecap f020fe ... IOMMU: Setting identity map for device xxxxxxxx ... PCI-DMA: Intel(R) Virtualization Technology for Directed I/O … https://www.kernel.org/doc/Documentation/vfio.txt
  • 10. 10/20 Debian: Starting with SR-IOV Host machine • Activate SR-IOV on both 10Gbps interfaces with 8 VFs (63 max. allowed) per PF # echo “options be2net num_vfs=8” >> /etc/modprobe.d/be2net.conf Then reboot or reload the be2net driver as shown below: # rmmod be2net && modprobe be2net num_vfs=8 be2net 0000:04:00.0: Max: txqs 7, rxqs 1, rss 0, eqs 8, vfs 63 be2net 0000:04:04.0: enabling device (0000 -> 0002) ... be2net 0000:04:04.7: enabling device (0000 -> 0002) • Warning: SR-IOV activation is currently unavailable through the sysfs interface. As a consequence, you set exactly the same number of VFs on both PFs (OCm14102): # echo 8 > /sys/bus/pci/devices/0000:04:00.0/sriov_numvfs # echo 8 > /sys/bus/pci/devices/0000:04:00.1/sriov_numvfs device ID for eth2, eth3
  • 11. 11/20 Debian: Starting with SR-IOV Host machine • Ensure that new virtual PCIe devices (Virtual Functions) are visible: # lspci ... 04:00.0 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10) 04:00.1 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10) 04:04.0 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10) 04:04.1 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10) 04:04.2 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10) 04:04.3 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10) 04:04.4 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10) 04:04.5 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10) 04:04.6 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10) 04:04.7 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10) 04:0c.0 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10) 04:0c.1 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10) 04:0c.2 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10) 04:0c.3 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10) 04:0c.4 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10) 04:0c.5 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10) 04:0c.6 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10) 04:0c.7 Ethernet controller: Emulex Corporation OneConnect NIC (Skyhawk) (rev 10) 8 VFs on the second PF (eth3) 8 VFs on the first PF (eth2) eth2 and eth3
  • 12. 12/20 Debian: Starting with SR-IOV Host machine • Each VF behaves like a traditional network interface - below, logical names eth4 eth19→ # ip link show | grep mtu 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT 2: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP... 3: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP... 4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 1000 5: eth3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 1000 7: eth4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000 8: eth5: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000 9: eth6: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000 ... 17: eth15: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000 18: eth16: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000 19: eth17: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000 20: eth18: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000 21: eth19: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000 16 unused VFs are visible 4 NICs on-board
  • 13. 13/20 Debian: PCI passthrough with libvirt Host machine • First method: Assignment with <hostdev> block <hostdev mode='subsystem' type='pci' managed='yes'> <source> <address domain='<dom_id>' bus='<bus_id>' slot='<slot_id>' function='<func_id>'/> </source> </hostdev> Where <dom_id>, <bus_id>, <slot_id> and <func_id> are given by: # lspci -D 0000:01:10.0 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01) ... 0000:01:11.6 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01) <func_id> <slot_id> <bus_id> <dom_id> - First virtual PCIe device (VF0): <address domain='0x0000' bus='0x01' slot='0x10' function='0x0'/> - Last virtual PCIe device (VF7): <address domain='0x0000' bus='0x01' slot='0x11' function='0x6'/> Excerpt from guest XML file
  • 14. 14/20 Debian: PCI passthrough with libvirt Host machine • Second method: Assignment with <interface type='hostdev'> block <interface type='hostdev' managed='yes'> <mac address='<virtual_mac_address>'/> <source> <address domain='<dom_id>' bus='<bus_id>' slot='<slot_id>' function='<func_id>'/> </source> </interface> Where <virtual_mac_address>' is the guest interface virtual mac address. <dom_id>, <bus_id>, <slot_id>, <func_id> are defined in the previous slide. Unfortunately, such an assignment method doesn't work on a standard Debian 7 distro (qemu-kvm 1.1.2, libvirt 0.9.12) need to upgrade qemu-kvm to version 1.3 or later→ # virsh define 01-test.xml Domain 01-test defined from 01-test.xml # virsh start 01-test error: Failed to start domain 01-test error: An error occurred, but the cause is unknown Excerpt from guest XML file
  • 15. 15/20 Debian: PCI passthrough with libvirt Host machine • Third method: Assignment from a pool of VFs <network> <name>sriov</name> <forward mode='hostdev' managed='yes'> <driver name='vfio'/> <pf dev='<iface>'/> </forward> </network> <interface type='network'> <source network='sriov'/> <vlan> <tag id='<vlan_id>'/> </vlan> </interface> Again, such an assignment method is currently unsupported on Debian 7 need to upgrade libvirt to version 0.10.0→ or later as well as qemu-kvm for VFIO PCI device assignment Network XML file Directory /etc/libvirt/qemu/networks/ Excerpt from guest XML file
  • 16. 16/20 Debian: PCI passthrough with libvirt Host machine • Third method is preferred for its simplicity, but, it requires newer versions of libvirt and qemu-kvm Debian/Backports→ # echo "deb ftp://<mirror-debian>/debian wheezy-backports main" >> /etc/apt/sources.list # apt-get update # apt-get -t wheezy-backports install libvirt-bin # apt-get -t wheezy-backports install qemu-kvm • Check for correct installation: # virsh version Compiled against library: libvirt 1.2.4 Using library: libvirt 1.2.4 Using API: QEMU 1.2.4 Running hypervisor: QEMU 2.0.0
  • 17. 17/20 Debian: PCI passthrough with libvirt Host machine • Assign two pools of PCIe devices to passthrough ; no need to worry about VF PCI IDs... # vi /etc/libvirt/qemu/networks/pf-eth2.xml <network> <name>pf-eth2</name> <forward mode='hostdev' managed='yes'> <driver name='vfio'/> <pf dev='eth2'/> </forward> </network> # virsh net-define /etc/libvirt/qemu/networks/pf-eth2.xml # virsh net-start pf-eth2 # virsh net-autostart pf-eth2 # modprobe vfio # vi /etc/libvirt/qemu/networks/pf-eth3.xml <network> <name>pf-eth3</name> <forward mode='hostdev' managed='yes'> <driver name='vfio'/> <pf dev='eth3'/> </forward> </network> # virsh net-define /etc/libvirt/qemu/networks/pf-eth3.xml # virsh net-start pf-eth3 # virsh net-autostart pf-eth3 # virsh net-list
  • 18. 18/20 Debian: PCI passthrough with libvirt Host machine • In each guest XML file, specify the source pool, vlan id as well as (if required) the interface mac address # vi /etc/libvirt/qemu/myguest.xml ... <interface type='network'> <source network='pf-eth<2|3>'/> <vlan> <tag id='<vlan_id>'/> </vlan> </interface> ... # virsh define myguest.xml # virsh autostart myguest # virsh start myguest # vi /etc/libvirt/qemu/myguest.xml ... <interface type='network'> <mac address='<mac-address>'/> <source network='pf-eth<2|3>'/> <vlan> <tag id='<vlan_id>'/> </vlan> </interface> ... # virsh define myguest.xml # virsh autostart myguest # virsh start myguest OR
  • 19. 19/20 Debian: Starting Guest machine • “a pure” Debian 7 (kernel 3.2.x) works perfectly on guest machines • Virtual interfaces are using the driver be2net
  • 20. 20/20 University of Nantes – IT Services Questions Yoann (dot) Juet (at) univ–nantes.fr