Game Changer: Software-Defined Storage
and Container Schedulers
David vonThenen
{code} by Dell EMC
@dvonthenen
dvonthenen.com
github.com/dvonthenen
Agenda
• Container Schedulers
• Containers In Production
• Software-Defined Storage (SDS)
• Schedulers + SDS = Game Changing
• Demo
Schedulers
What is a Scheduler?
• Fair and efficient workload
placement
• Adhering to a set of constraints
• Quickly (and deterministically)
dispatching jobs
• Robust and tolerates errors
Let’s take a look: Apache Mesos
• Is a Container Scheduler
– Docker
– Unified Containerizer
• Cluster Manager
• Task placement based on CPU, Memory, and Disk
• User defined constraints
• 2 Layer Scheduler – Offer/Accept Model
Mesos Frameworks
• Ability to sub-schedule tasks based on Application needs
• Framework implements a Scheduler and Executor
– Scheduler – Accepts/Denies resources
– Executor – Application
• Multiple Frameworks
run within the cluster
Framework / Offer Mechanism
Containers
Containers Today
• Many container workloads
are long running
• Many have state: user
data, configuration, and
etc
• Top 7 of 12 Apps in
Docker Hub are persistent
applications
Death of a Container
• Where does my data go?
• Turned to the compute
node’s local disk to store
data
• What happens on a node
failure?
• Production applications
require high availability
• External Storage!
/etc /var
/bin /opt
/data
How do we achieve this?
• REX-Ray
– Vendor agnostic storage orchestration engine
– AWS, GCE, ScaleIO, VirtualBox, many more
– GitHub: https://github.com/emccode/rexray
• mesos-module-dvdi
– Provides hooks to Mesos agent nodes to
manage external storage
– GitHub: https://github.com/emccode/mesos-
module-dvdi
Enablement is Out-Of-Band
• The “glue” that combines compute to external storage is
add-on to the resource manager
• Obvious but easily dismissive answer: DevOps
– Software upgrades? On all nodes…
– Maintenance? Infrastructure, Storage Platform, etc
– Changes to Container Scheduler? Behaviors, APIs, etc
• Just make it happen!
• Almost 100% of the way there...
Got to be an easier way…
Software-Defined
Storage
What are they?
• Software-Defined Storage (SDS) serve as abstraction
layer above underlying storage
• Provides a (programmatic) mechanism to provision
storage
• Varying degrees of SDS: NFS, VMware Virtual Volumes
What makes them unique?
• Manage provisioning and data independent of underlying
hardware (operational)
• Abstract consumed logical storage from underlying
physical storage (physical)
• Automation of policy driven SLAs both external (users)
and internal (platform)
Let’s take a look: ScaleIO
• Scale-out block storage
• Linear performance
• Elastic architecture
• Infrastructure agnostic
• Try ScaleIO as a free download:
https://www.emc.com/products-solutions/trial-software-
download/scaleio.htm
Scale-out Block Storage
• Scale from 3 nodes to 1000s
of nodes
• Add storage services and
servers on the fly to increase
capacity and performance
• Storage growth always
automatically aligned with
application needs
Elastic Architecture
Add, remove, re-allocate, on the fly, without stopping IO
AUTO-REBALANCE
when resources are added
NO CAPACITY PLANNING OR MIGRATION!
AUTO-REBUILD
when resources fail or removed
Mix and match OS, hypervisors, platforms, media in the same ScaleIO system
Infrastructure Agnostic
HYPERVISORSOPERATING
SYSTEM
CLOUD MEDIA
SSDs
HDDs
PCIe
Flash
FLASH FLASH
Game Changer
Let’s Review…
• Container Schedulers:
– Great platform for container management
– Needs persistent storage for production Apps
– Adding persistent storage out-of-band presents challenges
• Software-Defined Storage:
– Scale-out storage
– Elastic architecture
– Infrastructure agnostic
Schedulers + SDS = ????
Better than the Sum of Our Parts
• Let’s create a Software-
Defined Storage Framework
• ScaleIO + Mesos Framework =
Awesome Sauce!
• https://github.com/codedellemc
/scaleio-framework
SDS Framework = Mind Blown
• External persistent storage native to scheduling platform
• Globally accessible storage
• Storage array? Reduce complexity
• Reduces maintenance
• Deploy Anywhere!
What this Means for your Apps
• No data loss on infrastructure failure
• Insulates changes with cluster
manager (APIs, etc)
• Highly Available containers and Apps!
• Production ready!
• Tolerates failures
Surprising Combination
Demo
Configuration
• 3 Node Mesos Cluster (Management)
• 2 Mesos Agent nodes (Compute)
– Initially the first node online
– Second node will be onboarded or introduced later
• ScaleIO Cluster (Scale-out storage)
– 3 management nodes
– 180 GB local disks on each management node to comprise
this storage pool
Configuration (Cont.)
• ScaleIO Framework
– GitHub: https://github.com/codedellemc/scaleio-framework
• Persistent External Storage
– Using REX-Ray
› GitHub: https://github.com/emccode/rexray
– Using mesos-module-dvdi
› GitHub: https://github.com/emccode/mesos-module-dvdi
The Moving Parts
Scheduler
Mesos Cluster
Mesos
Agent
Mesos
Agent
App
#CodeOpen
DemoDemo
#CodeOpen
Thank youThank you
#CodeOpen
ContainerCon EU 2016 - Software-Defined Storage and Container Schedulers

ContainerCon EU 2016 - Software-Defined Storage and Container Schedulers

  • 1.
    Game Changer: Software-DefinedStorage and Container Schedulers David vonThenen {code} by Dell EMC @dvonthenen dvonthenen.com github.com/dvonthenen
  • 2.
    Agenda • Container Schedulers •Containers In Production • Software-Defined Storage (SDS) • Schedulers + SDS = Game Changing • Demo
  • 3.
  • 4.
    What is aScheduler? • Fair and efficient workload placement • Adhering to a set of constraints • Quickly (and deterministically) dispatching jobs • Robust and tolerates errors
  • 5.
    Let’s take alook: Apache Mesos • Is a Container Scheduler – Docker – Unified Containerizer • Cluster Manager • Task placement based on CPU, Memory, and Disk • User defined constraints • 2 Layer Scheduler – Offer/Accept Model
  • 6.
    Mesos Frameworks • Abilityto sub-schedule tasks based on Application needs • Framework implements a Scheduler and Executor – Scheduler – Accepts/Denies resources – Executor – Application • Multiple Frameworks run within the cluster
  • 7.
  • 8.
  • 9.
    Containers Today • Manycontainer workloads are long running • Many have state: user data, configuration, and etc • Top 7 of 12 Apps in Docker Hub are persistent applications
  • 10.
    Death of aContainer • Where does my data go? • Turned to the compute node’s local disk to store data • What happens on a node failure? • Production applications require high availability • External Storage! /etc /var /bin /opt /data
  • 11.
    How do weachieve this? • REX-Ray – Vendor agnostic storage orchestration engine – AWS, GCE, ScaleIO, VirtualBox, many more – GitHub: https://github.com/emccode/rexray • mesos-module-dvdi – Provides hooks to Mesos agent nodes to manage external storage – GitHub: https://github.com/emccode/mesos- module-dvdi
  • 12.
    Enablement is Out-Of-Band •The “glue” that combines compute to external storage is add-on to the resource manager • Obvious but easily dismissive answer: DevOps – Software upgrades? On all nodes… – Maintenance? Infrastructure, Storage Platform, etc – Changes to Container Scheduler? Behaviors, APIs, etc • Just make it happen! • Almost 100% of the way there...
  • 13.
    Got to bean easier way…
  • 14.
  • 15.
    What are they? •Software-Defined Storage (SDS) serve as abstraction layer above underlying storage • Provides a (programmatic) mechanism to provision storage • Varying degrees of SDS: NFS, VMware Virtual Volumes
  • 16.
    What makes themunique? • Manage provisioning and data independent of underlying hardware (operational) • Abstract consumed logical storage from underlying physical storage (physical) • Automation of policy driven SLAs both external (users) and internal (platform)
  • 17.
    Let’s take alook: ScaleIO • Scale-out block storage • Linear performance • Elastic architecture • Infrastructure agnostic • Try ScaleIO as a free download: https://www.emc.com/products-solutions/trial-software- download/scaleio.htm
  • 18.
    Scale-out Block Storage •Scale from 3 nodes to 1000s of nodes • Add storage services and servers on the fly to increase capacity and performance • Storage growth always automatically aligned with application needs
  • 19.
    Elastic Architecture Add, remove,re-allocate, on the fly, without stopping IO AUTO-REBALANCE when resources are added NO CAPACITY PLANNING OR MIGRATION! AUTO-REBUILD when resources fail or removed
  • 20.
    Mix and matchOS, hypervisors, platforms, media in the same ScaleIO system Infrastructure Agnostic HYPERVISORSOPERATING SYSTEM CLOUD MEDIA SSDs HDDs PCIe Flash FLASH FLASH
  • 21.
  • 22.
    Let’s Review… • ContainerSchedulers: – Great platform for container management – Needs persistent storage for production Apps – Adding persistent storage out-of-band presents challenges • Software-Defined Storage: – Scale-out storage – Elastic architecture – Infrastructure agnostic
  • 23.
  • 24.
    Better than theSum of Our Parts • Let’s create a Software- Defined Storage Framework • ScaleIO + Mesos Framework = Awesome Sauce! • https://github.com/codedellemc /scaleio-framework
  • 25.
    SDS Framework =Mind Blown • External persistent storage native to scheduling platform • Globally accessible storage • Storage array? Reduce complexity • Reduces maintenance • Deploy Anywhere!
  • 26.
    What this Meansfor your Apps • No data loss on infrastructure failure • Insulates changes with cluster manager (APIs, etc) • Highly Available containers and Apps! • Production ready! • Tolerates failures
  • 27.
  • 28.
  • 29.
    Configuration • 3 NodeMesos Cluster (Management) • 2 Mesos Agent nodes (Compute) – Initially the first node online – Second node will be onboarded or introduced later • ScaleIO Cluster (Scale-out storage) – 3 management nodes – 180 GB local disks on each management node to comprise this storage pool
  • 30.
    Configuration (Cont.) • ScaleIOFramework – GitHub: https://github.com/codedellemc/scaleio-framework • Persistent External Storage – Using REX-Ray › GitHub: https://github.com/emccode/rexray – Using mesos-module-dvdi › GitHub: https://github.com/emccode/mesos-module-dvdi
  • 31.
    The Moving Parts Scheduler MesosCluster Mesos Agent Mesos Agent App
  • 32.
  • 33.
  • 34.