See tips to improve your Cognos (v10 and v11) environment. Topics include the new Interactive Performance Assistant (v11), hardware and server specifics, failover and high availability, high and low affinity requests, overview of services, Java heap settings, IIS configurations and non-Cognos related tuning. View the video recording and download this deck at: http://www.senturus.com/resources/cognos-analytics-performance-tuning/
Senturus, a business analytics consulting firm, has a resource library with hundreds of free recorded webinars, trainings, demos and unbiased product reviews. Take a look and share them with your colleagues and friends: http://www.senturus.com/resources/.
The most common problems a database administrator or user encounters while working on a database is to audit and monitor the health of their databases which includes observing the database activity so as to be aware of the actions of the users, finding the slow running statements, finding why database systems performance is low during certain intervals or diagnosing the corruption issues.
In this webinar, you'll learn:
- More about the auditing feature in PostgreSQL/EPAS
- About the utility of built-in monitoring tools in PostgreSQL/EPAS - To help solve some commonly reported scenarios
- How to use pg_stat_statements and pg_stat_activity to find the execution-time statistics for long running queries, the number of
- Buffer reads and wait events
- How to use pgstattuple in debugging scenarios where relation level information is required and
- How to use pageinspect for page level details for a relation.
This webinar focuses on the built-in monitoring tools available in PostgreSQL/EPAS and does not discuss external tools available for monitoring PostgreSQL.
Making your PostgreSQL Database Highly AvailableEDB
High Availability is one of the most important requirements for mission-critical database systems. It is important for business continuity.
Enterprises cannot afford an outage of mission-critical applications, as mere minutes of downtime can cost millions of dollars in lost revenue.
Therefore making a database environment highly available is typically one of the highest priorities and poses significant challenges/questions to enterprises and database administrators.
What you will learn at this webinar:
- Database high availability basics in PostgreSQL
- How to design your environment for high availability
- High availability options available for PostgreSQL
- What EDB can offer to help enterprises meet their high availability requirements
Development of concurrent services using In-Memory Data Gridsjlorenzocima
As part of OTN Tour 2014 believes this presentation which is intented for covers the basic explanation of a solution of IMDG, explains how it works and how it can be used within an architecture and shows some use cases. Enjoy
A webinarban megtudhatják milyen kihívásokkal kell szembenézni Oracle adatbázis PostgreSQL-re migrálása során. Bemutatjuk az utóbbi két év nagy komplexitású Oracle kompatibilitási vizsgálatainak tapasztalatait, beleértve az idei évben az EDB migrációs portálján keresztül vizsgált több mint 2 200 000 Oracle DDL konstrukciót.
Az előadás alatt kitérünk az alábbiakra:
- Tárhely (storage) definiciók
- Csomagok
- Tárolt eljárások
- PL/SQL kód
- Gyártói adatbázis API-k
- Komplex adatbázis migrációk
Előadásunkat olyan migrációs eszközök bemutatásával zárjuk, amelyek jelentősen leegyszerűsítik az Oracle-PostgeSQL migrációt és csökkentik annak kockázatait.
Overcoming write availability challenges of PostgreSQLEDB
There's no shortage of physical replication solutions for PostgreSQL, they scale horizontally and provide high read availability. But where they fall short is write availability, which leads many users to consider PostgreSQL logical replication. Existing solutions have a single point of failure or are dependent on a forked, vendor-provided PostgreSQL extension making reliable, enterprise-class logical replication hard to come by. Furthermore, these solutions put limits on scaling PostgreSQL.
By combining Kafka, an open source event streaming system with PostgreSQL, customers can get a fault tolerant, scalable logical replication service. Learn how EDB Replicate leverages Kafka for high write availability needed for today's demanding consumers who expect their applications to be always available and won't tolerate latency.
An overview of reference architectures for PostgresEDB
EDB Reference Architectures are designed to help new and existing users alike to quickly design a deployment architecture that suits their needs. They can be used as either the blueprint for a deployment, or as the basis for a design that enhances and extends the functionality and features offered.
Add-on architectures allow users to easily extend their core database server deployment to add additional features and functionality "building block" style.
In this webinar, we will review the following architectures:
- Single Node
- Multi Node with Asynchronous Replication
- Multi Node with Synchronous Replication
- Add-on Architectures
Introducing Data Redaction - an enabler to data security in EDB Postgres Adva...EDB
With the rapid growth in digitalization, coupled with the current pandemic situation globally, many organizations and businesses are forced to operate remotely and online, more than they would prefer. At such times, how do corporations and businesses ensure data security, especially the secure management of personal information?
There are many techniques used to secure information, such as authentication, authorization, access control, virtual database, and encryption. In this webinar, we focus on Data Redaction - a technique that limits sensitive data exposure in EDB Postgres Advanced Server (EPAS).
This webinar covers:
- What is EDB Data Redaction
- How to limit sensitive data exposure in EPAS
- Provision for Oracle compatibility in EPAS
- Demo
Using PEM to understand and improve performance in Postgres: Postgres Tuning ...EDB
The Postgres Enterprise Manager (PEM) Tuning Wizard reviews your installation, and recommends a set of configuration options that will help tune a Postgres installation to best suit the anticipated workload. PEM's Performance Diagnostics uses Postgres' wait state information to analyze queries in context of the current workload and help identify further performance improvement opportunities in terms of locks, IO, and CPU bottlenecks.
This webinar covers:
- How to intelligently manage all your database servers with a single console
- Identify the useful features and functionality needed for visual database administration
- How to manage the performance and design of your database servers
Ein Expertenleitfaden für die Migration von Legacy-Datenbanken zu PostgreSQLEDB
Dieses Webinar gibt einen Überblick über die Schwierigkeiten, denen Teams bei der Migration von Oracle-Datenbanken zu PostgreSQL gegenüberstehen. Wir werden Einblicke in das geben, was wir bei der Durchführung groß angelegter Oracle-Kompatibilitätsbewertungen in den letzten zwei Jahren gelernt haben, einschließlich der über 2.200.000 Oracle-DDL-Konstrukte, die über das Migrationsportal von EDB im Jahr 2020 bewertet wurden.
Wir werden in diesem Kurs Folgendes behandeln:
Speicherdefinitionen
Pakete
Gespeicherte Verfahren
PL/SQL-Code
Proprietäre Datenbank-APIs
Groß angelegte Datenmigrationen
Zum Abschluss des Webinars werden wir Migrationswerkzeuge demonstrieren, die die Migration von Oracle-Datenbanken zu PostgreSQL erheblich vereinfachen und dabei helfen, das Risiko einer Migration zu PostgreSQL zu verringern.
Kontext: Ein relativ technisches Webinar, das darauf ausgerichtet ist, wie man aus Oracle aussteigt, was die häufigen Fehler dabei sind und wie man sie bewältigt.
Teilnehmerfeld: Geschäftsinhaber und Architekten, die die Machbarkeit des Ausstiegs aus einer der führenden und unbeliebtesten Legacy-Datenbanken analysieren möchten
The webinar will review a multi-layered framework for PostgreSQL security, with a deeper focus on limiting access to the database and data, as well as securing the data.
Using the popular AAA (Authentication, Authorization, Auditing) framework we will cover:
- Best practices for authentication (trust, certificate, MD5, Scram, etc).
- Advanced approaches, such as password profiles.
- Deep dive of authorization and data access control for roles, database objects (tables, etc), view usage, row-level security, and data redaction.
- Auditing, encryption, and SQL injection attack prevention.
Note: this session is delivered in French
The talk will be about the project to find a replacement for all IBM products in the company with the example for the databases. What was the goal of the project, the learning, a short overview about the options
we migrated about 500 db2 databases to EnterpriseDB. The database size was from a small size up to 4 TB and we implemented a completely new fully automated deployment of VM and database. Databases are now 11 month in production. The talk will have an overview of the project, the learnings, a few parameters and technical parameters that were found for stability and performance.
Un guide complet pour la migration de bases de données héritées vers PostgreSQLEDB
Ce webinaire passera en revue les défis auxquels les équipes sont confrontées lors de la migration d’une base de données Oracle vers PostgreSQL Nous partagerons les informations tirées d’évaluations de compatibilité Oracle de grande ampleur, effectuées sur les deux dernières années, inclues plus de 2 200 000 constructions Oracle DDL qui ont été évaluées au travers du portail de migration EDB en 2020.
Lors de cette session, nous aborderons:
Définition du stockage
Outils
Procédures stockées
Code PL/SQL
API de la base de données propriétaire
Migration de données à grande échelle
Nous terminerons cette session par une démonstration d’outils de migration qui simplifient considérablement et aident à réduire les risques de la migration d’une base de données Oracle vers PostgreSQL.
Contexte:
Un webinaire moyennement technique concentré sur la façon de quitter Oracle, quels sont les pièges à éviter et comment les aborder.
Public:
Les chefs d’entreprise et architectes qui souhaitent évaluer la faisabilité de la sortie d’une des bases de données héritée les plus utilisées mais aussi les plus détestées.
An overview of reference architectures for PostgresEDB
EDB Reference Architectures are designed to help new and existing users alike to quickly design a deployment architecture that suits their needs. They can be used as either the blueprint for a deployment, or as the basis for a design that enhances and extends the functionality and features offered.
Add-on architectures allow users to easily extend their core database server deployment to add additional features and functionality "building block" style.
In this webinar, we will review the following architectures:
- Single Node
- Multi Node with Asynchronous Replication
- Multi Node with Synchronous Replication
- Add-on Architectures
Speaker:
Michael Willer
Sales Engineer, EDB
This webinar will review the challenges teams face when migrating from Oracle databases to PostgreSQL. We will share insights gained from running large scale Oracle compatibility assessments over the last two years, including the over 2,200,000 Oracle DDL constructs that were assessed through EDB’s Migration Portal in 2020.
During this session we will address:
Storage definitions
Packages
Stored procedures
PL/SQL code
Proprietary database APIs
Large scale data migrations
We will end the session demonstrating migration tools that significantly simplify and aid in reducing the risk of migrating Oracle databases to PostgreSQL.
Expert Guide to Migrating Legacy Databases to PostgresEDB
This document provides an overview of migrating legacy databases from Oracle to PostgreSQL. It discusses the challenges with Oracle databases, reasons to leave Oracle for PostgreSQL, and considerations for the migration process. The presentation recommends choosing EnterpriseDB for the Oracle migration due to their compatibility tools and support services. It highlights EnterpriseDB's migration portal and toolkit, as well as their 24/7 support, as benefits of working with EnterpriseDB for the migration.
EDB & ELOS Technologies - Break Free from OracleEDB
This document provides an overview of EnterpriseDB's (EDB) solution for migrating from Oracle databases to PostgreSQL. It discusses the high licensing costs and restrictive terms of Oracle databases. EDB offers deep compatibility with Oracle through their EDB Postgres Advanced Server, comprehensive migration tools, and 24/7 support. Migrating to EDB PostgreSQL can provide significant cost savings compared to Oracle and help customers advance their open source database strategies. The presentation recommends interested customers schedule a lunch and learn or migration assessment to explore EDB's migration options further.
This document discusses IBM's Integrated Analytics System (IIAS), which is a next generation hybrid data warehouse appliance. Some key points:
- IIAS provides high performance analytics capabilities along with data warehousing and management functions.
- It utilizes a common SQL engine to allow workloads and skills to be portable across public/private clouds and on-premises.
- The system is designed for flexibility with the ability to independently scale compute and storage capacity. It also supports a variety of workloads including reporting, analytics, and operational analytics.
- IBM is positioning IIAS to address top customer requirements around broader workloads, higher concurrency, in-place expansion, and availability solutions.
Covers the problems of achieving scalability in server farm environments and how distributed data grids provide in-memory storage and boost performance. Includes summary of ScaleOut Software product offerings including ScaleOut State Server and Grid Computing Edition.
Virtualizing Latency Sensitive Workloads and vFabric GemFireCarter Shanklin
This presentation was made by Emad Benjamin of VMware Technical Marketing. Normally I wouldn't upload someone else's preso but I really insisted this get posted and he asked me to help him out.
This deck covers tips and best practices for virtualizing latency sensitive apps on vSphere in general, and takes a deep dive into virtualizing vFabric GemFire, which is a high-performance distributed and memory-optimized key/value store.
Best practices include how to configure the virtual machines and how to tune them appropriately to the hardware the application runs on.
Edge 2016 Session 1886 Building your own docker container cloud on ibm power...Yong Feng
The material for IBM Edge 2016 session for a client use case of Spectrum Conductor for Containers
https://www-01.ibm.com/events/global/edge/sessions/.
Please refer to http://ibm.biz/ConductorForContainers for more details about Spectrum Conductor for Containers.
Please refer to https://www.youtube.com/watch?v=7YMjP6EypqA and https://www.youtube.com/watch?v=d9oVPU3rwhE for the demo of Spectrum Conductor for Containers.
Migrate Today: Proactive Steps to Unhook from OracleEDB
Ed Boyajian, CEO of EDB, and Craig Guarente, CEO of Palisade Compliance, discuss steps organizations can take to migrate away from Oracle databases. They outline common problems with Oracle like high costs and vendor lock-in. EDB offers PostgreSQL as an alternative with migration tools and services. Palisade helps clients reduce Oracle costs through contract renegotiations and compliance assessments. The presentation includes a customer case study of a bank that saved over $800k by migrating applications to EDB PostgreSQL.
Cognos Analytics November 2017 Enhancements: 11.0.8 Demos and Q&A with the IB...Senturus
Cognos Analytics Release 8 includes new features in response to user requests for enhancements (RFE) along with items to make it easier to administer and manage your BI environment. View the video recording and download this deck at: http://www.senturus.com/resources/cognos-analytics-november-2017-enhancements/.
Senturus, a business analytics consulting firm, has a resource library with hundreds of free recorded webinars, trainings, demos and unbiased product reviews. Take a look and share them with your colleagues and friends: http://www.senturus.com/resources/.
Software Defined Storage - Open Framework and Intel® Architecture TechnologiesOdinot Stanislas
(FR)
Dans cette présentation vous aurez le plaisir d'y trouver une introduction plutôt détaillées sur la notion de "SDS Controller" qui est en résumé la couche applicative destinée à contrôler à terme toutes les technologies de stockage (SAN, NAS, stockage distribué sur disque, flash...) et chargée de les exposer aux orchestrateurs de Cloud et donc aux applications.
(ENG)
This presentation cover in detail the notion of "SDS Controller" which is in summary a software stack able to handle all storage technologies (SAN, NDA, distributed file systems on disk, flash...) and expose it to Cloud orchestrators and applications. Lots of good content.
Better performance and cost effectiveness empower better results in the cognitive era. For more information, visit: http://www.ibm.com/systems/power/hardware/linux-lc.html
Enabling a hardware accelerated deep learning data science experience for Apa...DataWorks Summit
Deep learning techniques are finding significant commercial success in a wide variety of industries. Large unstructured data sets such as images, videos, speech and text are great for deep learning, but impose a lot of demands on computing resources. New types of hardware architectures such as GPUs and faster interconnects (e.g. NVLink), RDMA capable networking interface from Mellanox available on OpenPOWER and IBM POWER systems are enabling practical speedups for deep learning. Data Scientists can intuitively incorporate deep learning capabilities on accelerated hardware using open source components such as Jupyter and Zeppelin notebooks, RStudio, Spark, Python, Docker, and Kubernetes with IBM PowerAI. Jupyter and Apache Zeppelin integrate well with Apache Spark and Hadoop using the Apache Livy project. This session will show some deep learning build and deploy steps using Tensorflow and Caffe in Docker containers running in a hardware accelerated private cloud container service. This session will also show system architectures and best practices for deployments on accelerated hardware. INDRAJIT PODDAR, Senior Technical Staff Member, IBM
If you're like most of the world, you're on an aggressive race to implement machine learning applications and on a path to get to deep learning. If you can give better service at a lower cost, you will be the winners in 2030. But infrastructure is a key challenge to getting there. What does the technology infrastructure look like over the next decade as you move from Petabytes to Exabytes? How are you budgeting for more colossal data growth over the next decade? How do your data scientists share data today and will it scale for 5-10 years? Do you have the appropriate security, governance, back-up and archiving processes in place? This session will address these issues and discuss strategies for customers as they ramp up their AI journey with a long term view.
Overcoming write availability challenges of PostgreSQLEDB
There's no shortage of physical replication solutions for PostgreSQL, they scale horizontally and provide high read availability. But where they fall short is write availability, which leads many users to consider PostgreSQL logical replication. Existing solutions have a single point of failure or are dependent on a forked, vendor-provided PostgreSQL extension making reliable, enterprise-class logical replication hard to come by. Furthermore, these solutions put limits on scaling PostgreSQL.
By combining Kafka, an open source event streaming system with PostgreSQL, customers can get a fault tolerant, scalable logical replication service. Learn how EDB Replicate leverages Kafka for high write availability needed for today's demanding consumers who expect their applications to be always available and won't tolerate latency.
An overview of reference architectures for PostgresEDB
EDB Reference Architectures are designed to help new and existing users alike to quickly design a deployment architecture that suits their needs. They can be used as either the blueprint for a deployment, or as the basis for a design that enhances and extends the functionality and features offered.
Add-on architectures allow users to easily extend their core database server deployment to add additional features and functionality "building block" style.
In this webinar, we will review the following architectures:
- Single Node
- Multi Node with Asynchronous Replication
- Multi Node with Synchronous Replication
- Add-on Architectures
Introducing Data Redaction - an enabler to data security in EDB Postgres Adva...EDB
With the rapid growth in digitalization, coupled with the current pandemic situation globally, many organizations and businesses are forced to operate remotely and online, more than they would prefer. At such times, how do corporations and businesses ensure data security, especially the secure management of personal information?
There are many techniques used to secure information, such as authentication, authorization, access control, virtual database, and encryption. In this webinar, we focus on Data Redaction - a technique that limits sensitive data exposure in EDB Postgres Advanced Server (EPAS).
This webinar covers:
- What is EDB Data Redaction
- How to limit sensitive data exposure in EPAS
- Provision for Oracle compatibility in EPAS
- Demo
Using PEM to understand and improve performance in Postgres: Postgres Tuning ...EDB
The Postgres Enterprise Manager (PEM) Tuning Wizard reviews your installation, and recommends a set of configuration options that will help tune a Postgres installation to best suit the anticipated workload. PEM's Performance Diagnostics uses Postgres' wait state information to analyze queries in context of the current workload and help identify further performance improvement opportunities in terms of locks, IO, and CPU bottlenecks.
This webinar covers:
- How to intelligently manage all your database servers with a single console
- Identify the useful features and functionality needed for visual database administration
- How to manage the performance and design of your database servers
Ein Expertenleitfaden für die Migration von Legacy-Datenbanken zu PostgreSQLEDB
Dieses Webinar gibt einen Überblick über die Schwierigkeiten, denen Teams bei der Migration von Oracle-Datenbanken zu PostgreSQL gegenüberstehen. Wir werden Einblicke in das geben, was wir bei der Durchführung groß angelegter Oracle-Kompatibilitätsbewertungen in den letzten zwei Jahren gelernt haben, einschließlich der über 2.200.000 Oracle-DDL-Konstrukte, die über das Migrationsportal von EDB im Jahr 2020 bewertet wurden.
Wir werden in diesem Kurs Folgendes behandeln:
Speicherdefinitionen
Pakete
Gespeicherte Verfahren
PL/SQL-Code
Proprietäre Datenbank-APIs
Groß angelegte Datenmigrationen
Zum Abschluss des Webinars werden wir Migrationswerkzeuge demonstrieren, die die Migration von Oracle-Datenbanken zu PostgreSQL erheblich vereinfachen und dabei helfen, das Risiko einer Migration zu PostgreSQL zu verringern.
Kontext: Ein relativ technisches Webinar, das darauf ausgerichtet ist, wie man aus Oracle aussteigt, was die häufigen Fehler dabei sind und wie man sie bewältigt.
Teilnehmerfeld: Geschäftsinhaber und Architekten, die die Machbarkeit des Ausstiegs aus einer der führenden und unbeliebtesten Legacy-Datenbanken analysieren möchten
The webinar will review a multi-layered framework for PostgreSQL security, with a deeper focus on limiting access to the database and data, as well as securing the data.
Using the popular AAA (Authentication, Authorization, Auditing) framework we will cover:
- Best practices for authentication (trust, certificate, MD5, Scram, etc).
- Advanced approaches, such as password profiles.
- Deep dive of authorization and data access control for roles, database objects (tables, etc), view usage, row-level security, and data redaction.
- Auditing, encryption, and SQL injection attack prevention.
Note: this session is delivered in French
The talk will be about the project to find a replacement for all IBM products in the company with the example for the databases. What was the goal of the project, the learning, a short overview about the options
we migrated about 500 db2 databases to EnterpriseDB. The database size was from a small size up to 4 TB and we implemented a completely new fully automated deployment of VM and database. Databases are now 11 month in production. The talk will have an overview of the project, the learnings, a few parameters and technical parameters that were found for stability and performance.
Un guide complet pour la migration de bases de données héritées vers PostgreSQLEDB
Ce webinaire passera en revue les défis auxquels les équipes sont confrontées lors de la migration d’une base de données Oracle vers PostgreSQL Nous partagerons les informations tirées d’évaluations de compatibilité Oracle de grande ampleur, effectuées sur les deux dernières années, inclues plus de 2 200 000 constructions Oracle DDL qui ont été évaluées au travers du portail de migration EDB en 2020.
Lors de cette session, nous aborderons:
Définition du stockage
Outils
Procédures stockées
Code PL/SQL
API de la base de données propriétaire
Migration de données à grande échelle
Nous terminerons cette session par une démonstration d’outils de migration qui simplifient considérablement et aident à réduire les risques de la migration d’une base de données Oracle vers PostgreSQL.
Contexte:
Un webinaire moyennement technique concentré sur la façon de quitter Oracle, quels sont les pièges à éviter et comment les aborder.
Public:
Les chefs d’entreprise et architectes qui souhaitent évaluer la faisabilité de la sortie d’une des bases de données héritée les plus utilisées mais aussi les plus détestées.
An overview of reference architectures for PostgresEDB
EDB Reference Architectures are designed to help new and existing users alike to quickly design a deployment architecture that suits their needs. They can be used as either the blueprint for a deployment, or as the basis for a design that enhances and extends the functionality and features offered.
Add-on architectures allow users to easily extend their core database server deployment to add additional features and functionality "building block" style.
In this webinar, we will review the following architectures:
- Single Node
- Multi Node with Asynchronous Replication
- Multi Node with Synchronous Replication
- Add-on Architectures
Speaker:
Michael Willer
Sales Engineer, EDB
This webinar will review the challenges teams face when migrating from Oracle databases to PostgreSQL. We will share insights gained from running large scale Oracle compatibility assessments over the last two years, including the over 2,200,000 Oracle DDL constructs that were assessed through EDB’s Migration Portal in 2020.
During this session we will address:
Storage definitions
Packages
Stored procedures
PL/SQL code
Proprietary database APIs
Large scale data migrations
We will end the session demonstrating migration tools that significantly simplify and aid in reducing the risk of migrating Oracle databases to PostgreSQL.
Expert Guide to Migrating Legacy Databases to PostgresEDB
This document provides an overview of migrating legacy databases from Oracle to PostgreSQL. It discusses the challenges with Oracle databases, reasons to leave Oracle for PostgreSQL, and considerations for the migration process. The presentation recommends choosing EnterpriseDB for the Oracle migration due to their compatibility tools and support services. It highlights EnterpriseDB's migration portal and toolkit, as well as their 24/7 support, as benefits of working with EnterpriseDB for the migration.
EDB & ELOS Technologies - Break Free from OracleEDB
This document provides an overview of EnterpriseDB's (EDB) solution for migrating from Oracle databases to PostgreSQL. It discusses the high licensing costs and restrictive terms of Oracle databases. EDB offers deep compatibility with Oracle through their EDB Postgres Advanced Server, comprehensive migration tools, and 24/7 support. Migrating to EDB PostgreSQL can provide significant cost savings compared to Oracle and help customers advance their open source database strategies. The presentation recommends interested customers schedule a lunch and learn or migration assessment to explore EDB's migration options further.
This document discusses IBM's Integrated Analytics System (IIAS), which is a next generation hybrid data warehouse appliance. Some key points:
- IIAS provides high performance analytics capabilities along with data warehousing and management functions.
- It utilizes a common SQL engine to allow workloads and skills to be portable across public/private clouds and on-premises.
- The system is designed for flexibility with the ability to independently scale compute and storage capacity. It also supports a variety of workloads including reporting, analytics, and operational analytics.
- IBM is positioning IIAS to address top customer requirements around broader workloads, higher concurrency, in-place expansion, and availability solutions.
Covers the problems of achieving scalability in server farm environments and how distributed data grids provide in-memory storage and boost performance. Includes summary of ScaleOut Software product offerings including ScaleOut State Server and Grid Computing Edition.
Virtualizing Latency Sensitive Workloads and vFabric GemFireCarter Shanklin
This presentation was made by Emad Benjamin of VMware Technical Marketing. Normally I wouldn't upload someone else's preso but I really insisted this get posted and he asked me to help him out.
This deck covers tips and best practices for virtualizing latency sensitive apps on vSphere in general, and takes a deep dive into virtualizing vFabric GemFire, which is a high-performance distributed and memory-optimized key/value store.
Best practices include how to configure the virtual machines and how to tune them appropriately to the hardware the application runs on.
Edge 2016 Session 1886 Building your own docker container cloud on ibm power...Yong Feng
The material for IBM Edge 2016 session for a client use case of Spectrum Conductor for Containers
https://www-01.ibm.com/events/global/edge/sessions/.
Please refer to http://ibm.biz/ConductorForContainers for more details about Spectrum Conductor for Containers.
Please refer to https://www.youtube.com/watch?v=7YMjP6EypqA and https://www.youtube.com/watch?v=d9oVPU3rwhE for the demo of Spectrum Conductor for Containers.
Migrate Today: Proactive Steps to Unhook from OracleEDB
Ed Boyajian, CEO of EDB, and Craig Guarente, CEO of Palisade Compliance, discuss steps organizations can take to migrate away from Oracle databases. They outline common problems with Oracle like high costs and vendor lock-in. EDB offers PostgreSQL as an alternative with migration tools and services. Palisade helps clients reduce Oracle costs through contract renegotiations and compliance assessments. The presentation includes a customer case study of a bank that saved over $800k by migrating applications to EDB PostgreSQL.
Cognos Analytics November 2017 Enhancements: 11.0.8 Demos and Q&A with the IB...Senturus
Cognos Analytics Release 8 includes new features in response to user requests for enhancements (RFE) along with items to make it easier to administer and manage your BI environment. View the video recording and download this deck at: http://www.senturus.com/resources/cognos-analytics-november-2017-enhancements/.
Senturus, a business analytics consulting firm, has a resource library with hundreds of free recorded webinars, trainings, demos and unbiased product reviews. Take a look and share them with your colleagues and friends: http://www.senturus.com/resources/.
Software Defined Storage - Open Framework and Intel® Architecture TechnologiesOdinot Stanislas
(FR)
Dans cette présentation vous aurez le plaisir d'y trouver une introduction plutôt détaillées sur la notion de "SDS Controller" qui est en résumé la couche applicative destinée à contrôler à terme toutes les technologies de stockage (SAN, NAS, stockage distribué sur disque, flash...) et chargée de les exposer aux orchestrateurs de Cloud et donc aux applications.
(ENG)
This presentation cover in detail the notion of "SDS Controller" which is in summary a software stack able to handle all storage technologies (SAN, NDA, distributed file systems on disk, flash...) and expose it to Cloud orchestrators and applications. Lots of good content.
Better performance and cost effectiveness empower better results in the cognitive era. For more information, visit: http://www.ibm.com/systems/power/hardware/linux-lc.html
Enabling a hardware accelerated deep learning data science experience for Apa...DataWorks Summit
Deep learning techniques are finding significant commercial success in a wide variety of industries. Large unstructured data sets such as images, videos, speech and text are great for deep learning, but impose a lot of demands on computing resources. New types of hardware architectures such as GPUs and faster interconnects (e.g. NVLink), RDMA capable networking interface from Mellanox available on OpenPOWER and IBM POWER systems are enabling practical speedups for deep learning. Data Scientists can intuitively incorporate deep learning capabilities on accelerated hardware using open source components such as Jupyter and Zeppelin notebooks, RStudio, Spark, Python, Docker, and Kubernetes with IBM PowerAI. Jupyter and Apache Zeppelin integrate well with Apache Spark and Hadoop using the Apache Livy project. This session will show some deep learning build and deploy steps using Tensorflow and Caffe in Docker containers running in a hardware accelerated private cloud container service. This session will also show system architectures and best practices for deployments on accelerated hardware. INDRAJIT PODDAR, Senior Technical Staff Member, IBM
If you're like most of the world, you're on an aggressive race to implement machine learning applications and on a path to get to deep learning. If you can give better service at a lower cost, you will be the winners in 2030. But infrastructure is a key challenge to getting there. What does the technology infrastructure look like over the next decade as you move from Petabytes to Exabytes? How are you budgeting for more colossal data growth over the next decade? How do your data scientists share data today and will it scale for 5-10 years? Do you have the appropriate security, governance, back-up and archiving processes in place? This session will address these issues and discuss strategies for customers as they ramp up their AI journey with a long term view.
Impact2014: Introduction to the IBM Java ToolsChris Bailey
IBM provides a number of free tools to assist in monitoring and diagnosing issues when running any Java application - from Hello World to IBM or third-party, middleware-based applications. This session introduces attendees to those tools, highlights how they have been extended with IBM middleware product knowledge, how they have been integrated into IBMs development tools, and how to use them to investigate and resolve real-world problem scenarios.
Big Data: InterConnect 2016 Session on Getting Started with Big Data AnalyticsCynthia Saracco
Learn how to get started with Big Data using a platform based on Apache Hadoop, Apache Spark, and IBM BigInsights technologies. The emphasis here is on free or low-cost options that require modest technical skills.
Everything you need to know about creating, managing and debugging Java applications on IBM Bluemix. This presentation covers the features the IBM WebSphere Application Server Liberty Buildpack provides to make Java development on the cloud easier. It also covers the Eclipse tooling support including remote debugging, incremental update, etc.
Elevate Your Continuous Delivery Strategy Above the Rolling Clouds (Interconn...Michael Elder
This presentation describes how we see client architectures evolving from traditional IT, to cloud-enabled, to cloud native, with bridges in between. It explains how IBM UrbanCode Deploy enables clients to capture full-stack blueprints for their workloads in a way that is cloud-portable. It will highlight new capabilities in VMWare vCenter, IBM SoftLayer, Amazon Web Services and Microsoft Azure. Attendees will also see a live demonstration of end-to-end deployment during the talk.
Benchmarking Hadoop - Which hadoop sql engine leads the herdGord Sissons
Stewart Tate ([email protected]), a key architect behind the industry's first ever Hadoop-DS benchmark at 30TB scale, describes the benchmark and comparative testing between IBM, Cloudera Impala and Hortonworks Hive
Accelerate Your Apache Spark with Intel Optane DC Persistent MemoryDatabricks
The capacity of data grows rapidly in big data area, more and more memory are consumed either in the computation or holding the intermediate data for analytic jobs. For those memory intensive workloads, end-point users have to scale out the computation cluster or extend memory with storage like HDD or SSD to meet the requirement of computing tasks. For scaling out the cluster, the extra cost from cluster management, operation and maintenance will increase the total cost if the extra CPU resources are not fully utilized. To address the shortcoming above, Intel Optane DC persistent memory (Optane DCPM) breaks the traditional memory/storage hierarchy and scale up the computing server with higher capacity persistent memory. Also it brings higher bandwidth & lower latency than storage like SSD or HDD. And Apache Spark is widely used in the analytics like SQL and Machine Learning on the cloud environment. For cloud environment, low performance of remote data access is typical a stop gap for users especially for some I/O intensive queries. For the ML workload, it's an iterative model which I/O bandwidth is the key to the end-2-end performance. In this talk, we will introduce how to accelerate Spark SQL with OAP (https://github.com/Intel-bigdata/OAP) to accelerate SQL performance on Cloud to archive 8X performance gain and RDD cache to improve K-means performance with 2.5X performance gain leveraging Intel Optane DCPM. Also we will have a deep dive how Optane DCPM for these performance gains.
Speakers: Cheng Xu, Piotr Balcer
This document discusses optimizing Apache Spark machine learning workloads on OpenPOWER platforms. It provides an overview of Spark, machine learning, and deep learning. It then discusses how OpenPOWER systems are well-suited for these workloads due to features like high memory bandwidth, large caches, and GPU support. The document outlines various techniques for tuning Spark performance on OpenPOWER, such as configuration of executors, cores, memory, and storage levels. It also presents examples analyzing the performance of a matrix factorization machine learning application under different Spark configurations.
The Changing Role of a DBA in an Autonomous WorldMaria Colgan
The advent of the cloud and the introduction of Oracle Autonomous Database Cloud presents opportunities for every organization, but what's the future role for the DBA? This presentation explores how the role of the DBA will continue to evolve, and provides advice on key skills required to be a successful DBA in the world of the cloud.
Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...Indrajit Poddar
This document provides a summary of lessons learned from deploying Apache Spark as a cloud service on IBM Power Systems. Key points include:
1. Using an open source stack enabled agile development. Continuous integration automated deployment.
2. Platform Symphony efficiently allocated resources for multi-tenancy.
3. Performance tests found Spark ran 1.7x faster on Power than x86, improving economics.
4. Potential for acceleration using Power features like CAPI flash was identified.
Putting these lessons together, Power Systems could differentiate cloud data services through improved cost performance, agile development, and advanced acceleration.
1457 - Reviewing Experiences from the PureExperience ProgramHendrik van Run
IBM IMPACT 2013 presentation
This session will present customer experiences with the PureApplication System. It will cover setting up a PureApplication System from the ground up, and will also explain the application onboarding process and the operation of the environment. The content of the session is from different customers that have completed the PureExperience program, and will include discussion of best practices and lessons learned.
Cloud Computing for Small & Medium BusinessesAl Sabawi
I presented this topic at the Greater Binghamton Business Expo in Upstate New York. It is meant to shed light on utilizing Cloud Computing for Small and Medium size businesses. It should help decision makers consider Software-as-a-Service offerings for their business as a way to save on IT cost and to deliver on better efficiency for their organizations.
Revolutionizing GPU-as-a-Service for Maximum EfficiencyAI Infra Forum
In this session, we'll explore our cutting-edge GPU-as-a-Service solution designed to transform enterprise AI operations. Learn how our MemVerge.ai platform maximizes GPU utilization, streamlines workload management, and ensures uninterrupted operations through innovative features like Dynamic GPU Surfing. We'll dive into key use cases, from training large language models to enterprise-scale AI deployment. We'll demonstrate how our solution benefits various stakeholders – from platform engineers to data scientists and decision-makers. Discover how our platform optimizes costs while maintaining data security and sovereignty.
This document discusses three key artificial intelligence capabilities of IBM's Power9 architecture:
1) Large Memory Support enables processing of high-definition images and large models that exceed GPU memory limits.
2) Distributed Deep Learning allows scaling to multiple servers for faster and more accurate training on large datasets.
3) PowerAI Vision provides tools for labeling data, training models for computer vision tasks, and deploying models for production use.
InfoSphere BigInsights - Analytics power for Hadoop - field experienceWilfried Hoge
This document provides an overview and summary of InfoSphere BigInsights, an analytics platform for Hadoop. It discusses key features such as real-time analytics, storage integration, search, data exploration, predictive modeling, and application tooling. Case studies are presented on analyzing binary data and developing applications for transformation and analysis. Partnerships and certifications with other vendors are also mentioned. The document aims to demonstrate how BigInsights brings enterprise-grade features to Apache Hadoop and provides analytics capabilities for business users.
Transparent Hardware Acceleration for Deep LearningIndrajit Poddar
This document provides an overview of transparent hardware acceleration for deep learning using IBM's PowerAI platform. It discusses how PowerAI leverages POWER CPUs and NVIDIA GPUs connected via NVLink to dramatically accelerate deep learning model training and inference. Using this approach, IBM has achieved significant performance improvements over x86 platforms, including faster training times, support for larger models, and more efficient distributed training across multiple servers.
Build FAST Deep Learning Apps with Docker on OpenPOWER and GPUs Indrajit Poddar
GPU and NVLink accelerated training and inference with tensorflow and caffe on OpenPOWER systems. Presented at a meetup prior to DataWorks Summit Munich 2017.
Fast Scalable Easy Machine Learning with OpenPOWER, GPUs and DockerIndrajit Poddar
Transparently accelerated Deep Learning workloads on OpenPOWER systems and GPUs using easy to use open source frameworks such as Caffe, Torch, Tensorflow, Theano.
Build FAST Learning Apps with Docker and OpenPOWERIndrajit Poddar
This document discusses using Docker containers on OpenPOWER systems for machine learning applications. Some key points:
- OpenPOWER systems have advantages for machine learning like more CPU cores, threads, memory and I/O bandwidth which allow scaling out training across systems.
- Docker allows distributing machine learning models easily across OpenPOWER systems for faster training times.
- The same Docker containers can run on x86 and OpenPOWER systems, providing a consistent developer experience for machine learning applications.
- OpenPOWER supports technologies like GPUs and FPGAs which provide huge speed-ups for machine learning and deep learning analytics.
Enabling Cognitive Workloads on the Cloud: GPUs with Mesos, Docker and Marath...Indrajit Poddar
This document provides an overview of enabling cognitive workloads on the cloud using GPUs with Mesos, Docker, and Marathon on IBM's POWER systems. It discusses requirements for GPUs in the cloud like exposing GPUs to containers and supporting multiple GPUs per node. It also summarizes Mesos and Kubernetes support for GPUs, and demonstrates running a deep learning workload on OpenPOWER hardware to identify dog breeds using Docker containers and GPUs.
Scalable TensorFlow Deep Learning as a Service with Docker, OpenPOWER, and GPUsIndrajit Poddar
This document discusses scaling TensorFlow deep learning using Docker, OpenPOWER systems, and GPUs. It provides an overview of distributing TensorFlow in a cluster with parameter servers and worker nodes. Example Dockerfiles are shown for creating deep learning images. The discussion also covers infrastructure components like Docker, OpenStack, and Mesos for managing compute resources in a deep learning cluster as a service.
Continuous Integration with Cloud Foundry Concourse and Docker on OpenPOWERIndrajit Poddar
This document discusses continuous integration (CI) for open source software on OpenPOWER systems. It provides background on CI, OpenPOWER systems, and the Cloud Foundry platform. It then describes using the Concourse CI tool to continuously build a Concourse project from a GitHub repository. Key steps involve deploying OpenStack, setting up a Docker registry, installing BOSH and Concourse, defining a Concourse pipeline, and updating the pipeline to demonstrate the CI process in action. The document emphasizes the importance of CI for open source projects and how it benefits development on OpenPOWER systems.
How Data Annotation Services Drive Innovation in Autonomous Vehicles.docxsofiawilliams5966
Autonomous vehicles represent the cutting edge of modern technology, promising to revolutionize transportation by improving safety, efficiency, and accessibility.
Glary Utilities Pro 5.157.0.183 Crack + Key Download [Latest]Designer
Copy Link & Paste in Google👉👉👉 https://alipc.pro/dl/
Glary Utilities Pro Crack Glary Utilities Pro Crack Free Download is an amazing collection of system tools and utilities to fix, speed up, maintain and protect your PC.
delta airlines new york office (Airwayscityoffice)jamespromind
Visit the Delta Airlines New York Office for personalized assistance with your travel plans. The experienced team offers guidance on ticket changes, flight delays, and more. It’s a helpful resource for those needing support beyond the airport.
Ever wondered how to inject your dashboards with the power of Python? This presentation will show how combining Tableau with Python can unlock advanced analytics, predictive modeling, and automation that’ll make your dashboards not just smarter—but practically psychic
"Machine Learning in Agriculture: 12 Production-Grade Models", Danil PolyakovFwdays
Kernel is currently the leading producer of sunflower oil and one of the largest agroholdings in Ukraine. What business challenges are they addressing, and why is ML a must-have? This talk explores the development of the data science team at Kernel—from early experiments in Google Colab to building minimal in-house infrastructure and eventually scaling up through an infrastructure partnership with De Novo. The session will highlight their work on crop yield forecasting, the positive results from testing on H100, and how the speed gains enabled the team to solve more business tasks.
Enabling a hardware accelerated deep learning data science experience for Apache Spark and Hadoop
1. Enabling a hardware accelerated deep
learning data science experience for
Apache Spark and Hadoop
Indrajit (I.P) Poddar
Senior Technical Staff Member
IBM Cloud and Cognitive Systems
June 2018
2. Safe Harbor Statement and Disclaimer
• Copyright IBM Corporation 2018. All rights reserved. U.S. Government Users Restricted Rights - use, duplication, or disclosure restricted
by GSA ADP Schedule Contract with IBM Corporation.
• IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corporation in the United
States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a
trademark symbol (® or TM), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information
was published. Such trademarks make also be registered or common law trademarks in other countries. A current list of IBM trademarks
is available on the Web at “Copyright and trademark information at : ibm.com/legal/copytrade/shtml.
• The information contained in this presentation is provided for informational purpose only. While efforts were made to verify the
completeness and accuracy of the information contained in this presentation, it is provided “as is” without warranty of any kind, expressed
or implied. IBM shall not be responsible for any damages arising out of the use of, or otherwise related to, this presentation or any other
documentation.
• The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material,
code or functionality. Information about potential future products may not be incorporated into any contract. Nothing contained in this
presentation is intended to, nor shall have the effect of, creating any warranties or representations from IBM (or its suppliers or licensors),
or altering the terms and conditions of any agreement or license governing the use of IBM products and/or software.
• Any statements of performance are based on measurements and projections using standard IBM benchmarks in a controlled environment.
The actual throughput or performance that any user will experience will vary depending upon many factors, including considerations such
as the amount of multi-programming in the user’s job stream, the I/O configuration, the storage configuration, and the workload
processed. Therefore, no assurance can be given that an individual user will achieve results similar to those stated.
• IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM’s sole discretion.
The development, release, and timing of any future features or functionality described for our products remains at our sole discretion.
Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making
a purchasing decision.”
3. AI, Deep Learning, Machine Learning
02
Data Science Experience
03
Hardware Acceleration
04
Demo
Agenda 01
5. Deep Learning Has
Revolutionized Machine Learning
5
Data
Accuracy
Deep
Learning
Traditional
Machine
Learning
100
80
60
40
20
0
Deep Learning Popularity
Growing Exponentially
Source: Google Trends. Search term “Deep Learning”
2011 2012 2013 2014 2015 2016 2017
6. 6
Machine Learning
Deep Learning
Input
Deep Neural Network
OutputFeature Extraction
& Classification
Input Feature
Extraction
Features Classification Output
Machine Learning
Algorithms
7. 7
Transform & Prep
Data (ETL)
AI Infrastructure Stack
Applications
Cognitive APIs
(Eg: Watson)
In-House APIs
Machine & Deep Learning
Libraries & Frameworks
Distributed Computing
Data Lake & Data Stores
Segment Specific:
Finance, Retail, Healthcare
Speech, Vision,
NLP, Sentiment
TensorFlow, Caffe,
SparkML
Kubernetes, Spark, MPI
Hadoop HDFS,
NoSQL DBs
Accelerated
Infrastructure
Accelerated Servers Storage
8. 8
Open Source Frameworks:
Supported Distribution
Developer Ease-of-Use Tools
Faster Training Times via
HW & SW Performance Optimizations
Integrated & Supported AI Platform
Higher Productivity for Data Scientists
Enable non-Data Scientists to use AI
Integrated
software and
hardware for
AI
9. AI, Deep Learning, Machine Learning
02
Data Science Experience
03
Hardware Acceleration
04
Demo
Agenda 01
10. Data Science Teams
Phase
Team
Tasks &
Pain points
Leader
concerns
Getting Started Modeling
Experimentation
Developing Apps
Developing Dashboards
Deployment
Monitoring
Support
• Defining projects
• Finding corporate data
• Connecting to data
sources
• Understanding the data
• Cleaning data
• Building models
Measuring accuracy
• Finding more data
• Building repeatable data
pipelines
• Integration with
engineering
• Machinery management
• QA
• Accuracy monitoring
• Scalability
• Models robustness with
new data
• Integration with
infrastructure
• (reuse of old models)
• Hiring, getting skills
• Data security
(breaches)
• Data security
• Productivity of a
very expensive &
rare skill
• Skill inconsistency
• Data security
• Productivity of a very
expensive & rare skill
• Knowledge loss due to
high employee turnover
• Meeting customer
expectations with timely
support
• Productivity of a very
expensive & rare skill
• Knowledge loss due to
high employee turnover
Data Scientist
Happiness
11. Teams getting started
• Learn
• Connect to Enterprise
data sources easily
• Collaborate
• Working on cluster
safer than desktops for
leader
• Safe behind the firewall
Big SQL, Db2 (warehouse/z/LUW), Hive
for HDP, HDFS for HDP
Hive for Cloudera (CDH)
HDFS for Cloudera (CDH)
Informix, Netezza, Oracle
12. Teams in modeling experimentation phase
• DSX Local simplifies distribution of team
work based on skills
• DSX Local increases knowledge sharing and
knowledge retention
• Currently based on open source notebooks,
productivity tools in the future
• DSX Local simplifies cluster management for
teams
13. Teams in applications building phase
• Facilitate creation of machine learning
models
• Facilitate deployment of models as API end-points
• Automation of Batch Scoring, Training and Evaluation
scripts as schedulable jobs
• GIT integration to collaborate with engineers in their
favorite environment
• Publish content to others in pdf / html / R-Shiny app
14. Teams in model deployment, monitoring and support phase
• Monitor models through a dashboard
• Model versioning, evaluation history
• Publish versions of models, supporting
dev/stage/production paradigm
• Monitor scalability through cluster dashboard
• Adapt scalability by redistributing
compute/memory/disk resources
15. Software Architecture Best Practices
Run as a collection of “dockerized” services which are managed by Kubernetes
Kubernetes handles the service orchestration by providing
• Service monitoring and administration
• High availability / service failure detection and automatic restart
• Dynamically adds or removes nodes
• Online upgrades
Services running in Kubernetes include:
• UI services built with Node.js frameworks for browsers to connect to
• User authentication services
• Project services for user collaboration and data sharing
• Notebook services with enhanced access to Jupyter notebooks
• A Spark service with access to sophisticated analytics libraries
• Pipeline and model building services
• Data connection building service for access to external data
• Various internal management services
16. Specialized Runtime environments for containers with GPUs
16
• Create microservices using
nvidia-docker images
• Add AI frameworks which
transparently exploit GPUs
such as Tensorflow to the
docker image
• Deploy image and allocate
GPUs in a cluster using
kubernetes
17. Connect to Spark and Hadoop cluster for larger data sets and
access to shared resources
or YARN
18. AI, Deep Learning, Machine Learning
02
Data Science Experience
03
Hardware Acceleration
04
Demo
Agenda 01
19. 19
Faster Data Communication with Unique
CPU-GPU NVLink High-Speed Connection
1 TB
Memory
CPU
GPU GPU
170GB/s
NVLink
150 GB/s
1 TB
Memory
CPU
GPU GPU
170GB/s
NVLink
150 GB/s
Deep Learning Server (4-GPU Config)
Store Large Models
in System Memory
Operate on One
Layer at a Time
Fast Transfer
via NVLink
25. Train Large AI Models
Faster
Servers with NVLink to GPUs
25
3.1 Hours
49 Mins
0
2000
4000
6000
8000
10000
12000
Xeon x86 2640v4 w/
4x V100 GPUs
Power AC922 w/ 4x
V100 GPUs
Time(secs)
Caffe with LMS (Large Model Support)
Runtime of 1000 Iterations
3.8x Faster
GoogleNet model on Enlarged
ImageNet Dataset (2240x2240)
More details:
https://developer.ibm.com/linuxonpower/perfco
l/perfcol-mldl/
26. Distributed Deep
Learning (DDL)
26
Deep learning training takes
days to weeks
Distributed learning enables
scaling to 100s of servers
connected with Mellanox IB
1 System 64 Systems
16 Days Down to 7 Hours
58x Faster
16 Days
7 Hours
Near Ideal Scaling to 256 GPUs
ResNet-101, ImageNet-22K
1
2
4
8
16
32
64
128
256
4 16 64 256
Speedup
Number of GPUs
Ideal Scaling
DDL Actual Scaling
95%Scaling
with 256 GPUS
Caffe with PowerAI DDL, Running on Minsky (S822Lc) Power System
ResNet-50, ImageNet-1K
27. 27
Network Switch
GPU
Memory
POWER
CPU
DDR4
GPU
Storage Network IB, Eth
PCle
DDR4POWER
CPU
GPU GPU GPU
GPU
Memory
GPU
Memory
GPU
Memory
NVLinkNVLink
NVLink
NVLink
GPU
Memory
POWER
CPU
DDR4
GPU
Storage Network IB, Eth
PCle
DDR4POWER
CPU
GPU GPU GPU
GPU
Memory
GPU
Memory
GPU
Memory
NVLinkNVLink
NVLink
NVLink
COMMUNICATION PATHS
DDL: Fully utilize bandwidth for links within each node and across all nodes
à Learners communicate as efficiently as possible
Storage
Mellanox IB Network Switch
GPU
Memory
POWER
CPU
DDR4
GPU
Network IB, Eth
PCle
DDR4POWER
CPU
GPU GPU GPU
GPU
Memory
GPU
Memory
GPU
Memory
NVLinkNVLink
NVLink
NVLink
GPU
Memory
POWER
CPU
DDR4
GPU
Storage Network IB, Eth
PCle
DDR4POWER
CPU
GPU GPU GPU
GPU
Memory
GPU
Memory
GPU
Memory
NVLinkNVLink
NVLink
NVLink
28. Auto Hyper-Parameter Tuning
Hyper-parameters
– Learning rate
– Decay rate
– Batch size
– Optimizer
• GradientDecedent,
Adadelta, Momentum,
RMSProp …
– Momentum (for some
optimizers)
– LSTM hidden unit size
Random
Tree-based Parzen
Estimator (TPE)
Bayesian
Multi-tenant Spark Cluster
IBM Spectrum Conductor with Spark
Spark search jobs are generated dynamically and executed in parallel
28
29. 29
libGLM (C++ / CUDA
Optimized Primitive Lib)
Distributed Training
Logistic Regression Linear Regression
Support Vector
Machines (SVM)
Distributed Hyper-
Parameter Optimization
More Coming Soon
APIs for Popular ML
Frameworks
Snap ML
Distributed GPU-Accelerated Machine Learning Library
(coming
soon)
Snap Machine Learning (ML) Library
30. 46x faster than previous
record set by Google
Workload: Click-through rate
prediction for advertising
Logistic Regression Classifier in
Snap ML using GPUs vs
TensorFlow using CPU-only
30
Snap ML: Training Time Goes
From An Hour to Minutes
Logistic Regression in Snap ML (with
GPUs) vs TensorFlow (CPU-only)
1.1 Hours
1.53
Minutes
0
20
40
60
80
Google
CPU-only
Snap ML
Power + GPU
Runtime(Minutes)
46x Faster
Dataset: Criteo Terabyte Click Logs
(http://labs.criteo.com/2013/12/download-terabyte-click-logs/)
4 billion training examples, 1 million features
Model: Logistic Regression: TensorFlow vs Snap ML
Test LogLoss: 0.1293 (Google using Tensorflow), 0.1292 (Snap ML)
Platform: 89 CPU-only machines in Google using Tensorflow versus
4 AC922 servers (each 2 Power9 CPUs + 4 V100 GPUs) for Snap ML
Google data from this Google blog
32. Semi-Automatic Labeling using PowerAI Vision
32
Train DL Model
Define Labels
Manually Label Some
Images / Video Frames
Manually Label
Use Trained DL
Model
Run Trained DL Model
on Entire Input Data
to Generate Labels
Correct Labels
on Some Data
Manually Correct
Labels on Some Data
Repeat Till Labels Achieve
Desired Accuracy
34. DSX Local on Power LC922 Server: Improved Price-Performance for
Clients
Increased model completion and lower cost running K-means Clustering than tested Intel Xeon SP servers
Intel Xeon SP Gold
6140 server:
578 seconds
Power LC922
340 seconds
Power LC922
$35,618
Intel Xeon SP Gold
6140 server:
$45,390
Power LC922
Versus
Intel Xeon SP
Gold 6140 server
1. Based on IBM internal testing of the core computational step for 8 users to form 5 clusters using a 350694 x 301 float64 data set (1 GB) running the K-means algorithm using Apache Python ® and Tensorflow, Results valid as of 4/21/18 and conducted
under laboratory condition with speculative execution controls to mitigate user-to-kernel and user-to-user side-channel attacks on both systems, individual results can vary based on workload size, use of storage subsystems & other conditions.
2. IBM Power LC922 2x22-core/2.6 GHz/512 GB memory) using 10 x 4TB HDD, 10 GbE two-port, RHEL 7.5 LE for Power9
3. Competitive stack: 2-socket Intel Xeon SP (Skylake) Gold 6140 (2x18-core/2.4 GHz/512 GB memory) using 10 x 4TB HDD, 10 GbE two-port and RHEL 7.5
4. Pricing is based on Power LC922 http://www-03.ibm.com/systems/power/hardware/linux-lc.html and publicly available x86 pricing.
5. Apache®, Apache Python®, and associated logos are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by The Apache Software Foundation is implied by the use of these marks.
41%
FASTER
Insights1
22%
LOWER
Price2,3,4
35. AI, Deep Learning, Machine Learning
02
Data Science Experience
03
Hardware Acceleration
04
Demo
Agenda 01
36. Thank You
36
Recorded demo: https://ibm.box.com/s/pj0hs07x1fqlp1z576rsnb6odwtf57k5
38. Notice and disclaimers cont.
38
Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly
available sources. IBM has not tested those products in connection with this publication and cannot confirm the accuracy of performance,
compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the
suppliers of those products. IBM does not warrant the quality of any third-party products, or the ability of any such third-party products to
interoperate with IBM’s products. IBM expressly disclaims all warranties, expressed or implied, including but not limited to, the implied
warranties of merchantability and fitness for a particular, purpose.
The provision of the information contained herein is not intended to, and does not, grant any right or license under any IBM patents, copyrights,
trademarks or other intellectual property right.
IBM, the IBM logo, ibm.com, AIX, BigInsights, Bluemix, CICS, Easy Tier, FlashCopy, FlashSystem, GDPS, GPFS, Guardium, HyperSwap, IBM
Cloud Managed Services, IBM Elastic Storage, IBM FlashCore, IBM FlashSystem, IBM MobileFirst, IBM Power Systems, IBM PureSystems, IBM
Spectrum, IBM Spectrum Accelerate, IBM Spectrum Archive, IBM Spectrum Control, IBM Spectrum Protect, IBM Spectrum Scale, IBM Spectrum
Storage, IBM Spectrum Virtualize, IBM Watson, IBM Z, IBM z Systems, IBM z13, IMS, InfoSphere, Linear Tape File System, OMEGAMON,
OpenPower, Parallel Sysplex, Power, POWER, POWER4, POWER7, POWER8, Power Series, Power Systems, Power Systems Software, PowerHA,
PowerLinux, PowerVM, PureApplica- tion, RACF, Real-time Compression, Redbooks, RMF, SPSS, Storwize, Symphony, SystemMirror, System
Storage, Tivoli, WebSphere, XIV, z Systems, z/OS, z/VM, z/VSE, zEnterprise and zSecure are trademarks of International Business Machines
Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A
current list of IBM trademarks is available on the Web at "Copyright and trademark information" at: www.ibm.com/legal/copytrade.shtml.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. Java and all Java-based trademarks and logos are
trademarks or registered trademarks of Oracle and/or its affiliates.