SlideShare a Scribd company logo
Facebook Presto 
Interactive and Distributed SQL Query Engine for 
Big Data 
liangguorong@baidu.com, 2014. 11.20
Presto’s Brief History 
• 2012 fall started at Facebook (6 developers) 
✦ Designed for interactive SQL query on PB data 
✦ Hive is for reliable and large scale batch processing 
• 2013 spring rolled out to entire company 
• 2013 Nov. open sourced (https://github.com/facebook/presto ) 
• 2014 Nov., 88 releases, 41 contributors, 3943commits 
• current version 0.85 (http://prestodb.io/ ) 
• java, fast development , java ecosystem, easy integration
Advantages 
• High Performance: 10x faster than Hive 
✦ 2013 Nov. Facebook 1000 nodes, 1000 employees run 30,000 queries on 1PB per day 
• Extensibility 
✦ Pluggable backends: Cassandra, Hive, JMX, Kafka, MySQL, PostgreSQL, MySQL, 
SystemSchema, TPCH 
✦ JDBC, ODBC(in future) for commercial BI tools or Dashboards, like data visualization 
✦ Client Protocol: HTTP+JSON, support various languages(Python, Ruby, PHP, Node.js 
Java(JDBC)…) 
• ANSI SQL 
• complex queries, joins, aggregations, various functions(Window 
functions)
• http://blog.cloudera.com/blog/2014/09/new-benchmarks- 
for-sql-on-hadoop-impala-1-4-widens-the- 
performance-gap/
Facebook Presto presentation
Facebook Presto presentation
Architecture
Why Presto Fast? 
1. In memory parallel computing 
2. Pipeline task execution 
3. Data local computation with multi-threads 
4. Cache hot queries and data 
5. JIT compile operator to byte code 
6. SQL optimization 
7. Other optimization
1. In memory parallel computing 
• Custom query engine, not MapReduce
SQL compile process 
antlr3
• select name, count(*) as count from orders as t1 join customer as t2 on 
t1.custkey = t2.custkey group by name order by count desc limit 100;
Sink! 
TopN! 
Exchange! 
Sink! 
TopN! 
Final Aggregation! 
Exchange! 
Sink! 
Partial Aggregation! 
Table Scan! 
orders! 
Exchange! 
Sink! 
Table Scan! 
customers! 
Project! 
Join! 
Sink! 
TopN! 
Exchange! 
Sink! 
TopN! 
Final Aggregation! 
Exchange! 
Sink! 
Partial Aggregation! 
1 thread! 
1 thread! 
Table Scan! 
orders! 
Project! 
Join! 
Table Scan! 
customers! 
Worker2! 
Sink! 
TopN! 
Final Aggregation! 
Sink! 
Partial Aggregation! 
Project! 
Join! 
Table Scan! 
Sink! 
Exchange! 
Exchange! 
Worker1! 
2 workers! 
All tasks in parallel! 
many splits ! 
many threads! 
1 thread! 
Sink! 
orders! 
Table Scan! 
customers! 
Exchange! 
many splits ! 
many threads!
Prioritized 
SplitRunner 
• SQL->Stages, Tasks, Splits 
• One task fail, query must rerun 
• Aggregation memory limit
2.Pipeline task execution 
• In worker, TaskExecutor, split pipeline 
1s by default
• Operator Pipeline 
• Page: smallest data processing unit(like 
RowBatch) 
• max page size 1MB, max rows: 
16*1024 
Page 
Exchange Operator: 
each client for each 
split
3. Data local computation with 
multi-threads 
• NodeSelector select available nodes(10 nodes 
default) 
• Nodes has the same address 
• If not enough, add nodes in the same rack 
• If not enough, randomly select nodes in other racks 
• Select the node with the smallest number of 
assignments (pending tasks)
• 4. Cache hot queries and data 
✦ Google Guava loading cache byte code 
✦ Cache Objects: Hive database/table/partition, JIT byte code 
class, functions 
• 5. JIT compile operator to byte code 
✦ Compile ScanFilterAndProjectOperator , 
FilterAndProjectOperator
6. SQL Optimization 
• PredicatePushDown 
• PruneRedundantProjections 
• PruneUnreferencedOutputs 
• MergeProjections 
• LimitPushDown 
• CanonicalizeExpressions 
• CountConstantOptimizer 
• ImplementSampleAsFilter 
• MetadataQueryOptimizer 
• SetFlatteningOptimizer 
• SimplifyExpressions 
• UnaliasSymbolReferences 
• WindowFilterPushDown
7. Other Optimization 
• BlinkDB liked approximate queries 
• JVM GC Control 
✦ JDK1.7 
✦ forcing the code cache evictor make room before the cache fills up 
• Careful use mem  data structure 
✦ Airlift slice for efficient heap and off-heap memory(https://github.com/airlift/slice ) 
✦ Java future async callback
Presto Extensibility 
• Connectors(Catalogs): Hive, Cassandra, Hive, JMX, Kafka, 
MySQL, PostgreSQL, System, TPCH 
• Custom connectors 
(http://prestodb.io/docs/current/spi/overview.html ): 
• Service Provider Interface(SPI): 
• ConnectorMetadata 
• ConnectorSplitManager 
• ConnectorRecordSetProvider
Presto’s Limitations 
• No fault tolerance, Unstable 
• Memory Limitations for aggregations, huge joins 
• SQL features like: 
• only CTAS 
• no support UDF
Presto’s Future 
Presto, Past, Present, and Future by Dain Sundstrom at Facebook, 2014.May 
• Basic Task Recovery 
• Huge joins and Group by 
• Spill to Disk(Implemented), Insert 
• Create View(Implemented), not compatible with hive 
• Native Store, Cache Hot data(Implemented) 
• Security : Authentication, Authorization, Permissions 
• ODBC Driver 
• Improve DDL DML
References 
• http://prestodb.io/ 
• https://github.com/facebook/presto 
• https://www.facebook.com/notes/facebook-engineering/ 
presto-interacting-with-petabytes-of-data- 
at-facebook/10151786197628920

More Related Content

ODP
Presto
Knoldus Inc.
 
PDF
Top 5 Mistakes When Writing Spark Applications
Spark Summit
 
PPTX
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Dremio Corporation
 
PDF
Understanding Presto - Presto meetup @ Tokyo #1
Sadayuki Furuhashi
 
PPTX
Presto: SQL-on-anything
DataWorks Summit
 
PDF
Apache Iceberg: An Architectural Look Under the Covers
ScyllaDB
 
PDF
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
StreamNative
 
PPTX
Evening out the uneven: dealing with skew in Flink
Flink Forward
 
Presto
Knoldus Inc.
 
Top 5 Mistakes When Writing Spark Applications
Spark Summit
 
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Dremio Corporation
 
Understanding Presto - Presto meetup @ Tokyo #1
Sadayuki Furuhashi
 
Presto: SQL-on-anything
DataWorks Summit
 
Apache Iceberg: An Architectural Look Under the Covers
ScyllaDB
 
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
StreamNative
 
Evening out the uneven: dealing with skew in Flink
Flink Forward
 

What's hot (20)

PDF
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Databricks
 
PDF
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Databricks
 
PDF
Oracle Performance Tuning Fundamentals
Enkitec
 
PDF
Presto anatomy
Dongmin Yu
 
PDF
Write Faster SQL with Trino.pdf
Eric Xiao
 
PPTX
Sizing your alfresco platform
Luis Cabaceira
 
PPTX
Apache Airflow overview
NikolayGrishchenkov
 
PDF
A Deep Dive into Stateful Stream Processing in Structured Streaming with Tath...
Databricks
 
PPTX
Apache Ranger
Rommel Garcia
 
PDF
Apache Spark Introduction
sudhakara st
 
PPTX
elasticsearch_적용 및 활용_정리
Junyi Song
 
PDF
Fine Tuning and Enhancing Performance of Apache Spark Jobs
Databricks
 
PPTX
RocksDB detail
MIJIN AN
 
PDF
Apache Flink internals
Kostas Tzoumas
 
PPTX
Hive on spark is blazing fast or is it final
Hortonworks
 
PPTX
Oracle database performance tuning
Yogiji Creations
 
PDF
A Deep Dive into Query Execution Engine of Spark SQL
Databricks
 
PPSX
Oracle Performance Tools of the Trade
Carlos Sierra
 
PDF
Intro to Airflow: Goodbye Cron, Welcome scheduled workflow management
Burasakorn Sabyeying
 
PDF
Apache Calcite: One planner fits all
Julian Hyde
 
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Databricks
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Databricks
 
Oracle Performance Tuning Fundamentals
Enkitec
 
Presto anatomy
Dongmin Yu
 
Write Faster SQL with Trino.pdf
Eric Xiao
 
Sizing your alfresco platform
Luis Cabaceira
 
Apache Airflow overview
NikolayGrishchenkov
 
A Deep Dive into Stateful Stream Processing in Structured Streaming with Tath...
Databricks
 
Apache Ranger
Rommel Garcia
 
Apache Spark Introduction
sudhakara st
 
elasticsearch_적용 및 활용_정리
Junyi Song
 
Fine Tuning and Enhancing Performance of Apache Spark Jobs
Databricks
 
RocksDB detail
MIJIN AN
 
Apache Flink internals
Kostas Tzoumas
 
Hive on spark is blazing fast or is it final
Hortonworks
 
Oracle database performance tuning
Yogiji Creations
 
A Deep Dive into Query Execution Engine of Spark SQL
Databricks
 
Oracle Performance Tools of the Trade
Carlos Sierra
 
Intro to Airflow: Goodbye Cron, Welcome scheduled workflow management
Burasakorn Sabyeying
 
Apache Calcite: One planner fits all
Julian Hyde
 
Ad

Viewers also liked (8)

PPTX
Presto: Distributed sql query engine
kiran palaka
 
PDF
Presto - SQL on anything
Grzegorz Kokosiński
 
PDF
Presto at Hadoop Summit 2016
kbajda
 
PDF
Presto @ Facebook: Past, Present and Future
DataWorks Summit
 
PDF
Presto: Distributed SQL on Anything - Strata Hadoop 2017 San Jose, CA
kbajda
 
PPTX
How to ensure Presto scalability 
in multi use case
Kai Sasaki
 
PDF
Optimizing Presto Connector on Cloud Storage
Kai Sasaki
 
PPTX
Hive, Presto, and Spark on TPC-DS benchmark
Dongwon Kim
 
Presto: Distributed sql query engine
kiran palaka
 
Presto - SQL on anything
Grzegorz Kokosiński
 
Presto at Hadoop Summit 2016
kbajda
 
Presto @ Facebook: Past, Present and Future
DataWorks Summit
 
Presto: Distributed SQL on Anything - Strata Hadoop 2017 San Jose, CA
kbajda
 
How to ensure Presto scalability 
in multi use case
Kai Sasaki
 
Optimizing Presto Connector on Cloud Storage
Kai Sasaki
 
Hive, Presto, and Spark on TPC-DS benchmark
Dongwon Kim
 
Ad

Similar to Facebook Presto presentation (20)

PDF
Workflow Engines for Hadoop
Joe Crobak
 
PDF
Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...
viirya
 
PDF
Trend Micro Big Data Platform and Apache Bigtop
Evans Ye
 
PDF
Buildingsocialanalyticstoolwithmongodb
MongoDB APAC
 
PDF
Webinar - DreamObjects/Ceph Case Study
Ceph Community
 
PDF
Middleware in Golang: InVision's Rye
Cale Hoopes
 
PDF
Ceph Day Beijing - Our Journey to High Performance Large Scale Ceph Cluster a...
Ceph Community
 
PDF
Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...
Danielle Womboldt
 
PDF
Big Data Developers Moscow Meetup 1 - sql on hadoop
bddmoscow
 
PDF
SQL on Hadoop
nvvrajesh
 
PPTX
Be faster then rabbits
Vladislav Bauer
 
PDF
Top ten-list
Brian DeShong
 
PDF
SharePoint Saturday San Antonio: SharePoint 2010 Performance
Brian Culver
 
PPTX
DOTNET8.pptx
Udaiappa Ramachandran
 
PDF
DrupalCampLA 2014 - Drupal backend performance and scalability
cherryhillco
 
PDF
Intro to CakePHP
Walther Lalk
 
PDF
Michael stack -the state of apache h base
hdhappy001
 
PPTX
Scaling with swagger
Tony Tam
 
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
PDF
DrupalSouth 2015 - Performance: Not an Afterthought
Nick Santamaria
 
Workflow Engines for Hadoop
Joe Crobak
 
Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...
viirya
 
Trend Micro Big Data Platform and Apache Bigtop
Evans Ye
 
Buildingsocialanalyticstoolwithmongodb
MongoDB APAC
 
Webinar - DreamObjects/Ceph Case Study
Ceph Community
 
Middleware in Golang: InVision's Rye
Cale Hoopes
 
Ceph Day Beijing - Our Journey to High Performance Large Scale Ceph Cluster a...
Ceph Community
 
Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...
Danielle Womboldt
 
Big Data Developers Moscow Meetup 1 - sql on hadoop
bddmoscow
 
SQL on Hadoop
nvvrajesh
 
Be faster then rabbits
Vladislav Bauer
 
Top ten-list
Brian DeShong
 
SharePoint Saturday San Antonio: SharePoint 2010 Performance
Brian Culver
 
DOTNET8.pptx
Udaiappa Ramachandran
 
DrupalCampLA 2014 - Drupal backend performance and scalability
cherryhillco
 
Intro to CakePHP
Walther Lalk
 
Michael stack -the state of apache h base
hdhappy001
 
Scaling with swagger
Tony Tam
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
DrupalSouth 2015 - Performance: Not an Afterthought
Nick Santamaria
 

Recently uploaded (20)

PDF
Adobe Illustrator Crack Full Download (Latest Version 2025) Pre-Activated
imang66g
 
PDF
advancepresentationskillshdhdhhdhdhdhhfhf
jasmenrojas249
 
PDF
lesson-2-rules-of-netiquette.pdf.bshhsjdj
jasmenrojas249
 
PDF
An Experience-Based Look at AI Lead Generation Pricing, Features & B2B Results
Thomas albart
 
PDF
Applitools Platform Pulse: What's New and What's Coming - July 2025
Applitools
 
PDF
Balancing Resource Capacity and Workloads with OnePlan – Avoid Overloading Te...
OnePlan Solutions
 
PPTX
Maximizing Revenue with Marketo Measure: A Deep Dive into Multi-Touch Attribu...
bbedford2
 
PPTX
ConcordeApp: Engineering Global Impact & Unlocking Billions in Event ROI with AI
chastechaste14
 
PPTX
ASSIGNMENT_1[1][1][1][1][1] (1) variables.pptx
kr2589474
 
PPTX
Explanation about Structures in C language.pptx
Veeral Rathod
 
PPTX
Can You Build Dashboards Using Open Source Visualization Tool.pptx
Varsha Nayak
 
PDF
Key Features to Look for in Arizona App Development Services
Net-Craft.com
 
PDF
Teaching Reproducibility and Embracing Variability: From Floating-Point Exper...
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
DOCX
Can You Build Dashboards Using Open Source Visualization Tool.docx
Varsha Nayak
 
PPTX
Contractor Management Platform and Software Solution for Compliance
SHEQ Network Limited
 
PPTX
oapresentation.pptx
mehatdhavalrajubhai
 
PDF
New Download FL Studio Crack Full Version [Latest 2025]
imang66g
 
PDF
New Download MiniTool Partition Wizard Crack Latest Version 2025
imang66g
 
PDF
On Software Engineers' Productivity - Beyond Misleading Metrics
Romén Rodríguez-Gil
 
PDF
vAdobe Premiere Pro 2025 (v25.2.3.004) Crack Pre-Activated Latest
imang66g
 
Adobe Illustrator Crack Full Download (Latest Version 2025) Pre-Activated
imang66g
 
advancepresentationskillshdhdhhdhdhdhhfhf
jasmenrojas249
 
lesson-2-rules-of-netiquette.pdf.bshhsjdj
jasmenrojas249
 
An Experience-Based Look at AI Lead Generation Pricing, Features & B2B Results
Thomas albart
 
Applitools Platform Pulse: What's New and What's Coming - July 2025
Applitools
 
Balancing Resource Capacity and Workloads with OnePlan – Avoid Overloading Te...
OnePlan Solutions
 
Maximizing Revenue with Marketo Measure: A Deep Dive into Multi-Touch Attribu...
bbedford2
 
ConcordeApp: Engineering Global Impact & Unlocking Billions in Event ROI with AI
chastechaste14
 
ASSIGNMENT_1[1][1][1][1][1] (1) variables.pptx
kr2589474
 
Explanation about Structures in C language.pptx
Veeral Rathod
 
Can You Build Dashboards Using Open Source Visualization Tool.pptx
Varsha Nayak
 
Key Features to Look for in Arizona App Development Services
Net-Craft.com
 
Teaching Reproducibility and Embracing Variability: From Floating-Point Exper...
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
Can You Build Dashboards Using Open Source Visualization Tool.docx
Varsha Nayak
 
Contractor Management Platform and Software Solution for Compliance
SHEQ Network Limited
 
oapresentation.pptx
mehatdhavalrajubhai
 
New Download FL Studio Crack Full Version [Latest 2025]
imang66g
 
New Download MiniTool Partition Wizard Crack Latest Version 2025
imang66g
 
On Software Engineers' Productivity - Beyond Misleading Metrics
Romén Rodríguez-Gil
 
vAdobe Premiere Pro 2025 (v25.2.3.004) Crack Pre-Activated Latest
imang66g
 

Facebook Presto presentation

  • 1. Facebook Presto Interactive and Distributed SQL Query Engine for Big Data [email protected], 2014. 11.20
  • 2. Presto’s Brief History • 2012 fall started at Facebook (6 developers) ✦ Designed for interactive SQL query on PB data ✦ Hive is for reliable and large scale batch processing • 2013 spring rolled out to entire company • 2013 Nov. open sourced (https://github.com/facebook/presto ) • 2014 Nov., 88 releases, 41 contributors, 3943commits • current version 0.85 (http://prestodb.io/ ) • java, fast development , java ecosystem, easy integration
  • 3. Advantages • High Performance: 10x faster than Hive ✦ 2013 Nov. Facebook 1000 nodes, 1000 employees run 30,000 queries on 1PB per day • Extensibility ✦ Pluggable backends: Cassandra, Hive, JMX, Kafka, MySQL, PostgreSQL, MySQL, SystemSchema, TPCH ✦ JDBC, ODBC(in future) for commercial BI tools or Dashboards, like data visualization ✦ Client Protocol: HTTP+JSON, support various languages(Python, Ruby, PHP, Node.js Java(JDBC)…) • ANSI SQL • complex queries, joins, aggregations, various functions(Window functions)
  • 8. Why Presto Fast? 1. In memory parallel computing 2. Pipeline task execution 3. Data local computation with multi-threads 4. Cache hot queries and data 5. JIT compile operator to byte code 6. SQL optimization 7. Other optimization
  • 9. 1. In memory parallel computing • Custom query engine, not MapReduce
  • 11. • select name, count(*) as count from orders as t1 join customer as t2 on t1.custkey = t2.custkey group by name order by count desc limit 100;
  • 12. Sink! TopN! Exchange! Sink! TopN! Final Aggregation! Exchange! Sink! Partial Aggregation! Table Scan! orders! Exchange! Sink! Table Scan! customers! Project! Join! Sink! TopN! Exchange! Sink! TopN! Final Aggregation! Exchange! Sink! Partial Aggregation! 1 thread! 1 thread! Table Scan! orders! Project! Join! Table Scan! customers! Worker2! Sink! TopN! Final Aggregation! Sink! Partial Aggregation! Project! Join! Table Scan! Sink! Exchange! Exchange! Worker1! 2 workers! All tasks in parallel! many splits ! many threads! 1 thread! Sink! orders! Table Scan! customers! Exchange! many splits ! many threads!
  • 13. Prioritized SplitRunner • SQL->Stages, Tasks, Splits • One task fail, query must rerun • Aggregation memory limit
  • 14. 2.Pipeline task execution • In worker, TaskExecutor, split pipeline 1s by default
  • 15. • Operator Pipeline • Page: smallest data processing unit(like RowBatch) • max page size 1MB, max rows: 16*1024 Page Exchange Operator: each client for each split
  • 16. 3. Data local computation with multi-threads • NodeSelector select available nodes(10 nodes default) • Nodes has the same address • If not enough, add nodes in the same rack • If not enough, randomly select nodes in other racks • Select the node with the smallest number of assignments (pending tasks)
  • 17. • 4. Cache hot queries and data ✦ Google Guava loading cache byte code ✦ Cache Objects: Hive database/table/partition, JIT byte code class, functions • 5. JIT compile operator to byte code ✦ Compile ScanFilterAndProjectOperator , FilterAndProjectOperator
  • 18. 6. SQL Optimization • PredicatePushDown • PruneRedundantProjections • PruneUnreferencedOutputs • MergeProjections • LimitPushDown • CanonicalizeExpressions • CountConstantOptimizer • ImplementSampleAsFilter • MetadataQueryOptimizer • SetFlatteningOptimizer • SimplifyExpressions • UnaliasSymbolReferences • WindowFilterPushDown
  • 19. 7. Other Optimization • BlinkDB liked approximate queries • JVM GC Control ✦ JDK1.7 ✦ forcing the code cache evictor make room before the cache fills up • Careful use mem data structure ✦ Airlift slice for efficient heap and off-heap memory(https://github.com/airlift/slice ) ✦ Java future async callback
  • 20. Presto Extensibility • Connectors(Catalogs): Hive, Cassandra, Hive, JMX, Kafka, MySQL, PostgreSQL, System, TPCH • Custom connectors (http://prestodb.io/docs/current/spi/overview.html ): • Service Provider Interface(SPI): • ConnectorMetadata • ConnectorSplitManager • ConnectorRecordSetProvider
  • 21. Presto’s Limitations • No fault tolerance, Unstable • Memory Limitations for aggregations, huge joins • SQL features like: • only CTAS • no support UDF
  • 22. Presto’s Future Presto, Past, Present, and Future by Dain Sundstrom at Facebook, 2014.May • Basic Task Recovery • Huge joins and Group by • Spill to Disk(Implemented), Insert • Create View(Implemented), not compatible with hive • Native Store, Cache Hot data(Implemented) • Security : Authentication, Authorization, Permissions • ODBC Driver • Improve DDL DML
  • 23. References • http://prestodb.io/ • https://github.com/facebook/presto • https://www.facebook.com/notes/facebook-engineering/ presto-interacting-with-petabytes-of-data- at-facebook/10151786197628920