by the #s
witter by the
    ing a t alk entitled “T
Giv
                   @twit  terU a t Cal.
Number    s” with
Lots o f ##s!                      ne
                 Tw itter for iPho
1m inute ago via
                     on
Retweet ed by 1 pers
What’s a Tweet?

It’s a short message that's sent through




                                           140 characters
How many are there?
How many are there?   70M
                      60M




                            Today! *




                             *off   the chart
op_oh
                                 Commons from lo
                    der Creative
       Photo used un




47M served                                               70M tweets
  per day                                                  per day
70M tweets
   per day   =   800 tweets
                 per second
Twitter by the Numbers
How big are they?


1 tweet text   =    140
                    characters
               ≈    200 bytes
800 tweets
per second       ≈       160 KB/sec
                 ≈ 9 MB/min
                 ≈ 12 GB/day
      Just tweet text!
MySQL
         Can’t generate IDs fast enough
Centralized and a single point of failure




                                    snowflake
                                    Highly available and uncoordinated (10kqps)
                                    Compatible with the           ecosystem
                                    http://github.com/twitter/snowflake
ampura
                                    Commons from ch
                       der Creative
          Photo used un




1 TB generated                                               8 TB generated
    per day                                                     per day
8 TB
per day
in total
                                                              ≈   100 MB
                                                                  per sec


  Photo used u
               nder Creative C
                              ommons from
                                            Mac Users G
                                                       uide
                                                              =   80 MB
                                                                  per sec
Where do they go?
              Followed by
  Following




                            Asymmetric Digraph
Tweets multiply
1

                   Digraph                           2
            Need to represent this

                                                         4
    1   2      3     4                    3
1




                             Matrix
2




                             Naïve implementation is not scalable
3
4
150M registered users




     2006     2008      2010
Photo used under Creative Commons from jurvetson




          Distributed graph database

flockdb   High rate of CRUD operations
          Complex set arithmetic queries
          http://github.com/twitter/flockdb
@ladygaga
mother mons†er
6.1 million followers

@BarakObama
44th President of the United States
5.3 million followers

@justinbieber
Justin Bieber
5.1 million followers

@raffi
me!
4.1 thousand followers
How do they get out?


6B API calls
    per day    ≈   70,000 calls
                   per second
REST API
         XML/JSON API over HTTP
Poll-based system / pseudo real-time




                               hosebird
                               Streaming API
                               Long poll HTTP
                               Near real-time delivery of Tweets
752%
in 2008
1358%
 in 2009
Where do we want to be?

           Today - 150M people generate ~1000 TPS

Tomorrow - we want to support half the world and all its devices

                  (5B phones and 6B people)
Real challenges in front of us
                 Real time

       Indexing, search, and analytics

             Relevance systems

              Graph databases

                   Storage

          Scalability and efficiency
Questions?   Follow me at
             twitter.com/raffi

More Related Content

PDF
cloudpack監視項目一覧表(サンプル)
DOC
Malicious file upload attacks - a case study
PDF
Devoir Final dans No Sql Base de donnée avec corrigé
PPT
Cloud et Virtualisation
PDF
The Golden Hour in Septic Shock Resuscitation "How Precious it is ?"
PDF
Axonius One Page Summary
PDF
Présentation data warehouse etl et olap
cloudpack監視項目一覧表(サンプル)
Malicious file upload attacks - a case study
Devoir Final dans No Sql Base de donnée avec corrigé
Cloud et Virtualisation
The Golden Hour in Septic Shock Resuscitation "How Precious it is ?"
Axonius One Page Summary
Présentation data warehouse etl et olap

What's hot (20)

PDF
Qu'est ce que le Cloud computing ?
PDF
Cours Big Data Chap5
PDF
Une Introduction à Hadoop
PPTX
Tiny os_2
PDF
Fast detection of Android malware: machine learning approach
PDF
Tp1 - WS avec JAXWS
PPTX
Modèle navigationnel (Mnav)
PDF
Support de cours EJB 3 version complète Par Mr Youssfi, ENSET, Université Ha...
PDF
Support Web Services SOAP et RESTful Mr YOUSSFI
PDF
Building and deploying microservices with event sourcing, CQRS and Docker (QC...
PDF
Alphorm.com Formation MySQL Administration(1Z0-883)
PPT
Cloud Computing
PDF
Epreuve concours génie informatique
PDF
Remote code execution in restricted windows environments
PDF
Search for All with Elastic Workplace Search
ODP
Sistemas para el Control de Versiones de Código
PDF
Real World Application Threat Modelling By Example
PDF
Services web soap-el-habib-nfaoui
PDF
Cours Big Data Chap4 - Spark
PPT
Types of clouds in cloud computing
Qu'est ce que le Cloud computing ?
Cours Big Data Chap5
Une Introduction à Hadoop
Tiny os_2
Fast detection of Android malware: machine learning approach
Tp1 - WS avec JAXWS
Modèle navigationnel (Mnav)
Support de cours EJB 3 version complète Par Mr Youssfi, ENSET, Université Ha...
Support Web Services SOAP et RESTful Mr YOUSSFI
Building and deploying microservices with event sourcing, CQRS and Docker (QC...
Alphorm.com Formation MySQL Administration(1Z0-883)
Cloud Computing
Epreuve concours génie informatique
Remote code execution in restricted windows environments
Search for All with Elastic Workplace Search
Sistemas para el Control de Versiones de Código
Real World Application Threat Modelling By Example
Services web soap-el-habib-nfaoui
Cours Big Data Chap4 - Spark
Types of clouds in cloud computing
Ad

Viewers also liked (20)

PDF
The Future of Check ins
PDF
Hacking Conway's Law
PDF
Digital in 2017 Global Overview
PDF
What we're learning about burnout and how DevOps can help
PDF
Scalding @ Coursera
PPT
Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)
PDF
Re-architecting on the Fly #OReillySACon
PDF
Real-time systems at Twitter (Velocity 2012)
PPTX
Aula 5 - Redes Sociais
PDF
Sistemas NoSQL, surgimento, características e exemplos
PDF
Scaling Twitter
PPTX
Deep Dive with Spark Streaming - Tathagata Das - Spark Meetup 2013-06-17
PPTX
Fitbit presentation
PDF
We Are Social's Guide to Social, Digital and Mobile Around the World (Feb 2013)
PDF
Digital in 2017: South America
PPTX
Presentation social network
PDF
Hadoop Summit Europe 2014: Apache Storm Architecture
PDF
BlueStore: a new, faster storage backend for Ceph
PDF
Social, Digital & Mobile Around The World (January 2014)
PPTX
Big data and Hadoop
The Future of Check ins
Hacking Conway's Law
Digital in 2017 Global Overview
What we're learning about burnout and how DevOps can help
Scalding @ Coursera
Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)
Re-architecting on the Fly #OReillySACon
Real-time systems at Twitter (Velocity 2012)
Aula 5 - Redes Sociais
Sistemas NoSQL, surgimento, características e exemplos
Scaling Twitter
Deep Dive with Spark Streaming - Tathagata Das - Spark Meetup 2013-06-17
Fitbit presentation
We Are Social's Guide to Social, Digital and Mobile Around the World (Feb 2013)
Digital in 2017: South America
Presentation social network
Hadoop Summit Europe 2014: Apache Storm Architecture
BlueStore: a new, faster storage backend for Ceph
Social, Digital & Mobile Around The World (January 2014)
Big data and Hadoop
Ad

Similar to Twitter by the Numbers (20)

PDF
Twitter by the Numbers (Columbia University)
PDF
Twitter for CS10 @ Berkeley (Spring 2011)
PDF
Twitter - Guest Lecture UC Berkeley CS10 Fall 2010
PDF
500Startups @ Twitter
KEY
Maintaining reliability in an unreliable world
KEY
From incubator to exit: A brief history of Reddit, the first YCombinator success
PPTX
The Megasite: Infrastructure for Internet Scale
PDF
Why all payments innovations are rubbish
PPTX
Colorado leadership v4
PPT
The History and Possible Futures of the Internet
PPTX
Big Data vs Data Warehousing
PDF
Ngn abridged oct2010
PPTX
2012: The End of the World?
PDF
Orb - How Blockchain Industry is evolving now and future?
PDF
Orb - How Blockchain Industry is evolving now and future?
PDF
The Creativity Machine
 
PDF
IPv6 Matrix Project - general presentation
PDF
IPv6 Matrix Project
PDF
Maq Software Live On Cutting Edge Dream Spark Yatra
PDF
Riak at Posterous
Twitter by the Numbers (Columbia University)
Twitter for CS10 @ Berkeley (Spring 2011)
Twitter - Guest Lecture UC Berkeley CS10 Fall 2010
500Startups @ Twitter
Maintaining reliability in an unreliable world
From incubator to exit: A brief history of Reddit, the first YCombinator success
The Megasite: Infrastructure for Internet Scale
Why all payments innovations are rubbish
Colorado leadership v4
The History and Possible Futures of the Internet
Big Data vs Data Warehousing
Ngn abridged oct2010
2012: The End of the World?
Orb - How Blockchain Industry is evolving now and future?
Orb - How Blockchain Industry is evolving now and future?
The Creativity Machine
 
IPv6 Matrix Project - general presentation
IPv6 Matrix Project
Maq Software Live On Cutting Edge Dream Spark Yatra
Riak at Posterous

More from Raffi Krikorian (20)

PDF
Twitter: Engineering for Real-Time (Stanford ACM 2011)
PDF
Securing Your Ecosystem (FOWA Las Vegas 2011)
PDF
Developing for @twitterapi (Techcrunch Disrupt Hackathon)
PDF
#rtgeo (Where 2.0 2011)
KEY
Users and Geo
PDF
Twitter and the Real-Time Web
PDF
Developing for @twitterapi #hack4health
KEY
Intro to developing for @twitterapi (updated)
PDF
How to use Geolocation in your webapp @ FOWA Dublin 2010
PDF
Intro to developing for @twitterapi
KEY
Twitter API Annotations
KEY
"What's Happening" to "What's Happening Here" @ Chirp
KEY
Energy / Tweet
KEY
Handling Real-time Geostreams
KEY
Adding the "Where" to the "When"
KEY
What's happening here?
PDF
WattzOn @ ETech 2009
PDF
Scala + WattzOn, sitting in a tree....
PDF
WattzOn Whole Earth Simulator
PDF
Broken Hearts: How Valentine's Day causes global warming
Twitter: Engineering for Real-Time (Stanford ACM 2011)
Securing Your Ecosystem (FOWA Las Vegas 2011)
Developing for @twitterapi (Techcrunch Disrupt Hackathon)
#rtgeo (Where 2.0 2011)
Users and Geo
Twitter and the Real-Time Web
Developing for @twitterapi #hack4health
Intro to developing for @twitterapi (updated)
How to use Geolocation in your webapp @ FOWA Dublin 2010
Intro to developing for @twitterapi
Twitter API Annotations
"What's Happening" to "What's Happening Here" @ Chirp
Energy / Tweet
Handling Real-time Geostreams
Adding the "Where" to the "When"
What's happening here?
WattzOn @ ETech 2009
Scala + WattzOn, sitting in a tree....
WattzOn Whole Earth Simulator
Broken Hearts: How Valentine's Day causes global warming

Recently uploaded (20)

PPTX
Microsoft Excel 365/2024 Beginner's training
PDF
Enhancing plagiarism detection using data pre-processing and machine learning...
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PPTX
Modernising the Digital Integration Hub
PDF
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PDF
Flame analysis and combustion estimation using large language and vision assi...
PDF
The influence of sentiment analysis in enhancing early warning system model f...
PDF
STKI Israel Market Study 2025 version august
PDF
Five Habits of High-Impact Board Members
PDF
“A New Era of 3D Sensing: Transforming Industries and Creating Opportunities,...
PDF
Getting started with AI Agents and Multi-Agent Systems
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
PDF
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
PPTX
Benefits of Physical activity for teenagers.pptx
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
Microsoft Excel 365/2024 Beginner's training
Enhancing plagiarism detection using data pre-processing and machine learning...
NewMind AI Weekly Chronicles – August ’25 Week III
Modernising the Digital Integration Hub
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
Taming the Chaos: How to Turn Unstructured Data into Decisions
Flame analysis and combustion estimation using large language and vision assi...
The influence of sentiment analysis in enhancing early warning system model f...
STKI Israel Market Study 2025 version august
Five Habits of High-Impact Board Members
“A New Era of 3D Sensing: Transforming Industries and Creating Opportunities,...
Getting started with AI Agents and Multi-Agent Systems
1 - Historical Antecedents, Social Consideration.pdf
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
Benefits of Physical activity for teenagers.pptx
Zenith AI: Advanced Artificial Intelligence
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
A contest of sentiment analysis: k-nearest neighbor versus neural network

Twitter by the Numbers

  • 2. witter by the ing a t alk entitled “T Giv @twit terU a t Cal. Number s” with Lots o f ##s! ne Tw itter for iPho 1m inute ago via on Retweet ed by 1 pers
  • 3. What’s a Tweet? It’s a short message that's sent through 140 characters
  • 4. How many are there?
  • 5. How many are there? 70M 60M Today! * *off the chart
  • 6. op_oh Commons from lo der Creative Photo used un 47M served 70M tweets per day per day
  • 7. 70M tweets per day = 800 tweets per second
  • 9. How big are they? 1 tweet text = 140 characters ≈ 200 bytes
  • 10. 800 tweets per second ≈ 160 KB/sec ≈ 9 MB/min ≈ 12 GB/day Just tweet text!
  • 11. MySQL Can’t generate IDs fast enough Centralized and a single point of failure snowflake Highly available and uncoordinated (10kqps) Compatible with the ecosystem http://github.com/twitter/snowflake
  • 12. ampura Commons from ch der Creative Photo used un 1 TB generated 8 TB generated per day per day
  • 13. 8 TB per day in total ≈ 100 MB per sec Photo used u nder Creative C ommons from Mac Users G uide = 80 MB per sec
  • 14. Where do they go? Followed by Following Asymmetric Digraph
  • 16. 1 Digraph 2 Need to represent this 4 1 2 3 4 3 1 Matrix 2 Naïve implementation is not scalable 3 4
  • 17. 150M registered users 2006 2008 2010
  • 18. Photo used under Creative Commons from jurvetson Distributed graph database flockdb High rate of CRUD operations Complex set arithmetic queries http://github.com/twitter/flockdb
  • 19. @ladygaga mother mons†er 6.1 million followers @BarakObama 44th President of the United States 5.3 million followers @justinbieber Justin Bieber 5.1 million followers @raffi me! 4.1 thousand followers
  • 20. How do they get out? 6B API calls per day ≈ 70,000 calls per second
  • 21. REST API XML/JSON API over HTTP Poll-based system / pseudo real-time hosebird Streaming API Long poll HTTP Near real-time delivery of Tweets
  • 24. Where do we want to be? Today - 150M people generate ~1000 TPS Tomorrow - we want to support half the world and all its devices (5B phones and 6B people)
  • 25. Real challenges in front of us Real time Indexing, search, and analytics Relevance systems Graph databases Storage Scalability and efficiency
  • 26. Questions? Follow me at twitter.com/raffi