Tweets are Not Created Equal 
Intersecting Devices in the 1% Sample 
Carolin Gerlitz & Bernhard Rieder 
IR15 - Boundaries & Intersections 
October 23, 2014
Digging deeper into 
Twitter devices 
The Twitter API's 1% random sample can 
be used to explore, baseline, contextualize, 
verify, etc. (Gerlitz & Rieder 2013, 
Morstatter et al. 2014). 
How can we qualify individual elements in 
relation to a larger platform ecology? 
The presentation inquires more deeply into 
the role devices play on Twitter. 
We used a week-long random sample of 
tweets to further explore this aspect. 
(14.6.2014 - 20.6.2014, n = 31.707.162)
Devices intersect use 
practices 
There has been a proliferation of very 
different devices (mobile, desktop, web, 
buttons, bots, etc.) from which people send 
their tweets. It's full of devices! 
Thinking Twitter as ecology of connected 
devices, we ask (1) how we can qualify 
devices and (2) how devices can enable us 
to unpack metrics for studying use cultures. 
Frequency based metrics suggest that the 
units they count are equivalent (e.g. tweets 
per time for a certain hashtag). 
Do we need to conceptualize devices as 
intervening variables?
iPhone 
Tweetdeck 
Web client 
Tweetadder 
Instagram Tribez
iPhone 
Instagram 
Tweetadder 
Tweetadder
Hashtag qualification 
#iraq
Hashtag qualification 
#CallMeCam
Hashtag qualification 
#gameinsight
Hashtag qualification 
#love
Devices & use practices 
Desktop clients (Web, Tweetdeck, etc.) are 
overrepresented in news conversations; 
Tweetdeck also points towards 
professional social media practices. 
The iPhone is the preferred microphone of 
the American teenager. 
Custom autopost clients (platforms, games, 
etc.) are engaged in activity loops. 
Automation clients (dlvr.it, IFTT, or 
Tweetadder) empower promotion, spam, 
hijacking, and syndication practices. 
Different devices have different 
capacities and enable different ways of 
engaging with the Twitter platform 
(posting, observing, responding, etc.).
Domain qualification 
nytimes.com
Domain qualification 
youtube.com
Domain qualification 
etsy.com
Devices intersect 
practices 
Tweets are not created equal. Devices imply different regimes of "being on Twitter" 
that are caught up in different perspectives, purposes, and politics. 
Twitter takes part in complex platform ecologies that mediate tweeting in different 
ways and are thus co-constitutive of practices. Devices intersect practices. 
For Internet researchers, this creates problems and opportunities. Devices as 
intervening variables can both skew and explain. 
Frequency counts that do not take into account devices are problematic: do 100K 
tweets from Tweetadder "mean" or "indicate" the same thing as 100K sent from the 
iPhone? They refer to different populations, practices, purposes, and politics.
Conclusion 
Frequency counts are not comparable from the outset, but need to be made 
comparable by including devices in the interpretation. 
Devices need to be taken into account when sampling, cleaning, analyzing, and 
interpreting Twitter data. 
This kind of unpacking and repacking of components in the platform ecologies can 
be performed for various other elements. (cf. Bruns & Stieglitz 2013)
Thank you. 
Carolin Gerlitz, c.gerlitz@uva.nl, @cgrltz 
Bernhard Rieder, rieder@uva.nl, @riederb 
DMI-TCAT (Borra & Rieder 2014), open source, available at: 
https://github.com/digitalmethodsinitiative/dmi-tcat

More Related Content

PPTX
Platforms and Analytical Gestures
PPTX
On Digital Markets, Data, and Concentric Diversification
PPTX
Engines of Order. Social Media and the Rise of Algorithmic Knowing.
PPTX
From Algorithms to Diagrams: How to Study Platforms?
PPTX
Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...
PPTX
On the Diversity of the Accountability Problem. Machine Learning and Knowing ...
PPTX
Truth, Justice, and Technicity: from Bias to the Politics of Systems
PPTX
ORDER BY column_name: The Relational Database as Pervasive Cultural Form
Platforms and Analytical Gestures
On Digital Markets, Data, and Concentric Diversification
Engines of Order. Social Media and the Rise of Algorithmic Knowing.
From Algorithms to Diagrams: How to Study Platforms?
Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...
On the Diversity of the Accountability Problem. Machine Learning and Knowing ...
Truth, Justice, and Technicity: from Bias to the Politics of Systems
ORDER BY column_name: The Relational Database as Pervasive Cultural Form

What's hot (20)

PPTX
Figures of the Many - Quantitative Concepts for Qualitative Thinking
PPTX
Interactive visualization and exploration of network data with gephi
PDF
The story of Data Stories
PDF
Pie chart or pizza: identifying chart types and their virality on Twitter
PPTX
Perceptions of Syrian refugees and data experts on relocation algorithm
PDF
High-value datasets: from publication to impact
PDF
The human face of AI: how collective and augmented intelligence can help sol...
PPT
The impact of Big Data on next generation of smart cities
PDF
Big Data Analytics : A Social Network Approach
PPT
Giovanni Maria Sacco
PDF
The data we want
PDF
Data stories
PDF
One does not simply crowdsource the Semantic Web: 10 years with people, URIs,...
PDF
Using Maltego Tungsten to Explore Cyber-Physical Confluence in Geolocation
DOCX
Tfsc disc 2014 si proposal (30 june2014)
PDF
Building better knowledge graphs through social computing
PPTX
Social Νetworks Data Mining
PDF
The web of data: how are we doing so far?
PPT
Data Visualization
PPT
Physical-Cyber-Social Data Analytics & Smart City Applications
Figures of the Many - Quantitative Concepts for Qualitative Thinking
Interactive visualization and exploration of network data with gephi
The story of Data Stories
Pie chart or pizza: identifying chart types and their virality on Twitter
Perceptions of Syrian refugees and data experts on relocation algorithm
High-value datasets: from publication to impact
The human face of AI: how collective and augmented intelligence can help sol...
The impact of Big Data on next generation of smart cities
Big Data Analytics : A Social Network Approach
Giovanni Maria Sacco
The data we want
Data stories
One does not simply crowdsource the Semantic Web: 10 years with people, URIs,...
Using Maltego Tungsten to Explore Cyber-Physical Confluence in Geolocation
Tfsc disc 2014 si proposal (30 june2014)
Building better knowledge graphs through social computing
Social Νetworks Data Mining
The web of data: how are we doing so far?
Data Visualization
Physical-Cyber-Social Data Analytics & Smart City Applications
Ad

Viewers also liked (13)

PPTX
Introducing telemetrics
PPTX
Exploring the Global Demographics of Twitter
PDF
Digital methods for app analysis mapping app ecologies in the google play store
PPTX
Mapping Social TV Audiences: The Footprints of Leading Shows in the Austral...
PPTX
Diversification strategies corporate level strategies - Strategic managemen...
PPTX
PPTX
Concentric diversification diversification strategies - corporate level str...
PPT
Strategies in Action
PPT
Corporate level strategic alternatives
PPT
KFC Matrixes Analysis
PPT
Grand strategy
DOCX
Mc donald's - Comprehensive management review of McDonald in Pakistan
PPTX
Expansion strategies
Introducing telemetrics
Exploring the Global Demographics of Twitter
Digital methods for app analysis mapping app ecologies in the google play store
Mapping Social TV Audiences: The Footprints of Leading Shows in the Austral...
Diversification strategies corporate level strategies - Strategic managemen...
Concentric diversification diversification strategies - corporate level str...
Strategies in Action
Corporate level strategic alternatives
KFC Matrixes Analysis
Grand strategy
Mc donald's - Comprehensive management review of McDonald in Pakistan
Expansion strategies
Ad

Similar to Tweets are Not Created Equal. Intersecting Devices in the 1% Sample (20)

PPT
The evolution of research on social media
PDF
Twitter Intelligent Sensor Agent
PDF
Eavesdropping on the Twitter Microblogging Site
PDF
Categorize balanced dataset for troll detection
PDF
THE ANALYSIS FOR CUSTOMER REVIEWS THROUGH TWEETS, BASED ON DEEP LEARNING
PDF
IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...
PDF
IRJET- Tweet Segmentation and its Application to Named Entity Recognition
PDF
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
PDF
IRJET- Effective Countering of Communal Hatred During Disaster Events in Soci...
PDF
F017433947
PDF
Analyzing-Threat-Levels-of-Extremists-using-Tweets
PDF
DETECTION OF MALICIOUS SOCIAL BOTS USING ML TECHNIQUE IN TWITTER NETWORK
PPTX
Researching Social Media – Big Data and Social Media Analysis
PDF
[IJET V2I4P9] Authors: Praveen Jayasankar , Prashanth Jayaraman ,Rachel Hannah
PDF
IRJET - Social Media Intelligence Tools
PDF
What Your Tweets Tell Us About You, Speaker Notes
PDF
IRJET- An Improved Machine Learning for Twitter Breaking News Extraction ...
PPT
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
PDF
IRJET- Identification of Prevalent News from Twitter and Traditional Media us...
PDF
An ensemble approach for the identification and classification of crime tweet...
The evolution of research on social media
Twitter Intelligent Sensor Agent
Eavesdropping on the Twitter Microblogging Site
Categorize balanced dataset for troll detection
THE ANALYSIS FOR CUSTOMER REVIEWS THROUGH TWEETS, BASED ON DEEP LEARNING
IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...
IRJET- Tweet Segmentation and its Application to Named Entity Recognition
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET- Effective Countering of Communal Hatred During Disaster Events in Soci...
F017433947
Analyzing-Threat-Levels-of-Extremists-using-Tweets
DETECTION OF MALICIOUS SOCIAL BOTS USING ML TECHNIQUE IN TWITTER NETWORK
Researching Social Media – Big Data and Social Media Analysis
[IJET V2I4P9] Authors: Praveen Jayasankar , Prashanth Jayaraman ,Rachel Hannah
IRJET - Social Media Intelligence Tools
What Your Tweets Tell Us About You, Speaker Notes
IRJET- An Improved Machine Learning for Twitter Breaking News Extraction ...
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
IRJET- Identification of Prevalent News from Twitter and Traditional Media us...
An ensemble approach for the identification and classification of crime tweet...

Recently uploaded (20)

PDF
English Textual Question & Ans (12th Class).pdf
PDF
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 2).pdf
PDF
LIFE & LIVING TRILOGY - PART (3) REALITY & MYSTERY.pdf
PDF
Hazard Identification & Risk Assessment .pdf
PDF
Complications of Minimal Access-Surgery.pdf
PDF
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
PPTX
A powerpoint presentation on the Revised K-10 Science Shaping Paper
PDF
Empowerment Technology for Senior High School Guide
PDF
International_Financial_Reporting_Standa.pdf
PDF
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα
PDF
Environmental Education MCQ BD2EE - Share Source.pdf
PPTX
What’s under the hood: Parsing standardized learning content for AI
PDF
Mucosal Drug Delivery system_NDDS_BPHARMACY__SEM VII_PCI.pdf
PDF
AI-driven educational solutions for real-life interventions in the Philippine...
PDF
Uderstanding digital marketing and marketing stratergie for engaging the digi...
PDF
Skin Care and Cosmetic Ingredients Dictionary ( PDFDrive ).pdf
PPTX
Virtual and Augmented Reality in Current Scenario
PDF
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
PDF
advance database management system book.pdf
PPTX
Share_Module_2_Power_conflict_and_negotiation.pptx
English Textual Question & Ans (12th Class).pdf
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 2).pdf
LIFE & LIVING TRILOGY - PART (3) REALITY & MYSTERY.pdf
Hazard Identification & Risk Assessment .pdf
Complications of Minimal Access-Surgery.pdf
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
A powerpoint presentation on the Revised K-10 Science Shaping Paper
Empowerment Technology for Senior High School Guide
International_Financial_Reporting_Standa.pdf
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα
Environmental Education MCQ BD2EE - Share Source.pdf
What’s under the hood: Parsing standardized learning content for AI
Mucosal Drug Delivery system_NDDS_BPHARMACY__SEM VII_PCI.pdf
AI-driven educational solutions for real-life interventions in the Philippine...
Uderstanding digital marketing and marketing stratergie for engaging the digi...
Skin Care and Cosmetic Ingredients Dictionary ( PDFDrive ).pdf
Virtual and Augmented Reality in Current Scenario
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
advance database management system book.pdf
Share_Module_2_Power_conflict_and_negotiation.pptx

Tweets are Not Created Equal. Intersecting Devices in the 1% Sample

  • 1. Tweets are Not Created Equal Intersecting Devices in the 1% Sample Carolin Gerlitz & Bernhard Rieder IR15 - Boundaries & Intersections October 23, 2014
  • 2. Digging deeper into Twitter devices The Twitter API's 1% random sample can be used to explore, baseline, contextualize, verify, etc. (Gerlitz & Rieder 2013, Morstatter et al. 2014). How can we qualify individual elements in relation to a larger platform ecology? The presentation inquires more deeply into the role devices play on Twitter. We used a week-long random sample of tweets to further explore this aspect. (14.6.2014 - 20.6.2014, n = 31.707.162)
  • 3. Devices intersect use practices There has been a proliferation of very different devices (mobile, desktop, web, buttons, bots, etc.) from which people send their tweets. It's full of devices! Thinking Twitter as ecology of connected devices, we ask (1) how we can qualify devices and (2) how devices can enable us to unpack metrics for studying use cultures. Frequency based metrics suggest that the units they count are equivalent (e.g. tweets per time for a certain hashtag). Do we need to conceptualize devices as intervening variables?
  • 4. iPhone Tweetdeck Web client Tweetadder Instagram Tribez
  • 10. Devices & use practices Desktop clients (Web, Tweetdeck, etc.) are overrepresented in news conversations; Tweetdeck also points towards professional social media practices. The iPhone is the preferred microphone of the American teenager. Custom autopost clients (platforms, games, etc.) are engaged in activity loops. Automation clients (dlvr.it, IFTT, or Tweetadder) empower promotion, spam, hijacking, and syndication practices. Different devices have different capacities and enable different ways of engaging with the Twitter platform (posting, observing, responding, etc.).
  • 14. Devices intersect practices Tweets are not created equal. Devices imply different regimes of "being on Twitter" that are caught up in different perspectives, purposes, and politics. Twitter takes part in complex platform ecologies that mediate tweeting in different ways and are thus co-constitutive of practices. Devices intersect practices. For Internet researchers, this creates problems and opportunities. Devices as intervening variables can both skew and explain. Frequency counts that do not take into account devices are problematic: do 100K tweets from Tweetadder "mean" or "indicate" the same thing as 100K sent from the iPhone? They refer to different populations, practices, purposes, and politics.
  • 15. Conclusion Frequency counts are not comparable from the outset, but need to be made comparable by including devices in the interpretation. Devices need to be taken into account when sampling, cleaning, analyzing, and interpreting Twitter data. This kind of unpacking and repacking of components in the platform ecologies can be performed for various other elements. (cf. Bruns & Stieglitz 2013)
  • 16. Thank you. Carolin Gerlitz, [email protected], @cgrltz Bernhard Rieder, [email protected], @riederb DMI-TCAT (Borra & Rieder 2014), open source, available at: https://github.com/digitalmethodsinitiative/dmi-tcat

Editor's Notes

  • #2: This is work in progress We changed out title One aspect of a larger project on thinking about metrics in Twitter data analysis
  • #3: 68.747 different devices, specified by a field that API programmers have to fill out. Android + iPhone are a little more than 50% of all tweets. The 1% sample is 1 out of 100, thus what we look at are high volume spaces, no fringe practices here. For representativeness see: Morstatter, Fred, Jürgen Pfeffer, and Huan Liu. "When is it biased?: assessing the representativeness of twitter's streaming API." Proceedings of the companion publication of the 23rd international conference on World wide web companion. International World Wide Web Conferences Steering Committee, 2014.
  • #4: It’s full of devices! Previous approaches towards hashtag qualifications, focused on user composition Bruns, A. & Stieglitz, S., 2013. Towards more systematic Twitter analysis: metrics for tweeting activities. International Journal of Social Research Methodology, 16(2), pp.91–108.
  • #6: RT japanese stuff: I follow evryone Kudunews: celebrity
  • #7: Explain sanket as element of TCAT issue based hashtag based on english language and arabic tweets. points to youtube and news sources diversity of non-automated devices with a predominance of web clients and tweetdeck: points towards professional and news based practices
  • #8: #callmecam is an eventive mass activation hashtag which is the most frequently used hashtag on iPhone in the sampled week. It has been designed by the youtube celebrity Cameron Dallas to drive his fan’s engagement. Whoever tweets the hashtag gets the chance to win a call from Dallas. The hashtags it is connected are more expressive variations of #callmecam. The high volume of hashtags is due to the fact that many tweets use the hashtag several times. The hashtag is mainly driven by users tweeting from iPhone, which can be considered as stand in for teen practices. Shoutout hashtags drive up the frequency of hashtags, seeing them from a device perspective allows to approach what kind of users might be driving this frequency. Teenage girls rule the trending topics.
  • #9: #gameinsight has been a high-frequency hashtag since we started studying the one percent sample in January 2013. Engaging with it from a device perspective allows to understand the specific automated practices behind them. Tweets with #gameinsight are produced through autopost from games which are issued by default when users connect the game to their Twitter account. The tweets feature in-game achievements and contain links back to the Facebook app. Due to its volume, the hashtag is constantly hijacked by spammers who send users to different websites they seek to promote.
  • #10: The practice of hashtag hijacking becomes more apparent in the example of the hashtag love. It is being catered by a multiplicity of devices: most notably Instagram, the happy space and dlvr.it, which uses the hashtag #love to participate in its audience to promote two websites at high volume. Points to systematic hashtag hijacking for promoting retail websites.
  • #12: Looking at devices offers a insights on how web content is being shared and contextualised. Diverse set of devices link to NYT Among them: NYT websites appear in a spammy environments
  • #13: Youtube has been one of the most shared domain on Twitter, which is slowly taken over by vine. Diversity of devices, high degree of hashtags, even from mainstream clients. iPhone used to promote the work of youtube celebrity Nash Grier, friends with Cameron Dallas through the hashtag #nashnewvideo. The twees make up ¼ of all tweets send from iPhone mentioning Youtube, pointing again to organic use practices driven by teenage fans. YouTube autoposter appears as "Google" Arabic tweets indicate alternative news syndicates
  • #14: Diversity of devices, many automators Hootsuite: autopost listings, etc. hashtag intense environments Extensive support infrastructure in etsy communities on how to use Twitter for etsy promotion trailblazing: community specific promotional devices, fostered by support infrastructures and specific symposis Language specific use of devices