SlideShare a Scribd company logo
The Rise of Crowd Computing
Matt Lease
School of Information @mattlease
University of Texas at Austin ml@utexas.edu
Slides:
slideshare.net/mattlease
“The place where people & technology meet”
~ Wobbrock et al., 2009
“iSchools” now exist at 65 universities around the world
www.ischools.org
What’s an Information School?
2
Roadmap
• Motivation from Artificial Intelligence (AI)
– Need for Plentiful Labeled Data
– Need for Capabilities Beyond What AI Can Deliver
• The Rise of Crowd Computing
– 1st Wave: Crowd-based data labeling
• Mechanical Turk & Beyond
– 2nd Wave: Crowd-based Human Computation
• Delivering beyond state-of-the-art AI applications today
• Open Problems
3
Motivation 1:
AI effectiveness is often limited by training data size
Problem: creating labeled data is expensive!
Banko and Brill (2001)
Motivation 2: What do we do when
state-of-the-art AI isn’t good enough?
Crowdsourcing
@mattlease
Crowdsourcing
• Jeff Howe. Wired, June 2006.
• Take a job traditionally
performed by a known agent
(often an employee)
• Outsource it to an undefined,
generally large group of
people via an open call
7
The Rise of Crowd Computing (December 2015)
• Marketplace for paid crowd work (“micro-tasks”)
– Created in 2005 (remains in “beta” today)
• On-demand, scalable, 24/7 global workforce
• API lets human labor be integrated into software
– “You’ve heard of software-as-a-service. Now this is human-as-a-service.”
Amazon Mechanical Turk (MTurk)
Find Jim Gray (February 2007)
The Early Days
• S
• Artificial Intelligence, With Help From the Humans.
– J. Pontin. NY Times, March 25, 2007
• Is Amazon's Mechanical Turk a Failure? April 9, 2007
– “As of this writing, there are [only] 128 Human Intelligence
Tasks available via the Mechanical Turk task page.”
• Su et al., WWW 2007: “a web-based human data
collection system that we [call] ‘System M’ ”
11
The 1st Wave of Crowd Computing:
Data Collection via Crowdsourcing
@mattlease
2008
MTurk “Discovery” sparks rush for “gold” labels across areas
• Alonso et al., SIGIR Forum (Information Retrieval)
• Kittur et al., CHI (Human-Computer Interaction)
• Sorokin and Forsythe, CVPR (Computer Vision)
Snow et al, EMNLP (NLP)
• Annotating human language
• 22,000 labels for only US $26
• Crowd’s consensus labels can
replace traditional expert labels
NLP Example – Dialect Identification
14
See work by Chris Callison-Burch. Interface:
http://ugrad.cs.jhu.edu/~sjeblee/arabic-classification-plus.shtml
August 12, 2012 15
Social & Behavioral Sciences
• A Guide to Behavioral Experiments
on Mechanical Turk
– W. Mason and S. Suri (2010). SSRN online.
• Crowdsourcing for Human Subjects Research
– L. Schmidt (CrowdConf 2010)
• Crowdsourcing Content Analysis for Behavioral Research:
Insights from Mechanical Turk
– Conley & Tosti-Kharas (2010). Academy of Management
• Amazon's Mechanical Turk : A New Source of
Inexpensive, Yet High-Quality, Data?
– M. Buhrmester et al. (2011). Perspectives… 6(1):3-5.
– see also: Amazon Mechanical Turk Guide for Social Scientists
16
Beyond MTurk
@mattlease
Crowdsourcing ≠ Mechanical Turk!
Many Platforms for Paid Crowd Work
And More!
JobBoy, microWorkers, MiniFreelance,
MiniJobz, MinuteWorkers, MyEasyTask,
OpTask, ShortTask, SimpleWorkers
Why Eytan Adar hates MTurk Research
(CHI 2011 CHC Workshop)
• Overly-narrow research focus on MTurk
– Distinguish general vs. platform-specific problems
– Distinguish research vs. industry concerns
• Should researchers really focus on…
– “...writing the user’s manual for MTurk ...”?
– “…struggl[ing] against the limits of the platform...”?
“…by rewarding quick demonstrations of the tool’s
use, we fail to attain a deeper understanding of the
problems to which it is applied…”
Beyond Mechanical Turk: An Analysis of
Paid Crowd Work Platforms
Vakharia and Lease, iConference 2015
Qualitative assessment of 7 platforms for paid crowd work
22
Crowdsourcing Transcription Beyond
Mechanical Turk
With Haofeng Zhou & Denys Baskov
HCOMP 2013 Speech Workshop
A lot of volunteer, Citizen Science too!
Citizen science in the early internet (2000-2001)
24
www.nasaclickworkers.com
Zooniverse
25
Crowd4U (Another Open Platform)
26
27
ESP Game (Gamification)
L. Von Ahn and L. Dabbish (2004)
28
reCaptcha (Repurpose Existing Activity)
von Ahn et al. (2008). In Science.
29
DuoLingo (Education)
30
Tracking Sentiment (Access Resource)
Brew et al., PAIS 2010
• Volunteer-crowd
– Work in exchange for
access to rich content
• Never-ending Learning
– Continual model
updates as what is
relevant vs. not
changes over time
31
Beat the Machine (Earn Money)
32
The 2nd Wave of Crowd Computing:
Human Computation
@mattlease
What is a Computer?
34
Princeton University Press, 2005
• What was old is new
• Crowdsourcing: A New
Branch of Computer Science
– D.A. Grier, IEEE President
• Tabulating the heavens:
computing the Nautical
Almanac in 18th-century
England
– M. Croarken (2003)
35
The Mechanical Turk
The original, constructed and
unveiled in 1770 by Wolfgang
von Kempelen (1734–1804)
36
J. Pontin. Artificial Intelligence, With Help From
the Humans. New York Times (March 25, 2007)
The Human Processing Unit (HPU)
Davis et al. (2010)
HPU
37
ACM Queue, May 2006
38
“Software developers with innovative ideas for
businesses and technologies are constrained by the
limits of artificial intelligence… If software developers
could programmatically access and incorporate human
intelligence into their applications, a whole new class
of innovative businesses and applications would be
possible. This is the goal of Amazon Mechanical Turk…
people are freer to innovate because they can now
imbue software with real human intelligence.”
Creating A New Class of
Intelligent Applications
39@mattlease
“Amazon Remembers”
40
PlateMate (Noronha et al., UIST’10)
41
Ethics Checking: The Next Frontier?
• Mark Johnson’s address at ACL 2003
– Transcript in Conduit 12(2) 2003
• Think how useful a little “ethics checker and
corrector” program integrated into a word
processor could be!
42
Soylent: A Word Processor with a Crowd Inside
• Bernstein et al., UIST 2010
43
fold.it
S. Cooper et al. (2010)
Alice G. Walton. Online Gamers Help Solve Mystery of
Critical AIDS Virus Enzyme. The Atlantic, October 8, 2011.
44
MonoTrans:
Translation by monolingual speakers
45
• Bederson et al.,
2010
• See also: Morita & Ishidi, ACM IUI 2009
VizWiz aaaaaaaa
Bingham et al. (UIST 2010)
46Matt Lease - ml@ischool.utexas.edu
Zensors
Laput et al., CSCW 2015
47
Flock: Hybrid Crowd-Machine Learning
Classifiers (2015)
48
@mattlease
HCOMP 2013 Panel
Anand Kulkarni: “How do we
dramatically reduce the complexity of
getting work done with the crowd?”
Greg Little: How can we post a task and
with 98% confidence know we’ll get a
quality result?
50
How to ensure data quality?
• Research on statistical quality control methods
– Online vs. offline, feature-based vs. content-agnostic
– Worker calibration, noise vs. bias, weighted voting
• Human factors matter too!
– Instructions, design, interface, interaction
– Names, relationship, reputation Fair pay, hourly vs.
per-task, recognition, advancement
51
52
SQUARE:
A Benchmark
for Research on
Computing
Crowd
Consensus
@HCOMP’13
ir.ischool.utexas.edu/square
(open source)
Is everyone just lazy, stupid, or deceitful?!?
Many published papers seem to suggest this
• “Cheaters”
• “Fraudsters”
• “Lazy Turkers”
• “Scammers”
• “Spammers”
But why can’t the workers just get it
right to begin with?
53
What is our responsibility?
• Ill-defined/incomplete/ambiguous/subjective task?
• Confusing, difficult, or unusable interface?
• Incomplete or unclear instructions?
• Insufficient or unhelpful examples given?
• Gold standard with low or unknown inter-assessor
agreement (i.e. measurement error in assessing
response quality)?
• Task design matters! (garbage in = garbage out)
– Report it for review, completeness, & reproducibility
54
Task Decomposition & Workflow Design
55
What about context?
“Best practices” for crowdsourcing design often
minimizes context to maximize task efficiency
– e.g. “Are these pictures of the same person?”
56
Importance of Informed Consent +
Potential for Oppression, Crime, & War
Jonathan Zittrain, Minds for Sale
57
Who are
the workers?
• A. Baio, November 2008. The Faces of Mechanical Turk.
• P. Ipeirotis. March 2010. The New Demographics of
Mechanical Turk
• J. Ross, et al. Who are the Crowdworkers? CHI 2010.
58
What about ethics?
• Silberman, Irani, and Ross (2010)
– “How should we… conceptualize the role of these people
who we ask to power our computing?”
• Irani and Silberman (2013)
– “…by hiding workers behind web forms and APIs…
employers see themselves as builders of innovative
technologies, rather than… unconcerned with working
conditions… redirecting focus to the innovation of human
computation as a field of technological achievement.”
• Fort, Adda, and Cohen (2011)
– “…opportunities for our community to deliberately
value ethics above cost savings.” 59
Digital Dirty Jobs
• The Googler who Looked at the Worst of the Internet
• Policing the Web’s Lurid Precincts
• Facebook content moderation
• The dirty job of keeping Facebook clean
• Even linguistic annotators report stress &
nightmares from reading news articles!
60
What about freedom?
• Crowdsourcing vision: empowering freedom
– work whenever you want for whomever you want
• Risk: people being compelled to perform work
– Digital sweat shops? Digital slaves?
– Chinese Prisoners used for online gold farming
– We really don’t know (and need to learn more…)
– Traction? Human Trafficking at MSR Summit’12
61
The Future of Crowd Work
Paper @ CSCW 2013 by
Kittur, Nickerson, Bernstein, Gerber,
Shaw, Zimmerman, Lease, and Horton 62
Conclusion
• Crowdsourcing is quickly transforming practice
in industry and academia via greater efficiency
• Human Computation enables a new design
space for applications, augmenting state-of-the-
art AI with human computation to offer
new capabilities and user experiences
• With people at the center of this new computing
paradigm, important research questions span
both technological and social/societal challenges
Matt Lease - ml@utexas.edu - @mattlease
Thank You!
ir.ischool.utexas.edu/crowd
Slides: slideshare.net/mattlease

More Related Content

PDF
The Rise of Crowd Computing - 2016
PDF
Beyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms
PDF
Crowd Computing: Opportunities & Challenges (IJCNLP 2011 Keynote)
PDF
Crowdsourcing for Search Evaluation and Social-Algorithmic Search
PDF
The Rise of Crowd Computing (July 7, 2016)
PDF
Toward Better Crowdsourcing Science
PDF
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
PDF
But Who Protects the Moderators?
The Rise of Crowd Computing - 2016
Beyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms
Crowd Computing: Opportunities & Challenges (IJCNLP 2011 Keynote)
Crowdsourcing for Search Evaluation and Social-Algorithmic Search
The Rise of Crowd Computing (July 7, 2016)
Toward Better Crowdsourcing Science
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
But Who Protects the Moderators?

What's hot (20)

PDF
UT Dallas CS - Rise of Crowd Computing
PPTX
Service thinking cases 20130702 v1
PPTX
Future of learning 20180425 v1
PPTX
20220203 jim spohrer uidp v11
PPTX
20220203 jim spohrer purdue v12
PPTX
Data urban service science 20130617 v2
PPTX
WWWW 2010 Conference - ICT4D 2.0
PPTX
Colombia 20140326 v1
PPTX
People's Interactions with Cognitive Assistants for Enhanced Performance
PPTX
Korea day2 tutorial 20161014 v6
PDF
Taking on a New Leadership Challenge: Student-Focused Learning in Artificial ...
PPTX
20210322 jim spohrer eaae deans summit v13
PPTX
Cases for chesbrough 201304122 v2
PDF
Machine Learning - Where to Next?, May 2015
PDF
The Potential and Challenges of Today's AI
PPTX
Ten reasons 20130621 v3
PPTX
Machine Learning for Web Data
PPT
Issip 2014028 v3
PDF
The Search for Truth in Objective & Subject Crowdsourcing
PPTX
Aspen forum 20140619 v5
UT Dallas CS - Rise of Crowd Computing
Service thinking cases 20130702 v1
Future of learning 20180425 v1
20220203 jim spohrer uidp v11
20220203 jim spohrer purdue v12
Data urban service science 20130617 v2
WWWW 2010 Conference - ICT4D 2.0
Colombia 20140326 v1
People's Interactions with Cognitive Assistants for Enhanced Performance
Korea day2 tutorial 20161014 v6
Taking on a New Leadership Challenge: Student-Focused Learning in Artificial ...
20210322 jim spohrer eaae deans summit v13
Cases for chesbrough 201304122 v2
Machine Learning - Where to Next?, May 2015
The Potential and Challenges of Today's AI
Ten reasons 20130621 v3
Machine Learning for Web Data
Issip 2014028 v3
The Search for Truth in Objective & Subject Crowdsourcing
Aspen forum 20140619 v5
Ad

Similar to The Rise of Crowd Computing (December 2015) (20)

PDF
Crowdsourcing & Human Computation Labeling Data & Building Hybrid Systems
PDF
Rise of Crowd Computing (December 2012)
PDF
Metrocon-Rise-Of-Crowd-Computing
PDF
許永真/Crowd Computing for Big and Deep AI
PDF
Crowdsourcing & ethics: a few thoughts and refences.
PDF
Crowdsourcing: From Aggregation to Search Engine Evaluation
PPTX
Human Computation for Big Data
PDF
Workshop on Crowd-sourcing
PDF
AI & Work, with Transparency & the Crowd
PPTX
Crowdsourcing for Online Data Collection
PPTX
Human computation, crowdsourcing and social: An industrial perspective
PDF
CUbRIK tutorial at ICWE 2013: part 1 Introduction to Human Computation
PDF
Sort joinpvldb12
PPTX
An Introduction to Human Computation and Games With A Purpose - Part I
PPTX
Metadata in a Crowd: Shared Knowledge Production
KEY
Crowd
PPTX
Human Computation and Crowdsourcing for Information Systems
PDF
Did you mean crowdsourcing for recommender systems?
PDF
Tutorial: Social Semantic Web and Crowdsourcing - E. Simperl - ESWC SS 2014
PDF
Social machines: theory design and incentives
Crowdsourcing & Human Computation Labeling Data & Building Hybrid Systems
Rise of Crowd Computing (December 2012)
Metrocon-Rise-Of-Crowd-Computing
許永真/Crowd Computing for Big and Deep AI
Crowdsourcing & ethics: a few thoughts and refences.
Crowdsourcing: From Aggregation to Search Engine Evaluation
Human Computation for Big Data
Workshop on Crowd-sourcing
AI & Work, with Transparency & the Crowd
Crowdsourcing for Online Data Collection
Human computation, crowdsourcing and social: An industrial perspective
CUbRIK tutorial at ICWE 2013: part 1 Introduction to Human Computation
Sort joinpvldb12
An Introduction to Human Computation and Games With A Purpose - Part I
Metadata in a Crowd: Shared Knowledge Production
Crowd
Human Computation and Crowdsourcing for Information Systems
Did you mean crowdsourcing for recommender systems?
Tutorial: Social Semantic Web and Crowdsourcing - E. Simperl - ESWC SS 2014
Social machines: theory design and incentives
Ad

More from Matthew Lease (18)

PDF
Automated Models for Quantifying Centrality of Survey Responses
PDF
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
PDF
Explainable Fact Checking with Humans in-the-loop
PDF
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...
PDF
Designing Human-AI Partnerships to Combat Misinfomation
PDF
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
PDF
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
PDF
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
PDF
Fact Checking & Information Retrieval
PDF
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
PDF
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
PDF
Systematic Review is e-Discovery in Doctor’s Clothing
PDF
Toward Effective and Sustainable Online Crowd Work
PDF
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
PDF
Crowdsourcing Transcription Beyond Mechanical Turk
PDF
Crowdsourcing for Information Retrieval: From Statistics to Ethics
PDF
Mechanical Turk is Not Anonymous
PDF
UT Austin @ TREC 2012 Crowdsourcing Track: Image Relevance Assessment Task (I...
Automated Models for Quantifying Centrality of Survey Responses
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
Explainable Fact Checking with Humans in-the-loop
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...
Designing Human-AI Partnerships to Combat Misinfomation
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
Fact Checking & Information Retrieval
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Systematic Review is e-Discovery in Doctor’s Clothing
Toward Effective and Sustainable Online Crowd Work
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
Crowdsourcing Transcription Beyond Mechanical Turk
Crowdsourcing for Information Retrieval: From Statistics to Ethics
Mechanical Turk is Not Anonymous
UT Austin @ TREC 2012 Crowdsourcing Track: Image Relevance Assessment Task (I...

Recently uploaded (20)

PPTX
Spectroscopy.pptx food analysis technology
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
August Patch Tuesday
PDF
Encapsulation theory and applications.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPT
Teaching material agriculture food technology
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
TLE Review Electricity (Electricity).pptx
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PDF
Empathic Computing: Creating Shared Understanding
PDF
Approach and Philosophy of On baking technology
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
A comparative analysis of optical character recognition models for extracting...
PPTX
A Presentation on Artificial Intelligence
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
Spectroscopy.pptx food analysis technology
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
August Patch Tuesday
Encapsulation theory and applications.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Teaching material agriculture food technology
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
TLE Review Electricity (Electricity).pptx
Building Integrated photovoltaic BIPV_UPV.pdf
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
Empathic Computing: Creating Shared Understanding
Approach and Philosophy of On baking technology
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
A comparative analysis of optical character recognition models for extracting...
A Presentation on Artificial Intelligence
Group 1 Presentation -Planning and Decision Making .pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Univ-Connecticut-ChatGPT-Presentaion.pdf
Programs and apps: productivity, graphics, security and other tools

The Rise of Crowd Computing (December 2015)

  • 1. The Rise of Crowd Computing Matt Lease School of Information @mattlease University of Texas at Austin [email protected] Slides: slideshare.net/mattlease
  • 2. “The place where people & technology meet” ~ Wobbrock et al., 2009 “iSchools” now exist at 65 universities around the world www.ischools.org What’s an Information School? 2
  • 3. Roadmap • Motivation from Artificial Intelligence (AI) – Need for Plentiful Labeled Data – Need for Capabilities Beyond What AI Can Deliver • The Rise of Crowd Computing – 1st Wave: Crowd-based data labeling • Mechanical Turk & Beyond – 2nd Wave: Crowd-based Human Computation • Delivering beyond state-of-the-art AI applications today • Open Problems 3
  • 4. Motivation 1: AI effectiveness is often limited by training data size Problem: creating labeled data is expensive! Banko and Brill (2001)
  • 5. Motivation 2: What do we do when state-of-the-art AI isn’t good enough?
  • 7. Crowdsourcing • Jeff Howe. Wired, June 2006. • Take a job traditionally performed by a known agent (often an employee) • Outsource it to an undefined, generally large group of people via an open call 7
  • 9. • Marketplace for paid crowd work (“micro-tasks”) – Created in 2005 (remains in “beta” today) • On-demand, scalable, 24/7 global workforce • API lets human labor be integrated into software – “You’ve heard of software-as-a-service. Now this is human-as-a-service.” Amazon Mechanical Turk (MTurk)
  • 10. Find Jim Gray (February 2007)
  • 11. The Early Days • S • Artificial Intelligence, With Help From the Humans. – J. Pontin. NY Times, March 25, 2007 • Is Amazon's Mechanical Turk a Failure? April 9, 2007 – “As of this writing, there are [only] 128 Human Intelligence Tasks available via the Mechanical Turk task page.” • Su et al., WWW 2007: “a web-based human data collection system that we [call] ‘System M’ ” 11
  • 12. The 1st Wave of Crowd Computing: Data Collection via Crowdsourcing @mattlease
  • 13. 2008 MTurk “Discovery” sparks rush for “gold” labels across areas • Alonso et al., SIGIR Forum (Information Retrieval) • Kittur et al., CHI (Human-Computer Interaction) • Sorokin and Forsythe, CVPR (Computer Vision) Snow et al, EMNLP (NLP) • Annotating human language • 22,000 labels for only US $26 • Crowd’s consensus labels can replace traditional expert labels
  • 14. NLP Example – Dialect Identification 14 See work by Chris Callison-Burch. Interface: http://ugrad.cs.jhu.edu/~sjeblee/arabic-classification-plus.shtml
  • 16. Social & Behavioral Sciences • A Guide to Behavioral Experiments on Mechanical Turk – W. Mason and S. Suri (2010). SSRN online. • Crowdsourcing for Human Subjects Research – L. Schmidt (CrowdConf 2010) • Crowdsourcing Content Analysis for Behavioral Research: Insights from Mechanical Turk – Conley & Tosti-Kharas (2010). Academy of Management • Amazon's Mechanical Turk : A New Source of Inexpensive, Yet High-Quality, Data? – M. Buhrmester et al. (2011). Perspectives… 6(1):3-5. – see also: Amazon Mechanical Turk Guide for Social Scientists 16
  • 19. Many Platforms for Paid Crowd Work And More! JobBoy, microWorkers, MiniFreelance, MiniJobz, MinuteWorkers, MyEasyTask, OpTask, ShortTask, SimpleWorkers
  • 20. Why Eytan Adar hates MTurk Research (CHI 2011 CHC Workshop) • Overly-narrow research focus on MTurk – Distinguish general vs. platform-specific problems – Distinguish research vs. industry concerns • Should researchers really focus on… – “...writing the user’s manual for MTurk ...”? – “…struggl[ing] against the limits of the platform...”? “…by rewarding quick demonstrations of the tool’s use, we fail to attain a deeper understanding of the problems to which it is applied…”
  • 21. Beyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms Vakharia and Lease, iConference 2015 Qualitative assessment of 7 platforms for paid crowd work
  • 22. 22 Crowdsourcing Transcription Beyond Mechanical Turk With Haofeng Zhou & Denys Baskov HCOMP 2013 Speech Workshop
  • 23. A lot of volunteer, Citizen Science too!
  • 24. Citizen science in the early internet (2000-2001) 24 www.nasaclickworkers.com
  • 26. Crowd4U (Another Open Platform) 26
  • 27. 27
  • 28. ESP Game (Gamification) L. Von Ahn and L. Dabbish (2004) 28
  • 29. reCaptcha (Repurpose Existing Activity) von Ahn et al. (2008). In Science. 29
  • 31. Tracking Sentiment (Access Resource) Brew et al., PAIS 2010 • Volunteer-crowd – Work in exchange for access to rich content • Never-ending Learning – Continual model updates as what is relevant vs. not changes over time 31
  • 32. Beat the Machine (Earn Money) 32
  • 33. The 2nd Wave of Crowd Computing: Human Computation @mattlease
  • 34. What is a Computer? 34
  • 35. Princeton University Press, 2005 • What was old is new • Crowdsourcing: A New Branch of Computer Science – D.A. Grier, IEEE President • Tabulating the heavens: computing the Nautical Almanac in 18th-century England – M. Croarken (2003) 35
  • 36. The Mechanical Turk The original, constructed and unveiled in 1770 by Wolfgang von Kempelen (1734–1804) 36 J. Pontin. Artificial Intelligence, With Help From the Humans. New York Times (March 25, 2007)
  • 37. The Human Processing Unit (HPU) Davis et al. (2010) HPU 37
  • 38. ACM Queue, May 2006 38 “Software developers with innovative ideas for businesses and technologies are constrained by the limits of artificial intelligence… If software developers could programmatically access and incorporate human intelligence into their applications, a whole new class of innovative businesses and applications would be possible. This is the goal of Amazon Mechanical Turk… people are freer to innovate because they can now imbue software with real human intelligence.”
  • 39. Creating A New Class of Intelligent Applications 39@mattlease
  • 41. PlateMate (Noronha et al., UIST’10) 41
  • 42. Ethics Checking: The Next Frontier? • Mark Johnson’s address at ACL 2003 – Transcript in Conduit 12(2) 2003 • Think how useful a little “ethics checker and corrector” program integrated into a word processor could be! 42
  • 43. Soylent: A Word Processor with a Crowd Inside • Bernstein et al., UIST 2010 43
  • 44. fold.it S. Cooper et al. (2010) Alice G. Walton. Online Gamers Help Solve Mystery of Critical AIDS Virus Enzyme. The Atlantic, October 8, 2011. 44
  • 45. MonoTrans: Translation by monolingual speakers 45 • Bederson et al., 2010 • See also: Morita & Ishidi, ACM IUI 2009
  • 46. VizWiz aaaaaaaa Bingham et al. (UIST 2010) 46Matt Lease - [email protected]
  • 47. Zensors Laput et al., CSCW 2015 47
  • 48. Flock: Hybrid Crowd-Machine Learning Classifiers (2015) 48
  • 50. HCOMP 2013 Panel Anand Kulkarni: “How do we dramatically reduce the complexity of getting work done with the crowd?” Greg Little: How can we post a task and with 98% confidence know we’ll get a quality result? 50
  • 51. How to ensure data quality? • Research on statistical quality control methods – Online vs. offline, feature-based vs. content-agnostic – Worker calibration, noise vs. bias, weighted voting • Human factors matter too! – Instructions, design, interface, interaction – Names, relationship, reputation Fair pay, hourly vs. per-task, recognition, advancement 51
  • 52. 52 SQUARE: A Benchmark for Research on Computing Crowd Consensus @HCOMP’13 ir.ischool.utexas.edu/square (open source)
  • 53. Is everyone just lazy, stupid, or deceitful?!? Many published papers seem to suggest this • “Cheaters” • “Fraudsters” • “Lazy Turkers” • “Scammers” • “Spammers” But why can’t the workers just get it right to begin with? 53
  • 54. What is our responsibility? • Ill-defined/incomplete/ambiguous/subjective task? • Confusing, difficult, or unusable interface? • Incomplete or unclear instructions? • Insufficient or unhelpful examples given? • Gold standard with low or unknown inter-assessor agreement (i.e. measurement error in assessing response quality)? • Task design matters! (garbage in = garbage out) – Report it for review, completeness, & reproducibility 54
  • 55. Task Decomposition & Workflow Design 55
  • 56. What about context? “Best practices” for crowdsourcing design often minimizes context to maximize task efficiency – e.g. “Are these pictures of the same person?” 56
  • 57. Importance of Informed Consent + Potential for Oppression, Crime, & War Jonathan Zittrain, Minds for Sale 57
  • 58. Who are the workers? • A. Baio, November 2008. The Faces of Mechanical Turk. • P. Ipeirotis. March 2010. The New Demographics of Mechanical Turk • J. Ross, et al. Who are the Crowdworkers? CHI 2010. 58
  • 59. What about ethics? • Silberman, Irani, and Ross (2010) – “How should we… conceptualize the role of these people who we ask to power our computing?” • Irani and Silberman (2013) – “…by hiding workers behind web forms and APIs… employers see themselves as builders of innovative technologies, rather than… unconcerned with working conditions… redirecting focus to the innovation of human computation as a field of technological achievement.” • Fort, Adda, and Cohen (2011) – “…opportunities for our community to deliberately value ethics above cost savings.” 59
  • 60. Digital Dirty Jobs • The Googler who Looked at the Worst of the Internet • Policing the Web’s Lurid Precincts • Facebook content moderation • The dirty job of keeping Facebook clean • Even linguistic annotators report stress & nightmares from reading news articles! 60
  • 61. What about freedom? • Crowdsourcing vision: empowering freedom – work whenever you want for whomever you want • Risk: people being compelled to perform work – Digital sweat shops? Digital slaves? – Chinese Prisoners used for online gold farming – We really don’t know (and need to learn more…) – Traction? Human Trafficking at MSR Summit’12 61
  • 62. The Future of Crowd Work Paper @ CSCW 2013 by Kittur, Nickerson, Bernstein, Gerber, Shaw, Zimmerman, Lease, and Horton 62
  • 63. Conclusion • Crowdsourcing is quickly transforming practice in industry and academia via greater efficiency • Human Computation enables a new design space for applications, augmenting state-of-the- art AI with human computation to offer new capabilities and user experiences • With people at the center of this new computing paradigm, important research questions span both technological and social/societal challenges
  • 64. Matt Lease - [email protected] - @mattlease Thank You! ir.ischool.utexas.edu/crowd Slides: slideshare.net/mattlease