Data Management for Citizen Science

Challenges & Opportunities for USGS Leadership


Andrea Wiggins
Postdoctoral Fellow
DataONE & Cornell Lab of Ornithology

12 September, 2012
USGS CDI Citizen Science workshop
DataONE PPSR Working Group
Purpose:
 • Improve quality, quantity, and accessibility of PPSR data
 • Advance integration of PPSR data in conventional science


Products:
 • Data Management Guide for PPSR - coming soon!
 • Articles in August FREE special issue
 • Data quality & validation paper




                                                               2
How long will it                      What is a data
          take to get                         management
         enough data?                            plan?
                                   Plan

                     Analyze                Collect       How can I assure
                                                        quality of volunteers’
 What tools
                                                                data?
  do I use?


               Integrate                          Assure

                                                            What data about
                                                           volunteers should I
Who can help
                                                             keep or share?
   me?
                     Discover               Describe

                                 Preserve         Should I share
          What if the data are                    raw data with
          used for commercial                     known errors?
                 profit?
How long will it                      What is a data
          take to get                         management
         enough data?                            plan?
                                   Plan

                     Analyze                Collect       How can I assure
                                                             quality of
 What tools
                                                         volunteers’ data?
  do I use?


               Integrate                          Assure

                                                            What data about
                                                           volunteers should
Who can help
                                                            I keep or share?
   me?
                     Discover               Describe

                                 Preserve         Should I share
          What if the data are                    raw data with
          used for commercial                     known errors?
                 profit?
Citizen science data challenges
Data policies

Cyberinfrastructure

Data quality




                                  5
Policy? What policy?
Data policies = boring




          http://www.flickr.com/photos/escapist/107455718/




                                                             6
Policy? What policy?
Data policies = boring

Data policies = hard
 • Ownership, sharing, use, access, challenge, etc.
 • Lots of decisions, vague consequences




                                                      7
Policy? What policy?
Data policies = boring

Data policies = hard
 • Ownership, sharing, use, access, challenge, etc.
 • Lots of decisions, vague consequences


Need examples of carefully crafted policies
 • Story of the data + policy that resulted
 • USGS is way ahead of the game!

                                                      8
Cyberinfrastructure
Technology is a major pain point




                                   9
Cyberinfrastructure
Technology is a major pain point

Platforms needed
  • Transcription, observation, processing
  • Ongoing support & development required




                                             10
Cyberinfrastructure
Technology is a major pain point

Platforms needed
  • Transcription, observation, processing
  • Ongoing support & development required

Who is going to pay?
 • <insert sound of crickets here>



                              http://www.flickr.com/photos/gravitywave/1303504847/   11
Data quality perceptions
No more reinvention
 • The data are as good as your project design
 • Reuse protocols & technologies
 • Replicability -> reliability




                                                 12
Data quality perceptions
No more reinvention
 • The data are as good as your project design
 • Reuse protocols & technologies
 • Replicability -> reliability


No more excuses
 • All scientific data have errors
 • Our data are just like yours...except we have more friends
 • Document data collection & QA/QC in excruciating detail


                                                                13
Survey says...




                 14
Survey says...
Least satisfied with current:
  • Process for sharing project data with colleagues,
    researchers, and/or participants
  • Ways of presenting project data/results to participants




                                                              15
Survey says...
Least satisfied with current:
  • Process for sharing project data with colleagues,
    researchers, and/or participants
  • Ways of presenting project data/results to participants

Better data management planning than average
 • 1/3 had NO data management plan at all!
 • Government-funded projects: yes, for some data




                                                              16
Survey says...
Tools & resources strongly desired across categories,
especially:
 • Analyzing & visualizing data
 • Documenting & describing data
 • Training




                                                        17
Survey says...
Tools & resources strongly desired across categories,
especially:
 • Analyzing & visualizing data
 • Documenting & describing data
 • Training


Top priorities for improvement (high agreement)
 1. Analyzing & visualizing data
 2. Documenting & describing data
 3. Long-term storage
 4. Establishing & updating data policies
                                                        18
Leading the way




                  19
Leading the way
Be an exemplar in data sharing & community building




                                                      20
Leading the way
Be an exemplar in data sharing & community building

Make your data policies easy to find & emulate




                                                      21
Leading the way
Be an exemplar in data sharing & community building

Make your data policies easy to find & emulate

Share your platforms with everyone, not just New Zealand!




                                                            22
Leading the way
Be an exemplar in data sharing & community building

Make your data policies easy to find & emulate

Share your platforms with everyone, not just New Zealand!

Make data quality obvious




                                                            23
Leading the way
Be an exemplar in data sharing & community building

Make your data policies easy to find & emulate

Share your platforms with everyone, not just New Zealand!

Make data quality obvious

USGS brings more credibility to citizen science


                                                            24
Thanks!
andrea.wiggins@cornell.edu
@AndreaWiggins

dataone.org
birds.cornell.edu
citizenscience.org
andreawiggins.com




                             25

More Related Content

PPT
The Evolving Landscape of Citizen Science
PDF
Crowdsourcing Scientific Work: A Comparative Study of Technologies, Processes...
PDF
Citizen Science 101: What Every Researcher Should Know About Crowdsourcing Sc...
PDF
Online Communities in Citizen Science
KEY
Open Source, Open Science, & Citizen Science
PDF
Online Communities in Citizen Science & BirdCams
PPTX
CHI2015 - Citizen Science || Zooniverse
The Evolving Landscape of Citizen Science
Crowdsourcing Scientific Work: A Comparative Study of Technologies, Processes...
Citizen Science 101: What Every Researcher Should Know About Crowdsourcing Sc...
Online Communities in Citizen Science
Open Source, Open Science, & Citizen Science
Online Communities in Citizen Science & BirdCams
CHI2015 - Citizen Science || Zooniverse

What's hot (20)

PPT
Citizen Science Phenotypes
PDF
Free as in Puppies: Compensating for ICT Constraints in Citizen Science
PDF
Data Intensive Collaboration in Science and Engineering: CSCW workshop themes
PDF
Little eScience
PDF
Crowdsourcing Science
PPTX
Citizen science
PPTX
4-H and Citizen Science Basics
PPTX
Engaging the software in research community
PPTX
Outcomes for citizen science at science centers
PPTX
Ian Thornhill Citizen Science Training Day
PPTX
What's up at Kno.e.sis?
PPTX
Why do citizen science at science centers?
PPTX
Activities for citizen science at science centers
PPTX
EPA 2013 Air Sensors Meeting Big Data Talk
PPT
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
PPTX
Knoesis Student Achievement
PDF
Developing Staff Competencies in Emerging Technologies
PDF
Newsletter 2013-fall
PPT
Supporting Libraries in Leading the Way in Research Data Management
Citizen Science Phenotypes
Free as in Puppies: Compensating for ICT Constraints in Citizen Science
Data Intensive Collaboration in Science and Engineering: CSCW workshop themes
Little eScience
Crowdsourcing Science
Citizen science
4-H and Citizen Science Basics
Engaging the software in research community
Outcomes for citizen science at science centers
Ian Thornhill Citizen Science Training Day
What's up at Kno.e.sis?
Why do citizen science at science centers?
Activities for citizen science at science centers
EPA 2013 Air Sensors Meeting Big Data Talk
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Knoesis Student Achievement
Developing Staff Competencies in Emerging Technologies
Newsletter 2013-fall
Supporting Libraries in Leading the Way in Research Data Management
Ad

Viewers also liked (20)

PDF
Crowdsourcing Citizen Science Data Quality with a Human-Computer Learning Net...
PDF
Code for Africa - Building Demand-driven + Citizen-focused Open Data Ecosystems
PDF
Enterprise 2.0 - Enabling change or part of the problem?
PPTX
SCC2013 - Citizen science - Helen Roy
PPTX
The Road to Identity 2.0
PPT
digital identity 2.0: how technology is transforming behaviours and raising c...
PPTX
National identity strategy presentation may 10, 2016
PPT
Canberra Executive Breakfast - A Citizen-Centric Approach to Identity
PDF
Trends in IRM: Internet of Things
PPTX
User Authentication for Government
PDF
The Rise of the Citizen Data Scientist
PDF
The connected economy mark skilton july 15 bright talk v2
PPTX
Digital Transformation: Connected API Ecosystems
PDF
Project Management 2.0
PDF
Humanity 2.0
PPT
New Zealand: Proactively Preparing for a More Sustainable Future
PPT
IR-website. Investor Relations. What to do online? Nov 2010 - eng
PPTX
World Economic Forum Global Risks 2015 Report - A Review
PPTX
A Collective, merit-based approach to Managing Workforce Adjustment, Canada
PDF
National Trade Facilitation Strategy and Roadmap
Crowdsourcing Citizen Science Data Quality with a Human-Computer Learning Net...
Code for Africa - Building Demand-driven + Citizen-focused Open Data Ecosystems
Enterprise 2.0 - Enabling change or part of the problem?
SCC2013 - Citizen science - Helen Roy
The Road to Identity 2.0
digital identity 2.0: how technology is transforming behaviours and raising c...
National identity strategy presentation may 10, 2016
Canberra Executive Breakfast - A Citizen-Centric Approach to Identity
Trends in IRM: Internet of Things
User Authentication for Government
The Rise of the Citizen Data Scientist
The connected economy mark skilton july 15 bright talk v2
Digital Transformation: Connected API Ecosystems
Project Management 2.0
Humanity 2.0
New Zealand: Proactively Preparing for a More Sustainable Future
IR-website. Investor Relations. What to do online? Nov 2010 - eng
World Economic Forum Global Risks 2015 Report - A Review
A Collective, merit-based approach to Managing Workforce Adjustment, Canada
National Trade Facilitation Strategy and Roadmap
Ad

Similar to Data Management for Citizen Science (20)

PPT
Informatics Transform : Re-engineering Libraries for the Data Decade
PDF
2012 Fall Data Management Planning Workshop
PPT
Grace Currie Ann Jebson First Things First
PPTX
Managing the research life cycle
PPT
Elag workshop sessie 1 en 2 v10
PDF
Data Management Planning - 02/21/13
PDF
From metadata to data curation: the role of libraries in data exchange
PDF
Digital Curation for Excel (DCXL)
PPTX
Big and Small Web Data
PPT
Evolving Roles in Scholarly Communications
PPT
The role of libraries in data exchange
PPT
Introduction to Research Data Management for postgraduate students
PPTX
Session 01 designing and scoping a data science project
PPTX
Session 01 designing and scoping a data science project
PPTX
RDAP13 Jared Lyle: Domain Repositories and Institutional Repositories Partn…
PPTX
Michener Plenary PPSR2012
PPT
Managing data throughout the research lifecycle
PDF
UC Santa Cruz: Data Management for Scientists
PPT
discopen
PPT
Where is the opportunity for libraries in the collaborative data infrastructure?
Informatics Transform : Re-engineering Libraries for the Data Decade
2012 Fall Data Management Planning Workshop
Grace Currie Ann Jebson First Things First
Managing the research life cycle
Elag workshop sessie 1 en 2 v10
Data Management Planning - 02/21/13
From metadata to data curation: the role of libraries in data exchange
Digital Curation for Excel (DCXL)
Big and Small Web Data
Evolving Roles in Scholarly Communications
The role of libraries in data exchange
Introduction to Research Data Management for postgraduate students
Session 01 designing and scoping a data science project
Session 01 designing and scoping a data science project
RDAP13 Jared Lyle: Domain Repositories and Institutional Repositories Partn…
Michener Plenary PPSR2012
Managing data throughout the research lifecycle
UC Santa Cruz: Data Management for Scientists
discopen
Where is the opportunity for libraries in the collaborative data infrastructure?

More from Andrea Wiggins (18)

PDF
With Great Data Comes Great Responsibility
PPTX
Mechanisms for Data Quality and Validation in Citizen Science
KEY
Open Source & Citizen Science
PPT
From Conservation to Crowdsourcing: A Typology of Citizen Science
PDF
Motivation by Design: Technologies, Experiences, and Incentives
PDF
Secondary data analysis with digital trace data
PPT
Reclassifying Success and Tragedy in FLOSS Projects
PPT
Intellectual Diversity in the iSchools: Past, Present and Future
PPT
Distributed Scientific Collaboration: Research Opportunities in Citizen Science
PPT
Designing Virtual Organizations for Citizen Science
PPT
National Park System Property Designations
PPT
Collaborative Data Analysis with Taverna Workflows
PPT
Tales of the Field: Building Small Science Cyberinfrastructure
PPT
Coordination Dynamics in Free/Libre and Open Source Software
PPT
Heartbeat: Measuring Active User Base and Potential User Interest
PPT
Replicating FLOSS Research as eResearch
PPT
Social dynamics of FLOSS team communication across channels
PPT
eResearch workflows for studying free and open source software development
With Great Data Comes Great Responsibility
Mechanisms for Data Quality and Validation in Citizen Science
Open Source & Citizen Science
From Conservation to Crowdsourcing: A Typology of Citizen Science
Motivation by Design: Technologies, Experiences, and Incentives
Secondary data analysis with digital trace data
Reclassifying Success and Tragedy in FLOSS Projects
Intellectual Diversity in the iSchools: Past, Present and Future
Distributed Scientific Collaboration: Research Opportunities in Citizen Science
Designing Virtual Organizations for Citizen Science
National Park System Property Designations
Collaborative Data Analysis with Taverna Workflows
Tales of the Field: Building Small Science Cyberinfrastructure
Coordination Dynamics in Free/Libre and Open Source Software
Heartbeat: Measuring Active User Base and Potential User Interest
Replicating FLOSS Research as eResearch
Social dynamics of FLOSS team communication across channels
eResearch workflows for studying free and open source software development

Recently uploaded (20)

PDF
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
PDF
MENA-ECEONOMIC-CONTEXT-VC MENA-ECEONOMIC
PPTX
Presentation - Principles of Instructional Design.pptx
PDF
The-2025-Engineering-Revolution-AI-Quality-and-DevOps-Convergence.pdf
PDF
Auditboard EB SOX Playbook 2023 edition.
PDF
Rapid Prototyping: A lecture on prototyping techniques for interface design
PDF
A symptom-driven medical diagnosis support model based on machine learning te...
PDF
CEH Module 2 Footprinting CEH V13, concepts
PDF
Decision Optimization - From Theory to Practice
PDF
SaaS reusability assessment using machine learning techniques
PDF
AI.gov: A Trojan Horse in the Age of Artificial Intelligence
PDF
Connector Corner: Transform Unstructured Documents with Agentic Automation
PDF
giants, standing on the shoulders of - by Daniel Stenberg
PDF
Data Virtualization in Action: Scaling APIs and Apps with FME
PPTX
AI-driven Assurance Across Your End-to-end Network With ThousandEyes
PDF
NewMind AI Weekly Chronicles – August ’25 Week IV
PDF
Build Real-Time ML Apps with Python, Feast & NoSQL
PPTX
Build automations faster and more reliably with UiPath ScreenPlay
PDF
“The Future of Visual AI: Efficient Multimodal Intelligence,” a Keynote Prese...
PPTX
Internet of Everything -Basic concepts details
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
MENA-ECEONOMIC-CONTEXT-VC MENA-ECEONOMIC
Presentation - Principles of Instructional Design.pptx
The-2025-Engineering-Revolution-AI-Quality-and-DevOps-Convergence.pdf
Auditboard EB SOX Playbook 2023 edition.
Rapid Prototyping: A lecture on prototyping techniques for interface design
A symptom-driven medical diagnosis support model based on machine learning te...
CEH Module 2 Footprinting CEH V13, concepts
Decision Optimization - From Theory to Practice
SaaS reusability assessment using machine learning techniques
AI.gov: A Trojan Horse in the Age of Artificial Intelligence
Connector Corner: Transform Unstructured Documents with Agentic Automation
giants, standing on the shoulders of - by Daniel Stenberg
Data Virtualization in Action: Scaling APIs and Apps with FME
AI-driven Assurance Across Your End-to-end Network With ThousandEyes
NewMind AI Weekly Chronicles – August ’25 Week IV
Build Real-Time ML Apps with Python, Feast & NoSQL
Build automations faster and more reliably with UiPath ScreenPlay
“The Future of Visual AI: Efficient Multimodal Intelligence,” a Keynote Prese...
Internet of Everything -Basic concepts details

Data Management for Citizen Science

  • 1. Data Management for Citizen Science Challenges & Opportunities for USGS Leadership Andrea Wiggins Postdoctoral Fellow DataONE & Cornell Lab of Ornithology 12 September, 2012 USGS CDI Citizen Science workshop
  • 2. DataONE PPSR Working Group Purpose: • Improve quality, quantity, and accessibility of PPSR data • Advance integration of PPSR data in conventional science Products: • Data Management Guide for PPSR - coming soon! • Articles in August FREE special issue • Data quality & validation paper 2
  • 3. How long will it What is a data take to get management enough data? plan? Plan Analyze Collect How can I assure quality of volunteers’ What tools data? do I use? Integrate Assure What data about volunteers should I Who can help keep or share? me? Discover Describe Preserve Should I share What if the data are raw data with used for commercial known errors? profit?
  • 4. How long will it What is a data take to get management enough data? plan? Plan Analyze Collect How can I assure quality of What tools volunteers’ data? do I use? Integrate Assure What data about volunteers should Who can help I keep or share? me? Discover Describe Preserve Should I share What if the data are raw data with used for commercial known errors? profit?
  • 5. Citizen science data challenges Data policies Cyberinfrastructure Data quality 5
  • 6. Policy? What policy? Data policies = boring http://www.flickr.com/photos/escapist/107455718/ 6
  • 7. Policy? What policy? Data policies = boring Data policies = hard • Ownership, sharing, use, access, challenge, etc. • Lots of decisions, vague consequences 7
  • 8. Policy? What policy? Data policies = boring Data policies = hard • Ownership, sharing, use, access, challenge, etc. • Lots of decisions, vague consequences Need examples of carefully crafted policies • Story of the data + policy that resulted • USGS is way ahead of the game! 8
  • 10. Cyberinfrastructure Technology is a major pain point Platforms needed • Transcription, observation, processing • Ongoing support & development required 10
  • 11. Cyberinfrastructure Technology is a major pain point Platforms needed • Transcription, observation, processing • Ongoing support & development required Who is going to pay? • <insert sound of crickets here> http://www.flickr.com/photos/gravitywave/1303504847/ 11
  • 12. Data quality perceptions No more reinvention • The data are as good as your project design • Reuse protocols & technologies • Replicability -> reliability 12
  • 13. Data quality perceptions No more reinvention • The data are as good as your project design • Reuse protocols & technologies • Replicability -> reliability No more excuses • All scientific data have errors • Our data are just like yours...except we have more friends • Document data collection & QA/QC in excruciating detail 13
  • 15. Survey says... Least satisfied with current: • Process for sharing project data with colleagues, researchers, and/or participants • Ways of presenting project data/results to participants 15
  • 16. Survey says... Least satisfied with current: • Process for sharing project data with colleagues, researchers, and/or participants • Ways of presenting project data/results to participants Better data management planning than average • 1/3 had NO data management plan at all! • Government-funded projects: yes, for some data 16
  • 17. Survey says... Tools & resources strongly desired across categories, especially: • Analyzing & visualizing data • Documenting & describing data • Training 17
  • 18. Survey says... Tools & resources strongly desired across categories, especially: • Analyzing & visualizing data • Documenting & describing data • Training Top priorities for improvement (high agreement) 1. Analyzing & visualizing data 2. Documenting & describing data 3. Long-term storage 4. Establishing & updating data policies 18
  • 20. Leading the way Be an exemplar in data sharing & community building 20
  • 21. Leading the way Be an exemplar in data sharing & community building Make your data policies easy to find & emulate 21
  • 22. Leading the way Be an exemplar in data sharing & community building Make your data policies easy to find & emulate Share your platforms with everyone, not just New Zealand! 22
  • 23. Leading the way Be an exemplar in data sharing & community building Make your data policies easy to find & emulate Share your platforms with everyone, not just New Zealand! Make data quality obvious 23
  • 24. Leading the way Be an exemplar in data sharing & community building Make your data policies easy to find & emulate Share your platforms with everyone, not just New Zealand! Make data quality obvious USGS brings more credibility to citizen science 24

Editor's Notes

  • #4: When it comes to the data life cycle that Bill mentioned yesterday, many scientists are grappling with questions about data management. Questions like... [READ OFF] These are just a few questions out of many that PPSR project leaders have discussed with me, but as you might have noticed, most of them are questions that are equally applicable to conventional scientific research.
  • #5: In fact, the only thing I can see that is truly unique about PPSR data is the involvement of volunteers. At the end of the day, data is data. So I hope it comes as some comfort for everyone here to know that there ’ s nothing unusual in these challenges, with the exception of needing to manage aspects of the data that are directly related to volunteers.