Research Data
Management @Harvard
Mercè Crosas, Ph.D. @mercecrosas
Chief Data Science andTechnology Officer
Institute for Quantitative Social Science
Harvard University
“Good data management is
not a goal in itself”
Wilkinson et al.,The FAIR Guiding Principles for Scientific Data Management and Stewardship, Nature Scientific Data, 2016
“Good data management is
not a goal in itself”
•Enables continuity of research projects
•Facilitates data sharing and re-use
•Reduces research and data storage costs
•Helps with data reproducibility
Connecting Computing Resources
and Data Management is critical
Harvard Data
Group
(established in 2016 by Office of
Vice-Provost of Research;
Tahmassian and Crosas)
Research
Computing Council
(established in 2016 by CIO; Margulies, Cuff)
Data (Crosas)
Access (Yockel)
Talent (Adair)
HMS Data
Management
Working Group
(established in 2014, grassroots)
•Office	
  of	
  Vice-­‐Provost	
  of	
  Research	
  
•IQSS	
  
•HUIT	
  
•HMS	
  IT	
  
•HU	
  Library	
  
•Countway	
  Library	
  
•HU	
  Office	
  of	
  Sponsored	
  Programs	
  
•HMS	
  Basic	
  Sciences	
  	
  
•HMS	
  Sponsored	
  Programs	
  Admin
2014 2017
•Countway	
  Library	
  
•HMS	
  IT	
  
•HMS	
  Basic	
  Sciences	
  	
  
•HMS	
  Sponsored	
  Programs	
  Admin	
  
•Harvard	
  Chan	
  BioinformaHcs	
  
•HMS	
  Research	
  CompuHng	
  
•HMS	
  Academic	
  and	
  Research	
  
Integrity
•HUIT	
  
•FAS	
  Research	
  
CompuHng	
  
•HMS	
  Research	
  
CompuHng	
  
•HBS	
  Research	
  
CompuHng	
  
•IQSS	
  
•HU	
  Library
Harvard Data Group
has concreteTasks
• Build a website for research data management @Harvard,
coordinating with all existing resources (Spring-Summer 2017)
• Create a research data management training module, with
custom modules for various research domains (2017-2018)
• Data User Agreements sub-group to coordinate DUA tracking
and workflows, as part of data management support (2017)
• More in the future
A Single Entry to Data Management
Avoids Confusion
• For researchers, not for
librarians, archivists, or trainers
• Cite scholarly work, evidence-
based studies
• Concise; point to other
resources as needed
• “good enough data
management”:
• what you need to know
• what Harvard can offer
• other resources you can use
RDM @Harvard will link to HMS Data
Management and Library sites
Data ManagementTraining offered for
Medical School & School of Public Health
• Organized by the HMS
Data Management
Working Group
• Based on the data lifecycle
for biomedical research
• Has been offered a few
times in 2016/2017
• Will be combined with
Harvard wide training
Extension of Harvard
Dataverse Curation Services
• Led by Sonia Barbosa (IQSS)
• 6 month pilot program with
Harvard librarians
• Offers extended curation
services to Harvard affiliates
(and all users, when possible)
• Evaluating cost-based model
• Plus, office hours once a week
Data Management Support is
not Sufficient
Research Computing & Security Support
Data Science Support
Data Management Support
LayersofSupport
DataFest 2017 Brings Data Science
BasicTraining to researchers and staff
More technology integration
and ease-of-use,
less training and support
Data Repositories can help
Integrate the Data Lifecycle
Dataverse is an open-source platform for building any
type of data repository, including institutional repositories.
A growing community of developers and users
http://dataverse.org
Agriculture*data**
Repository*in*
Fudan,*China*
Data*from*20*Universi>es*
Public*data*repository*
Science*Consor>um*
An Integrated Data Management and
Computing Solution
E Lab
Notebooks,
Instruments,
Surveys, …
Storage and
Computing
Journals
Data
Repository
Assign
Security Level,
DUA
DOI, Metadata,
DUA, Restrictions
Link
citations
DOI, metadata, and DUA are assigned after data collection;
Data repository enables data-centric computing
Track Provenance
Metadata
Machine-readable, FAIR Data Management
Plans can help track data management
THANKS!
Mercè Crosas @mercecrosas mercecrosas.com
With contributions from Caroline Shamu, Radhika Khetani, and Sonia Barbosa
In summary:
• Coordinate, coordinate, coordinate
(across groups)
• Integrate, integrate, integrate
(across technologies)

More Related Content

PDF
Practical Implementation of research data policies: Solutions with Dataverse
PDF
Managing and sharing confidential data in Australian social science
PPTX
Research data: publishers, policies and patient privacy
PDF
Valen Metadata and the [Data] Repository
PPTX
THOR Workshop - Data Publishing
PDF
NIH BD2K DataMed metadata model - Force11, 2016
PPTX
Towards Open Research
PDF
The blessing and the curse: handshaking between general and specialist data r...
Practical Implementation of research data policies: Solutions with Dataverse
Managing and sharing confidential data in Australian social science
Research data: publishers, policies and patient privacy
Valen Metadata and the [Data] Repository
THOR Workshop - Data Publishing
NIH BD2K DataMed metadata model - Force11, 2016
Towards Open Research
The blessing and the curse: handshaking between general and specialist data r...

What's hot (20)

PPTX
SciDataCon - How to increase accessibility and reuse for clinical and persona...
PPTX
Midwest Medical Library Association 2015 Big Data Panel
PPTX
FAIR for the future: embracing all things data
PDF
BioSharing - Update - Feb2016
PPTX
Collaboratively creating a network of ideas, data and software
PPTX
The Kaleidoscope of Impact: same data, different perspectives, constantly cha...
PDF
Strasser "Effective data management and its role in open research"
PPTX
Research data management workshop april12 2016
PPTX
The challenge of sharing data well, how publishers can help
PPTX
Stop press: should embargo conditions apply to metadata?
PPTX
Clarivate ERA Supplier rscd2018
PPTX
2017 05 03 Implementing Pure at UWA - ANDS Webinar Series
PPTX
Data management profiles workshop
PDF
THOR Workshop - Services PANGAEA
PPTX
The Data Management Ecosystem
PDF
Open Science: Research Data Management
PDF
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
PPTX
Lee "Supporting Research Data is a Group Effort"
PPTX
A National Approach to Open Data in Ireland: Publishers and Research Data Man...
PPTX
RDAP13 Elizabeth Moss: The impact of data reuse
SciDataCon - How to increase accessibility and reuse for clinical and persona...
Midwest Medical Library Association 2015 Big Data Panel
FAIR for the future: embracing all things data
BioSharing - Update - Feb2016
Collaboratively creating a network of ideas, data and software
The Kaleidoscope of Impact: same data, different perspectives, constantly cha...
Strasser "Effective data management and its role in open research"
Research data management workshop april12 2016
The challenge of sharing data well, how publishers can help
Stop press: should embargo conditions apply to metadata?
Clarivate ERA Supplier rscd2018
2017 05 03 Implementing Pure at UWA - ANDS Webinar Series
Data management profiles workshop
THOR Workshop - Services PANGAEA
The Data Management Ecosystem
Open Science: Research Data Management
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
Lee "Supporting Research Data is a Group Effort"
A National Approach to Open Data in Ireland: Publishers and Research Data Man...
RDAP13 Elizabeth Moss: The impact of data reuse
Ad

Similar to Research Data Management @Harvard (20)

PPTX
Practical Research Data Management: tools and approaches, pre- and post-award
PPTX
Winter school in research data science research data management - final
PDF
University of Hertfordshire researcher development - research data management
PPTX
Implementing Open Access: Effective Management of Your Research Data
PDF
The state of global research data initiatives: observations from a life on th...
PDF
Data Management for Quantitative Biology - Lecture 1, Apr 16, 2015
PPTX
Research Data management - Importance, Good Practices, Guidance
PPTX
Data Management for librarians
PPTX
Re tooling for data management-support
PPTX
Research Data Management at The University of Edinburgh
PPTX
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
PPTX
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
PPTX
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
PPTX
Research data support: a growth area for academic libraries?
PPTX
Managing data responsibly to enable research interity
PPTX
Introduction to RDM for Geoscience PhD Students
PPTX
RDAP14: University-wide Research Data Management Policy
PPTX
Data Management Plans: a gentle introduction
PPTX
RDMRose 1.1 The basics
PPSX
Managing Your Research Data for Maximum Impact -Rob Daley 300616_Shared
Practical Research Data Management: tools and approaches, pre- and post-award
Winter school in research data science research data management - final
University of Hertfordshire researcher development - research data management
Implementing Open Access: Effective Management of Your Research Data
The state of global research data initiatives: observations from a life on th...
Data Management for Quantitative Biology - Lecture 1, Apr 16, 2015
Research Data management - Importance, Good Practices, Guidance
Data Management for librarians
Re tooling for data management-support
Research Data Management at The University of Edinburgh
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research data support: a growth area for academic libraries?
Managing data responsibly to enable research interity
Introduction to RDM for Geoscience PhD Students
RDAP14: University-wide Research Data Management Policy
Data Management Plans: a gentle introduction
RDMRose 1.1 The basics
Managing Your Research Data for Maximum Impact -Rob Daley 300616_Shared
Ad

More from Merce Crosas (20)

PPTX
Cloud Dataverse: A Data repository platform for an OpenStack Cloud
PDF
Can data access combat fake news?
PDF
Data Repositories Impact
PDF
Dataverse, Cloud Dataverse, and DataTags
PDF
FAIR Data Management and FAIR Data Sharing
PDF
The Data Lifecycle (Harvard DataFest)
PDF
Cloud Dataverse
PDF
Making Data Accessible
PDF
Abcd iqs ssoftware-projects-mercecrosas
PDF
The DataTags System: Sharing Sensitive Data with Confidence
PDF
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
PDF
Connecting Dataverse with the Research Life Cycle
PDF
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
PPTX
A very Brief History of Communicating Science
PDF
Data Citation Implementation at Dataverse
PDF
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
PPTX
Dataverse on the MOC
PPTX
The Dataverse Commons
PPTX
Data Publishing at Harvard's Research Data Access Symposium
PDF
Dataverse hpdm symposium
Cloud Dataverse: A Data repository platform for an OpenStack Cloud
Can data access combat fake news?
Data Repositories Impact
Dataverse, Cloud Dataverse, and DataTags
FAIR Data Management and FAIR Data Sharing
The Data Lifecycle (Harvard DataFest)
Cloud Dataverse
Making Data Accessible
Abcd iqs ssoftware-projects-mercecrosas
The DataTags System: Sharing Sensitive Data with Confidence
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Connecting Dataverse with the Research Life Cycle
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
A very Brief History of Communicating Science
Data Citation Implementation at Dataverse
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Dataverse on the MOC
The Dataverse Commons
Data Publishing at Harvard's Research Data Access Symposium
Dataverse hpdm symposium

Recently uploaded (20)

PPTX
Crypto_Trading_Beginners.pptxxxxxxxxxxxxxx
PDF
©️ 02_SKU Automatic SW Robotics for Microsoft PC.pdf
PPT
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
PPTX
ai agent creaction with langgraph_presentation_
PPTX
Tapan_20220802057_Researchinternship_final_stage.pptx
PPTX
1 hour to get there before the game is done so you don’t need a car seat for ...
PPTX
recommendation Project PPT with details attached
PPTX
statsppt this is statistics ppt for giving knowledge about this topic
PDF
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
PDF
Tetra Pak Index 2023 - The future of health and nutrition - Full report.pdf
PDF
Loose-Leaf for Auditing & Assurance Services A Systematic Approach 11th ed. E...
PPTX
SET 1 Compulsory MNH machine learning intro
PPT
statistics analysis - topic 3 - describing data visually
PPTX
CHAPTER-2-THE-ACCOUNTING-PROCESS-2-4.pptx
PPTX
Statisticsccdxghbbnhhbvvvvvvvvvv. Dxcvvvhhbdzvbsdvvbbvv ccc
PPTX
The Data Security Envisioning Workshop provides a summary of an organization...
PDF
CS3352FOUNDATION OF DATA SCIENCE _1_MAterial.pdf
PPT
expt-design-lecture-12 hghhgfggjhjd (1).ppt
PDF
©️ 01_Algorithm for Microsoft New Product Launch - handling web site - by Ale...
PPT
DU, AIS, Big Data and Data Analytics.ppt
Crypto_Trading_Beginners.pptxxxxxxxxxxxxxx
©️ 02_SKU Automatic SW Robotics for Microsoft PC.pdf
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
ai agent creaction with langgraph_presentation_
Tapan_20220802057_Researchinternship_final_stage.pptx
1 hour to get there before the game is done so you don’t need a car seat for ...
recommendation Project PPT with details attached
statsppt this is statistics ppt for giving knowledge about this topic
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
Tetra Pak Index 2023 - The future of health and nutrition - Full report.pdf
Loose-Leaf for Auditing & Assurance Services A Systematic Approach 11th ed. E...
SET 1 Compulsory MNH machine learning intro
statistics analysis - topic 3 - describing data visually
CHAPTER-2-THE-ACCOUNTING-PROCESS-2-4.pptx
Statisticsccdxghbbnhhbvvvvvvvvvv. Dxcvvvhhbdzvbsdvvbbvv ccc
The Data Security Envisioning Workshop provides a summary of an organization...
CS3352FOUNDATION OF DATA SCIENCE _1_MAterial.pdf
expt-design-lecture-12 hghhgfggjhjd (1).ppt
©️ 01_Algorithm for Microsoft New Product Launch - handling web site - by Ale...
DU, AIS, Big Data and Data Analytics.ppt

Research Data Management @Harvard

  • 1. Research Data Management @Harvard Mercè Crosas, Ph.D. @mercecrosas Chief Data Science andTechnology Officer Institute for Quantitative Social Science Harvard University
  • 2. “Good data management is not a goal in itself” Wilkinson et al.,The FAIR Guiding Principles for Scientific Data Management and Stewardship, Nature Scientific Data, 2016
  • 3. “Good data management is not a goal in itself” •Enables continuity of research projects •Facilitates data sharing and re-use •Reduces research and data storage costs •Helps with data reproducibility
  • 4. Connecting Computing Resources and Data Management is critical Harvard Data Group (established in 2016 by Office of Vice-Provost of Research; Tahmassian and Crosas) Research Computing Council (established in 2016 by CIO; Margulies, Cuff) Data (Crosas) Access (Yockel) Talent (Adair) HMS Data Management Working Group (established in 2014, grassroots) •Office  of  Vice-­‐Provost  of  Research   •IQSS   •HUIT   •HMS  IT   •HU  Library   •Countway  Library   •HU  Office  of  Sponsored  Programs   •HMS  Basic  Sciences     •HMS  Sponsored  Programs  Admin 2014 2017 •Countway  Library   •HMS  IT   •HMS  Basic  Sciences     •HMS  Sponsored  Programs  Admin   •Harvard  Chan  BioinformaHcs   •HMS  Research  CompuHng   •HMS  Academic  and  Research   Integrity •HUIT   •FAS  Research   CompuHng   •HMS  Research   CompuHng   •HBS  Research   CompuHng   •IQSS   •HU  Library
  • 5. Harvard Data Group has concreteTasks • Build a website for research data management @Harvard, coordinating with all existing resources (Spring-Summer 2017) • Create a research data management training module, with custom modules for various research domains (2017-2018) • Data User Agreements sub-group to coordinate DUA tracking and workflows, as part of data management support (2017) • More in the future
  • 6. A Single Entry to Data Management Avoids Confusion • For researchers, not for librarians, archivists, or trainers • Cite scholarly work, evidence- based studies • Concise; point to other resources as needed • “good enough data management”: • what you need to know • what Harvard can offer • other resources you can use
  • 7. RDM @Harvard will link to HMS Data Management and Library sites
  • 8. Data ManagementTraining offered for Medical School & School of Public Health • Organized by the HMS Data Management Working Group • Based on the data lifecycle for biomedical research • Has been offered a few times in 2016/2017 • Will be combined with Harvard wide training
  • 9. Extension of Harvard Dataverse Curation Services • Led by Sonia Barbosa (IQSS) • 6 month pilot program with Harvard librarians • Offers extended curation services to Harvard affiliates (and all users, when possible) • Evaluating cost-based model • Plus, office hours once a week
  • 10. Data Management Support is not Sufficient Research Computing & Security Support Data Science Support Data Management Support LayersofSupport
  • 11. DataFest 2017 Brings Data Science BasicTraining to researchers and staff
  • 12. More technology integration and ease-of-use, less training and support
  • 13. Data Repositories can help Integrate the Data Lifecycle
  • 14. Dataverse is an open-source platform for building any type of data repository, including institutional repositories. A growing community of developers and users http://dataverse.org Agriculture*data** Repository*in* Fudan,*China* Data*from*20*Universi>es* Public*data*repository* Science*Consor>um*
  • 15. An Integrated Data Management and Computing Solution E Lab Notebooks, Instruments, Surveys, … Storage and Computing Journals Data Repository Assign Security Level, DUA DOI, Metadata, DUA, Restrictions Link citations DOI, metadata, and DUA are assigned after data collection; Data repository enables data-centric computing Track Provenance Metadata
  • 16. Machine-readable, FAIR Data Management Plans can help track data management
  • 17. THANKS! Mercè Crosas @mercecrosas mercecrosas.com With contributions from Caroline Shamu, Radhika Khetani, and Sonia Barbosa In summary: • Coordinate, coordinate, coordinate (across groups) • Integrate, integrate, integrate (across technologies)