SlideShare a Scribd company logo
INTRODUCTION TO
      DATA MIGRATION

                    Trish Rose-Sandler
                    VRA Conference Kansas City, MO
                    March 28 2007




Definition

  “Data migration is the transferring of data
  between storage types, formats, or computer
  systems. Data migration is usually performed
  programmatically to achieve an automated
  migration, freeing up human resources from
  tedious tasks. It is required when organizations
  or individuals change computer systems or
  upgrade to new systems.”
                            Wikipedia 2/21/07
3 Stages of Data Migration

 Pre Migration

 Migration

 Post Migration




3 Stages of Data Migration

 Pre Migration – Analyzing, Mapping,
 Normalizing/Transforming, Testing, Backup

 Migration

 Post Migration – Quality Control, Cleanup,
 Update Cataloging Guidelines
Pre-Migration: Data Analysis

Evaluate data for:
 Consistency
 Unnecessary redundancy across
 records
 Identify errors
 Re-evaluate relationships &
 structures




Pre-Migration: Data Analysis
    Screen shot of Oxygen XML editor
Pre-Migration: Mapping

Source                              Target

CreatorName                         vra:agentName
                                    dc:creator

Date                                vra:date
                                    dc:date

Site                                vra:locationName
                                    dc:coverage

General Notes                       vra:description
                                    dc: description




Pre-Migration: Mapping
       Example of Source to Target map
Pre-Migration: Mapping

Source Set

  Examples of parsing
    Source data stores all agent names in creator field w/o a
     qualifying role
            Creator
    Target data stores all agent names separate from but
     linked to roles
            Agent.name
            Agent.role




Pre-Migration: Mapping

Target Set

Mix of community standards and local elements
  Descriptive – e.g. Core 4.0
  Technical – e.g. Core 4.0, MIX
  Rights – e.g. Core 4.0, METS Rights
  Local – e.g. dateImagePurchased
E.g. VRA Core has its own documentation and is meant to be used with the Cataloging Cultural Objects (CCO) content standard
MODS schema has both the MODS documentation guidelines <> and a best practice guideline written by DLF members called DLF/Aquifer Implementation Guide



          Pre-Migration: Mapping
          Familiarize yourself with standards’
           documentation
               Examples
               VRA Core 4.0 standard
                     Core 4.0 documentation
                     Cataloging Cultural Objects (CCO) content standard
               MODS standard
                     MODS documentation guidelines
                         <http://www.loc.gov/standards/mods/>
                     DLF/Aquifer Implementation Guidelines for Shareable
                      MODS Records
                         <http://www.diglib.org/aquifer/dlfmodsimplementationguidelines_
                         finalnov2006 pdf>



E.g. VRA Core has its own documentation and is meant to be used with the Cataloging Cultural Objects (CCO) content standard
MODS schema has both the MODS documentation guidelines <> and a best practice guideline written by DLF members called DLF/Aquifer Implementation Guide



          Pre-Migration: Mapping
        Crosswalks may be useful                             e.g. Getty crosswalk
E.g. VRA Core has its own documentation and is meant to be used with the Cataloging Cultural Objects (CCO) content standard
MODS schema has both the MODS documentation guidelines <> and a best practice guideline written by DLF members called DLF/Aquifer Implementation Guide



          Pre-Migration: Mapping

        The goal of normalization is to transform or
        clean up your data values so they conform to
        accepted standards, are more consistent, and
        can be understood by any user of your images


        Can be done during
                Pre Migration
                Migration
                Post Migration




E.g. VRA Core has its own documentation and is meant to be used with the Cataloging Cultural Objects (CCO) content standard
MODS schema has both the MODS documentation guidelines <> and a best practice guideline written by DLF members called DLF/Aquifer Implementation Guide



          Pre-Migration: Mapping

       Data to Normalize
                      Abbreviations
                      e.g. Material= ol on cv.; source = DOA; attribution=sch of


                      Inconsistencies
                      e.g. expressions of circa: Date= c1947; c. 1988; c 500 AD;
                               ca. 15th Century


                      Formatting conventions
                      e.g. title=[Grapevines at Mission de San Ignacio]
E.g. VRA Core has its own documentation and is meant to be used with the Cataloging Cultural Objects (CCO) content standard
MODS schema has both the MODS documentation guidelines <> and a best practice guideline written by DLF members called DLF/Aquifer Implementation Guide



          Pre-Migration: Mapping


       Other types of normalization
       Source system uses codes for values
                      e.g. Name=Johnson, Ben                                  type=1 (1=personal)

       Create indexed versions of dates
                      e.g. 2/9/01, mid 15th century, October 1945
                      YYYY-MM-DD, YYYY-MM, YYYY (ISO 8601 standard)




E.g. VRA Core has its own documentation and is meant to be used with the Cataloging Cultural Objects (CCO) content standard
MODS schema has both the MODS documentation guidelines <> and a best practice guideline written by DLF members called DLF/Aquifer Implementation Guide



          Pre-Migration: Mapping

       Assess metadata granularity

       Examples of narrow and broad worktypes

                      Narrow: City planning, urbanism, landscape design,
                      garden design, environmental design

                      Broad: Architecture

                      Narrow: costume design, fashion design, clothing,
                      jewelry, ornament, body decoration

                      Broad: Costume and Jewelry
E.g. VRA Core has its own documentation and is meant to be used with the Cataloging Cultural Objects (CCO) content standard
MODS schema has both the MODS documentation guidelines <> and a best practice guideline written by DLF members called DLF/Aquifer Implementation Guide



          Pre-Migration: Testing
        Hand pick records to evaluate mapping decisions and to
        test for normalization and diacritics problems

        Diacritics- best to encode in UTF-8 Unicode or use
        Unicode decimal or hexadecimal character references

        Display

                       Karls tejn (Str edoc eský kraj, Czech Republic)--Castle

        Exported using Unicode decimal character references

                       Karls&#780;tejn (Str&#780;edoc&#780;esky&#769; kraj, Czech
                       Republic)--Castle




E.g. VRA Core has its own documentation and is meant to be used with the Cataloging Cultural Objects (CCO) content standard
MODS schema has both the MODS documentation guidelines <> and a best practice guideline written by DLF members called DLF/Aquifer Implementation Guide



          Pre-Migration: Final Word



            BACK-UP, BACK-UP, BACK-UP
E.g. VRA Core has its own documentation and is meant to be used with the Cataloging Cultural Objects (CCO) content standard
MODS schema has both the MODS documentation guidelines <> and a best practice guideline written by DLF members called DLF/Aquifer Implementation Guide



          Migration: A few words

        Enlist the help of a programmer or database
        administrator

        Need for assistance will depend on your source and
        target systems (what tools they provide for migrating
        data), how much data normalization you’ll need,
        restructuring of the data, etc.

        A db administrator can help with target system setup
        (forms, reports, security, backup, etc)




E.g. VRA Core has its own documentation and is meant to be used with the Cataloging Cultural Objects (CCO) content standard
MODS schema has both the MODS documentation guidelines <> and a best practice guideline written by DLF members called DLF/Aquifer Implementation Guide



          Post Migration


                       Quality Control (QC)

                       Data Cleanup

                       Update Cataloging Guidelines

More Related Content

PDF
Using Dublin Core for DISCOVER: a New Zealand visual art and music resource f...
PDF
Glossary of Metadata standards
PPTX
Mapping FRBR, ISBD, RDA, and other namespaces to DC for interoperability
PPTX
Harmonization of vocabularies for water data
PPT
DCMI Collection Description Working Group
PDF
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
PPTX
An Approach for the Incremental Export of Relational Databases into RDF Graphs
PPT
Status of HDF-EOS, Related Software and Tools
Using Dublin Core for DISCOVER: a New Zealand visual art and music resource f...
Glossary of Metadata standards
Mapping FRBR, ISBD, RDA, and other namespaces to DC for interoperability
Harmonization of vocabularies for water data
DCMI Collection Description Working Group
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
An Approach for the Incremental Export of Relational Databases into RDF Graphs
Status of HDF-EOS, Related Software and Tools

Viewers also liked (20)

PDF
VRA 2013 Reach Out, Beene
PPTX
VRA 2015 Engaging New Technologies Schoen
DOC
THROWING THE CAT AMONG THE PIGEONS: Keeping Visual Resources Positions Viabl...
PPTX
Granting Wishes: Roles and Challenges Implementing Grants with Multiple Partners
PPTX
VRA 2015 Nps Focus Dietrich
PDF
Promoting Subject Specialists as Curators (poster)
PPTX
Calisphere: New Modes for Contributing, New Modes for Access
PPTX
VRA 2014 Case Studies in International Resources, Schuler
PPTX
VRA 2015 slides Burns
PPTX
VRA 2015 Nps Focus McDonald
PDF
Mcn 2010 brooklyn_museum_copyright_project_wythe
DOCX
Too Important to Fail
PPTX
LIS 653 posters spring 2015
PDF
Library Instruction in No Time! (Supplemental)
PPTX
A Robinson Living The Life Of A Registrar
PPTX
VRA 2014 VRA Core Unbound, Reser
PDF
Discovery by Design: A Wayfinding Approach to Browsing (poster)
PPTX
VRA 2013, Visual Resourcefulness and the Public Art Challenge, Lessick
PPTX
VRA 2015 35mm slides Case studies Schuler
PPTX
VRA 2015 Nps Focus Barnhart
VRA 2013 Reach Out, Beene
VRA 2015 Engaging New Technologies Schoen
THROWING THE CAT AMONG THE PIGEONS: Keeping Visual Resources Positions Viabl...
Granting Wishes: Roles and Challenges Implementing Grants with Multiple Partners
VRA 2015 Nps Focus Dietrich
Promoting Subject Specialists as Curators (poster)
Calisphere: New Modes for Contributing, New Modes for Access
VRA 2014 Case Studies in International Resources, Schuler
VRA 2015 slides Burns
VRA 2015 Nps Focus McDonald
Mcn 2010 brooklyn_museum_copyright_project_wythe
Too Important to Fail
LIS 653 posters spring 2015
Library Instruction in No Time! (Supplemental)
A Robinson Living The Life Of A Registrar
VRA 2014 VRA Core Unbound, Reser
Discovery by Design: A Wayfinding Approach to Browsing (poster)
VRA 2013, Visual Resourcefulness and the Public Art Challenge, Lessick
VRA 2015 35mm slides Case studies Schuler
VRA 2015 Nps Focus Barnhart
Ad

Similar to Introduction to Data Migration (20)

DOCX
804 annotated bibliography
PPTX
CCO (Cataloging Cultural Objects): Structuring and Storing Data with CCO
PPTX
Presentation arsip nov 2012 frans smit handout
PPTX
Kevin De Vorsey Past is Prologue
PDF
Handout 2 for Metadata for Visual Resources
PPT
Intro to Digitization Projects
PPT
Descriptive Standards and Applications in Memory Institutions
PPTX
Doing Less More Often: An Approach to Digital Strategy for Cultural Heritage ...
PPSX
MCN 2012 Swiss Army Knife Approach
PPTX
Building Digital Collections
PDF
Handout for Metadata for your Digital Collections
PPTX
Linked Open Data for Libraries, Archives, and Museums: An Aggregators View
PDF
Handout 2 for Metadata Overview, SEI 2012
PPTX
CCO (Cataloging Cultural Objects): Incorporating CCO in Your Workflow
PDF
Handout for Applying Digital Library Metadata Standards
PDF
Handout 2 for Metadata Overview, SEI 2013
PDF
Digital projects best practices [xxxiii reunión nacional de archivos 201111]
PPTX
20100401 정영임 da 전략 tft_0330
PPTX
20100401 정영임 da 전략 tft_0330
DOCX
Processing and Arrangement-Arial
804 annotated bibliography
CCO (Cataloging Cultural Objects): Structuring and Storing Data with CCO
Presentation arsip nov 2012 frans smit handout
Kevin De Vorsey Past is Prologue
Handout 2 for Metadata for Visual Resources
Intro to Digitization Projects
Descriptive Standards and Applications in Memory Institutions
Doing Less More Often: An Approach to Digital Strategy for Cultural Heritage ...
MCN 2012 Swiss Army Knife Approach
Building Digital Collections
Handout for Metadata for your Digital Collections
Linked Open Data for Libraries, Archives, and Museums: An Aggregators View
Handout 2 for Metadata Overview, SEI 2012
CCO (Cataloging Cultural Objects): Incorporating CCO in Your Workflow
Handout for Applying Digital Library Metadata Standards
Handout 2 for Metadata Overview, SEI 2013
Digital projects best practices [xxxiii reunión nacional de archivos 201111]
20100401 정영임 da 전략 tft_0330
20100401 정영임 da 전략 tft_0330
Processing and Arrangement-Arial
Ad

More from Visual Resources Association (20)

PDF
Measuring Impact for Sustainable Digital Projects
PPTX
Museums and Libraries Roadmap to Collaboration
PPTX
Lola Alvarez Bravo Digitization Presentation
PPTX
Comparative Study and Expansion of Metadata Standards for Historic Fashion Co...
PPTX
Unsettling Collections: Bias in the Visual Canon
PPTX
The Medieval Kingdom of Sicily Image Database Project: From Concept to Reality
PPTX
Interactive Topography with IIIF: Open Access to Photographs from the Ernest ...
PPTX
Recreating a 19th-Century Spectacle: The 3D Glass Stereo Project
PPTX
Cradle of Texas Gay Liberty: An Alternate History of the Alamo City
PDF
Material Order: A Discovery Group, Shared Catalog, and Research Platform for ...
PPTX
Personal Archiving for Undergraduate Students
PPTX
Disinformation and Deepfakes: The Urgent Need for Visual Literacy
PDF
Jean Charlot: Artist as Archivist
PDF
Pattern and Representation: Critical Cataloging for a New Perspective on Camp...
PPTX
Stories from the Stop (and Re-Start?): Visual Resources Professionals Face Re...
PPTX
Supporting Art History Students' Digital Projects at American University
PPT
Material Objects and Special Collections
PPTX
Digital Art History
PPTX
Assessing the use of Qualitative Data Analysis Software (QDAS) by Art Histori...
PPTX
Describing Art on the Street: The Graffiti Art Community Voice
Measuring Impact for Sustainable Digital Projects
Museums and Libraries Roadmap to Collaboration
Lola Alvarez Bravo Digitization Presentation
Comparative Study and Expansion of Metadata Standards for Historic Fashion Co...
Unsettling Collections: Bias in the Visual Canon
The Medieval Kingdom of Sicily Image Database Project: From Concept to Reality
Interactive Topography with IIIF: Open Access to Photographs from the Ernest ...
Recreating a 19th-Century Spectacle: The 3D Glass Stereo Project
Cradle of Texas Gay Liberty: An Alternate History of the Alamo City
Material Order: A Discovery Group, Shared Catalog, and Research Platform for ...
Personal Archiving for Undergraduate Students
Disinformation and Deepfakes: The Urgent Need for Visual Literacy
Jean Charlot: Artist as Archivist
Pattern and Representation: Critical Cataloging for a New Perspective on Camp...
Stories from the Stop (and Re-Start?): Visual Resources Professionals Face Re...
Supporting Art History Students' Digital Projects at American University
Material Objects and Special Collections
Digital Art History
Assessing the use of Qualitative Data Analysis Software (QDAS) by Art Histori...
Describing Art on the Street: The Graffiti Art Community Voice

Recently uploaded (20)

PPTX
Big Data Technologies - Introduction.pptx
PDF
GamePlan Trading System Review: Professional Trader's Honest Take
PDF
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Empathic Computing: Creating Shared Understanding
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPT
Teaching material agriculture food technology
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Modernizing your data center with Dell and AMD
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Electronic commerce courselecture one. Pdf
Big Data Technologies - Introduction.pptx
GamePlan Trading System Review: Professional Trader's Honest Take
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
NewMind AI Monthly Chronicles - July 2025
Diabetes mellitus diagnosis method based random forest with bat algorithm
Empathic Computing: Creating Shared Understanding
20250228 LYD VKU AI Blended-Learning.pptx
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Mobile App Security Testing_ A Comprehensive Guide.pdf
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Teaching material agriculture food technology
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Modernizing your data center with Dell and AMD
Understanding_Digital_Forensics_Presentation.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Reach Out and Touch Someone: Haptics and Empathic Computing
Electronic commerce courselecture one. Pdf

Introduction to Data Migration

  • 1. INTRODUCTION TO DATA MIGRATION Trish Rose-Sandler VRA Conference Kansas City, MO March 28 2007 Definition “Data migration is the transferring of data between storage types, formats, or computer systems. Data migration is usually performed programmatically to achieve an automated migration, freeing up human resources from tedious tasks. It is required when organizations or individuals change computer systems or upgrade to new systems.” Wikipedia 2/21/07
  • 2. 3 Stages of Data Migration Pre Migration Migration Post Migration 3 Stages of Data Migration Pre Migration – Analyzing, Mapping, Normalizing/Transforming, Testing, Backup Migration Post Migration – Quality Control, Cleanup, Update Cataloging Guidelines
  • 3. Pre-Migration: Data Analysis Evaluate data for: Consistency Unnecessary redundancy across records Identify errors Re-evaluate relationships & structures Pre-Migration: Data Analysis Screen shot of Oxygen XML editor
  • 4. Pre-Migration: Mapping Source Target CreatorName vra:agentName dc:creator Date vra:date dc:date Site vra:locationName dc:coverage General Notes vra:description dc: description Pre-Migration: Mapping Example of Source to Target map
  • 5. Pre-Migration: Mapping Source Set Examples of parsing Source data stores all agent names in creator field w/o a qualifying role Creator Target data stores all agent names separate from but linked to roles Agent.name Agent.role Pre-Migration: Mapping Target Set Mix of community standards and local elements Descriptive – e.g. Core 4.0 Technical – e.g. Core 4.0, MIX Rights – e.g. Core 4.0, METS Rights Local – e.g. dateImagePurchased
  • 6. E.g. VRA Core has its own documentation and is meant to be used with the Cataloging Cultural Objects (CCO) content standard MODS schema has both the MODS documentation guidelines <> and a best practice guideline written by DLF members called DLF/Aquifer Implementation Guide Pre-Migration: Mapping Familiarize yourself with standards’ documentation Examples VRA Core 4.0 standard Core 4.0 documentation Cataloging Cultural Objects (CCO) content standard MODS standard MODS documentation guidelines <http://www.loc.gov/standards/mods/> DLF/Aquifer Implementation Guidelines for Shareable MODS Records <http://www.diglib.org/aquifer/dlfmodsimplementationguidelines_ finalnov2006 pdf> E.g. VRA Core has its own documentation and is meant to be used with the Cataloging Cultural Objects (CCO) content standard MODS schema has both the MODS documentation guidelines <> and a best practice guideline written by DLF members called DLF/Aquifer Implementation Guide Pre-Migration: Mapping Crosswalks may be useful e.g. Getty crosswalk
  • 7. E.g. VRA Core has its own documentation and is meant to be used with the Cataloging Cultural Objects (CCO) content standard MODS schema has both the MODS documentation guidelines <> and a best practice guideline written by DLF members called DLF/Aquifer Implementation Guide Pre-Migration: Mapping The goal of normalization is to transform or clean up your data values so they conform to accepted standards, are more consistent, and can be understood by any user of your images Can be done during Pre Migration Migration Post Migration E.g. VRA Core has its own documentation and is meant to be used with the Cataloging Cultural Objects (CCO) content standard MODS schema has both the MODS documentation guidelines <> and a best practice guideline written by DLF members called DLF/Aquifer Implementation Guide Pre-Migration: Mapping Data to Normalize Abbreviations e.g. Material= ol on cv.; source = DOA; attribution=sch of Inconsistencies e.g. expressions of circa: Date= c1947; c. 1988; c 500 AD; ca. 15th Century Formatting conventions e.g. title=[Grapevines at Mission de San Ignacio]
  • 8. E.g. VRA Core has its own documentation and is meant to be used with the Cataloging Cultural Objects (CCO) content standard MODS schema has both the MODS documentation guidelines <> and a best practice guideline written by DLF members called DLF/Aquifer Implementation Guide Pre-Migration: Mapping Other types of normalization Source system uses codes for values e.g. Name=Johnson, Ben type=1 (1=personal) Create indexed versions of dates e.g. 2/9/01, mid 15th century, October 1945 YYYY-MM-DD, YYYY-MM, YYYY (ISO 8601 standard) E.g. VRA Core has its own documentation and is meant to be used with the Cataloging Cultural Objects (CCO) content standard MODS schema has both the MODS documentation guidelines <> and a best practice guideline written by DLF members called DLF/Aquifer Implementation Guide Pre-Migration: Mapping Assess metadata granularity Examples of narrow and broad worktypes Narrow: City planning, urbanism, landscape design, garden design, environmental design Broad: Architecture Narrow: costume design, fashion design, clothing, jewelry, ornament, body decoration Broad: Costume and Jewelry
  • 9. E.g. VRA Core has its own documentation and is meant to be used with the Cataloging Cultural Objects (CCO) content standard MODS schema has both the MODS documentation guidelines <> and a best practice guideline written by DLF members called DLF/Aquifer Implementation Guide Pre-Migration: Testing Hand pick records to evaluate mapping decisions and to test for normalization and diacritics problems Diacritics- best to encode in UTF-8 Unicode or use Unicode decimal or hexadecimal character references Display Karls tejn (Str edoc eský kraj, Czech Republic)--Castle Exported using Unicode decimal character references Karls&#780;tejn (Str&#780;edoc&#780;esky&#769; kraj, Czech Republic)--Castle E.g. VRA Core has its own documentation and is meant to be used with the Cataloging Cultural Objects (CCO) content standard MODS schema has both the MODS documentation guidelines <> and a best practice guideline written by DLF members called DLF/Aquifer Implementation Guide Pre-Migration: Final Word BACK-UP, BACK-UP, BACK-UP
  • 10. E.g. VRA Core has its own documentation and is meant to be used with the Cataloging Cultural Objects (CCO) content standard MODS schema has both the MODS documentation guidelines <> and a best practice guideline written by DLF members called DLF/Aquifer Implementation Guide Migration: A few words Enlist the help of a programmer or database administrator Need for assistance will depend on your source and target systems (what tools they provide for migrating data), how much data normalization you’ll need, restructuring of the data, etc. A db administrator can help with target system setup (forms, reports, security, backup, etc) E.g. VRA Core has its own documentation and is meant to be used with the Cataloging Cultural Objects (CCO) content standard MODS schema has both the MODS documentation guidelines <> and a best practice guideline written by DLF members called DLF/Aquifer Implementation Guide Post Migration Quality Control (QC) Data Cleanup Update Cataloging Guidelines