Content Migration for
Sitecore
by SurendrA SharmA
Friday, July
15, 2016
1
Agenda
 Introduction
 Questions for Client
 Data Mapping
 Data Representation and Mapping
 Sources of Data
 Data Migration Path
 Why SQL Server
 Guide for content migration resources and team
 Images and media
 Sitecore Items
 Sitecore Fields
 Migration Code
 Log Files
 Code Techniques
 Type of resources
 Testing
 Takeaway
Friday, July
15, 2016
2
Source of Inspiration
Client Manager Testing Team Developer
Friday, July
15, 2016
3
Questions for client
 Is it one time activity?
 In how many batches will data provide?
 Same format of data into all batches
 Client can correct data
 Frequent data format changes require code changes
 Is there any manual migration involved?
 Specify instructions for client in mail, call, SOW etc
Friday, July
15, 2016
4
Data Mapping
 3 things required : source, destination and mapping
Friday, July
15, 2016
5
Data Representation and Mapping
 Same data different representation.
 WWW + Customization = Where to Where and What + Data Representation
Source Destination Level Example
Single Field Single Field Easy Lastname field of database to Lastname field of Sitecore
Example :- Kalam To Kalam
Multiple Fields Single Field Medium Database Prefix, Firstname, Last name fields to Sitecore Fullname field
(Dr., Abdul, Kalam => Dr. Abdul Kalam)
Single Field Multiple
Fields
Difficult Database Fullname field to Sitecore Prefix, Firstname, Last name fields
(Dr. Abdul Kalam => Dr., Abdul, Kalam)
(Avul Pakir Jainulabdeen Abdul Kalam => ?)
Full
Name
First Name
Last Name
Prefix
Full
Name
First Name
Last Name
Prefix
First
Name
First Name
Friday, July
15, 2016
6
Source of Data
 Content migration from multiple source with multiple language data.
 Code for multiple version destination.
 Code for multiple version and multi language data.
 Different sources of data
 Text files
 Excel file
 CSV file
 XML file
 JSON
 Database [SQL, MySQL, etc]
 Others etc.
 Don’t write direct migration code from txt, excel, xml to Sitecore.
Friday, July
15, 2016
7
Data Migration Map
Load
• This will load the
data
Parse
• Parse the loaded
data
Dump
• Insert into SQL
Server Database
Get data
• Get Data from
SQL Server
Create/Update
• Create or update
item in Sitecore
This is a process to migrate data and load the same in to Sitecore. The data can come from various sources
like files, SQL server etc. Irrespective from where the data comes, the overall general process is to load the
data from the source, parse the data, dump the same in to SQL Server, get data from SQL server and finally
create/update items in Sitecore .
Friday, July
15, 2016
8
Why SQL Server
 SQL database can be reuse on different environment like QA, Staging, Production etc.
 One can easily filter data in SQL SELECT statement.
 Records can be processed in SQL server based on unique keys.
 Software drivers like Excel may not be available on staging, production servers.
 Migrate once, use multiple times
 If your database is publically accessible over Internet then you can access data on any server / environment
 Easily restore these databases on any environment.
Friday, July
15, 2016
9
Guide for content migration resources
and team
 Able to work with minimum supervision
 Passionate
 Self-starter
 Agile team
Friday, July
15, 2016
10
Images and Media
 Use regex
 Identify image and download it
 Better if client provide on FTP
 Upload and create image item in Sitecore
 Replace links from content as per new website links.
Friday, July
15, 2016
11
SITECORE ITEMS
 Make sure if item is creating from template or branch template. There may be case where
developer write the code for item creation from specific template instead of a branch
template.
 Item naming regex - Use same naming process which is used by Sitecore for items
 Custom Naming - Special characters can’t be a part of item name. Remove all special
characters from item name and consider only A-Z, underscore(_), Hyphen(-), 0-9 etc.
 Item name should not truncate in mid of any word
 Instead of “Welcome To New”, create item with name “Welcome To New York”.
 Consider item name up to 100 characters like
 string newItem = itemName.Substring(0,100).Trim();
 newItem = newItem.Substring (0,newItem.LastIndexOf(' '));
 Item name plays an important role in URL and SEO
 For example – How to specify item name of lawyers?
 Lastname + firstname or firstname + lastname
 Legacy URL http://www.Oldsite.com/lara-craft , new URL is http://www.newsite.com/craft-lara
 In case of multiple language, migrate all related details in the same language in which main
item is created. Friday, July
15, 2016
12
Sitecore Fields
 All droplink, dropdown, images fields in Sitecore should be shared.
 Keep GUID / ID and last updated date in Sitecore to track and migrate only different
data in next batch.
 Skip sitecore item for same GUID and last updated date.
 Update sitecore item for same GUID but change in date.
 Create sitecore item for new GUID.
 This strategy will save processing time and avoid any existing change in data.
 Keep Is Auto migrated checkbox in sitecore.
 Always mark this checkbox during auto migration process.
 Keep “Is Valid” checkbox.
 In all auto and manual migration this checkbox is checked.
 If any item is unchecked, they all are likely to be deleted.
Friday, July
15, 2016
13
Migration Code
 Create separate class library project for content migration.
 Separate class library helps in maintenance of the code and can be easily
accessible.
 Create one web page from where you can run script.
 This page should be password protected.
 Allow to select only one option by using radio buttons in this page and one a button
to activate/run the migration script.
 Keep all migration related settings including connection strings in separate
config file located at WebsiteApp_Config.
Friday, July
15, 2016
14
Log Files
 Maintain separate log files for each
migration script
 Logs are handy to check migration progress
status, specially on QA and staging
environment.
 These log files are referred by developers
and testers for data verification.
 Always keep some counter value for record
counting in logs.
 Maintain logs in XML format for better
result.
 Use Notepad++ for analyzing the log files.
Friday, July
15, 2016
15
Code Technique
 Instead of fast query to get child; use Item.Axes.GetDescendants() method as fast
query gives unpredictable results.
 Use LINQ as much as possible to minimize code line
 Instead of getting item from Sitecore.Context.Database.GetItem(); use database
method as Sitecore.Data.Database.GetDatabase("master").GetItem(); as
context statement give you result from WEB database while we have to refer master
database items.
 Always check null, blank, empty value before getting, converting any value in code.
 Use break in loops if particular item found as its processing is completed.
 If you are going to create > 100 items under one parent than try to create them in
folder. Folder can be according to date hierarchy or alphabetic order.
 Write sitecore fetch item code common and call it outside loop. Don’t call such code
inside loop as it slow down the process and its unnecessary delay the processing.
Friday, July
15, 2016
16
Code Technique Continue..
 Don’t write sitecore fetch code inside
the loop but if you are creating the items
of the same collection which are
iterating in loop than fetch all items
again inside loop.
 For eg: Fetch all bio.
 Write a common function for validating
item name and image and use the same
function everywhere in project and
upload the image with the name that is
return by this function.
 Write your script in aspx page inside
server tags so that you don’t need to put
the DLL on anywhere and it will not
effect in the .net process/ sitecore
process.
 Sitecore itself using this technic.
 Take backup of Sitecore database before
running any migration script. Friday, July
15, 2016
17
Code Technique Continue..
 Migration data may contain n (newline). It should be
replaced by <br> or <p></p> tag.
Friday, July 15, 2016
18
Testing
 Always write auto-script to test and validate data between live and new site. As
testing team can’t test all the records manually. Auto migration data must be
tested by auto-testing.
 Run your test auto scripts on isolated server like CA server or QA server. Don’t
disturb staging or client using servers. Run your scripts in off hours On client
accessing servers.
 Code review must be done for auto-migration script.
 Ask cross question and validate all documents of clients/ vendors before starting
data migration. Try to know the classification of data/ categories of data.
 Take backup of database before running the scripts.
 Run you migration script for few records/items at first time.Once these items are
reviewed by testing team then only run your script for all records.
Friday, July
15, 2016
19
Takeaway
 WWW + Customization = Where to Where and What + Data Representation
 Data is everything for any organization.
 Avoid Update operation with data
 Content migration is for developers who like to play with data,
DO YOU?
Friday, July
15, 2016
20
Thank you!
21

More Related Content

PPTX
Optimizing your DITA content model for translation
PPTX
Revit BIM Coordination: Streamlining Collaboration & Efficiency
PPT
Cours JavaScript.ppt
PDF
Cours HTML/CSS
PDF
ADAPTACIÓN DE HERRAMIENTA DE EVALUACIÓN DE LA CALIDAD UNE 71362 AL PERFIL PRO...
PDF
Application Spring MVC/IOC & Hibernate
PDF
Livraison en continue avec l'outillage devops - Jenkins, Ansible, Docker et ...
PDF
HTML X CSS
Optimizing your DITA content model for translation
Revit BIM Coordination: Streamlining Collaboration & Efficiency
Cours JavaScript.ppt
Cours HTML/CSS
ADAPTACIÓN DE HERRAMIENTA DE EVALUACIÓN DE LA CALIDAD UNE 71362 AL PERFIL PRO...
Application Spring MVC/IOC & Hibernate
Livraison en continue avec l'outillage devops - Jenkins, Ansible, Docker et ...
HTML X CSS

Similar to Content migration for sitecore (20)

PDF
WEBINAR: Proven Patterns for Loading Test Data for Managed Package Testing
PDF
SEO for Large/Enterprise Websites - Data & Tech Side
PDF
Continuous delivery for machine learning
PDF
Migration Best Practices - SEOkomm 2018
PDF
How we improved performance at Mixbook
PPT
Sql Portfolio
PPT
JBUG 11 - Django-The Web Framework For Perfectionists With Deadlines
PPTX
AngularJS 1.x - your first application (problems and solutions)
PDF
Intro to SpringBatch NoSQL 2021
PPTX
Usability AJAX and other ASP.NET Features
PPTX
Practical OData
PPTX
Sps redmond 2014 deck
PPTX
A Beginner's Guide to Client Side Development with Javascript
ODP
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
PDF
Владимир Дубина - Meet Magento Ukraine - Data consistency
PPT
Itemscript, a specification for RESTful JSON integration
PPTX
SEO for Large Websites
PPT
Intro to Application Express
PPT
Migration from ASP to ASP.NET
PPTX
MongoDB.local Seattle 2019: MongoDB Stitch Tutorial
WEBINAR: Proven Patterns for Loading Test Data for Managed Package Testing
SEO for Large/Enterprise Websites - Data & Tech Side
Continuous delivery for machine learning
Migration Best Practices - SEOkomm 2018
How we improved performance at Mixbook
Sql Portfolio
JBUG 11 - Django-The Web Framework For Perfectionists With Deadlines
AngularJS 1.x - your first application (problems and solutions)
Intro to SpringBatch NoSQL 2021
Usability AJAX and other ASP.NET Features
Practical OData
Sps redmond 2014 deck
A Beginner's Guide to Client Side Development with Javascript
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
Владимир Дубина - Meet Magento Ukraine - Data consistency
Itemscript, a specification for RESTful JSON integration
SEO for Large Websites
Intro to Application Express
Migration from ASP to ASP.NET
MongoDB.local Seattle 2019: MongoDB Stitch Tutorial
Ad

Recently uploaded (20)

PDF
Loose-Leaf for Auditing & Assurance Services A Systematic Approach 11th ed. E...
PPTX
recommendation Project PPT with details attached
PPTX
The Data Security Envisioning Workshop provides a summary of an organization...
PPT
expt-design-lecture-12 hghhgfggjhjd (1).ppt
PPT
statistic analysis for study - data collection
PPT
PROJECT CYCLE MANAGEMENT FRAMEWORK (PCM).ppt
PPTX
Statisticsccdxghbbnhhbvvvvvvvvvv. Dxcvvvhhbdzvbsdvvbbvv ccc
PPTX
CHAPTER-2-THE-ACCOUNTING-PROCESS-2-4.pptx
PDF
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
PPTX
ai agent creaction with langgraph_presentation_
PDF
©️ 01_Algorithm for Microsoft New Product Launch - handling web site - by Ale...
PDF
A biomechanical Functional analysis of the masitary muscles in man
PPTX
chrmotography.pptx food anaylysis techni
PPTX
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
PDF
Tetra Pak Index 2023 - The future of health and nutrition - Full report.pdf
PPTX
PPT for Diseases.pptx, there are 3 types of diseases
PDF
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
PDF
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
PPTX
machinelearningoverview-250809184828-927201d2.pptx
PPTX
Business_Capability_Map_Collection__pptx
Loose-Leaf for Auditing & Assurance Services A Systematic Approach 11th ed. E...
recommendation Project PPT with details attached
The Data Security Envisioning Workshop provides a summary of an organization...
expt-design-lecture-12 hghhgfggjhjd (1).ppt
statistic analysis for study - data collection
PROJECT CYCLE MANAGEMENT FRAMEWORK (PCM).ppt
Statisticsccdxghbbnhhbvvvvvvvvvv. Dxcvvvhhbdzvbsdvvbbvv ccc
CHAPTER-2-THE-ACCOUNTING-PROCESS-2-4.pptx
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
ai agent creaction with langgraph_presentation_
©️ 01_Algorithm for Microsoft New Product Launch - handling web site - by Ale...
A biomechanical Functional analysis of the masitary muscles in man
chrmotography.pptx food anaylysis techni
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
Tetra Pak Index 2023 - The future of health and nutrition - Full report.pdf
PPT for Diseases.pptx, there are 3 types of diseases
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
machinelearningoverview-250809184828-927201d2.pptx
Business_Capability_Map_Collection__pptx
Ad

Content migration for sitecore

  • 1. Content Migration for Sitecore by SurendrA SharmA Friday, July 15, 2016 1
  • 2. Agenda  Introduction  Questions for Client  Data Mapping  Data Representation and Mapping  Sources of Data  Data Migration Path  Why SQL Server  Guide for content migration resources and team  Images and media  Sitecore Items  Sitecore Fields  Migration Code  Log Files  Code Techniques  Type of resources  Testing  Takeaway Friday, July 15, 2016 2
  • 3. Source of Inspiration Client Manager Testing Team Developer Friday, July 15, 2016 3
  • 4. Questions for client  Is it one time activity?  In how many batches will data provide?  Same format of data into all batches  Client can correct data  Frequent data format changes require code changes  Is there any manual migration involved?  Specify instructions for client in mail, call, SOW etc Friday, July 15, 2016 4
  • 5. Data Mapping  3 things required : source, destination and mapping Friday, July 15, 2016 5
  • 6. Data Representation and Mapping  Same data different representation.  WWW + Customization = Where to Where and What + Data Representation Source Destination Level Example Single Field Single Field Easy Lastname field of database to Lastname field of Sitecore Example :- Kalam To Kalam Multiple Fields Single Field Medium Database Prefix, Firstname, Last name fields to Sitecore Fullname field (Dr., Abdul, Kalam => Dr. Abdul Kalam) Single Field Multiple Fields Difficult Database Fullname field to Sitecore Prefix, Firstname, Last name fields (Dr. Abdul Kalam => Dr., Abdul, Kalam) (Avul Pakir Jainulabdeen Abdul Kalam => ?) Full Name First Name Last Name Prefix Full Name First Name Last Name Prefix First Name First Name Friday, July 15, 2016 6
  • 7. Source of Data  Content migration from multiple source with multiple language data.  Code for multiple version destination.  Code for multiple version and multi language data.  Different sources of data  Text files  Excel file  CSV file  XML file  JSON  Database [SQL, MySQL, etc]  Others etc.  Don’t write direct migration code from txt, excel, xml to Sitecore. Friday, July 15, 2016 7
  • 8. Data Migration Map Load • This will load the data Parse • Parse the loaded data Dump • Insert into SQL Server Database Get data • Get Data from SQL Server Create/Update • Create or update item in Sitecore This is a process to migrate data and load the same in to Sitecore. The data can come from various sources like files, SQL server etc. Irrespective from where the data comes, the overall general process is to load the data from the source, parse the data, dump the same in to SQL Server, get data from SQL server and finally create/update items in Sitecore . Friday, July 15, 2016 8
  • 9. Why SQL Server  SQL database can be reuse on different environment like QA, Staging, Production etc.  One can easily filter data in SQL SELECT statement.  Records can be processed in SQL server based on unique keys.  Software drivers like Excel may not be available on staging, production servers.  Migrate once, use multiple times  If your database is publically accessible over Internet then you can access data on any server / environment  Easily restore these databases on any environment. Friday, July 15, 2016 9
  • 10. Guide for content migration resources and team  Able to work with minimum supervision  Passionate  Self-starter  Agile team Friday, July 15, 2016 10
  • 11. Images and Media  Use regex  Identify image and download it  Better if client provide on FTP  Upload and create image item in Sitecore  Replace links from content as per new website links. Friday, July 15, 2016 11
  • 12. SITECORE ITEMS  Make sure if item is creating from template or branch template. There may be case where developer write the code for item creation from specific template instead of a branch template.  Item naming regex - Use same naming process which is used by Sitecore for items  Custom Naming - Special characters can’t be a part of item name. Remove all special characters from item name and consider only A-Z, underscore(_), Hyphen(-), 0-9 etc.  Item name should not truncate in mid of any word  Instead of “Welcome To New”, create item with name “Welcome To New York”.  Consider item name up to 100 characters like  string newItem = itemName.Substring(0,100).Trim();  newItem = newItem.Substring (0,newItem.LastIndexOf(' '));  Item name plays an important role in URL and SEO  For example – How to specify item name of lawyers?  Lastname + firstname or firstname + lastname  Legacy URL http://www.Oldsite.com/lara-craft , new URL is http://www.newsite.com/craft-lara  In case of multiple language, migrate all related details in the same language in which main item is created. Friday, July 15, 2016 12
  • 13. Sitecore Fields  All droplink, dropdown, images fields in Sitecore should be shared.  Keep GUID / ID and last updated date in Sitecore to track and migrate only different data in next batch.  Skip sitecore item for same GUID and last updated date.  Update sitecore item for same GUID but change in date.  Create sitecore item for new GUID.  This strategy will save processing time and avoid any existing change in data.  Keep Is Auto migrated checkbox in sitecore.  Always mark this checkbox during auto migration process.  Keep “Is Valid” checkbox.  In all auto and manual migration this checkbox is checked.  If any item is unchecked, they all are likely to be deleted. Friday, July 15, 2016 13
  • 14. Migration Code  Create separate class library project for content migration.  Separate class library helps in maintenance of the code and can be easily accessible.  Create one web page from where you can run script.  This page should be password protected.  Allow to select only one option by using radio buttons in this page and one a button to activate/run the migration script.  Keep all migration related settings including connection strings in separate config file located at WebsiteApp_Config. Friday, July 15, 2016 14
  • 15. Log Files  Maintain separate log files for each migration script  Logs are handy to check migration progress status, specially on QA and staging environment.  These log files are referred by developers and testers for data verification.  Always keep some counter value for record counting in logs.  Maintain logs in XML format for better result.  Use Notepad++ for analyzing the log files. Friday, July 15, 2016 15
  • 16. Code Technique  Instead of fast query to get child; use Item.Axes.GetDescendants() method as fast query gives unpredictable results.  Use LINQ as much as possible to minimize code line  Instead of getting item from Sitecore.Context.Database.GetItem(); use database method as Sitecore.Data.Database.GetDatabase("master").GetItem(); as context statement give you result from WEB database while we have to refer master database items.  Always check null, blank, empty value before getting, converting any value in code.  Use break in loops if particular item found as its processing is completed.  If you are going to create > 100 items under one parent than try to create them in folder. Folder can be according to date hierarchy or alphabetic order.  Write sitecore fetch item code common and call it outside loop. Don’t call such code inside loop as it slow down the process and its unnecessary delay the processing. Friday, July 15, 2016 16
  • 17. Code Technique Continue..  Don’t write sitecore fetch code inside the loop but if you are creating the items of the same collection which are iterating in loop than fetch all items again inside loop.  For eg: Fetch all bio.  Write a common function for validating item name and image and use the same function everywhere in project and upload the image with the name that is return by this function.  Write your script in aspx page inside server tags so that you don’t need to put the DLL on anywhere and it will not effect in the .net process/ sitecore process.  Sitecore itself using this technic.  Take backup of Sitecore database before running any migration script. Friday, July 15, 2016 17
  • 18. Code Technique Continue..  Migration data may contain n (newline). It should be replaced by <br> or <p></p> tag. Friday, July 15, 2016 18
  • 19. Testing  Always write auto-script to test and validate data between live and new site. As testing team can’t test all the records manually. Auto migration data must be tested by auto-testing.  Run your test auto scripts on isolated server like CA server or QA server. Don’t disturb staging or client using servers. Run your scripts in off hours On client accessing servers.  Code review must be done for auto-migration script.  Ask cross question and validate all documents of clients/ vendors before starting data migration. Try to know the classification of data/ categories of data.  Take backup of database before running the scripts.  Run you migration script for few records/items at first time.Once these items are reviewed by testing team then only run your script for all records. Friday, July 15, 2016 19
  • 20. Takeaway  WWW + Customization = Where to Where and What + Data Representation  Data is everything for any organization.  Avoid Update operation with data  Content migration is for developers who like to play with data, DO YOU? Friday, July 15, 2016 20

Editor's Notes

  • #2: Content migration over 3 lakh records
  • #4: Training can only provide knowledge while real life scenario gives you experience. Client providing you experience as his requirement are weird and unpredictable. Client don’t know about Sitecore, PM Know about sitecore and instruct you to do the work. Testers can never be good friends as developer never want to break their code and testers do the exactly opposite. Testers find the issues which developer have to fix and Developer don’t like to change the code. It’s very difficult where Testers and developers are in same project and they are good friend. For Tester – No bugs means no glory This presentation and session is dedicated to client, PM and Testers.
  • #5: Most important is content - data – information. Ask questions to client Is it one time activity? If its answer is Yes, believe me u r in safe zone. In how many batches will it provide? Strict instruction - We want data in same format into all batches – Educate your client If it is in different format than client have to correct it  developer have to change the code which they don’t like and it create conflict between team and manager. Is there any manual migration? Manual migration – Hard work Specify these clear instruction to client in mail and SOW
  • #7: Developers questions for manager in Single Source to Multiple Destination
  • #11: Able to work with minimum supervision -You will be the only one we blame when something goes wrong Passionate - Perseveres through regular death marches in front of management Self-starter - We have no process Agile team - We have daily stand-ups
  • #18: Give example: difference between 1. XML data and RT field. 2. XML data and RT HTML field.
  • #19: Give example: difference between 1. XML data and RT field. 2. XML data and RT HTML field.
  • #20: The imitation Game - It stars Benedict Cumberbatch as real-life British cryptanalyst Alan Turing, who decrypted German intelligence codes for the British government during World War II. The team are trying to break the ciphers created by the Enigma machine, which the Nazis use to provide security for their radio messages.