Sander Vermolen
    Eelco Visser



                                        Data
                                        Model 
                                        Evolution


            This research is supported by NWO/JACQUARD project
                                           
            638.001.610, MoDSE: Model­Driven Software Evolution.
Data 
    Models



              
     
     
     
User   1
       name      bob 
       real name Bob Johnson
       email     b.johnson@mail.com

    Page   1
       title      "The first page"
       isRedirect false
       text       "Hello world"




        
Count page views




               Version history



                  
     
No page count
        No revisions


        User   1
           name      bob 
           real name Bob Johnson
           email     b.johnson@mail.com

        Page   1
           title      "The first page"
           isRedirect false
           text       "Hello world"



     
     
Coupled Data Evolution




               
...

    $dbh->bz_add_column('attachments', 'submitter_id', {TYPE => 'INT3', NOTNULL => 1}, 0);
    $dbh->bz_rename_column('bugs_activity', 'when', 'bug_when');
    _add_bug_vote_cache(); _update_product_name_definition(); _add_bug_keyword_cache();
    $dbh->bz_add_column('profiles', 'disabledtext', {TYPE => 'MEDIUMTEXT', NOTNULL => 1}, '');
    _populate_longdescs(); _update_bugs_activity_field_to_fieldid();

    if (!$dbh->bz_column_info('bugs', 'lastdiffed')) {
           $dbh->bz_add_column('bugs', 'lastdiffed', {TYPE =>'DATETIME'});
           $dbh->do('UPDATE bugs SET lastdiffed = NOW()');
    }

    _add_unique_login_name_index_to_profiles();
    $dbh->bz_add_column('profiles', 'mybugslink', {TYPE => 'BOOLEAN', NOTNULL => 1, DEFAULT => 'TRUE'});
    _update_component_user_fields_to_ids();
    $dbh->bz_add_column('bugs', 'everconfirmed', {TYPE => 'BOOLEAN', NOTNULL => 1}, 1);
    $dbh->bz_add_column('products', 'maxvotesperbug', {TYPE => 'INT2', NOTNULL => 1, DEFAULT => '10000'});
    $dbh->bz_add_column('products', 'votestoconfirm', {TYPE => 'INT2', NOTNULL => 1}, 0);
    _populate_milestones_table();
    $dbh->bz_alter_column('bugs', 'target_milestone', {TYPE => 'varchar(20)', NOTNULL => 1, DEFAULT => "'---'"});
    $dbh->bz_alter_column('milestones', 'value', {TYPE => 'varchar(20)', NOTNULL => 1});
    _add_products_defaultmilestone();

    if (!$dbh->bz_index_info('cc', 'cc_bug_id_idx') || !$dbh->bz_index_info('cc', 'cc_bug_id_idx')->{TYPE}) {
           $dbh->bz_drop_index('cc', 'cc_bug_id_idx');
           $dbh->bz_add_index('cc', 'cc_bug_id_idx', {TYPE => 'UNIQUE', FIELDS => [qw(bug_id who)]});
    }
    if (!$dbh->bz_index_info('keywords', 'keywords_bug_id_idx') || !$dbh->bz_index_info('keywords',
    'keywords_bug_id_idx')->{TYPE}) {
           $dbh->bz_drop_index('keywords', 'keywords_bug_id_idx');
           $dbh->bz_add_index('keywords', 'keywords_bug_id_idx', {TYPE => 'UNIQUE', FIELDS => [qw(bug_id
    keywordid)]});
    }

    ...
                                                             
Costly

                High risk

    Holds back the development process

    Large infrequent development steps




                      
User
           id           :   integer
           name         :   varchar
           realName     :   varchar
           email        :   tinytext

        Page
           id           :   integer
           title        :   varchar
           author       -   User        *
           isRedirect   :   boolean
           content      :   text




                                       set of
User
       id          :   integer    Unique
       name        :   varchar
       realName    :   varchar    ?
       email       :   tinytext

    Page : Medium
       author     -    User       min(1) max(8)
       content    :    text
       refs       :    url        * Indexed

    abstract Medium
        id         :   integer    Unique
        title      :   ANY_NAME




                          
Evolving
    Data Models



      
     
     
     
     
Specifying
    Data Model Evolution



         
User                                User
       id           ::   integer           id           ::   integer
       name         ::   varchar           name         ::   varchar
       realName     ::   varchar           realName     ::   varchar
       email        ::   tinytext          email        ::   tinytext

    Page                                Page
       id           ::   integer           id           ::   integer
       title        ::   varchar           title        ::   varchar
       author           User              counter      ::   biginteger
       isRedirect   ::   boolean           isRedirect   ::   boolean
       content      ::   text              revisions        set of Revision

                                        Revision
                                           id           ::   integer
                                           page             Page
                                           comment ::        tinyblob
                                           timestamp ::      time
                                           revisionText ::   text
                                           author           User
What happened?

    Added type revisions                 Revision
                                            id           ::   integer
                                            page             Page
                                            comment      ::   tinyblob
                                            timestamp    ::   time
                                            author           User

    Added attribute revisions            Page
                                            revisions     set of Revision

    Moved content to revision text       Revision
                                            revisionText ::   text

    Added attribute counter              Page
                                            counter      ::   biginteger
                                      
8 Basic Transformations



    add or remove entity         add or remove property




    change name of entity       change name of property




    change type of set           change type of property




                             
1 Advanced Transformation




            move property




                
add
          Revision
             id             :   integer
             page           -   Page
             comment        :   tinyblob
             timestamp      :   time
             author         -   User




    add
          counter   :       biginteger




                         
     
At    / Entity Page  /  Property Title
    add   counter : biginteger




                      
At    / Entity[Id=''Page''] / Property[Id=''Title'']
    add   counter : biginteger




                       
at    // Property [Id = ../Id]
    add   counter :: biginteger




                       
Revision
              revisionText ::   text




    At     / Entity Revision  /  Property timeStamp
    move   page.content 
    to     revisionText :: text




                       
at     Entity Page  /  Property Title
    add    counter :: biginteger


       ;

    at     Entity Revision  /  Property timeStamp
    add    revisionText :: text
    from   page.content




                          
Evolving
            Data Models




    8 basic transformations

    1 advanced transformation

    Language to specify transformations

    Positioning sub language




    Specify data model evolutions

                                   
Data
    Migration



      
     
Technical Domain


     WebDSL



    Stratego/XT




                       Stratego/XT
                      Generic Aterm
                  SQL Databases (MySQL)


                             
Program transformation for data migration



       Because 


          we really like program transformations


          generally richer than regular data acessing languages (SQL)


          data migration = model transformation



                                      
User(                                User
       id(1),                               id       ::   integer
       name(“John”),                        name     ::   varchar
       email(“johnnyboy@mail.com”)          email    ::   tinytext
    )

    Page(                                Page
       id(2),                               id       :: integer
       title(“Hello World”),                title    :: varchar
       [author(1)]                          author    User
    )




                                      
User(                      <user>
       id(1),                      <id>1</id>
       name(“John”),               <name>John</name>
       email(“jb@m.com”)           <email>jb@m.com</email>
    )                          </user>

    Page(                      <page>
       id(2),                      <id>2</id>
       title(“Hello World”),       <title>Hello World</title>
       [author(1)]                 <authors>
    )                                   <author>1</author>
                                   </authors>
                               </page>



                                
Remove Attribute (1)

    User(                                          User
       id(1),                                         id       ::   integer
       name(“John”),                                  name     ::   varchar
       .....                                          email    ::   tinytext
    )

    Page(                                          Page
       id(2),                                         id       :: integer
       title(“Hello World”),                          title    :: varchar
       [author(1)]                                    author    User
    )




                               Signature:
                               User := Id * Name * Email
                                            
Remove Attribute (2)

    User(                                User
       id(1),                               id       ::   integer
       name(“John”),                        name     ::   varchar
       email(“johnnyboy@mail.com”)          email    ::   tinytext
    )

    Page(                                Page
       id(2),                               id       :: integer
       title(“Hello World”),                title    :: varchar
       [author(1)]                          author    User
    )




                                      
Generic Aterm (GTerm)

    User(   0,                            User
       [                                     id       ::   integer
            id(1),                           name     ::   varchar
            name(“John”),                    email    ::   tinytext
            email(“johnnyboy@mail.com”)
        ]
    )

    Page( 1,                              Page
       [                                     id       :: integer
          id(2),                             title    :: varchar
          title(“Hello World”),              author    User
          author(0)
       ]
    )
                                     
XMI

    User(   0,                              <user id='0'>
       [                                        <id>1</id>
            id(1),                              <name>John</name>
            name(“John”),                       <email>jb@m.com</email>
            email(“jb@m.com”)               </user>
        ]
    )

    Page( 1,                                <page id='1'>
       [                                        <id>2</id>
          id(2),                                <title>Hello World</title>
          title(“Hello World”),                 <author>0</author>
          author(0)                         </page>
       ]
    )
                                         
GTerm Characteristics



        Explicit references

        No nesting

        Implicit sets/lists

        Storage...




                      
GTerm Transformation


    Gterm library
        Object creation
        Modifying attributes 
            (add, remove, change, rename, ...)
        Object equivalence
        Object traversals
        (Object graph traversals)

    Data model library
       Type examination
       Super/Sub type handling
       Abstract type handling


                        
GTerm Storage



    Large quantities of data...



    Storage engine:
        In memory ­ list based         ~10K
        In memory ­ hash table based   ~500K
        In database                    ~25M ­ ...




                         
GTerm Storage – In database (1)

    User( 0,
       [
                                       0   User   id       1
               id(1),
                                       0   User   name     John
               name(“John”),
                                       0   User   email    jb@m.com
               email(“jb@m.com”)
                                       1   Page   id       2
        ]
                                       1   Page   title    Hello World
    )
                                       1   Page   author   0
    Page( 1,
       [
          id(2),
                                       0   User
          title(“Hello World”),
                                       1   Page
          author(0)
       ]
    )
                                    
GTerm Storage – In database
            GTerm Storage – In database (2)




    CREATE TABLE  Attributes (          CREATE TABLE  Objects (
       id    varchar(16),                  id    varchar(16),
       type  varchar(30),                  type  varchar(30)
       name varchar(30),                )
       value text,
       INDEX USING HASH (id (5)),
       INDEX USING BTREE (v(10))
    )




                                     
GTerm Performance




      Database indexes

      Stratego memory usage

      Parallel execution




                  
GTerm Storage – Regular database




    0   User   id       1                 User:
    0   User   name     John              1       John          jb@m.com
    0   User   email    jb@m.com
    1   Page   id       2                 Page:
    1   Page   title    Hello World       2       Hello World
    1   Page   author   0
                                          Page­User:
                                          2     1




                                       
Data model




               SQL Script                          GTerm 2 SQL


Old Database                Generic Database                     SQL Script




                            Migration (Stratego)             New Database




                                     
DBLP       Researchr




            
<dblp>
    <incollection mdate="2002­01­03" key="books/acm/kim95/AnnevelinkACFHK95">
         <author>Jurgen Annevelink</author>
         <author>Rafiul Ahad</author>
         <author>Amelia Carlson</author>
         <author>Daniel H. Fishman</author>
         <author>Michael L. Heytens</author>
         <author>William Kent</author>
         <title>Object SQL ­ A Language for the Design and 
                  Implementation of Object Databases.</title>
         <pages>42­68</pages>
         <year>1995</year>
         <booktitle>Modern Database Systems</booktitle>
         <url>db/books/collections/kim95.html#AnnevelinkACFHK95</url>
         <crossref>books/crc/KIM95</crossref>
    </incollection>

    ....
</dblp>


                                        
article {                                   book {
    key         :   string                      key         :   string
    title       :   string   *                  title       :   string   *
    author      :   string   *                  author      :   string   *
    editor      :   string   *                  editor      :   string   *
    booktitle   :   string   *                  booktitle   :   string   *
    pages       :   string   *                  pages       :   string   *
    address     :   string   *                  address     :   string   *
    journal     :   string   *                  journal     :   string   *
    volume      :   string   *                  volume      :   string   *
    number      :   string   *                  number      :   string   *
    publisher   :   string   *                  publisher   :   string   *
    crossref    :   string   *                  crossref    :   string   *
    series      :   string   *                  series      :   string   *
    school      :   string   *                  school      :   string   *
    chapter     :   string   *                  chapter     :   string   *
    month       :   string   *                  month       :   string   *
    year        :   string   *                  year        :   string   *
    url         :   string   *                  url         :   string   *
    note        :   string   *                  note        :   string   *
    mdate       :   string   *                  mdate       :   string   *
    cite        :   string   *                  cite        :   string   *
    ee          :   string   *                  ee          :   string   *
    cdrom       :   string   *                  cdrom       :   string   *
    isbn        :   string   *                  isbn        :   string   *
}                                           }


                                         
                                 ....
DBLP Data



    ~ 800,000           Authors

    ~ 1,200,000         Publications

    ~ 14,000,000        Lines of XML

    ~ 16,000,000        Database records




                     
Publication {                                  abstract PrintPublication - Publication
    key         :   string        Unique       {
    title       :   string                         pages           : string
    authors     -   Author        + Ind            publisher       : string    ?
    month       :   string                         firstpage       : int
    year        :   string                         lastpage        : int       ?
    dblpUrl     :   string        ?            }
    doi         :   string
    links       -   Link          *            Article - PrintPublication {
    abstract    :   string                         journalname : string
    note        :   string                         volumenumber: string
    annote      :   string                         issuenumber : string
    modified    :   date                       }
    modifiers   -   User          *
    cites       -   Publication   *            Alias {
    ee          :   string        ?                name           : string    Unique
    isbn        :   string                     }
    issn        :   string
    conflicts   :   bool                       AbstractAuthor {
}                                                  alias          - Alias
                                               }




                                            
Migration Approach




     Load objects into database


     9 Stages of migration


     Generate SQL




                   
Bridging
    Meta levels




       
     
article - PrintPublication {              Article - PrintPublication {
    reviewid   : string   ?                   reviewid   : string   ?
    rating     : string   ?                   rating     : string   ?
    journal    : string   ?                   journal    : string   ?
    volume     : string   ?                   volume     : string   ?
    number     : string   ?                   number     : string   ?
}                                         }




              ?Transformation(path, Substitution(newName), _)



                  renameType(|oldType, newName)



                        UPDATE At 
                        SET       t=newName 
                        WHERE     t = ...;

                                      
Proceedings - Collection {                Proceedings - Collection {
    booktitle : string     ?                  conference : string    ?
}                                         }




              ?Transformation(path, Substitution(newName), _)



              renameAtt(|oldName, newName, type, dmodel)



             UPDATE At 
             SET       n = newName 
             WHERE     n = oldName
             AND       t = <getSubTypeQuery(|dmodel)> type;
                                      
abstract Thesis - Publication {                 abstract Thesis - Publication {
    school      : string                            school     : string
}                                                   type       : string   ?
                                                }




    ?Transformation(path, Addition(Att(Name(attName), PrimType(_), annotations)),_);
                       <getAttAnn(|"MinCard")> annotations; ?0




                                          ...



                                         ...




                                           
abstract Thesis - Publication {               abstract Thesis - Publication {
    school      : string                          school     : string
}                                                 type       : string
                                              }



       ?Transformation(
           path, 
           Addition(Att(Name(attName), PrimType(_), annotations)),
           _
       );
       <getAttAnn(|"MinCard")> annotations; ?low


       <addDefaultAttributesToType(|
           type, <make­int> low, attName, attType, mmodel
       )> model;


       onType(
            addDefaultAttributes(|type, nr, attName, attType)
       | type, mmodel)
                                           
onType



    1. find objects of type

    2. divide into chunks

    3. per chunk                     in parallel
        Load objects in chunk
        per object
            s
            save object if changed




                         
Publication {                              Publication {
    key     : string     Unique                key     : string    Unique
    title : string       ?                     title : string      ?
    authors : string     + Indexed             authors - Author    + Indexed
    year    : string                           year    : string
    ...                                        ...
}                                          }

                                           Author {
                                               alias   : string
                                           }


       ?Transformation(path, Substitution(DeclType(Name(newTypeName))), _)
       ...
       Author alias mandatory
       Author alias not unique


       onType(
           for each author
                create author object
                set author attribute
       )                                
Author {                                          Author {
    alias   : string                                  alias   : Alias
}                                                 }

                                                  Alias {
                                                      name    : string   Unique




       ?Transformation(path, Substitution(DeclType(Name(newTypeName))), _)
       ...
       Alias name mandatory
       Alias name unique


       onType(
           if alias exists then
                 set alias attribute to existing id
           else
                 create alias object
                 set alias attribute to new id
       )
                                               
Supported transformations

       Identity
       Primitive attribute addition (3)
       Complex attribute addition
       Attribute removal
       Attribute name change
       Attribute move (2)
       Primitive type change
       Implicit reference resolution
       Attribute wrapping
       Type addition
       Type removal
       Type name change
       Abstract type handling
       Inverse annotation handling (2)
       Cardinality changes (2)

                       
In memory vs. Database transformations



       Easy to define         Hard to define

       Easy to optimize       No need to optimize

       Expressive             Limited expressiveness

       Easy to abstract       Abstraction near impossible

       Performance OK         Performance great




                           
Detecting
    Evolution



                 
     
article - PrintPublication {           Article - PrintPublication {
    reviewid   : string   ?                reviewid   : string   ?
    rating     : string   ?                rating     : string   ?
    journal    : string   ?    diff?       journal    : string   ?
    volume     : string   ?                volume     : string   ?
    number     : string   ?                number     : string   ?
}                                      }




                                   
     
article - PrintPublication {            Article - PrintPublication {
    reviewid   : string   ?                 reviewid   : string   ?
    rating     : string   ?                 rating     : string   ?
    journal    : string   ?                 journal    : string   ?
    volume     : string   ?                 volume     : string   ?
    number     : string   ?                 number     : string   ?
}                                       }




 Entity(                                Entity(
     Name(“article”),                       Name(“Article”),
     Name(“PrintPublication”),              Name(“PrintPublication”),
     [                                      [
         Att(                                   Att(
             Name(“reviewid”),                      Name(“reviewid”),
             PrimType(                              PrimType(
                 Name(“string”)),                       Name(“string”)),
             [                                      [
                 MinCard(0),                            MinCard(0),
                 MaxCard(1)                             MaxCard(1)
             ]                                      ]
         ),                                     ),
         ...                                    ...
     ]                                      ]
 )                                      )
article - PrintPublication {                Article - PrintPublication {
    reviewid   : string   ?                     reviewid   : string   ?
    rating     : string   ?                     rating     : string   ?
    journal    : string   ?                     journal    : string   ?
    volume     : string   ?                     volume     : string   ?
    number     : string   ?                     number     : string   ?
}                                           }




 Entity(                                    Entity(
     Name(“article”),                           Name(“Article”),
     Name(“PrintPublication”),                  Name(“PrintPublication”),
     [                                          [
         Att(                                       Att(
             Name(“reviewid”),                          Name(“reviewid”),
             PrimType(                                  PrimType(
                 Name(“string”)),   diff?                   Name(“string”)),
             [                                          [
                 MinCard(0),                                MinCard(0),
                 MaxCard(1)                                 MaxCard(1)
             ]                                          ]
         ),                                         ),
         ...                                        ...
     ]                                          ]
 )                                          )
article - PrintPublication {                 Article - PrintPublication {
    reviewid   : string   ?                      reviewid   : string   ?
    rating     : string   ?                      rating     : string   ?
    journal    : string   ?                      journal    : string   ?
    volume     : string   ?                      volume     : string   ?
    number     : string   ?                      number     : string   ?
}                                            }




                     What happened?

             Removed type article;         Added type Article

             Added type article;           Removed type Article

             Substituted article name with Article


                                        
Article - PrintPublication {                 Article - PrintPublication {
    reviewid   : string   ?                      reviewid   : string   ?
    rating     : string   ?                      rate       : string
    journal    : string   ?                      journal    : string   ?
    volume     : string   ?                      volume     : string   ?
    number     : string                          nr         : string   ?
}                                            }




                     What happened?

           renamed rating;                   renamed number;
           changed rating cardinality;       changed number cardinality




                                          
Article - PrintPublication {                 Article - PrintPublication {
    reviewid   : string   ?                      reviewid   : string   ?
    rating     : string   ?                      rate       : string
    journal    : string   ?                      journal    : string   ?
    volume     : string   ?                      volume     : string   ?
    number     : string                          nr         : string   ?
}                                            }




                     What happened?

           renamed rating;                   renamed number;
           changed rating cardinality;       changed number cardinality


           swapped rating and number; 
           renamed rating;                   renamed number

                                          
Article - PrintPublication {                 Article - PrintPublication {
    reviewid   : string   ?                      reviewid   : string   ?
    rating     : string   ?                      rate       : string
    journal    : string   ?                      journal    : string   ?
    volume     : string   ?                      volume     : string   ?
    number     : string                          nr         : string   ?
}                                            }




                      What happened?

           renamed rating;                   renamed number;
           changed rating cardinality;       changed number cardinality


           swapped rating and number; 
           renamed rating;                   renamed number

           deleted Article;                  added Article
Weighing transformations




      Addition:       0.8 * relativeSize4
      Removal:        0.5 * relativeSize4
      Substitution    0.4 * relativeSize6




      Custom weights:
         Type removal
         Type substitution         (0)
         Attribute substitution    (0)


                        
Try them all!




    On both versions at the same time


    Bound on weight
       Increasing bound


    Weight computation caching




                    
     
Heterogeneous
    Coupled Evolution



       
     
     
     
Horizontal Generalization




                 
User
       name       ::   varchar
       realName   ::   varchar
       email      ::   tinytext           Entities

                                        Properties
    Page
       title      :: varchar                  Types
       author      User
       isRedirect :: boolean




                       Meta model / Grammar

                                   
     
Lists

        More types

    Inverse associations

      Abstract types




             
Vertical Generalization
               
Heterogeneous 
         Coupled Evolution 
                 of
        Software Languages




     
The ingredients


         Software Language 
         Definition Formalism




         Evolving
         Software Language 




         Software 
     
Diverse
    Evolution




         What did we generalize?



         Why did we generalize?




                     
A Generic
    Architecture



      
     
     
     
Input



    Coupled evolution scenario




    Mapping from Mi to Mi+1

                                  
Output

                 Domain Specific 
             Transformation Language 
                      (DSTL)



             Transformation Interpreter




                Software Migration


        
SDF              SDF




          Stratego




              
Entity*            ­> DataM    Model
    Id "{" Prop* "}"   ­> Entity   Entity
    Id "::" Type       ­> Prop     Prop
    "int"              ­> Type     Int
    "bool"             ­> Type     Bool
    Id                 ­> Type  
    "set of" Type      ­> Type     Set
    NAME               ­> Id       Id




                         
Lists                               [...]

    Entity*          ­> DataM       Model




    "add" Entity    ­> LocalTransformation
    "remove"        ­> LocalTransformation



                       
Lexicals                             ''...''

    NAME             ­> Id          Id




    "substitute" NAME ­> LocalTransformation




                       
Multiple productions                  ­>*

    "int"            ­> Type        Int
    "bool"           ­> Type        Bool
    Id               ­> Type  
    "set of" Type    ­> Type        Set



    "substitute" Type ­> LocalTransformation




                       
Type checking                                 .../...


    "at" APath LocalTransformation ­> Transformation




                 Generation of local transformation domains
                                        APath type derivation
                                              Type checking

                               
Larger grammars




               Transform


               Constant




                    
     
Interpreter generation




                             Transformations library

                             Generic DSTL constructs

                             APath evaluation
                      
A Generic
    Architecture




                    
Software 
                                Language 
                                Evolution


                                  
    This research was supported by NWO/JACQUARD project
     638.001.610, MoDSE: Model­Driven Software Evolution.

Model Driven Software Development - Data Model Evolution

  • 1.
    Sander Vermolen Eelco Visser Data Model  Evolution This research is supported by NWO/JACQUARD project     638.001.610, MoDSE: Model­Driven Software Evolution.
  • 2.
    Data  Models    
  • 3.
       
  • 4.
       
  • 5.
       
  • 6.
    User   1 name bob  real name Bob Johnson email  [email protected] Page   1 title "The first page" isRedirect false text  "Hello world"    
  • 7.
    Count page views Version history    
  • 8.
       
  • 9.
    No page count No revisions User   1 name bob  real name Bob Johnson email  [email protected] Page   1 title "The first page" isRedirect false text  "Hello world"    
  • 10.
       
  • 11.
  • 12.
    ... $dbh->bz_add_column('attachments', 'submitter_id', {TYPE => 'INT3', NOTNULL => 1}, 0); $dbh->bz_rename_column('bugs_activity', 'when', 'bug_when'); _add_bug_vote_cache(); _update_product_name_definition(); _add_bug_keyword_cache(); $dbh->bz_add_column('profiles', 'disabledtext', {TYPE => 'MEDIUMTEXT', NOTNULL => 1}, ''); _populate_longdescs(); _update_bugs_activity_field_to_fieldid(); if (!$dbh->bz_column_info('bugs', 'lastdiffed')) { $dbh->bz_add_column('bugs', 'lastdiffed', {TYPE =>'DATETIME'}); $dbh->do('UPDATE bugs SET lastdiffed = NOW()'); } _add_unique_login_name_index_to_profiles(); $dbh->bz_add_column('profiles', 'mybugslink', {TYPE => 'BOOLEAN', NOTNULL => 1, DEFAULT => 'TRUE'}); _update_component_user_fields_to_ids(); $dbh->bz_add_column('bugs', 'everconfirmed', {TYPE => 'BOOLEAN', NOTNULL => 1}, 1); $dbh->bz_add_column('products', 'maxvotesperbug', {TYPE => 'INT2', NOTNULL => 1, DEFAULT => '10000'}); $dbh->bz_add_column('products', 'votestoconfirm', {TYPE => 'INT2', NOTNULL => 1}, 0); _populate_milestones_table(); $dbh->bz_alter_column('bugs', 'target_milestone', {TYPE => 'varchar(20)', NOTNULL => 1, DEFAULT => "'---'"}); $dbh->bz_alter_column('milestones', 'value', {TYPE => 'varchar(20)', NOTNULL => 1}); _add_products_defaultmilestone(); if (!$dbh->bz_index_info('cc', 'cc_bug_id_idx') || !$dbh->bz_index_info('cc', 'cc_bug_id_idx')->{TYPE}) { $dbh->bz_drop_index('cc', 'cc_bug_id_idx'); $dbh->bz_add_index('cc', 'cc_bug_id_idx', {TYPE => 'UNIQUE', FIELDS => [qw(bug_id who)]}); } if (!$dbh->bz_index_info('keywords', 'keywords_bug_id_idx') || !$dbh->bz_index_info('keywords', 'keywords_bug_id_idx')->{TYPE}) { $dbh->bz_drop_index('keywords', 'keywords_bug_id_idx'); $dbh->bz_add_index('keywords', 'keywords_bug_id_idx', {TYPE => 'UNIQUE', FIELDS => [qw(bug_id keywordid)]}); } ...    
  • 13.
    Costly High risk Holds back the development process Large infrequent development steps    
  • 14.
    User id  : integer name : varchar realName : varchar email : tinytext Page id : integer title : varchar author - User * isRedirect : boolean content : text     set of
  • 15.
    User id  : integer Unique name : varchar realName : varchar ? email : tinytext Page : Medium author - User min(1) max(8) content : text refs : url * Indexed abstract Medium id : integer Unique title : ANY_NAME    
  • 16.
    Evolving Data Models    
  • 17.
       
  • 18.
       
  • 19.
       
  • 20.
       
  • 21.
    Specifying Data Model Evolution    
  • 22.
    User User id  :: integer id  :: integer name :: varchar name :: varchar realName :: varchar realName :: varchar email :: tinytext email :: tinytext Page Page id :: integer id :: integer title :: varchar title :: varchar author  User counter :: biginteger isRedirect :: boolean isRedirect :: boolean content :: text revisions  set of Revision Revision id :: integer page  Page comment :: tinyblob timestamp :: time revisionText :: text     author  User
  • 23.
    What happened? Added type revisions Revision id :: integer page  Page comment :: tinyblob timestamp :: time author  User Added attribute revisions Page revisions  set of Revision Moved content to revision text Revision revisionText :: text Added attribute counter Page counter :: biginteger    
  • 24.
    8 Basic Transformations add or remove entity add or remove property change name of entity change name of property change type of set change type of property    
  • 25.
    1 Advanced Transformation move property    
  • 26.
    add Revision id : integer page - Page comment : tinyblob timestamp : time author - User add counter : biginteger    
  • 27.
       
  • 28.
    At / Entity Page  /  Property Title add counter : biginteger    
  • 29.
    At / Entity[Id=''Page''] / Property[Id=''Title''] add counter : biginteger    
  • 30.
    at // Property [Id = ../Id] add counter :: biginteger    
  • 31.
    Revision revisionText :: text At / Entity Revision  /  Property timeStamp move page.content  to revisionText :: text    
  • 32.
    at Entity Page  /  Property Title add counter :: biginteger ; at Entity Revision  /  Property timeStamp add revisionText :: text from page.content    
  • 33.
    Evolving Data Models 8 basic transformations 1 advanced transformation Language to specify transformations Positioning sub language Specify data model evolutions    
  • 34.
    Data Migration    
  • 35.
       
  • 36.
    Technical Domain WebDSL Stratego/XT Stratego/XT Generic Aterm SQL Databases (MySQL)    
  • 37.
    Program transformation for data migration Because  we really like program transformations generally richer than regular data acessing languages (SQL) data migration = model transformation    
  • 38.
    User( User id(1), id  :: integer name(“John”), name :: varchar email(“[email protected]”) email :: tinytext ) Page( Page id(2), id :: integer title(“Hello World”), title :: varchar [author(1)] author  User )    
  • 39.
    User( <user> id(1), <id>1</id> name(“John”), <name>John</name> email(“[email protected]”) <email>[email protected]</email> ) </user> Page( <page> id(2), <id>2</id> title(“Hello World”), <title>Hello World</title> [author(1)] <authors> ) <author>1</author> </authors> </page>    
  • 40.
    Remove Attribute (1) User( User id(1), id  :: integer name(“John”), name :: varchar ..... email :: tinytext ) Page( Page id(2), id :: integer title(“Hello World”), title :: varchar [author(1)] author  User ) Signature:   User := Id * Name * Email  
  • 41.
    Remove Attribute (2) User( User id(1), id  :: integer name(“John”), name :: varchar email(“[email protected]”) email :: tinytext ) Page( Page id(2), id :: integer title(“Hello World”), title :: varchar [author(1)] author  User )    
  • 42.
    Generic Aterm (GTerm) User( 0, User [ id  :: integer id(1), name :: varchar name(“John”), email :: tinytext email(“[email protected]”) ] ) Page( 1, Page [ id :: integer id(2), title :: varchar title(“Hello World”), author  User author(0) ] )    
  • 43.
    XMI User( 0, <user id='0'> [ <id>1</id> id(1), <name>John</name> name(“John”), <email>[email protected]</email> email(“[email protected]”) </user> ] ) Page( 1, <page id='1'> [ <id>2</id> id(2), <title>Hello World</title> title(“Hello World”), <author>0</author> author(0) </page> ] )    
  • 44.
    GTerm Characteristics Explicit references No nesting Implicit sets/lists Storage...    
  • 45.
    GTerm Transformation Gterm library Object creation Modifying attributes  (add, remove, change, rename, ...) Object equivalence Object traversals (Object graph traversals) Data model library Type examination Super/Sub type handling Abstract type handling    
  • 46.
    GTerm Storage Large quantities of data... Storage engine: In memory ­ list based ~10K In memory ­ hash table based ~500K In database ~25M ­ ...    
  • 47.
    GTerm Storage – In database (1) User( 0, [ 0 User id 1 id(1), 0 User name John name(“John”), 0 User email [email protected] email(“[email protected]”) 1 Page id 2 ] 1 Page title Hello World ) 1 Page author 0 Page( 1, [ id(2), 0 User title(“Hello World”), 1 Page author(0) ] )    
  • 48.
    GTerm Storage – In database GTerm Storage – In database (2) CREATE TABLE  Attributes ( CREATE TABLE  Objects ( id varchar(16), id varchar(16), type varchar(30), type varchar(30) name varchar(30), ) value text, INDEX USING HASH (id (5)), INDEX USING BTREE (v(10)) )    
  • 49.
    GTerm Performance Database indexes Stratego memory usage Parallel execution    
  • 50.
    GTerm Storage – Regular database 0 User id 1 User: 0 User name John 1 John [email protected] 0 User email [email protected] 1 Page id 2 Page: 1 Page title Hello World 2 Hello World 1 Page author 0 Page­User: 2 1    
  • 51.
    Data model SQL Script GTerm 2 SQL Old Database Generic Database SQL Script Migration (Stratego) New Database    
  • 52.
    DBLP Researchr    
  • 53.
    <dblp> <incollection mdate="2002­01­03" key="books/acm/kim95/AnnevelinkACFHK95"> <author>Jurgen Annevelink</author> <author>Rafiul Ahad</author> <author>Amelia Carlson</author> <author>Daniel H. Fishman</author> <author>Michael L. Heytens</author> <author>William Kent</author> <title>Object SQL ­ A Language for the Design and  Implementation of Object Databases.</title> <pages>42­68</pages> <year>1995</year> <booktitle>Modern Database Systems</booktitle> <url>db/books/collections/kim95.html#AnnevelinkACFHK95</url> <crossref>books/crc/KIM95</crossref> </incollection> .... </dblp>    
  • 54.
    article { book { key : string key : string title : string * title : string * author : string * author : string * editor : string * editor : string * booktitle : string * booktitle : string * pages : string * pages : string * address : string * address : string * journal : string * journal : string * volume : string * volume : string * number : string * number : string * publisher : string * publisher : string * crossref : string * crossref : string * series : string * series : string * school : string * school : string * chapter : string * chapter : string * month : string * month : string * year : string * year : string * url : string * url : string * note : string * note : string * mdate : string * mdate : string * cite : string * cite : string * ee : string * ee : string * cdrom : string * cdrom : string * isbn : string * isbn : string * } }     ....
  • 55.
    DBLP Data ~ 800,000  Authors ~ 1,200,000  Publications ~ 14,000,000  Lines of XML ~ 16,000,000 Database records    
  • 56.
    Publication { abstract PrintPublication - Publication key : string Unique { title : string pages : string authors - Author + Ind publisher : string ? month : string firstpage : int year : string lastpage : int ? dblpUrl : string ? } doi : string links - Link * Article - PrintPublication { abstract : string journalname : string note : string volumenumber: string annote : string issuenumber : string modified : date } modifiers - User * cites - Publication * Alias { ee : string ? name : string Unique isbn : string } issn : string conflicts : bool AbstractAuthor { } alias - Alias }    
  • 57.
    Migration Approach Load objects into database 9 Stages of migration Generate SQL    
  • 58.
    Bridging Meta levels    
  • 59.
       
  • 60.
    article - PrintPublication{ Article - PrintPublication { reviewid : string ? reviewid : string ? rating : string ? rating : string ? journal : string ? journal : string ? volume : string ? volume : string ? number : string ? number : string ? } } ?Transformation(path, Substitution(newName), _) renameType(|oldType, newName) UPDATE At  SET  t=newName  WHERE t = ...;    
  • 61.
    Proceedings - Collection{ Proceedings - Collection { booktitle : string ? conference : string ? } } ?Transformation(path, Substitution(newName), _) renameAtt(|oldName, newName, type, dmodel) UPDATE At  SET  n = newName  WHERE n = oldName AND t = <getSubTypeQuery(|dmodel)> type;    
  • 62.
    abstract Thesis -Publication { abstract Thesis - Publication { school : string school : string } type : string ? } ?Transformation(path, Addition(Att(Name(attName), PrimType(_), annotations)),_); <getAttAnn(|"MinCard")> annotations; ?0 ... ...    
  • 63.
    abstract Thesis -Publication { abstract Thesis - Publication { school : string school : string } type : string } ?Transformation( path,  Addition(Att(Name(attName), PrimType(_), annotations)), _ ); <getAttAnn(|"MinCard")> annotations; ?low <addDefaultAttributesToType(| type, <make­int> low, attName, attType, mmodel )> model; onType( addDefaultAttributes(|type, nr, attName, attType) | type, mmodel)    
  • 64.
    onType 1. find objects of type 2. divide into chunks 3. per chunk in parallel Load objects in chunk per object s save object if changed    
  • 65.
    Publication { Publication { key : string Unique key : string Unique title : string ? title : string ? authors : string + Indexed authors - Author + Indexed year : string year : string ... ... } } Author { alias : string } ?Transformation(path, Substitution(DeclType(Name(newTypeName))), _) ... Author alias mandatory Author alias not unique onType( for each author create author object set author attribute   )  
  • 66.
    Author { Author { alias : string alias : Alias } } Alias { name : string Unique ?Transformation(path, Substitution(DeclType(Name(newTypeName))), _) ... Alias name mandatory Alias name unique onType( if alias exists then set alias attribute to existing id else create alias object set alias attribute to new id )    
  • 67.
    Supported transformations Identity Primitive attribute addition (3) Complex attribute addition Attribute removal Attribute name change Attribute move (2) Primitive type change Implicit reference resolution Attribute wrapping Type addition Type removal Type name change Abstract type handling Inverse annotation handling (2) Cardinality changes (2)    
  • 68.
    In memory vs. Database transformations Easy to define Hard to define Easy to optimize No need to optimize Expressive Limited expressiveness Easy to abstract Abstraction near impossible Performance OK Performance great    
  • 69.
    Detecting Evolution    
  • 70.
       
  • 71.
    article - PrintPublication{ Article - PrintPublication { reviewid : string ? reviewid : string ? rating : string ? rating : string ? journal : string ? diff? journal : string ? volume : string ? volume : string ? number : string ? number : string ? } }    
  • 72.
       
  • 73.
    article - PrintPublication{ Article - PrintPublication { reviewid : string ? reviewid : string ? rating : string ? rating : string ? journal : string ? journal : string ? volume : string ? volume : string ? number : string ? number : string ? } } Entity( Entity( Name(“article”), Name(“Article”), Name(“PrintPublication”), Name(“PrintPublication”), [ [ Att( Att( Name(“reviewid”), Name(“reviewid”), PrimType( PrimType( Name(“string”)), Name(“string”)), [ [ MinCard(0), MinCard(0), MaxCard(1) MaxCard(1) ] ] ), ), ... ... ] ]  )   )
  • 74.
    article - PrintPublication{ Article - PrintPublication { reviewid : string ? reviewid : string ? rating : string ? rating : string ? journal : string ? journal : string ? volume : string ? volume : string ? number : string ? number : string ? } } Entity( Entity( Name(“article”), Name(“Article”), Name(“PrintPublication”), Name(“PrintPublication”), [ [ Att( Att( Name(“reviewid”), Name(“reviewid”), PrimType( PrimType( Name(“string”)), diff? Name(“string”)), [ [ MinCard(0), MinCard(0), MaxCard(1) MaxCard(1) ] ] ), ), ... ... ] ]  )   )
  • 75.
    article - PrintPublication{ Article - PrintPublication { reviewid : string ? reviewid : string ? rating : string ? rating : string ? journal : string ? journal : string ? volume : string ? volume : string ? number : string ? number : string ? } } What happened? Removed type article; Added type Article Added type article; Removed type Article Substituted article name with Article    
  • 76.
    Article - PrintPublication{ Article - PrintPublication { reviewid : string ? reviewid : string ? rating : string ? rate : string journal : string ? journal : string ? volume : string ? volume : string ? number : string nr : string ? } } What happened? renamed rating;  renamed number; changed rating cardinality; changed number cardinality    
  • 77.
    Article - PrintPublication{ Article - PrintPublication { reviewid : string ? reviewid : string ? rating : string ? rate : string journal : string ? journal : string ? volume : string ? volume : string ? number : string nr : string ? } } What happened? renamed rating;  renamed number; changed rating cardinality; changed number cardinality swapped rating and number;  renamed rating;  renamed number    
  • 78.
    Article - PrintPublication{ Article - PrintPublication { reviewid : string ? reviewid : string ? rating : string ? rate : string journal : string ? journal : string ? volume : string ? volume : string ? number : string nr : string ? } } What happened? renamed rating;  renamed number; changed rating cardinality; changed number cardinality swapped rating and number;  renamed rating;  renamed number   deleted Article;    added Article
  • 79.
    Weighing transformations Addition: 0.8 * relativeSize4 Removal: 0.5 * relativeSize4 Substitution 0.4 * relativeSize6 Custom weights: Type removal Type substitution  (0) Attribute substitution (0)    
  • 80.
    Try them all! On both versions at the same time Bound on weight Increasing bound Weight computation caching    
  • 81.
       
  • 82.
    Heterogeneous Coupled Evolution    
  • 83.
       
  • 84.
       
  • 85.
       
  • 86.
  • 87.
    User name :: varchar realName :: varchar email :: tinytext Entities Properties Page title :: varchar Types author  User isRedirect :: boolean Meta model / Grammar    
  • 88.
       
  • 89.
    Lists More types Inverse associations Abstract types    
  • 90.
  • 91.
    Heterogeneous  Coupled Evolution  of Software Languages    
  • 92.
    The ingredients Software Language  Definition Formalism Evolving Software Language  Software     
  • 93.
    Diverse Evolution What did we generalize? Why did we generalize?    
  • 94.
    A Generic Architecture    
  • 95.
       
  • 96.
       
  • 97.
       
  • 98.
    Input Coupled evolution scenario Mapping from Mi to Mi+1    
  • 99.
    Output Domain Specific  Transformation Language  (DSTL) Transformation Interpreter Software Migration    
  • 100.
    SDF SDF Stratego    
  • 101.
    Entity* ­> DataM  Model Id "{" Prop* "}" ­> Entity Entity Id "::" Type ­> Prop  Prop "int" ­> Type   Int "bool" ­> Type   Bool Id ­> Type   "set of" Type ­> Type   Set NAME ­> Id     Id    
  • 102.
    Lists [...] Entity* ­> DataM  Model "add" Entity    ­> LocalTransformation "remove"        ­> LocalTransformation    
  • 103.
    Lexicals ''...'' NAME ­> Id     Id "substitute" NAME ­> LocalTransformation    
  • 104.
    Multiple productions ­>* "int" ­> Type   Int "bool" ­> Type   Bool Id ­> Type   "set of" Type ­> Type   Set "substitute" Type ­> LocalTransformation    
  • 105.
    Type checking .../... "at" APath LocalTransformation ­> Transformation Generation of local transformation domains APath type derivation Type checking    
  • 106.
    Larger grammars Transform Constant    
  • 107.
       
  • 108.
    Interpreter generation Transformations library Generic DSTL constructs APath evaluation    
  • 109.
    A Generic Architecture    
  • 110.
    Software  Language  Evolution     This research was supported by NWO/JACQUARD project 638.001.610, MoDSE: Model­Driven Software Evolution.