SlideShare a Scribd company logo
27th of October 2016
Piotr Zakrzewski – The Hyve
TranSMART Pro 17.1 project
Technical Overview
2
What does 17.1 mean for future
development?
Improved ease of development
● Clean up of repositories (single repo)
● One step build
● Dependencies update
● Rest api improvements
● Consolidation and extension of the star schema to
better fit tranSMART and new data types
● Documentation
3
What does 17.1 mean for future
development?
Improved ease of development
● Clean up of repositories (single repo)
● One step build
● Dependencies update
● Rest api improvements
● Consolidation and extension of the star schema to
better fit tranSMART and new data types
● Documentation
4
Repository Structure
Before you can
deploy it here ...
5
Repository Structure
core-api core-db rest-api
R
modules
core-api
transmart
data
legacy db
you need all of these ...
...and these...
6
Repository Structure
16.2:
- TranSMART 16.2 spans 10 core repositories
- Building & testing tranSMART requires a special
setup (that resides in yet another repository)
17.1:
- Single repository with all core components
necessary for building working tranSMART WAR
file
7
What does 17.1 mean for future
development?
Improved ease of development
● Clean up of repositories (single repo)
● One step build
● Dependencies update
● Rest api improvements
● Consolidation and extension of the star schema to
better fit tranSMART and new data types
● Documentation
8
Versioning of Artifacts
16.2:
- Most components are versioned as
SNAPSHOTs
- core-api, core-db, rest-api, transmartApp and
all other core components need to match strictly
in revision in order to work
17.1:
- Single repository: all changes to different
components come in a single PR
9
Build Process
16.2:
- Transmart 16.2 (Grails 2) uses Gant scripts for building
- git-repo used for fetching all repositories
- custom groovy script (dependency manager) needed for
dev setup
17.1:
- Gradle build system (comes with Grails 3)
- One step build (also with database setup)
- just git clone && ./gradlew build
10
Test Setup
16.2:
- Custom script matching branches during travis
run
- Different way to run tests locally and on travis
- No reliable way to run tests for all components
- Tested on H2 in-memory database
17.1:
- ./gradlew test both locally and on travis
- tested against Oracle and Postgres
11
- Default option for Grails 3.X
- Very versatile build system
- Also very popular (gained momentum due to adoption by
Android)
- Especially suitable for multi-project, multi-language
builds like tranSMART
12
What does 17.1 mean for future
development?
Improved ease of development
● Clean up of repositories (single repo)
● One step build
● Dependencies update
● Rest api improvements
● Consolidation and extension of the star schema to
better fit tranSMART and new data types
● Documentation
13
Java 7 to Java 8
tranSMART is still running on Java 7 which is no longer
supported, even for security updates since April 2015.
Java 7 reached its end of life
14
Groovy 2.4 and Grails 3
- Java 8 supports invokeDynamic, which should increase
performance of many groovy dynamic calls
- Many workarounds accounting for old Grails and
Hibernate versions bugs no longer necessary
- Upgrade allowed us to adopt better build system: Gradle
15
What does 17.1 mean for future
development?
Improved ease of development
● Clean up of repositories (single repo)
● One step build
● Dependencies update
● Rest api improvements
● Consolidation and extension of the star schema to
better fit tranSMART and new data types
● Documentation
16
REST-API versioning
● TranSMART REST-api is used in production
● Several clients and third-party apps
● But development needs to continue …
17
REST-API versioning
- in 17.1 REST-api versioning is introduced
- Versioning is done on the url level
- GET /studies becomes GET /v1/studies
- only minor influence on existing clients (change of base
url configuration to include version)
18
Current REST-API documentation
19
Open API (previously Swagger)
20
What does 17.1 mean for future
development?
Improved ease of development
● Clean up of repositories (single repo)
● One step build
● Dependencies update
● Rest api improvements
● Consolidation and extension of the star schema to
better fit tranSMART and new data types
● Documentation
21
Db schema as of now (16.2)
22
Db schema as of now (16.2)
Some facts about the current schema:
Study exists only as string ids sprinkled around the star
schema (no table for study)
Concepts and patients belong to a study (cannot be
shared)
Combination of patient-concept yields a single
observation
23
Db schema of 17.1
24
Db schema of 17.1
Most important Consequences of 17.1 changes:
Concepts and patients can be shared between studies
more straightforward cross trial comparison (trial-visit
dimension) and longitudinal data (start date) support
Much redundancy and inconsistencies removed
25
Hypercube
- Introduction of longitudinal data
requires a whole different
approach
- Modifiers used to store
time point. Both relative and
absolute allowed
- Each observation has effectively an additional dimension
(hence the Hypercube)
26
How to query a Hypercube ?
27
Impact on backwards compatibility
- Old UI will work only with old data, new data (especially
longitudinal) will not be supported
- Old ui will not make use of new cross-trial functionality
- Migration path will be provided between 16.2 and 17.1
28
New UI however will support the longitudinal
data and other features
29
What does 17.1 mean for future
development?
Improved ease of development
● Clean up of repositories (single repo)
● One step build
● Dependencies update
● Rest api improvements
● Consolidation and extension of the star schema to
better fit tranSMART and new data types
● Documentation
30
Documentation
- one of the project deliverables is documentation on the
database schema
- REST-api documented with Open-API
- Documentation as part of git repository
31
Conclusion
17.1 aside from many new features is also a major
clean-up that will make future developments easier
tranSMART 17.1 technical overview
Backup slides
33
34
Arvados Keep
35
Performance Benchmarks
- Goal: safeguarding performance of REST-api
- Implemented as a Gradle task (single command)
- Should help developers spot falls in performance after
new changes
- Reference setup on Amazon will be available to make
benchmarks comparable
36
Other changes
- Multiple observations per concept-patient support
- Categorial variables no longer loaded per value (e.g.
variable Treated being two variables: yes and no)
- Several new tables to accommodate new HDD data type
(RNAseq measurement per transcript) and table to store
generic links to external resources (files)

More Related Content

PDF
Glowing bear
PPTX
tranSMART 17.1 technical overview
PDF
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
PDF
Airframe RPC
PDF
RGW S3: Features vs deep compatibility - Robin Johnson
PDF
Docker for mac & local developer environment optimization
PDF
Build your own discovery index of scholary e-resources
PDF
Gluster fs architecture_future_directions_tlv
Glowing bear
tranSMART 17.1 technical overview
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
Airframe RPC
RGW S3: Features vs deep compatibility - Robin Johnson
Docker for mac & local developer environment optimization
Build your own discovery index of scholary e-resources
Gluster fs architecture_future_directions_tlv

What's hot (19)

PDF
TiDB Introduction
PDF
Integrating Flink with Hive, Seattle Flink Meetup, Feb 2019
PDF
Kubernetes on CRI-O
PDF
Flink Forward Berlin 2017: Zohar Mizrahi - Python Streaming API
PPTX
Developing a Framework for File Format Migrations. Joey Heinen and Andrea Goe...
PPTX
HDF5 OPeNDAP project update and demo
PDF
Grafana 7.0
PDF
Everything you wanted to know about RadosGW - Orit Wasserman, Matt Benjamin
PPTX
RxJS vs RxJava: Intro
PPTX
Data- How Does It Work-
PPTX
Meetup#2: Building responsive Symbology & Suggest WebService
PDF
Apache Flink Training Workshop @ HadoopCon2016 - #1 System Overview
PDF
Web scale monitoring
PDF
Codefresh CICD New Features Launch! May 2019
PDF
Maintaining spatial data infrastructures (SDIs) using distributed task queues
PDF
RocksDB storage engine for MySQL and MongoDB
PDF
Hyperscale SIG Introduction
PDF
Flink Forward Berlin 2018: Xingcan Cui - "Stream Join in Flink: from Discrete...
TiDB Introduction
Integrating Flink with Hive, Seattle Flink Meetup, Feb 2019
Kubernetes on CRI-O
Flink Forward Berlin 2017: Zohar Mizrahi - Python Streaming API
Developing a Framework for File Format Migrations. Joey Heinen and Andrea Goe...
HDF5 OPeNDAP project update and demo
Grafana 7.0
Everything you wanted to know about RadosGW - Orit Wasserman, Matt Benjamin
RxJS vs RxJava: Intro
Data- How Does It Work-
Meetup#2: Building responsive Symbology & Suggest WebService
Apache Flink Training Workshop @ HadoopCon2016 - #1 System Overview
Web scale monitoring
Codefresh CICD New Features Launch! May 2019
Maintaining spatial data infrastructures (SDIs) using distributed task queues
RocksDB storage engine for MySQL and MongoDB
Hyperscale SIG Introduction
Flink Forward Berlin 2018: Xingcan Cui - "Stream Join in Flink: from Discrete...
Ad

Similar to tranSMART 17.1 technical overview (20)

ODP
20160401 guster-roadmap
PDF
20160401 guster-roadmap
ODP
20160401 Gluster-roadmap
PDF
Upcoming features in Airflow 2
PDF
Migrating to spark 2.0
PDF
LAS16-209: Finished and Upcoming Projects in LMG
PPTX
Tips for Installing Cognos Analytics 11.2.1x
PPTX
The next step from Microsoft - Vnext (Srdjan Poznic)
PDF
SFO15-110: Toolchain Collaboration
PDF
Heroku to Kubernetes & Gihub to Gitlab success story
PDF
(ATS4-PLAT10) Planning your deployment for a 64 bit world
PDF
(ATS4-PLAT10) Planning your deployment for a 64 bit world
PDF
DevConf 2017 - Realistic Container Platform Simulations
PDF
Red Hat Storage Roadmap
PDF
Red Hat Storage Roadmap
PDF
Blackray @ SAPO CodeBits 2009
PPTX
Road to sbt 1.0 paved with server
PDF
Introducing TiDB Operator [Cologne, Germany]
PDF
Introducing TiDB Operator
PDF
Overhauling a database engine in 2 months
20160401 guster-roadmap
20160401 guster-roadmap
20160401 Gluster-roadmap
Upcoming features in Airflow 2
Migrating to spark 2.0
LAS16-209: Finished and Upcoming Projects in LMG
Tips for Installing Cognos Analytics 11.2.1x
The next step from Microsoft - Vnext (Srdjan Poznic)
SFO15-110: Toolchain Collaboration
Heroku to Kubernetes & Gihub to Gitlab success story
(ATS4-PLAT10) Planning your deployment for a 64 bit world
(ATS4-PLAT10) Planning your deployment for a 64 bit world
DevConf 2017 - Realistic Container Platform Simulations
Red Hat Storage Roadmap
Red Hat Storage Roadmap
Blackray @ SAPO CodeBits 2009
Road to sbt 1.0 paved with server
Introducing TiDB Operator [Cologne, Germany]
Introducing TiDB Operator
Overhauling a database engine in 2 months
Ad

Recently uploaded (20)

PPTX
PRESENTACION DE TRAUMA CRANEAL, CAUSAS, CONSEC, ETC.
PPT
neurology Member of Royal College of Physicians (MRCP).ppt
PPT
Copy-Histopathology Practical by CMDA ESUTH CHAPTER(0) - Copy.ppt
PPTX
Acute Coronary Syndrome for Cardiology Conference
PDF
SEMEN PREPARATION TECHNIGUES FOR INTRAUTERINE INSEMINATION.pdf
PPTX
2 neonat neotnatology dr hussein neonatologist
PPTX
surgery guide for USMLE step 2-part 1.pptx
PDF
Extended-Expanded-role-of-Nurses.pdf is a key for student Nurses
PDF
TISSUE LECTURE (anatomy and physiology )
PPTX
MANAGEMENT SNAKE BITE IN THE TROPICALS.pptx
PPTX
Stimulation Protocols for IUI | Dr. Laxmi Shrikhande
PDF
شيت_عطا_0000000000000000000000000000.pdf
PPTX
NRPchitwan6ab2802f9.pptxnepalindiaindiaindiapakistan
PDF
focused on the development and application of glycoHILIC, pepHILIC, and comm...
PPT
HIV lecture final - student.pptfghjjkkejjhhge
PPTX
regulatory aspects for Bulk manufacturing
PDF
Cardiology Pearls for Primary Care Providers
PDF
Oral Aspect of Metabolic Disease_20250717_192438_0000.pdf
PPTX
Anatomy and physiology of the digestive system
PPTX
CHEM421 - Biochemistry (Chapter 1 - Introduction)
PRESENTACION DE TRAUMA CRANEAL, CAUSAS, CONSEC, ETC.
neurology Member of Royal College of Physicians (MRCP).ppt
Copy-Histopathology Practical by CMDA ESUTH CHAPTER(0) - Copy.ppt
Acute Coronary Syndrome for Cardiology Conference
SEMEN PREPARATION TECHNIGUES FOR INTRAUTERINE INSEMINATION.pdf
2 neonat neotnatology dr hussein neonatologist
surgery guide for USMLE step 2-part 1.pptx
Extended-Expanded-role-of-Nurses.pdf is a key for student Nurses
TISSUE LECTURE (anatomy and physiology )
MANAGEMENT SNAKE BITE IN THE TROPICALS.pptx
Stimulation Protocols for IUI | Dr. Laxmi Shrikhande
شيت_عطا_0000000000000000000000000000.pdf
NRPchitwan6ab2802f9.pptxnepalindiaindiaindiapakistan
focused on the development and application of glycoHILIC, pepHILIC, and comm...
HIV lecture final - student.pptfghjjkkejjhhge
regulatory aspects for Bulk manufacturing
Cardiology Pearls for Primary Care Providers
Oral Aspect of Metabolic Disease_20250717_192438_0000.pdf
Anatomy and physiology of the digestive system
CHEM421 - Biochemistry (Chapter 1 - Introduction)

tranSMART 17.1 technical overview

  • 1. 27th of October 2016 Piotr Zakrzewski – The Hyve TranSMART Pro 17.1 project Technical Overview
  • 2. 2 What does 17.1 mean for future development? Improved ease of development ● Clean up of repositories (single repo) ● One step build ● Dependencies update ● Rest api improvements ● Consolidation and extension of the star schema to better fit tranSMART and new data types ● Documentation
  • 3. 3 What does 17.1 mean for future development? Improved ease of development ● Clean up of repositories (single repo) ● One step build ● Dependencies update ● Rest api improvements ● Consolidation and extension of the star schema to better fit tranSMART and new data types ● Documentation
  • 4. 4 Repository Structure Before you can deploy it here ...
  • 5. 5 Repository Structure core-api core-db rest-api R modules core-api transmart data legacy db you need all of these ... ...and these...
  • 6. 6 Repository Structure 16.2: - TranSMART 16.2 spans 10 core repositories - Building & testing tranSMART requires a special setup (that resides in yet another repository) 17.1: - Single repository with all core components necessary for building working tranSMART WAR file
  • 7. 7 What does 17.1 mean for future development? Improved ease of development ● Clean up of repositories (single repo) ● One step build ● Dependencies update ● Rest api improvements ● Consolidation and extension of the star schema to better fit tranSMART and new data types ● Documentation
  • 8. 8 Versioning of Artifacts 16.2: - Most components are versioned as SNAPSHOTs - core-api, core-db, rest-api, transmartApp and all other core components need to match strictly in revision in order to work 17.1: - Single repository: all changes to different components come in a single PR
  • 9. 9 Build Process 16.2: - Transmart 16.2 (Grails 2) uses Gant scripts for building - git-repo used for fetching all repositories - custom groovy script (dependency manager) needed for dev setup 17.1: - Gradle build system (comes with Grails 3) - One step build (also with database setup) - just git clone && ./gradlew build
  • 10. 10 Test Setup 16.2: - Custom script matching branches during travis run - Different way to run tests locally and on travis - No reliable way to run tests for all components - Tested on H2 in-memory database 17.1: - ./gradlew test both locally and on travis - tested against Oracle and Postgres
  • 11. 11 - Default option for Grails 3.X - Very versatile build system - Also very popular (gained momentum due to adoption by Android) - Especially suitable for multi-project, multi-language builds like tranSMART
  • 12. 12 What does 17.1 mean for future development? Improved ease of development ● Clean up of repositories (single repo) ● One step build ● Dependencies update ● Rest api improvements ● Consolidation and extension of the star schema to better fit tranSMART and new data types ● Documentation
  • 13. 13 Java 7 to Java 8 tranSMART is still running on Java 7 which is no longer supported, even for security updates since April 2015. Java 7 reached its end of life
  • 14. 14 Groovy 2.4 and Grails 3 - Java 8 supports invokeDynamic, which should increase performance of many groovy dynamic calls - Many workarounds accounting for old Grails and Hibernate versions bugs no longer necessary - Upgrade allowed us to adopt better build system: Gradle
  • 15. 15 What does 17.1 mean for future development? Improved ease of development ● Clean up of repositories (single repo) ● One step build ● Dependencies update ● Rest api improvements ● Consolidation and extension of the star schema to better fit tranSMART and new data types ● Documentation
  • 16. 16 REST-API versioning ● TranSMART REST-api is used in production ● Several clients and third-party apps ● But development needs to continue …
  • 17. 17 REST-API versioning - in 17.1 REST-api versioning is introduced - Versioning is done on the url level - GET /studies becomes GET /v1/studies - only minor influence on existing clients (change of base url configuration to include version)
  • 20. 20 What does 17.1 mean for future development? Improved ease of development ● Clean up of repositories (single repo) ● One step build ● Dependencies update ● Rest api improvements ● Consolidation and extension of the star schema to better fit tranSMART and new data types ● Documentation
  • 21. 21 Db schema as of now (16.2)
  • 22. 22 Db schema as of now (16.2) Some facts about the current schema: Study exists only as string ids sprinkled around the star schema (no table for study) Concepts and patients belong to a study (cannot be shared) Combination of patient-concept yields a single observation
  • 24. 24 Db schema of 17.1 Most important Consequences of 17.1 changes: Concepts and patients can be shared between studies more straightforward cross trial comparison (trial-visit dimension) and longitudinal data (start date) support Much redundancy and inconsistencies removed
  • 25. 25 Hypercube - Introduction of longitudinal data requires a whole different approach - Modifiers used to store time point. Both relative and absolute allowed - Each observation has effectively an additional dimension (hence the Hypercube)
  • 26. 26 How to query a Hypercube ?
  • 27. 27 Impact on backwards compatibility - Old UI will work only with old data, new data (especially longitudinal) will not be supported - Old ui will not make use of new cross-trial functionality - Migration path will be provided between 16.2 and 17.1
  • 28. 28 New UI however will support the longitudinal data and other features
  • 29. 29 What does 17.1 mean for future development? Improved ease of development ● Clean up of repositories (single repo) ● One step build ● Dependencies update ● Rest api improvements ● Consolidation and extension of the star schema to better fit tranSMART and new data types ● Documentation
  • 30. 30 Documentation - one of the project deliverables is documentation on the database schema - REST-api documented with Open-API - Documentation as part of git repository
  • 31. 31 Conclusion 17.1 aside from many new features is also a major clean-up that will make future developments easier
  • 35. 35 Performance Benchmarks - Goal: safeguarding performance of REST-api - Implemented as a Gradle task (single command) - Should help developers spot falls in performance after new changes - Reference setup on Amazon will be available to make benchmarks comparable
  • 36. 36 Other changes - Multiple observations per concept-patient support - Categorial variables no longer loaded per value (e.g. variable Treated being two variables: yes and no) - Several new tables to accommodate new HDD data type (RNAseq measurement per transcript) and table to store generic links to external resources (files)