CLINICAL DATA MANAGEMENT
FAVEENNA SUKUMARAN
Introduction to Database Design
Study Setup Steps in Database Design
Database creation
Design Issues in Database Design
Database Normalization
• Normal Forms
• Relational Database Management System
(RDBMS)
Building the Database from the Design
Documentation and Testing
Data Entry Applications
Electronic Transfer Programs
WHAT IS A DATABASE?
A database is a system for storing and organizing data.
INITIAL DATA CAPTURE
Manual: Data collected by hand (e.g, paper forms, written notes).
Electronic: Data input via electronic means like software applications.
DATA COLLECTION METHODS
Paper forms (e.g., CRFs or
Case Report Forms).
Central systems (e.g.,
laboratory instruments).
Direct Entry (e.g., a computer
entry screen at a study site).
DATABASE CREATION
A database can be created using different tools like : Excel,
Microsoft Access, or SAS tables
The goal is to create an organized structure for storing data, and
the database must be tested before it goes live.
DESIGN ISSUES IN DATABASE DESIGN
Main Purpose of Database Design: The primary goal of a database is to
store data accurately and ensure it is easy to retrieve and analyze
Data Capture Instruments: Data entry tools (e.g., CRF forms, electronic
screens) should be designed for-
Clarity: Data entry should be straightforward.
Efficiency: Data entry should be quick and reliable.
Analysis: The data should be easily analyzed for study purposes
COMMON FIELDS IN CLINICAL DATA
There are various types of fields that need special consideration during database design, including:
Text Fields
Date Fields
Special Integers
(handling both
(e.g., patient
complete and
numbers,
incomplete
document IDs).
dates).
Calculated Values Single Checkboxes
(e.g., age, (for yes/no
weight). responses).
TEXT FIELDS AND ANNOTATIONS
In clinical databases, text fields can store various types of information. These can range from short yes/no
answers to longer free-text comments and even detailed medical annotations. These might include-
TEXT FIELDS
• Categorical Values
• Free Text
• Long Comments
• Header Information
DATES IN CLINICAL DATA
Complete Dates
Incomplete Dates
Historical Dates
HIDDEN TEXT IN NUMERIC FIELDS
A common challenge is when text appears in fields that should only have numbers.
Here are Using Numeric Storing Both Text
several Allowing Both Text
and Numbers in a
Fields but Flagging and Numeric
ways to Single Field
Discrepancies if
Text
Versions
Separately
handle
DATABASE NORMALIZATION
Normalization is the process of organizing data in the database to reduce redundancy and improve data
retrieval.
The goal is to:
• Eliminate redundancy (e.g., storing the same data in multiple places).
• Ensure data is stored properly in different tables.
• Simplify the database structure.
Normal Forms
There are several levels
of normalization to
ensure the database is
efficient and accurate
Second Normal Form Third Normal Form
First Normal Form (1NF)
(2NF) (3NF):
RELATIONAL DATABASE MANAGEMENT SYSTEM (RDBMS)
• In relational databases, data is stored in tables.
• The relationships between these tables are established using primary and foreign keys.
Primary Key: A field that uniquely identifies each record in a table.
Foreign Key: A field in one table that refers to the primary key in another table.
BUILDING THE DATABASE FROM THE DESIGN
Once the design is finalized, the next step is to implement it by:
Implementing the Design Document: This includes creating the tables, defining the
relationships between them, and setting up keys.
Ensuring Requirements are Met: Confirm that the database meets all functional and
non-functional requirements.
Testing the Database: Before the database is fully operational, test it to ensure it
behaves as expected, using test cases that validate each part of the design.
DOCUMENTATION AND TESTING
Database Design Document
Testing the Database
Technical Documentation
DATA ENTRY METHODS
Single pass entry
Double pass entry
OCR (optical character recognition)
Remote data entry
DATA ENTRY SCREENS
Heads-down Heads-up Data
entry staff entry staff coordinators
ELECTRONIC TRANSFER PROGRAMS
In clinical studies, sometimes data is collected electronically (for example, via electronic health records or
digital forms) and transferred directly to a central database.
This requires careful planning and management to ensure the data is accurate, complete, and secure.
Handling Data
Cleaning Handling Edits
Data Tracking That Cannot Be
Checks and Updates
Loaded
Standards
Solution
SOPs(Standard Quality
Operation Assurance
Procedure) (QA)
Database design ensures accurate, efficient data storage and retrieval.
Involves planning data capture methods (manual, electronic, paper).
Normalization reduces redundancy and improves data structure.
Data entry tools should be clear, efficient, and suitable for analysis.
Testing and documentation verify the database's functionality.
Data entry methods include single-pass, double-pass, OCR, and remote entry.
Electronic transfer programs ensure smooth integration of data from various sources.
Proper quality assurance and SOPs maintain data integrity and consistency.