0% found this document useful (0 votes)
4 views45 pages

Introduction To Data Quality

The document provides an introduction to Data Quality within Health Management Information Systems (HMIS), outlining key concepts, dimensions of data quality, and common problems affecting data quality. It emphasizes the importance of accurate, complete, and timely data for effective health system management and decision-making. Additionally, it discusses strategies for resolving data quality issues and the requirements for maintaining good data quality across various levels of health facilities.

Uploaded by

Anthony Afachung
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views45 pages

Introduction To Data Quality

The document provides an introduction to Data Quality within Health Management Information Systems (HMIS), outlining key concepts, dimensions of data quality, and common problems affecting data quality. It emphasizes the importance of accurate, complete, and timely data for effective health system management and decision-making. Additionally, it discusses strategies for resolving data quality issues and the requirements for maintaining good data quality across various levels of health facilities.

Uploaded by

Anthony Afachung
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

M ay

2019

INTRODUCTION TO DATA QUALITY

Morka Mercy
Outline
• Health Management Information Systems(HMIS)
• Basic Concepts/Terms
• Data Quality
• Ten Problems affecting Data Quality
• Resolving Data Quality errors
• Group Exercise
• Summary
Health Management Information Systems(HMIS)
• Definition: ‘Health Management Information Systems (HMIS)’ can be
described as a tool which helps in gathering, aggregating, analyzing
and using information for taking actions to improve performance of
the health systems.

• The Mandate of HMIS: To ensure that there is a continuous flow of


quality disaggregated data on health of populations and health care
services to assist in local planning , programme implementation,
management, monitoring and evaluation.
Flow of Data
• Data Quality must be guaranteed at every stage of data collection,
collation and transmission.

FACILITY LGA STATE NATIONAL


Basic Concepts/Terms
Data Element and Data
Data Element is a recorded event. Data is an aggregation of data elements - in the
form of numbers, characters, images -that gives information after being analyzed

Information

is data organized with reference to a context.- which gives data a meaning

Knowledge
when information is analyzed, communicated and acted upon, it becomes knowledge.
Basic Concepts/Terms: Examples

Data
No. of pregnant women in an area who received skilled birth assistance

Information % of pregnant women who received skilled birth assistance & % of


pregnant women who were left out

Knowledge Why are some pregnant women able to receive skilled birth assistance?
Why are some pregnant women left out? Who were left out? What are
the issues related to access to service?
Data element
• A data element is a record of health event or health related event.
• Data Elements are recorded in a primary register (recording formats)
by the service provider.
• Similar events for the month are aggregated and reported in
specified reporting formats.
• Example: Number of HIV+ pregnant women who Record of a service delivered
were delivered in the health facility by a
midwife in the reporting month.

Number of exposed infants (children Health event


below 1 year) who were affected with HIV.

Number of live births in the reporting Health related event


month.
Types Of Data Element
• Simple data element
• Disaggregated data elements
• Calculated data elements
Quiz: Please identify the
• Simple data elements?
• Disaggregated data elements?
• Calculated data elements?
Data Quality
• Data quality refers to the extent to which data measures what they
intend to measure.

Dimension of Data Quality


- Completeness
- Timeliness
- Reliability/ Accuracy/ Correctness/ Validity
COMPLETENESS
• Reports are a reflection of service provision and utilization thus an
incomplete report will indicate partial service delivery/utilization.

Data completeness is assessed for the following:


1. Number of facilities reported against total
facilities
2. Number of data elements reported against
total data elements in a reporting form.

Reporting from “Private Health Facilities”?


TIMELINESS
• Timeliness is very important component of data quality. Timely
processing and reporting of data facilitates timely availability of data
for decision making.

Example: During monthly review meetings, if out of


10 PHCs 5 do not submit report on time it will be
difficult for the M&E officer to assess the
performance and develop a plan for the LGA in
particular and of PHCs in general.

Check for the date of reporting for every facility and


find out when all facilities report data in your LGA.
Causes Of Decreased Completeness And
Timeliness
• Poor internet connectivity
• Lack of hardware in some facilities
• Lack of staff
• Lack of supervision
• Staff attitude
• Poor motivation
• Knowledge gaps
• Violence
ACCURACY AND RELIABILITY
• Accuracy refers to the correctness of data collected in terms of actual
number of services provided or health events organized.

• Inaccurate data will yield incorrect conclusions during analyses and


interpretation.

• Small errors at respective facility level will accumulate into bigger


mistakes at higher levels, since data from various providers/facilities
are aggregated.
ACCURACY AND RELIABILITY…
Poor data accuracy/reliability could be due to following four factors

Ambiguity
Data entry
about data
errors
element

Dishonesty
Systemic
in
errors
reporting
Example: Examine ANC data reported by all the
PHCs of District X and check for accuracy in data.
Data elements PHC 1 PHC 2 PHC 3 PHC 4 PHC 5 TOTAL

Total Number of pregnant 230 367 359 330 256 1542


women attending ANC 1 Visit

PMTCT HTS given to ANC 1 215 406 500 330 230 1681
attendees (Preg HIV CTR)
ANC 1 HTS coverage rate (Percent 93% 111% 139% 100% 90% 109%
of ANC1 CTR)
Observations
• PHC 1, 4 & 5 have reported correct figures and no problem was found
while processing/analyzing data.

• PHC 2 & 3 reported high number of PMTCT HTS beneficiaries at ANC1


but looking at the figure, one can easily identify the mistake rather than
any systemic problem in reporting.

• Data from PHC 2 & 3 is intriguing; probably the PHCs had high number
of actual beneficiaries of PMTCT HTS at ANC1 or pregnant women were
not given PMTCT HTS in past months because PHC was out of stock and
now back log was being cleared. Further probing is required to identify
the reason for the errors.
Data Entry Errors
• Typing errors: wrong numbers entered in computer or MSF
• Wrong box entry: data entered in wrong box e.g., ‘ANC 1 attendance’
data entered in ‘Total ANC attendance’.
• Calculation errors: during data entry basic computation happens if
formula are incorrect, then errors can happen.
Performing validation checks
• Validation is performed by comparing values of 2 (or more) data elements that
are comparable.
Validation rule Left side Operator Right side
Total Preg HIV tested Total Preg HIV ≤ (less than or Total Preg HIV
positive is less than or tested positve equal to ) tested
equal to Total Preg HIV
tested

Can you mention other common validation rules in the DHIS2?


Does Validation always indicates an error?
• It is important to note that violation of a validation rule does not
always indicate error. Violations can be due to-
– Management issues like availability of vaccines or medicines in stock,
– Disease outbreak
– Actual improvement due to a good CQI program.

• Violation of validation rule prompts you to enquire and check the


data until satisfactory answer is found.
Identification of Statistical Outliers
• Statistical outliers are numbers that do not conform to the trend or
are unexpected values.

• Deviations away from the range (can also be viewed on stem and leaf
plot) it is identified as an outlier.

• This often helps to identify data entry errors or large computation


mistakes.
Systemic Errors
• Systemic errors are those which are embedded in the system that
promote poor data quality.

• Can be resolved systematically by identification of the root causes and


the addressing of same.
DATA QUALITY ISSUES
• Are there other problems that affect data quality at facility and other
levels?

• What are some of these problems and can they be resolved?


Problem 1: Errors due to poorly designed primary registers
1. Data element required to report in the form are not there and gets
missed-out while reporting.

2. Data element present- but cannot be computed easily or prone to


recording errors.

3. Multiple registers.

• Solution : Rationalization of Primary Registers. –


keeping the service delivery recording function, the tracking function
and the computing function, distinct and visible- checking to see all
data required is present in the record and allows for computation.
Problem 1b:
Computation problem in register (data validity)
• Counting to get totals in the register should be limited to the reporting
month only and specific data element, otherwise, errors set in.

• Ensure data entry for a new month starts on a new page; other wise,
over reporting of totals may set in.

• Avoid double counting

• Avoid incomplete counting….check the dates on the pages very well.


Problem 2: Data Definitions:
a. Misinterpretation of Data Elements
Solution

• Each data element needs to be clearly defined and interpreted not


only in English language but also in local language depending on the
level of the staff.

• Data dictionary must be available with every service provider


recording or reporting data in their own language:
Data definitions
b. Consistency of terms used
• Alignment between the recording registers and the reporting form.

• Example:
• a. ‘ANC 1 Attendees’, vs. ‘Number of pregnant women attending ANC
for the first time’
Problem 3: Problems in data aggregation
3.Data Aggregation problems
1. Data is difficult to add up across hundreds of facilities especially manually,
disaggregated….need applications
{Facility-Wise data entry in off line application computes a block and LGA
aggregation sheet which can then be uploaded in MS Excel sheets if nothing
else available. DHIS2 e.t.c}

2. Clarity on which facilities get added up where. Denominators relate to


“catchment areas” for facilities.

3. Facilities reporting late, or not at all….. Needs rules to cope with this.

4. Providing feedbacks as block aggregated forms / sector aggregated forms as


well as comparisons of facilities in a block
Problem 4: Confirmation and Error Management
Procedures
• No clear delegation of powers for approving or confirming data.
• Especially needed for late reporting facilities, non reporting facilities,
cumulative data coming in, error management

• Solution- Written guidelines should be in place


Problem 5: Logistic Problems
Non-reporting/inconsistent reporting can be due to –
• Form Problems: Shortage of pre printed forms , lack of standardization of
forms, poor Quality photocopy etc.
• Traveling time to submit report.
• Lack of staff or hardware for data entry

• Solutions
• Forms adequate for six monthly basis
• Attend to hardware/staff problems or relax schedules accordingly.
• Mobile communication to save travel and staff time- but requires more
applications management and funding.
Problem 6: Duplication
• Data duplication leads to false higher coverage of services and
inaccurate decision making. {It covers up for lack of private sector
data}.
• Multiple entries of same client data in the register/ “Achieving
program targets syndrome for a site or health facility”?
• Multipoint testing for pregnant women leading to repeated testing of
same client!
Problem 7: The Zero Problem: How to report
nonexistent vs Non utilized services
• Example: Test kit is ran out of stock; HF report says there are
‘pregnant women tested for HIV’ cases; HF reports test done in a
private laboratory.
• What problem you can face by this?
– it adversely affects data accuracy because HFs may
overestimate or underestimate Preg HIV tested.
• Solution: Follow data collection & reporting guidelines.- suggest these
are reported as zero- and no difference be made between zeros and
blanks. Agreed?
Problem 8: Wrong choice of indicators
/denominators
• This refers to a common problem where data element itself is correct but
denominator chosen is inappropriate.

• Example- When estimating the population of a LGA one has to extrapolate


the population from 2006 census data to the mid-year population of the
corresponding year then from this number derive expected population for
different age groups and categories.

• Failure to extrapolate will lead to higher rates or we may be counting the


numerator only from public health facilities whereas the denominator may
included all patients seen by both public and private facilities. In some
districts migration could affect denominator and so on.
Problem 9. Inability to create indicators- or too
many data elements for one indicator.
• Each data element must contribute to 1 to 1.5 indicators.

• Need to identify and remove data elements that are not used.

• Some “rates”- need far too many data elements to compute- high
degree of inaccuracy results.
Problem 10: Death Reporting issues
• Facilities/ LGAs/ State which are under-reporting deaths (for
economic or political reasons) need to be identified and worked
upon.

• Issues with retrospective data reporting of deaths.

• Duplication avoidance rules need to be created.


Resolving Data Quality Issues
1. Check denominator and how indicator was calculated.. Ok..
2. Triangulate with other sources of information within the same format and with DHIS: is it error at all?
Or is it a surprise but true finding to be acted on?.. Then….
3. Disaggregate to next level- see if over- under reporting is uniform or whether it comes from one block/
one facility.
If the error was found in one facility report then :
A. Make sure that it was not a data entry error- which could be systemic or random.
B. Go back to the registers and check the value, correct it, and also mark a note about the change made.
C. Ensure that registers have space to record these data.
D. Make sure that your staff understands meaning of this data element.
E. Check if there is a data collection problem
F. In the forthcoming month check the value to ensure that they have understood the importance of this
procedure that you followed.
Resolving Data Quality errors - HFs/ LGAs
• Check what category of error it belongs to- in the Problem 1 to 10
category outlined. See if there is a guidelines in place to which we need
to promote strict adherence or whether a guidelines need to be issued.
There can be unique situations where strict adherence to guidelines
might not be feasible, these should be made note of.

• Get “Government Feedback Orders” issued from LGA level to resolve the
errors.
• Ensure that every HMIS manager/data entry operator keeps a file
containing all such orders.

• Institutional regular Data Quality Audit


Requirements for Good Data Quality?
• Functioning information systems
• Clear definition of indicators consistently used at all levels
• Description of roles and responsibilities at all levels
• Specific reporting timelines
• Standard/compatible data-collection and reporting forms/tools with clear instructions
Documented data review procedures to be performed at all levels
• Steps for addressing data quality challenges (missing data, double-counting, lost to
follow up, …) should be in place
• Storage policy and filling practices that allow retrieval of documents for auditing
purposes (leaving an audit trail)
Steps to conduct DQA
• Training of DQA Team/Facility staff on DQA protocol
• Random Sampling of folders to be assessed using the appropriate
software.
• Extract the folders from the Shelves.
• Conduct DQA using the appropriate DQA tool (RDQA)
• Provide on site feedback to Facility Staff.
• File copy of report at the facility for actions to be taken to improve
Data Quality.
Summary

• Understanding the basic elements of Data Quality is key to


maintaining Reliable data at the facility, State and National
Levels.

• Data should be good enough to document performance of any


facility, state or nation and support decision-making
Think about this..
“Where is the wisdom we have lost in knowledge? Where is the
knowledge we have lost in information?” (T. S. Eliot)

“Where is the information we have lost in data?” (John Seely Brown)

Data Information Knowledge Wisdom


Take Home Message

The key to data quality is the use of information, the more it is


regularly used, the more the seriousness with which data is entered
and problems in flow and analysis are sorted out!! And for this timely
and regular feedback at every level is a must.
Group Exercise
In your working group, review a set of monthly or quarterly data that has been
submitted to the national level.

Discuss the following questions in your group:


I. How does your program assess data quality?
II. What are common data quality issues in your State?
III. What are the current ways in which your program is adjusting for data quality issues such
as incomplete reporting by health facilities?
IV. What do you think should be put in place to strengthen data quality assessment in your
program?
V. Are the data collection instruments and reporting forms standardized and compatible?
THANK YOU

You might also like