0% found this document useful (0 votes)
32 views17 pages

Personalized Medicine Recommendation System

This project presents a machine learning-based personalized medicine recommendation system utilizing the CatBoost algorithm to predict appropriate medications from synthetic patient data. The system achieved strong performance metrics, including a Micro F1 Score of 0.851 and a Hamming Loss of 0.0105, and was integrated into a user-friendly web application for real-time predictions. The framework aims to enhance clinical decision-making and supports the broader goal of individualized healthcare delivery through data-driven approaches.

Uploaded by

akshit modi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views17 pages

Personalized Medicine Recommendation System

This project presents a machine learning-based personalized medicine recommendation system utilizing the CatBoost algorithm to predict appropriate medications from synthetic patient data. The system achieved strong performance metrics, including a Micro F1 Score of 0.851 and a Hamming Loss of 0.0105, and was integrated into a user-friendly web application for real-time predictions. The framework aims to enhance clinical decision-making and supports the broader goal of individualized healthcare delivery through data-driven approaches.

Uploaded by

akshit modi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Personalized Medicine Recommendation System

by

Akshit Modi

An Applied Project Presented in Partial Fulfillment


of the Requirements for the Degree
Master in Biomedical Informatics/Health Informatics

Approved 01/28/2025 by the


Applied Project Mentors:

Imon Banerjee

ARIZONA STATE UNIVERSITY


04/16/2025
ABSTRACT

Background:

Personalized medication recommendation remains a critical challenge in biomedical informatics,

particularly as healthcare moves toward precision medicine. Variability in patient demographics,

clinical conditions, procedures, allergies, and lifestyle factors makes the task of prescribing

appropriate medications highly complex.[1]

Objective:

This project aimed to develop a machine learning-based personalized medicine

recommendation system using the CatBoost algorithm to predict appropriate medications from

structured synthetic patient data generated by the Synthea database.

Methods:

We trained a CatBoost multi-label classifier using key patient-specific attributes,

including age, gender, ethnicity, marital status, city, weight, height, smoking habits,

allergies, and clinical codes (conditions, procedures, and encounters). Categorical

variables were encoded using techniques such as multi-label binarization and label

encoding. To manage high-dimensional multi-hot encoded features, dimensionality

reduction via Singular Value Decomposition (SVD) was applied. A web-based

application was also built using HTML/CSS for the frontend and Flask for the

backend to enable real-time predictions, user interaction, and feedback-based


[2][3]
retraining.

Results:

The CatBoost model achieved strong performance in predicting relevant

medications with measurable improvement across multiple metrics, including micro

F1 score, Hamming loss, and exact subset accuracy. The system was successfully
integrated into a user-friendly interface that accepted patient inputs and returned

medication predictions, which were partially or fully aligned with ground truth

medication data.

Conclusion:

This framework demonstrates the effectiveness of using CatBoost for personalized medication

recommendations by combining structured synthetic patient data with an interpretable, high-

performance model. The web interface enhances usability and enables potential integration into

clinical decision support systems, contributing to the broader goal of individualized healthcare

delivery.

BACKGROUND

In recent years, personalized medicine has emerged as a transformative approach in

healthcare, aiming to tailor treatments to individual patient profiles instead of relying on

generalized, one-size-fits-all strategies. A central pillar of this approach is the ability to

recommend appropriate medications based on a wide range of patient-specific features,

including demographics, clinical history, lifestyle factors, allergies, and comorbidities.[1]

Despite substantial progress in biomedical informatics, medication recommendation remains a

significant challenge. Traditional prescribing practices often neglect critical nuances such as

prior procedures, allergies, smoking habits, and concurrent conditions—factors that directly

influence drug efficacy and safety. According to the American Medical Association (AMA),

adverse drug events account for nearly 5% of hospital admissions and contribute to increased

healthcare costs, patient morbidity, and even mortality (AMA Citation #1). These realities

underscore the pressing need for intelligent systems that can support safer and more effective

prescribing.
Advancements in machine learning (ML), especially tree-based models like CatBoost, offer

robust solutions by handling heterogeneous, high-dimensional data and learning complex

interactions between patient variables. Unlike deep learning models that require extensive

unstructured text processing, CatBoost excels in scenarios with rich structured data, such as

patient demographics, procedure codes, condition codes, medication history, and vital statistics.
[5]

In this project, we utilize structured synthetic patient data from the Synthea™ dataset to

simulate clinical variability. A CatBoost multi-label classifier was trained using features such

as age, gender, race, allergies, height, weight, conditions, procedures, encounters, and

geographical data. This model was then integrated into a Flask-based web application, enabling

real-time medication prediction for new patient inputs. To ensure scalability, the application also

stores user-submitted records to allow continuous improvement through periodic model

retraining.

The impact of this work lies in its ability to improve clinical decision-making through transparent,

interpretable predictions that reduce prescribing errors and optimize therapeutic outcomes. It

supports the broader goal of advancing clinical decision support systems (CDSS) in biomedical

informatics by promoting data-driven, personalized care.

METHODS

This project utilized the Synthea synthetic health record dataset, a publicly available simulation

of realistic yet fully de-identified patient records. Synthea was selected for its ability to generate

high-quality, HIPAA-compliant data without revealing personally identifiable information (PII).

Approximately 10,000 synthetic patient records were extracted for this study. Each record

contained structured features such as age, gender, height, weight, marital status, race, and

ethnicity, as well as multi-label categorical data including condition codes, procedure codes,
medication codes, allergy codes, and encounter types. This dataset provided a rich foundation

for training a multi-label medication recommendation model.

Data preprocessing involved several key steps to standardize and clean the dataset. For

numerical features (e.g., age, height, weight), missing values were imputed using median

values. For categorical fields like gender, marital status, or race, missing values were filled with

a default category such as “Unknown.” Multi-label fields (e.g., condition codes) were converted

from stringified lists to actual Python lists using safe evaluation, followed by multi-hot encoding

using the MultiLabelBinarizer method. This transformation enabled representation of multiple

codes within a single input record. [7][10]

Given the high dimensionality resulting from multi-hot encoding across multiple features,

dimensionality reduction techniques were applied. Specifically, Singular Value

Decomposition (SVD) was used to reduce the feature space while preserving as much

variance as possible, making the model more memory-efficient and improving computational

performance.[9]

The target variable—Medication Codes—was also treated as a multi-label output using multi-

label binarization. The model was trained using CatBoostClassifier, a gradient boosting

algorithm that performs well with categorical data and handles class imbalance efficiently.
!Figure 1 provides a high-level overview of the complete medication recommendation pipeline.
Starting from data extraction and preprocessing, the flow progresses through dimensionality
reduction, model training using CatBoost, and culminates in web-based deployment for real-time
prediction.)

The classifier was configured in a One-vs-Rest scheme to handle multi-label classification.

Model training included cross-validation to ensure generalization and robustness, and

hyperparameter tuning was conducted to optimize metrics such as F1 score and Hamming loss.

Once trained, the CatBoost model and preprocessing artifacts (encoders, SVD transformers)

were saved using Joblib for deployment. A Flask-based web application was developed for

real-time usage.
(Figure 2 illustrates the real-time medication prediction workflow from the user

interface to result storage. After a user submits the form, the Flask server receives

the input, which is then preprocessed in alignment with the training pipeline. The

preprocessed data is passed to the loaded CatBoost model to generate medication

predictions. These predictions are returned to the frontend and also saved to a CSV

file for retraining purposes.)


The frontend, created using HTML and CSS, accepts patient-specific inputs which are sent to

the Flask backend. The backend performs preprocessing consistent with the training pipeline,

applies the CatBoost model, and returns predicted medication codes to the user interface.

These predictions are also logged for future model retraining.

Model evaluation was carried out using standard multi-label metrics: Micro F1 Score,

Hamming Loss, and Subset Accuracy (Exact Match). These metrics were computed on the

validation dataset to quantify the model’s predictive performance. Additionally, sample outputs

were manually reviewed to verify clinical plausibility, offering a qualitative validation of the

system's real-world applicability.

In summary, the system combines structured EHR data, interpretable machine learning

(CatBoost), and a deployable web interface to deliver personalized medication

recommendations efficiently and ethically.

RESULTS

Model performance was evaluated using both quantitative and qualitative

approaches after training a CatBoost-based multi-label medication

recommendation model on structured and encoded synthetic patient data from

the Synthea dataset.

Training Performance

The training phase was monitored over 600 iterations. Both training and validation

loss curves showed a consistent downward trend, indicating effective learning and

good generalization capability. The use of Truncated SVD for dimensionality

reduction and regularization techniques, such as early stopping and class

balancing, helped prevent overfitting.


( Figure 3 Mircro F1 score, haming loss and accuracy.)

Evaluation Metrics

Final evaluation on the held-out validation set yielded the following performance:

 Micro F1 Score: 0.851

 Hamming Loss: 0.0105

 Subset Accuracy (Exact Match): 0.623

These metrics demonstrate strong performance in a multi-label classification context:

 The F1 Score of 0.851 indicates the model effectively balances precision and recall,

both crucial in clinical prediction systems.

 A Hamming Loss of just 1% reflects very few incorrect label assignments relative to

the large number of possible medication codes.

 Subset Accuracy of 62.3% shows the model predicted the entire correct set of

medications for over 60% of patients—a significant achievement for a problem with large

label spaces.

Clinical Case Examples (Code-Based Evaluation)

To ensure robust and interoperable evaluation, both input and output were encoded using

SNOMED CT codes. This structure enables the model to be integrated directly into electronic

health record (EHR) systems and evaluated at the clinical code level.
Input Summary (Structured Features) True Predicted

Medication Medication

Codes Codes

Age: 67, Gender: M, Conditions: [44054006, [313782, [313782, 387544]

38341003], Procedure: [80146002]….. 861467]

Age: 82, Gender: F, Conditions: [10509002, [321949, [321949, 763293]

233604007], Procedure: [183452005]….. 763293]

Age: 45, Gender: F, Conditions: [22298006], [313782] [313782]

Procedure: [80146002]….

In cases where exact matches were not observed, the model still produced therapeutically

valid alternatives. For instance, substituting “387544” for “861467” in a cardiovascular scenario

still aligns with clinical treatment guidelines, indicating that the model recognizes pharmacologic

equivalence and therapeutic intent.


Deployment and Web Application Interface

(Fig:4 Patient information entry form for collect geography and normal data)

(Fig: 5 after predicting a medication getting all medication list)


To ensure real-time usability, the trained CatBoost model was deployed via a Flask-based web

application. The interface allows users to input structured patient data, such as age, gender,

condition codes, procedure codes, and allergy codes, through an intuitive HTML form.

Upon form submission:

1. The backend server receives the data.

2. Preprocessing steps are executed, including SNOMED CT code encoding and SVD-

based dimensionality reduction.

3. The preprocessed data is passed to the CatBoost model to generate personalized

medication predictions, returned as SNOMED CT or RxNorm codes.

After generating code-based predictions, the system uses the RxNorm API to convert those

codes into human-readable medication names. This enhances interpretability for clinicians

and end-users by bridging machine-readable predictions with standard drug terminology. For

example, a predicted RxNorm code such as 313782 is resolved to "Metformin".

Additionally, the system:

 Logs user inputs and predictions into a CSV file for future model retraining,

 Supports scalable deployments by maintaining modular, reproducible codebases,

 Enables a full pipeline from data ingestion → model inference → clinical output,

supporting continuous learning and adaptation to new data.

This seamless integration of machine learning inference with clinical terminology services

(RxNorm) makes the system practical for real-world deployment in electronic health record

(EHR) systems or clinical decision support tools.

DISCUSSION

The results obtained from this project strongly validate the initial objective of developing an

accurate, scalable, and user-friendly Personalized Medication Recommendation System.

The model achieved a Micro F1 Score of 0.851, Hamming Loss of 0.0105, and Exact Match
Accuracy of 62.3% on the held-out validation set, demonstrating its effectiveness in predicting

clinically meaningful medications based on individual patient profiles.

The success of the system can be attributed to several key design choices:

 The use of the Synthea synthetic dataset provided a rich yet de-identified source of

data that allowed extensive model training without ethical concerns related to patient

privacy under HIPAA regulations.

 The CatBoost multi-label classification approach, combined with multi-hot encoding

and SVD dimensionality reduction, allowed efficient learning across a large number of

input features and medication labels.

 Integration of RxNorm API post-prediction enabled conversion of coded outputs into real

medication names, increasing interpretability and clinical usability.

 Deployment through a Flask-based web application demonstrated practical

applicability, allowing for real-time medication predictions accessible to non-technical

users.

Moreover, qualitative inspection of individual model outputs revealed that even when exact

ground truth matches were not achieved, the generated medication recommendations were

often clinically appropriate, indicating the model’s understanding of therapeutic context and

not merely pattern matching.

Limitations

While the system shows strong predictive capabilities, there are notable limitations that must

be acknowledged:

 Incomplete Predictions: In some cases, the model fails to return any medication

predictions for a valid patient input. This could be due to data sparsity, underrepresented

combinations in training, or overly aggressive dimensionality reduction.


 Ambiguity in Multilabel Outputs: Although the model is trained to predict multiple

medications simultaneously, it may occasionally underpredict by missing one or more

medications from the ground truth, especially in complex multi-diagnosis scenarios.

 Reliance on Coded Input: The system requires structured inputs in SNOMED CT

format, which may not always be available in real-world clinical settings without prior

mapping or preprocessing.

 Interpretability: While RxNorm resolves codes into medication names, deeper insights

(e.g., why a drug was predicted) remain opaque unless paired with explainability

techniques.

These limitations, while not detrimental to the core system, underscore the need for continuous

model refinement and more diverse, real-world data validation.

Opportunities for Future Advancement

Several enhancements could further increase the utility and reliability of the system:

 Expand Input Variables: Incorporate lab results, allergies, vitals, and genetic markers.

 Real Clinical Validation: Collaborate with healthcare institutions under IRB approval.

 Model Explainability: Use tools like SHAP or attention maps to explain decisions.

 Continuous Learning: Use logged predictions to retrain models periodically.

 Cloud Deployment: Deploy on AWS/Azure for broader scalability.

 Label Coverage Optimization: Improve medication code recall for rare or multi-

combination therapies.

CONCLUSION

In conclusion, this project demonstrates the feasibility and utility of integrating structured

biomedical data with machine learning—specifically using CatBoost for multi-label

classification—to create a personalized, ethical, and scalable medication recommendation

system. The pipeline leverages SNOMED CT-coded inputs and resolves RxNorm-coded
outputs into interpretable medication names, creating a seamless bridge between clinical

terminology and algorithmic intelligence.

The system achieved strong performance across standard metrics, including a Micro F1 Score

of 0.851, low Hamming Loss, and practical exact match accuracy, validating its effectiveness

in generating clinically relevant medication suggestions.

By combining robust machine learning with real-world standards like RxNorm and SNOMED

CT, the system moves beyond experimental modeling to offer a viable clinical decision support

tool. Its deployment via a Flask web application further enhances accessibility, enabling real-

time usage by researchers or clinicians.

This work represents a significant step forward in the application of biomedical informatics

methodologies to personalized medicine, and lays the groundwork for future research, clinical

validation, and scalable deployment in electronic health record (EHR) environments.


REFERENCES

1. Chen T, Guestrin C. XGBoost: A scalable tree boosting system. In:


Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining. ACM; 2016:785-794.
doi:10.1145/2939672.2939785

2. Prokhorenkova L, Gusev G, Vorobev A, Dorogush AV, Gulin A. CatBoost:


Unbiased boosting with categorical features. Adv Neural Inf Process Syst.
2018;31.
https://proceedings.neurips.cc/paper_files/paper/2018/file/14491b756b3a51d
aac41c24863285549-Paper.pdf

3. Synthea. Synthetic Patient Population Simulator. The MITRE Corporation.


Accessed April 16, 2025. https://synthetichealth.github.io/synthea

4. Bodenreider O. The Unified Medical Language System (UMLS): Integrating


biomedical terminology. Nucleic Acids Res. 2004;32(Database issue):D267-
D270. doi:10.1093/nar/gkh061

5. U.S. National Library of Medicine. RxNorm Overview. National Institutes of


Health. Accessed April 16, 2025.
https://www.nlm.nih.gov/research/umls/rxnorm/

6. Steindel SJ. SNOMED CT: Standardizing clinical phrases. MLO Med Lab Obs.
2010;42(3):26-28.

7. Rajkomar A, Dean J, Kohane I. Machine learning in medicine. N Engl J Med.


2019;380(14):1347-1358. doi:10.1056/NEJMra1814259

8. Miotto R, Wang F, Wang S, Jiang X, Dudley JT. Deep learning for healthcare:
Review, opportunities and challenges. Brief Bioinform. 2017;19(6):1236-1246.
doi:10.1093/bib/bbx044

9. Ribeiro MT, Singh S, Guestrin C. "Why Should I Trust You?": Explaining the
predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining. ACM;
2016:1135-1144. doi:10.1145/2939672.2939778

10.Saria S, Butte A, Sheikh A. Better medicine through machine learning: What's


real, and what's artificial? PLoS Med. 2018;15(12):e1002721.
doi:10.1371/journal.pmed.1002721

11.Shortliffe EH, Sepúlveda MJ. Clinical decision support in the era of artificial
intelligence. JAMA. 2018;320(21):2199-2200. doi:10.1001/jama.2018.17163

12.Zhang Y, Milinovich A, Xu Z, et al. Forecasting respiratory infectious


outbreaks using explainable machine learning models: A case study on
COVID-19 and influenza. Environ Res. 2021;204:111972.
doi:10.1016/j.envres.2021.111972
13.Shortliffe EH, Sepúlveda MJ. Clinical decision support in the era of artificial
intelligence. JAMA. 2018;320(21):2199-2200.
https://jamanetwork.com/journals/jama/fullarticle/2711172

You might also like