AI Decision Support
AI Decision Support
We Learned?
Authors: Grace Golden1,2, BASc; Christina Popescu2,3, BSc, MSc; Sonia Israel2, BSc; Kelly
Perlman2,3, BSc; Caitrin Armstrong2, BASc; Robert Fratila2, BSc; Myriam Tanguay-Sela2,
BASc, & David Benrimoh2,3,4, MD.CM., MSc., MSc., FRCPC
Affiliations: 1University of Western Ontario, London, ON, Canada; 2Aifred Health Inc.,
Montreal, QC, Canada; 3McGill University, Montreal, QC, Canada; 4Stanford University, Palo
Alto, California, United States
Author Contribution:
GG and DB conceptualized the manuscript and contributed to the manuscript writing and review.
CP, SI, KP, CA, RF, and MTS contributed to the manuscript writing and review.
Abstract: Clinical decision support systems (CDSS) augmented with artificial intelligence (AI)
models are emerging as potentially valuable tools in healthcare. Despite their promise, the
development and implementation of these systems typically encounter several barriers, hindering
the potential for widespread adoption. Here we present a case study of a recently developed AI-
CDSS, Aifred Health, aimed at supporting the selection and management of treatment in major
depressive disorder. We consider both the principles espoused during development and testing of
this AI-CDSS, as well as the practical solutions developed to facilitate implementation. We also
propose recommendations to consider throughout the building, validation, training, and
implementation process of an AI-CDSS. These recommendations include: identifying the key
problem, selecting the type of machine learning approach based on this problem, determining the
type of data required, determining the format required for a CDSS to provide clinical utility,
gathering physician and patient feedback, and validating the tool across multiple settings.
Finally, we explore the potential benefits of widespread adoption of these systems, while
balancing these against implementation challenges such as ensuring systems do not disrupt the
clinical workflow, and designing systems in a manner that engenders trust on the part of end
users.
INTRODUCTION
CDSS have a long history and have been developed for various physical and psychiatric
medical conditions (refer to Supplementary for more detail).5,6,7 There are two types of CDSS:
knowledge-based (KB) and non-knowledge-based (NKB) systems.1,3 A KB CDSS is built on
static evidence-based rules and clinical knowledge and may provide diagnostic or treatment
suggestions or reminders.1,3 Meanwhile, NKB systems incorporate some form of a statistical
model that generates outputs based on observed data.3 A subset of these systems incorporate
complex statistical learning models known as artificial intelligence (AI). A NKB CDSS can also
contain a KB CDSS (e.g., a CDSS that provides predictions may also include information from
guidelines). Here, we will focus the discussion on NKB CDSS with integrated AI models
(termed AI-CDSS), which also have elements of KB systems.
AI-CDSS: BARRIERS AND CHALLENGES
Integration of AI into CDSS
Successful integration of an AI model into CDSS requires that the model reach a field-
specific standard of accuracy based on varying levels of acceptable risk and uncertainty in
different domains. Generating accurate models depends on having sufficient data of a high
enough quality, which addresses the intended population at time points relevant to the decision
that needs to be made. Quality begins with obtaining large amounts of data, often from electronic
medical records (EMR) or clinical trial repositories, as small datasets have problems with biased
outputs or difficulties with generalizability.3,11 Even if a model is accurate and reliable when
predicting a given endpoint, choosing that endpoint is essential as it will determine whether the
model will have clinical utility. For example, imagine a model that accurately predicts a
treatment outcome in a disease with only one effective treatment. In this case, knowing the
predicted outcome likely will not change the clinical management as clinicians are unlikely to
deny their patients a chance to try the treatment. In addition, a model that makes predictions
between multiple treatments may have greater utility if there is no clinical reason to select one
treatment over another than if guidelines or clinical experience dictates a specific order in which
to try treatments.
Adoption into Clinical Practice
When discussing the success or failure of CDSS, the system needs to be judged by
metrics specific to the field and, indeed, to the decision(s) they are intended to assist with. One
potentially universal metric is adoption (i.e., the number of clinicians who regularly use the
CDSS); while high adoption does not guarantee impact, it is a prerequisite for impact at scale.
The lack of CDSS in usual clinical practice, and especially the lack of AI-CDSS, is striking, as it
indicates that even if useful CDSS exist, they do not currently have a significant clinical impact
because of low adoption. Why might this be? CDSS have been criticized for taking too much
time14, have been received by clinicians with uncertainty due to low confidence in the AI-
produced results and a lack of overall trust in the system15,16,17, there has been resistance within
the medical field to accept AI, stemming, at least in part, from the lack of transparency
concerning machine learning (ML) models.18,19 Given the nature of medicine, which partially
relies on mechanistic information, physicians may not trust the reasoning and decisions behind
the model prediction if the system does not provide justifications for why or how a prediction
was determined.18,19 Lack of trust is the “black box” problem of AI, and research on AI
interpretability methods is now a primary focus of the field.20
An important criticism is that incorporating CDSS into clinical practice disrupts or
interferes with workflow and often results in physician frustration and burnout, possibly due to a
lack of training or support.14,21 These examples demonstrate how despite the potential of CDSS
to improve clinical care and patient outcomes, improper design or implementation can introduce
new errors or problems, decrease tool retention, or prevent adoption entirely.22
BUILDING A CDSS
The first step in building our CDSS was considering what features clinicians and patients
required via discussions with clinician and patient stakeholders and a survey of the evidence-
based depression treatment literature. We decided to solve treatment selection using AI, as
existing approaches that used classical statistics had failed to generate reliable tools to help
personalize treatment.25,29,30 We determined that improving treatment management did not
require an AI approach, as existing guidelines provided clear recommendations and the use of
measurement-based, algorithm-guided care has been proven to improve patient outcomes but
simply is not being implemented with sufficient fidelity in clinical practice (see Table 1). 23,31
The design of the AI component took place concurrently with the design of the patient
and clinician user experience (see Figure 1). The resulting application was influenced by what
clinicians and patients explained they needed, and the kinds of outputs generated by the AI were
engineered to fit into the clinical workflow and to be both useful and more easily interpretable to
a clinician. Furthermore, to prevent clinician and patient burden and reduce patient and clinician
discontinuation rates, the AI required as few inputs as possible.32,33
Furthermore, it was clear from early discussions that clinicians did not simply want a tool
to give an overall patient probability of remission; rather, they wanted one to help them select
between treatments. Clinician input is why a focus of the model was to allow for differential
treatment benefit prediction - that is, models that allow for comparison between multiple
individual treatments.30 Based on clinician feedback, in later iterations of the model, we focused
on creating a model that generated clinically meaningful differences in predicted efficacy
between the best and worst treatments for each patient.34
Figure 1
Clinical Decision Support System Development, Implementation, and Validation Process
Once we understood what problem the AI needed to solve, we also needed to determine
how to incorporate it into the system. To assist with treatment management, the CDSS contains
an operationalized version of the Canadian Network for Mood and Anxiety Treatments
(CANMAT) 2016 guidelines.23 These operationalized guidelines provide information based on
the patient’s current status and include a treatment selection module where clinicians are given
information about treatments listed in the guidelines. At this step, we presented the AI
predictions within the guideline module but visually differentiated the AI from the guideline-
derived information. To minimize the impact of the system on physician autonomy, the AI
generates a probability of remission for each treatment in an ordered list rather than providing a
single recommendation, prompting the clinician to discuss treatment options with the patient to
encourage shared decision making.34
As noted above, a key barrier to physician adoption of AI tools is a lack of trust in these
systems because of a lack of interpretability. To address this, we created the “interpretability
report” (described in Mehltretter30), which generates the five patient variables most strongly
associated with the predicted probability of remission for each drug. Physician-perceived
relevance of this report to a simulated patient was correlated with physician trust, which
interacted with patient severity to predict physician prescription of one of the top treatments
predicted by the AI in a simulation center study.34,35
As we designed the CDSS that would house the AI, we simultaneously developed the AI
itself. We began with a comprehensive review of which predictors could influence treatment
outcomes in patients with MDD to determine what our model should include and to create a
reference against which associations learned by the model could be compared. Initial predictors
identified included symptoms, demographic information, and various biomarkers.36 However,
after reviewing available data, we determined that there is a paucity of biomarker data for many
individual treatments commonly used in practice and that it was unlikely that physicians would
routinely collect this data from patients using the tool.36 As such, we focused model training
solely on easy-to-collect clinical and demographic characteristics. Part of the development
process involved determining which ML method to use. We ultimately chose deep learning
because it can find complex, non-linear patterns in data that classical statistical models struggle
to find.37 In addition, as new predictor modalities become more feasible, integration into the core
model will be easier (see Mehltretter29 for more information).
An important decision when creating our model was the choice of the data source. Data
sources must cover the range of relevant predictors, include a variety of patients treated with
different medications, and clinical characteristics (e.g., comorbidities) which approximate a real-
world population. When considering clinical trial data, biases may be found due to strict
inclusion and exclusion criteria (e.g., participants are not representative of patients seen in the
clinic). Alternatively, while EMR systems have a surplus of available data, medical records may
lack rigorous outcomes measures38, data may be incomplete, missing, unstructured, or contain
errors39, or data may engender false inferences due to misinterpretation of consecutive visits or
irregularity of visits.40 EMR data can also be affected by temporal biases, wherein as time goes
on and treatment standards change, patients from different periods may not be comparable.41
There is also the risk of propensity biases, where clinicians may be more likely to treat certain
patients in a certain way.42 Importantly, strategies such as propensity models, can help overcome
such issues and should be considered when using EMR data.42
Ultimately, we determined that the biases present in clinical trial data were easier to
identify and investigate in practice than EMR data and that sufficient data was available from
trials with looser inclusion/exclusion criteria that a dataset approximating a real clinical
population could be created. In addition, a clear advantage for clinical trial data was the presence
of unambiguous outcomes which could be used as training targets.29,30,35
Implementation
Implementing the tool into clinical practice requires validating the tool in a clinical
setting, which is necessary to build trust between the physician and the AI-powered tool.25 Pilot
and quality assurance testing also provide the opportunity to find and address software errors or
mismatches between the tool’s design and the clinical workflow before larger trials, thereby
hopefully reducing the incidence of adverse events and reducing the impact of suboptimal design
on the performance of the CDSS in a clinical trial.
Given the importance of adequate and iterative testing during the implementation phase,
one of our earliest studies used a simulation center to receive feedback from physicians on their
experience concerning the product’s utility before incorporating it into actual clinical practice
(see Benrimoh35 and Tanguay-Sela43 for more information). We found that clinicians noted the
tool could be helpful for shared decision making with patients.35 Importantly, given those early
conversations with clinicians focused on concerns about the time using the tool would take, it
was found that the tool could be used successfully during a 10-minute session with a
standardized patient, which suggested that the tool was ready for feasibility testing in the clinic.
Our next step was a feasibility study of the CDSS in a real clinical environment before a
clinical trial focused on effectiveness. The main focus of this study was time spent in each
appointment, as clinician stakeholders mentioned that lost time would be a significant barrier to
them using the CDSS in practice (see Popescu34 for more information). We found no difference
in appointment length before and after the introduction of the tool. The study also demonstrated
that physicians and patients sustained engagement with the tool beyond two weeks (a two-week
benchmark was used to signify retention success based on results in Arean et al., 2016)44,
potentially because the app was tied into clinical care.34 Crucially, we found that the tool did not
negatively impact the clinician-patient relationship for any of the patients – and in roughly half
of all cases (i.e., 46%), the use of the CDSS was even reported to improve the relationship.34
Such findings were important as they support that adding a “third party” (the AI-CDSS) to the
clinician-patient relationship will not negatively impact it and in turn, increase the feasibility of
the tool and the efficacy of treatment overall.45
The next step in the clinical validation of the tool is conducting a randomized control trial
(RCT) to assess the tool’s effectiveness and safety. Secondary endpoints collected will also allow
continued feasibility assessment and impact on care processes. This is an important step since
many CDSS are not evaluated for impact on patient outcomes in RCTs.12
DISCUSSION
CDSS have the potential to improve healthcare efficiency and reduce long-term costs.46
A tool that augments the healthcare providers’ decision-making with evidence-based support
could help address the need for mental health services, potentially empowering more first-line
practitioners, such as nurse practitioners and family physicians, to provide high-quality care. In
addition, an AI-powered CDSS that could reduce the number of treatments that need to be tried
by increasing personalization while improving the quality of treatment management has the
potential to reduce the time patients remain ill, which would have a significant effect on patient
and family suffering in addition to an important reduction in the strain on struggling healthcare
systems and economies.25
Despite the potential benefits of implementing CDSS, multiple CDSS have often failed
during the implementation stage due to a lack of transparency, adding time to routine practice,
uncertainty relating to the evidence and lack of trust in the system, or disruptions to the clinical
workflow.14,15,16,17,18 The seeds of these setbacks are often sown during the initial
conceptualization stage if the tool is developed to create something innovative but unsuitable for
current needs.18 We argue that the first step towards avoiding these pitfalls is to determine the
specific problem to be addressed and to use this to drive the system's design - both in terms of
user experience and the design of the AI model. In addition, we argue that a successful CDSS
will undergo a rigorous testing and validation process.47 In addition, this process must be
iterative and include, at every stage, opportunities for incorporating user feedback. It may be
tempting to begin with a large clinical trial since it will reduce the time from the initial system
development to clinical implementation. However, clinical trials are expensive, resource
intensive- and expose large numbers of patients- and, therefore, should only be conducted when
there is confidence in probable success and lack of harm.
CONCLUSION
This paper discusses current barriers which limit the widespread adoption of CDSS into
healthcare and provides recommendations to consider throughout the building, validation,
training, and implementation process. CDSS are innovative and efficient tools with the potential
to improve healthcare delivery at the patient and system levels but have significant barriers to
implementation that must be addressed to be successfully implemented.
Table 1. Unique Challenges of Building an AI CDSS and How Aifred Approached the Problem.
Ensuring the CDSS solves Identified treatment selection and management as key problems;
the right problem.18 treatment selection required the development of an AI model,
while management required the digitization and
operationalization of guidelines.
Data needs to be of high Did not include EMR data due to inconsistencies in quality and
quality.3,38,39,40 availability of outcomes. Trained and validated model using
baseline clinical and demographic data from antidepressant
clinical trials.
Developing physician trust Developed the interpretability report, which provides up to five
in AI.15,16,17,18 patient variables that were most significant in determining the
probability of remission for a particular drug; conducted
stakeholder needs assessments to understand what physicians
wanted from the tool and how they would use it; underwent a
rigorous clinical validation process and reported results. .
Program, National Research Council, Canada; ERA-Permed Vision 2020 supporting IMADAPT;
Aifred Health. MTS is employed by Aifred Health and is an options holder. SI, KP, CA, RF, and
1. Berner, E. S., & La Lande, T. J. (2007). Overview of Clinical Decision Support Systems. In E. S.
Berner (Ed.), Clinical Decision Support Systems (pp. 3–22). Springer New York.
https://doi.org/10.1007/978-0-387-38319-4_1
2. Kawamoto, K., Houlihan, C. A., Balas, E. A., & Lobach, D. F. (2005). Improving clinical practice
using clinical decision support systems: A systematic review of trials to identify features critical to
3. Sutton, R. T., Pincock, D., Baumgart, D. C., Sadowski, D. C., Fedorak, R. N., & Kroeker, K. I.
(2020). An overview of clinical decision support systems: Benefits, risks, and strategies for success.
4. Sim, I., Gorman, P., Greenes, R. A., Haynes, R. B., Kaplan, B., Lehmann, H., & Tang, P. C. (2001).
Clinical Decision Support Systems for the Practice of Evidence-based Medicine. Journal of the
https://doi.org/10.1136/jamia.2001.0080527
5. Jia, P., Zhang, L., Chen, J., Zhao, P., & Zhang, M. (2016). The Effects of Clinical Decision Support
https://doi.org/10.1371/journal.pone.0167683
6. Kwan, J. L., Lo, L., Ferguson, J., Goldberg, H., Diaz-Martinez, J. P., Tomlinson, G., Grimshaw, J.
M., & Shojania, K. G. (2020). Computerised clinical decision support systems and absolute
https://doi.org/10.1136/bmj.m3216
7. Roshanov, P. S., Misra, S., Gerstein, H. C., Garg, A. X., Sebaldt, R. J., Mackay, J. A., Weise-Kelly,
L., Navarro, T., Wilczynski, N. L., Haynes, R. B., & the CCDSS Systematic Review Team. (2011).
Computerized clinical decision support systems for chronic disease management: A decision-maker-
https://doi.org/10.1186/1748-5908-6-92
8. Cohen, J. P., Bertin, P., & Frappier, V. (2020). Chester: A Web Delivered Locally Computed Chest
9. Lakhani, P., & Sundaram, B. (2017). Deep Learning at Chest Radiography: Automated
10. Larson, D. B., Chen, M. C., Lungren, M. P., Halabi, S. S., Stence, N. V., & Langlotz, C. P. (2018).
11. Squarcina, L., Villa, F. M., Nobile, M., Grisan, E., & Brambilla, P. (2021). Deep learning for the
https://doi.org/10.1016/j.jad.2020.11.104
12. Tulk Jesso, S., Kelliher, A., Sanghavi, H., Martin, T., & Henrickson Parker, S. (2022). Inclusion of
Clinicians in the Development and Evaluation of Clinical Artificial Intelligence Tools: A Systematic
13. Waring, J., Lindvall, C., & Umeton, R. (2020). Automated machine learning: Review of the state-of-
the-art and opportunities for healthcare. Artificial Intelligence in Medicine, 104, 101822.
https://doi.org/10.1016/j.artmed.2020.101822
14. Muhiyaddin, R., Abd-Alrazaq, A. A., Househ, M., Alam, T., & Shah, Z. (2020). The Impact of
Clinical Decision Support Systems (CDSS) on Physicians: A Scoping Review. Studies in Health
care setting. Journal of the American Medical Informatics Association: JAMIA, 18(3), 267–270.
https://doi.org/10.1136/amiajnl-2011-000049
16. Shibl, R., Lawley, M., & Debuse, J. (2013). Factors influencing decision support system acceptance.
17. Sousa, V. E. C., Lopez, K. D., Febretti, A., Stifter, J., Yao, Y., Johnson, A., Wilkie, D. J., & Keenan,
https://doi.org/10.1097/CIN.0000000000000185
18. Shortliffe, E. H., & Sepúlveda, M. J. (2018). Clinical Decision Support in the Era of Artificial
19. Stiglic, G., Kocbek, P., Fijacko, N., Zitnik, M., Verbert, K., & Cilar, L. (2020). Interpretability of
machine learning‐based prediction models in healthcare. WIREs Data Mining and Knowledge
20. Kleinerman, A., Rosenfeld, A., Benrimoh, D., Fratila, R., Armstrong, C., Mehltretter, J., Shneider,
E., Yaniv-Rosenfeld, A., Karp, J., Reynolds, C. F., Turecki, G., & Kapelner, A. (2021). Treatment
selection using prototyping in latent-space with application to depression treatment. PLOS ONE,
21. Jankovic, I., & Chen, J. H. (2020). Clinical Decision Support and Implications for the Clinician
1701986
22. Graham, T. A. D., Kushniruk, A. W., Bullard, M. J., Holroyd, B. R., Meurer, D. P., & Rowe, B. H.
(2008). How usability of a web-based clinical decision support system has the potential to contribute
to adverse medical events. AMIA ... Annual Symposium Proceedings. AMIA Symposium, 257–261.
23. Kennedy, S. H., Lam, R. W., McIntyre, R. S., Tourjman, S. V., Bhat, V., Blier, P., Hasnain, M.,
Jollant, F., Levitt, A. J., MacQueen, G. M., McInerney, S. J., McIntosh, D., Milev, R. V., Müller, D.
J., Parikh, S. V., Pearson, N. L., Ravindran, A. V., Uher, R., & the CANMAT Depression Work
Group. (2016). Canadian Network for Mood and Anxiety Treatments (CANMAT) 2016 Clinical
Guidelines for the Management of Adults with Major Depressive Disorder: Section 3.
https://doi.org/10.1177/0706743716659417
24. Warden, D., Rush, A. J., Trivedi, M. H., Fava, M., & Wisniewski, S. R. (2007). The STAR*D
project results: A comprehensive review of findings. Current Psychiatry Reports, 9(6), 449–459.
https://doi.org/10.1007/s11920-007-0061-3
25. Benrimoh, D., Fratila, R., Israel, S., Perlman, K., Mirchi, N., Desai, S., Rosenfeld, A., Knappe, S.,
Behrmann, J., Rollins, C., & You, R. P. (2018). Aifred health, a deep learning powered clinical
decision support system for mental health. In The NIPS ’17 Competition: Building Intelligent
26. Frank, R. G., & Zeckhauser, R. J. (2007). Custom-made versus ready-to-wear treatments: Behavioral
https://doi.org/10.1016/j.jhealeco.2007.08.002
27. Adli, M., Wiethoff, K., Baghai, T. C., Fisher, R., Seemüller, F., Laakmann, G., Brieger, P., Cordes,
J., Malevani, J., Laux, G., Hauth, I., Möller, H.-J., Kronmüller, K.-T., Smolka, M. N., Schlattmann,
P., Berger, M., Ricken, R., Stamm, T. J., Heinz, A., & Bauer, M. (2017). How Effective Is
Algorithm-Guided Treatment for Depressed Inpatients? Results from the Randomized Controlled
28. Trivedi, M. H., Rush, A. J., Crismon, M. L., Kashner, T. M., Toprac, M. G., Carmody, T. J., Key, T.,
Biggs, M. M., Shores-Wilson, K., Witte, B., Suppes, T., Miller, A. L., Altshuler, K. Z., & Shon, S. P.
(2004). Clinical Results for Patients With Major Depressive Disorder in theTexas Medication
https://doi.org/10.1001/archpsyc.61.7.669
29. Mehltretter, J., Rollins, C., Benrimoh, D., Fratila, R., Perlman, K., Israel, S., Miresco, M., Wakid,
M., & Turecki, G. (2020). Analysis of Features Selected by a Deep Learning Model for Differential
https://doi.org/10.3389/frai.2019.00031
30. Mehltretter, J., Fratila, R., Benrimoh, D. A., Kapelner, A., Perlman, K., Snook, E., Israel, S.,
Armstrong, C., Miresco, M., & Turecki, G. (2020). Differential Treatment Benet Prediction for
Treatment Selection in Depression: A Deep Learning Analysis of STAR*D and CO-MED Data.
31. Kozicky, J. M., Schaffer, A., Beaulieu, S., McIntosh, D., & Yatham, L. N. (2022). Use of a point‐of‐
care web‐based application to enhance adherence to the CANMAT and ISBD 2018 guidelines for
32. Baumel, A., Muench, F., Edan, S., & Kane, J. M. (2019). Objective User Engagement With Mental
Health Apps: Systematic Search and Panel-Based Usage Analysis. Journal of Medical Internet
apps for depressive symptoms: A systematic review and meta-analysis. Journal of Affective
34. Popescu, C., Golden, G., Benrimoh, D., Tanguay-Sela, M., Slowey, D., Lundrigan, E., Williams, J.,
Desormeau, B., Kardani, D., Perez, T., Rollins, C., Israel, S., Perlman, K., Armstrong, C., Baxter, J.,
Whitmore, K., Fradette, M.-J., Felcarek-Hope, K., Soufi, G., … Turecki, G. (2021). Evaluating the
System for the Treatment of Depression in Adults: Longitudinal Feasibility Study. JMIR Formative
35. Benrimoh, D., Tanguay-Sela, M., Perlman, K., Israel, S., Mehltretter, J., Armstrong, C., Fratila, R.,
Parikh, S. V., Karp, J. F., Heller, K., Vahia, I. V., Blumberger, D. M., Karama, S., Vigod, S. N.,
Myhr, G., Martins, R., Rollins, C., Popescu, C., Lundrigan, E., … Margolese, H. C. (2021). Using a
powered clinical decision support system for depression treatment on the physician–patient
36. Perlman, K., Benrimoh, D., Israel, S., Rollins, C., Brown, E., Tunteng, J.-F., You, R., You, E.,
Tanguay-Sela, M., Snook, E., Miresco, M., & Berlim, M. T. (2019). A systematic meta-review of
37. Rashidi, H. H., Tran, N., Albahra, S., & Dang, L. T. (2021). Machine learning in health care and
laboratory medicine: General overview of supervised learning and Auto‐ML. International Journal of
https://doi.org/10.1192/bjp.180.2.101
39. Kristianson, K. J., Ljunggren, H., & Gustafsson, L. L. (2009). Data extraction from a semi-structured
electronic medical record system for outpatients: A model to facilitate the access and use of data for
https://doi.org/10.1177/1460458209345889
40. Zheng, K., Gao, J., Ngiam, K. Y., Ooi, B. C., & Yip, W. L. J. (2017). Resolving the Bias in
Electronic Medical Records. Proceedings of the 23rd ACM SIGKDD International Conference on
41. Yuan, W., Beaulieu-Jones, B. K., Yu, K.-H., Lipnick, S. L., Palmer, N., Loscalzo, J., Cai, T., &
Kohane, I. S. (2021). Temporal bias in case-control design: Preventing reliable predictions of the
42. Afzal, Z., Masclee, G. M. C., Sturkenboom, M. C. J. M., Kors, J. A., & Schuemie, M. J. (2019).
Generating and evaluating a propensity model using textual features from electronic medical
43. Tanguay-Sela, M., Benrimoh, D., Popescu, C., Perez, T., Rollins, C., Snook, E., Lundrigan, E.,
Armstrong, C., Perlman, K., Fratila, R., Mehltretter, J., Israel, S., Champagne, M., Williams, J.,
Simard, J., Parikh, S. V., Karp, J. F., Heller, K., Linnaranta, O., … Margolese, H. C. (2022).
Evaluating the perceived utility of an artificial intelligence-powered clinical decision support system
for depression treatment using a simulation center. Psychiatry Research, 308, 114336.
https://doi.org/10.1016/j.psychres.2021.114336
44. Arean, P. A., Hallgren, K. A., Jordan, J. T., Gazzaley, A., Atkins, D. C., Heagerty, P. J., & Anguera,
J. A. (2016). The Use and Effectiveness of Mobile Apps for Depression: Results From a Fully
https://doi.org/10.2196/jmir.6482
45. Nolan, P., & Badger, F. (2005). Aspects of the relationship between doctors and depressed patients
that enhance satisfaction with primary care. Journal of Psychiatric and Mental Health Nursing, 12(2),
146–153. https://doi.org/10.1111/j.1365-2850.2004.00806.x
46. Castillo, R. S., & Kelemen, A. (2013). Considerations for a Successful Clinical Decision Support
https://doi.org/10.1097/NXN.0b013e3182997a9c
47. Day, S., Shah, V., Kaganoff, S., Powelson, S., & Mathews, S. C. (2022). Assessing the Clinical Robustness of
Digital Health Startups: Cross-sectional Observational Analysis. Journal of Medical Internet Research, 24(6),
e37677. https://doi.org/10.2196/37677
Supplemental Material
Introduction
Historical Context
First, we will provide some context with respect to the development and use of CDSS in
healthcare. The United States National Health Care Act endorses the development of CDSS
given their cost-effectiveness and ability to integrate into some electronic medical record (EMR)
systems (Sutton et al., 2020). CDSS have various functions and advantages such as their ability
to reduce prescription errors, mitigate adverse events, reduce costs (e.g., reduce test and order
duplication), provide diagnostic and decision support, improve clinical workflow, deliver
reminders and alerts, and improve patient outcomes (Khairat et al., 2018; Sutton et al., 2020).
Since the 1970s (Shortliffe & Buchanan, 1975), CDSS have been developed for a range
of physical and psychiatric medical conditions (Jia et al., 2016; Kwan et al., 2020; Roshanov et
al., 2011). Notwithstanding increasingly sophisticated clinical or data-driven algorithms and
models, increased data volumes from studies and EMRs, and improvements in computational
storage and power, CDSS effectiveness varies considerably with each tool (Kwan et al., 2020;
Roshanov et al., 2011). For instance, a systematic review and meta-analysis conducted by Kwan
et al. (2020) reported that of 108 studies, CDSS resulted in 5.8% more patients receiving care
adherence to the standards programmed into the CDSS. However, the authors noted substantial
heterogeneity among the top quartile of studies, ranging from 10% to 62% on improved process
adherence, which suggests that CDSS interventions can have a wide range of possible impacts.
Another systematic review by Roshanov et al. (2011) investigated whether CDSS improved
chronic disease management processes and associated patient outcomes, and found that systems
addressing diabetes and dyslipidemia showed potential for improving patient outcomes.
Nevertheless, systems addressing hypertension, asthma, chronic obstructive pulmonary disease,
cardiac conditions, and other care rarely demonstrated benefits with respect to patient outcomes,
though studies in these conditions suggested that CDSS may improve the quality of care
provided to patients (Roshanov et al., 2011). Improvements of care processes may result from
the CDSS’ push features (e.g., a notification message that pops up on a mobile device), complex
decision support (e.g., guideline-based feedback and recommendations for screening, diagnosis,
and prescribing), or documentation options (Kwan et al., 2020). This observation was
corroborated by another review assessing the effects of CDSS on medication safety, where 75%
of trials demonstrated positive impacts on the process of care, but only 20% demonstrated
positive impacts on patient outcomes (Jia et al., 2016). Roshanov et al. (2011) noted that only
56% of trials investigated patient outcomes as the primary endpoint, indicating that CDSS, or the
studies aimed at investigating their impact, are not necessarily designed with the intent to
improve patient outcomes. Rather, improving the care process is often assumed to improve
outcomes or other important metrics. For instance, a CDSS may be built with the intent to allow
clinicians to see more patients by improving their efficiency or capacity, with the expectation
that this will improve outcomes overall, as more patients receive care. Alternatively, study
designers may be uncertain regarding the expected effect of CDSS on patient outcomes and
therefore focus on assessing the systems’ feasibility as the primary endpoint, without further
studying the effects on outcomes. This distinction is relevant because there are several potential
barriers that may prevent the translation from improved care processes to improved patient
outcomes. These might include a lack of confidence by end-users in the appropriateness of
outcomes being measured or of the decision support being offered, insufficient time for training
or use of the CDSS during regular practice, or limited CDSS features that do not meet clinical
needs. Additionally, many CDSS are built with patient-facing components and therefore, it is
critical to consider the patient’s perception of the CDSS during the design process (Kawamoto et
al., 2005), as patient confidence in the system or their perception of its utility may drive their
adoption and, in turn, the overall effectiveness of the system.
Implementation
Given the importance of adequate and iterative testing during the implementation phase,
one of our earliest studies used a simulation center to receive feedback from physicians on their
experience concerning the product’s utility prior to incorporating it into real clinical practice (see
Benrimoh et al., 2021 and Tanguay-Sela et al., 2022 for more information). Aifred’s simulation
center study assessed the CDSS among 20 participants, 11 of which were psychiatrists and 9 of
which were family physicians. The study allowed us to understand how physicians would use the
system in session with patients by enabling us to observe them using it with three standardized
patients (one with mild, one with moderate and one with severe depression). Physicians were
given basic training to use the tool, and were instructed to use the tool however they thought was
appropriate during the session (Benrimoh et al., 2021). The simulation provided several insights
concerning how physicians received the tool. Key among these was that clinicians noted that the
tool could be helpful in elements of shared decision making with patients and that the
standardized patients were most appreciative of the tool when they were ‘invited in’ by a
clinician sharing their screen of the app with them (Benrimoh et al., 2021). 50% of physicians
reported that they would use the tool for all patients with MDD, and 85% of physicians thought
they would use the tool for complex or treatment-resistant patients (Tanguay-Sela et al., 2022),
and it was found that the tool could be used in under 5 minutes during a session with a
standardized patient, which suggested that the tool was ready to be tested for feasibility in the
clinic.
This assessment of readiness prompted our feasibility study of the CDSS in a real clinical
environment, a “stress test” of sorts prior to a clinical trial focused on effectiveness. Our main
focus in this feasibility study was time spent in each appointment, as clinicians during our
stakeholder discussions earlier in development had identified lost time as being the most
significant barrier to them using the CDSS in practice. Seven clinicians treated 14 patients
diagnosed with MDD (Popescu et al., 2021). Comparing the baseline appointment, in which the
CDSS was not used, to subsequent visits, in which the CDSS was used, there was no significant
difference in appointment length (Popescu et al., 2021). Physicians were not required to use the
CDSS outside of patient appointments. These results removed the potential concern that this
system would increase physician workload or time spent. In addition, 62% of patients and 71%
of physicians indicated that they trusted the tool, and 92% of patients and 71% of physicians felt
the tool was easy to use (Popescu et al., 2021). Finally, physicians used the application during
96% of appointments, despite being told that they were free to use it or ignore it as they saw fit
(see Popescu et al., 2021 for more information). Notably, both physicians and patients
demonstrated sustained engagement beyond two weeks with the tool (a two-week benchmark
was used to signify retention success based on results in Arean et al. (2016) where the authors
noted app usage diminished significantly after 2 weeks), potentially attributed to the fact that the
app was tied into clinical care (Popescu et al., 2021). Crucially, we found that the tool did not
negatively impact the clinician-patient relationship for any of the patients – and in roughly half
of all cases, use of the CDSS was even reported to improve the relationship (Popescu et al.,
2021). This is important as there were concerns that adding in a “third party” (the AI-CDSS) to
the clinician-patient relationship might negatively impact it, and in turn reduce the feasibility of
the tool and the efficacy of treatment overall (Nolan & Badger, 2005).
References
Arean, P. A., Hallgren, K. A., Jordan, J. T., Gazzaley, A., Atkins, D. C., Heagerty, P. J., &
Anguera, J. A. (2016). The Use and Effectiveness of Mobile Apps for Depression:
Results From a Fully Remote Clinical Trial. Journal of Medical Internet Research,
18(12), e330. https://doi.org/10.2196/jmir.6482
Benrimoh, D., Tanguay-Sela, M., Perlman, K., Israel, S., Mehltretter, J., Armstrong, C., Fratila,
R., Parikh, S. V., Karp, J. F., Heller, K., Vahia, I. V., Blumberger, D. M., Karama, S.,
Vigod, S. N., Myhr, G., Martins, R., Rollins, C., Popescu, C., Lundrigan, E., …
Margolese, H. C. (2021). Using a simulation centre to evaluate preliminary acceptability
and impact of an artificial intelligence-powered clinical decision support system for
depression treatment on the physician–patient interaction. BJPsych Open, 7(1), Article 1.
https://doi.org/10.1192/bjo.2020.127
Jia, P., Zhang, L., Chen, J., Zhao, P., & Zhang, M. (2016). The Effects of Clinical Decision
Support Systems on Medication Safety: An Overview. PLOS ONE, 11(12), e0167683.
https://doi.org/10.1371/journal.pone.0167683
Khairat, S., Marc, D., Crosby, W., & Al Sanousi, A. (2018). Reasons For Physicians Not
Adopting Clinical Decision Support Systems: Critical Analysis. JMIR Medical
Informatics, 6(2), e24. https://doi.org/10.2196/medinform.8912
Kwan, J. L., Lo, L., Ferguson, J., Goldberg, H., Diaz-Martinez, J. P., Tomlinson, G., Grimshaw,
J. M., & Shojania, K. G. (2020). Computerised clinical decision support systems and
absolute improvements in care: Meta-analysis of controlled clinical trials. BMJ, m3216.
https://doi.org/10.1136/bmj.m3216
Nolan, P., & Badger, F. (2005). Aspects of the relationship between doctors and depressed
patients that enhance satisfaction with primary care. Journal of Psychiatric and Mental
Health Nursing, 12(2), 146–153. https://doi.org/10.1111/j.1365-2850.2004.00806.x
Popescu, C., Golden, G., Benrimoh, D., Tanguay-Sela, M., Slowey, D., Lundrigan, E., Williams,
J., Desormeau, B., Kardani, D., Perez, T., Rollins, C., Israel, S., Perlman, K., Armstrong,
C., Baxter, J., Whitmore, K., Fradette, M.-J., Felcarek-Hope, K., Soufi, G., … Turecki,
G. (2021). Evaluating the Clinical Feasibility of an Artificial Intelligence-Powered, Web-
Based Clinical Decision Support System for the Treatment of Depression in Adults:
Longitudinal Feasibility Study. JMIR Formative Research, 5(10), e31862.
https://doi.org/10.2196/31862
Roshanov, P. S., Misra, S., Gerstein, H. C., Garg, A. X., Sebaldt, R. J., Mackay, J. A., Weise-
Kelly, L., Navarro, T., Wilczynski, N. L., Haynes, R. B., & the CCDSS Systematic
Review Team. (2011). Computerized clinical decision support systems for chronic
disease management: A decision-maker-researcher partnership systematic review.
Implementation Science, 6(1), 92. https://doi.org/10.1186/1748-5908-6-92
Shortliffe, E. H., & Buchanan, B. G. (1975). A model of inexact reasoning in medicine.
Mathematical Biosciences, 23(3–4), 351–379. https://doi.org/10.1016/0025-
5564(75)90047-4
Sutton, R. T., Pincock, D., Baumgart, D. C., Sadowski, D. C., Fedorak, R. N., & Kroeker, K. I.
(2020). An overview of clinical decision support systems: Benefits, risks, and strategies
for success. Npj Digital Medicine, 3(1), 17. https://doi.org/10.1038/s41746-020-0221-y
Tanguay-Sela, M., Benrimoh, D., Popescu, C., Perez, T., Rollins, C., Snook, E., Lundrigan, E.,
Armstrong, C., Perlman, K., Fratila, R., Mehltretter, J., Israel, S., Champagne, M.,
Williams, J., Simard, J., Parikh, S. V., Karp, J. F., Heller, K., Linnaranta, O., …
Margolese, H. C. (2022). Evaluating the perceived utility of an artificial intelligence-
powered clinical decision support system for depression treatment using a simulation
center. Psychiatry Research, 308, 114336.
https://doi.org/10.1016/j.psychres.2021.114336