Abstract #3037: Machine learning-based multiple cancer detections
with circulating miRNA profiles in the blood
Juntaro Matsuzaki, Yusuke Yamamoto, Ouyang Yi, Sandeep Ayyar, Ryo Miyajima,Timothy Nolan, Nobuhiro Kawai, Ken Kato, Nobuyuki Ota, Takahiro Ochiya
Division of Pharmacotherapeutics, Keio University Faculty of Pharmacy, Tokyo, Japan; Division of Cellular Signaling, National Cancer Center Research Institute, Tokyo, Japan; Preferred Medicine Inc, Burlingame, CA; Preferred Networks America Inc, Burlingame, CA; Preferred Medicine, Inc., Burlingame, CA; Department of Head and Neck Medical Oncology, National Cancer Center Hospital, Tokyo, Japan;
Preferred Networks America Inc, Burlingame, CA; Department of Molecular and Cellular Medicine, Institute of Medical Science, Tokyo Medical University Japan
Early vs Late Stage Breast Cancer detection
Background: Breast Cancer data distribution by Stage
Circulating miRNA expression
● A wide variety of circulating microRNAs (miRNAs) that specifically indicate many types
of cancer have been identified, and their miRNA expression profiles are considered as
potential biomarkers.
profiles in blood combined with
● Circulating miRNAs may serve as a non-invasive liquid biopsy diagnostic tool for early
detection of many types of cancer.
● A novel blood-based diagnostic method combined with machine learning techniques
machine learning can provide
is developed using the entire circulating miRNA expression repertoire in serum
without prior selection of miRNA marker sets. biomarkers for the earliest Prioritized Marker set selection and associated AUC scores
multiple cancer detections
Patient Type Male Female Age: Mean (SD)
Breast: 272 - 271 54.0 (11.8)
Methods: Lung: 223 133 90 68.3 (9.8)
● Clinical serum samples of cancer patients with five Colorectal: 237 144 93 64.8 (11.9)
types of cancer from National Cancer Center Japan and non-cancer volunteers from Stomach: 221 152 69 68.2 (10.6) Accuracy Precision Specificity Sensitivity
Sensitivity at Sensitivity at Sensitivity at
AUC (mean
Class 0.90 specificity 0.95 specificity 0.99 specificity
Minoru Clinic. Pancreas: 99 60 39 64.7 (11.3)
(mean (std)) (mean (std)) (mean (std)) (mean (std))
(mean (std)) (mean (std)) (mean (std))
(std))
○ Breast cancer (272), Volunteers: 289 142 147 60.7 (12.0) Breast 0.899 (0.026) 0.897 (0.045) 0.900 (0.047) 0.897 (0.044) 0.890 (0.052) 0.795 (0.078) 0.710 (0.117) 0.964 (0.018)
○ Colorectal cancer (239),
Lung 0.850 (0.055) 0.848 (0.045) 0.893 (0.028) 0.793 (0.096) 0.753 (0.133) 0.646 (0.168) 0.476 (0.199) 0.921 (0.033)
○ Lung cancer (223),
○ Stomach cancer (221), pancreas 0.897 (0.026) 0.884 (0.045) 0.969 (0.013) 0.690 (0.086) 0.820 (0.121) 0.720 (0.098) 0.500 (0.228) 0.957 (0.016)
○ Pancreatic cancer (100), colorectal 0.888 (0.041) 0.890 (0.033) 0.914 (0.024) 0.858 (0.070) 0.837 (0.100) 0.761 (0.094) 0.648 (0.130) 0.963 (0.018)
○ Non-cancer volunteers (289) stomach 0.910 (0.022) 0.900 (0.030) 0.924 (0.023) 0.891 (0.036) 0.905 (0.073) 0.769 (0.161) 0.583 (0.196) 0.969 (0.016)
● Serum samples were prospectively collected with standard operating procedures.
● The entire miRNA expression profile is analyzed via NGS (Illumina NovaSeq 6000)
● The resulting total miRNA expression profile was used to train machine learning
models, without prior selection of miRNAs by human intervention.
Advantages and Future direction :
● The machine learning model was trained with a training set to test set ratio of 4:1 ● The main advantage of miRNA-based cancer diagnosis is that they
and was carefully monitored by 5-fold cross-validation to avoid overfitting. are more sensitive even in the early stages of cancer, compared to
other diagnostic methods, such as cell-free DNA diagnostics, where
Results: the sensitivity of many types of cancer in the early stages still
● The diagnostic model provided 88% accuracy for all five cancer types (mean). remains low.
● The overall average AUROC was 0.954. ● This approach could be easily expanded to other cancer types.
Roc curve analysis and associated volcano plot
● For breast cancer, the machine learning model provided 90% accuracy and 89% ● Given the potential value of early detection in fatal malignancies,
sensitivity at 90% specificity.The overall AUROC was 0.964. further validation studies are justified in future population-based
● High sensitivity was obtained regardless of the stage of the cancers, indicating studies. Many cancer research institutes are currently conducting
that the possibility of early detection of cancer is kept high. further clinical trials to validate this early cancer diagnosis based on
miRNA expression profiles.