Journal of Public Health (2024) 32:1829–1834
https://doi.org/10.1007/s10389-023-01935-z
ORIGINAL ARTICLE
Evaluation of significant factors influencing the survival time of breast
cancer patients using the Cox regression model
Khanda Gharib Aziz1 · Hazhar Talaat Abubaker Blbas2,3 · Azheen Hama Tofiq4
Received: 27 February 2023 / Accepted: 3 May 2023 / Published online: 29 May 2023
© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023
Abstract
Purpose The leading cause of cancer-related deaths in Iraqi women is breast cancer, followed by other malignancies. The
purpose of this research is to investigate the association between covariates (pathological and demographic characteristics)
and time to death in women with breast cancer.
Methods Data were collected from the cancer archive in the city of Sulaimani regarding 305 women who received breast
cancer diagnoses between 2011 and 2020. Using R and SPSS Programs, Cox regression was used to investigate the relation-
ship between explanatory variables (sociodemographic, pathological factors, and treatment) and time to death as a response
variable.
Results The study's findings showed that occupation was statistically the greatest risk factor (−2.361), followed by weight
and treatment (0.789 and 1.605, respectively), while age, place of habitation, and tumor size had no statistically significant
effect on survival time in breast cancer patients.
Conclusion This study discovered that patients who receive therapy, such as chemotherapy, radiotherapy, and hormonal
therapy, alone or in combination live longer than patients who do not receive therapy. In addition, it emphasizes that not
exercising and staying at home, not eating nutritious food, and eating red meat and fried meals are the most common factors
associated with shorter survival time in breast cancer patients.
Keywords Breast cancer · Survival analysis · Hazard function · Censored data · Cox regression
Introduction mortality rates in Iraq and around the world highlights
the need for more research into the disease. As a result, it
Analyzing the characteristics that influence cancer patients' appears that determining the most important elements that
survival times can help doctors diagnose and treat their affect patients' survival concerning symptomatic conditions
patients more effectively. The recent rise in breast cancer through media and regular tests and screening for early
detection is necessary.
* Hazhar Talaat Abubaker Blbas Breast cancer is the most common cause of mortality
[email protected] among women in the world, before lung cancer (Sung et al.
Khanda Gharib Aziz 2021). Breast cancer now accounts for 23% of the 1.1 million
[email protected] new cases diagnosed each year. In addition, breast cancer is
Azheen Hama Tofiq the most common cancer among women in Iraq and the Iraqi
[email protected] Kurdistan Region and the leading cause of cancer death,
followed by other cancers, accounting for about one third
1
Faculty of Law and Administration, College of International of all recorded female cancer cases. As the leading cause of
Trade, University of Halabja, Slemani, Iraq
death among Iraqi women aged 50 years or younger (64.9%),
2
Department of Statistics, College of Administration this suggests that the disease is more likely to affect younger
and Economics, Salahaddin University-Erbil, Erbil,
Kurdistan Region, Iraq women. This is due to the lack of early cancer detection
3 and research programs, as well as limited diagnostic and
Department of Biomedical Science, College of Science,
Cihan University-Erbil, Erbil, Kurdistan Region, Iraq treatment facilities (Aziz and Murad 2020). To determine
4 the explanatory variables that predict the response variable,
Ministry of health, Slemani, Iraq
13
Vol.:(0123456789)
1830 Journal of Public Health (2024) 32:1829–1834
regression analysis was utilized (Aroian et al. 2017; Blbas and compared the outcomes of mastectomy and breast-con-
and Kahwachi 2021). serving surgery. Thee authors used data from breast carci-
The Cox regression model, developed by D.R. Cox in noma patients treated in a private clinic under the author's
1972, is one of the most well-known statistical risk mod- supervision from 1994 to 2007, where 464 and 441 cases
els in survival analysis. The Cox model examines the link with acceptable follow-up were included to define survival.
between several variables across time before an occurrence These cases underwent surgery as mastectomy or breast-
in the context of a certain outcome, such as death (Klein- conserving surgery with other necessary treatment based
baum and Klein 1996). This model is known as a semi- on the clinical status. As a result, overall, 5-year survival
parametric model because it includes two components: the was 81%, which is comparable to that in developed coun-
baseline hazard function of time and an exponential term tries with different healthcare delivery systems and quality
that is used to generate variable hazard rates for each indi- of care, and it is significantly better than other reports from
vidual depending on which covariate groups they belong to. Iran, regionally, and in comparable countries. Rezaianzadeh
Interpreting a Cox model involves examining coefficients et al. (2009) assessed a number of explanatory factors to
for an explanatory variable. A positive regression coeffi- examine breast cancer survival in southern Iran, in which
cient for the explanatory variable often indicates that there 1148 women with breast cancer were included in the data
is a strong likelihood that the patient will have a high posi- from the Fars Province Cancer Registry in southern Iran,
tive value for that variable. Patients with greater values of collected between 2000 and 2005. The results demonstrated
that variable will improve more quickly if the regression that poor survival was associated with a lack of knowledge,
coefficient is negative. a lack of screening programs, and consequently slow access
Equation 1 illustrates the Cox regression model proposed to treatment. Thomson (2012) summarized the most cur-
by Collett (2015). rent clinical and epidemiological trial data indicating a con-
nection between diet and breast cancer deaths, recurrence,
(1)
( ( ))
h(t, X) = h0 (t) exp 𝛽1 x1 + 𝛽2 x2 + … + 𝛽p xp survival, and mortality. A summary of past and current
dietary intervention trials intended to lower breast cancer
where
risk was included in the review, as well as new epidemio-
h(t, X) is the length time of survival paptients logical studies that evaluated risk within types of tumors.
According to the available studies, overall caloric consump-
h0 (t) is the basic hazard function at time t = 0, tion and alcohol seem to have a positive correlation with
breast cancer risk, although low-fat and high-fiber diets may
X is the explanatory variables, P is the sum of the predictor be only marginally beneficial. Fruit and vegetable diet are
variables, and β is the regression coefficient vector. not clearly linked to risk, although fiber may be margin-
The hazard function from the Cox regression model is ally protective, presumably through estrogen regulation. In
shown in Eq. 2: order to lower the risk of developing postmenopausal illness,
adult weight gain should be avoided. The greatest potential
h(t) = h0 (t) exp(y) (2) impact on overall mortality among survivors comes from
The purpose of this research is to estimate the effect of diet, not breast cancer-specific events. Woods et al. (2021)
explanatory variables such as age, place of habitation, occu- investigated whether prior primary care consultation history
pation, weight, tumor size, and treatment on the survival and pre-existing individual health status at diagnosis could
time in days among breast cancer patients using Cox regres- explain socioeconomic differences in survival among breast
sion and then estimate the cumulative hazard function. cancer patients. They used linked routine data to conduct a
There have been a number of studies undertaken to retrospective cohort study of women aged 15 to 99 years
identify the factors affecting survival time in breast cancer diagnosed in England. They combined ecologically devel-
patients. Ebrahimi et al. (2002) conducted a case–control oped indicators of financial deprivation with individually
study in Tehran, Iran, between April 1997 and April 1998 to associated data using the English National Cancer Registry,
evaluate the risk factors for breast cancer in Iranian women. Clinical Practice Research Datalink, and Hospital Episodes
Using a brief, standardized questionnaire, demographic data Statistics databases. These findings show that consultation
and information about risk factors were gathered. From the history or pre-existing individual health status as defined in
logistic regression analysis, odds ratios and 95% confidence primary care have no effect on socioeconomic disparities
intervals were generated. The results show that family his- in survival. In addition to the inclusion of breast cancer,
tory and marital status may have an effect on the prevalence differences in treatment effectiveness should be evaluated
of breast cancer in Iranian women. Akbari et al. (2009) as part of a larger effort to eliminate inequities in premature
investigated the survival of patients with breast carcinoma mortality, in addition to breast surgery and surgery timing.
13
Journal of Public Health (2024) 32:1829–1834 1831
Materials and methods used to estimate the effect of explanatory variables such as
age, patients' occupations, residence, weight, tumor side,
This practical study involved 305 breast cancer patients from and treatment on patient survival time using both the R and
the Breast Cancer Treatment Center in Sulaimani, the Early SPSS version 26 programs. The hypotheses are as follows:
Detection Center of Breast Diseases in the City Hospital in
Sulaimani, and other local medical institutions in Sulaimani H0: There is no relationship between the independent and
for 9 years from 2011 to 2020, and phones were available dependent variables.
for future follow-up. Any other disease-related information H1: The null hypothesis is not true.
was obtained through interviews and direct observation of
the patients with their consent, a process that took approx-
imately 6 months. Those who survived to the end of the Results
study period or whose data were not available after a certain
period, properly censored, were counted. Additionally, an Table 1 shows the descriptive statistics for demographic,
indicator variable indicating patient location has a value of pathological, and treatment questions. The results show
1 when the patient dies and a value of zero when the patient that the majority of patients with breast cancer are aged
is alive or lost to follow-up. Demographic and clinical data 50 years or younger (63.93%), indicating that the majority
studied included age, divided into two thresholds, ≤ 50 and of patients diagnosed with breast cancer in Kurdistan are
> 50 years. Since most of the patients in Kurdistan are 50 young, while 36.07% are older than 50. Next, the percentage
years of age or younger, as indicated in Table 1, their weight of patients who are homemakers (unemployed) (80.3%) is
is divided into two levels, ≤ 65 or > 65 kg. According to higher than that of those who are employed (19.7%), owing
the body mass index (BMI) equation [weight/(height)2], we to the fact that the majority of females (82.6%) live in cold
determined that those weighing more than 65 kg fall into climates. Further, the majority of females (72.8%) have a
the overweight or obese category, because most of these weight greater than 65 kg, while 27.21% have a weight less
patients are overweight and short in stature. According to than 65 kg, and 52.79% of patients have right-side tumors.
the statistics of the Kurdistan Meteorological Directorate, Additionally, 96.7% of patients received treatment, includ-
in terms of weather conditions and temperatures, Kurdistan ing chemotherapy, radiotherapy, and hormone therapy, while
is divided into two parts, cool areas such as Sulaimani and some patients did not require surgery because their cancer
Halabja, and hot areas such as Kirkuk and Kalar. The tumor was found to be in a late stage, and 3.3% of patients with
lying on the right or left side of the patient’s chest, and the breast cancer refused surgery and other forms of treatment
type of treatment (chemotherapy, radiation, hormones) alone because they were too old.
or in combination are also included. Figure 1 shows that the percentage of patients surviv-
Finally, descriptive statistics were used to determine ing or censored (86.89%) is higher that the percentage of
the percentage of demographic and pathological factors patients who died (13.11%) during this study.
of breast cancer patients, and Cox regression models were
Table 1 Descriptive statistics of demographic and pathological fac-
tors of breast cancer patients
F %
Age (years) ≤ 50 198 64.9
> 50 107 35.1
Place of habitation Cold 252 82.6
Hot 53 17.4
Occupation Unemployed 245 80.3
Employed 60 19.7
Weight (kg) > 65 222 72.8
≤ 65 83 27.2
Tumor side Right side 162 53.1
Left side 143 46.9
Treatment Yes 295 96.7
No 10 3.3 Fig. 1 Descriptive statistics of breast cancer patients regarding sur-
vival time
13
1832 Journal of Public Health (2024) 32:1829–1834
By looking at the hypothesis, the significance test is used
to evaluate the model's utility.
H0: The explanatory variables have no impact at all on
the response variable.
H1: H0 is not true.
Table 2 displays results of the likelihood ratio test for
the fitted Cox regression model between simple and for-
ward linear regression (LR) methods, which are 400.583 and
403.463, respectively. For both enter and forward LR, this
model is significant and acceptable because their p-values
(0.000 and 0.018, respectively) are less than the significance
level of α = 0.05.
On the other hand, the results show that women survive at
the beginning of breast cancer diagnosis because the longer
the time, the lower the chances of survival, especially for Fig. 2 Survival time according to Cox regression model
patients in advanced stages of the disease. Woods (2021)
confirmed that the risk increases with increasing time, espe- With advancements in screening systems and treatments,
cially in stages III and IV. breast cancer survival has gradually increased in industrial-
The survival times of patients gradually decrease, as ized countries, where it currently stands around 85%. In con-
shown in Fig. 2. The survival times of patients who spent trast, survival rates in developing countries remain around
roughly 1 to 2250 days in the hospital receiving treatments 50–60%. Akbari et al. (2009) reported 5-year and 10-year
are roughly equal to 0.9, and the survival times of patients survival rates of 81% and 77%, respectively. On the other
who spent close to 4000 days receiving treatments are tens hand, there is ongoing debate on how early diagnosis of
to zero. breast cancer affects a woman's chance of survival. Despite
Figure 3 clearly indicates that the cumulative hazard the fact that younger women (under 50) demonstrated a
increases with time, which leads to an increasing number of higher propensity for tumor recurrence following surgery
deaths for the patients under the study. Patients who spend than did those over 50 years (Aziz and Murad 2020), this
approximately more than 2000 days in hospital have a low study showed that age was not associated with survival of
chance of survival or have a 0.000 survival rate. It is also the patients, with 0.382 and standard error of 0.327. This
clear patients with a shorter hospital stay have a very high result agrees with that of many previous studies.
survival rate. Table 3 shows that there is no significant difference
between the results of the enter and forward LR meth-
ods, because both methods have the same three significant
Discussion
The independent variables occupation and weight and
dependent variable (time) are shown in Table 3 to have weak
negative statistically significant relationships, whereas the
independent variable (treatment and dependent variable had
weak positive statistically significant relationships. However,
the independent variables age, location, and tumor side did
not have a statistically significant relationship with the
dependent variable (time).
Table 2 Likelihood ratio test of postulated model between enter and
backward methods
Method −2 Log likelihood Chi-square p-value
Enter 400.583 27.292 0.000
Forward stepwise 403.468 5.563 0.018
Fig. 3 Cumulative hazard according to the Cox regression model
13
Journal of Public Health (2024) 32:1829–1834 1833
Table 3 Parameter estimates for all covariates and all transitions using the Cox regression model for enter and forward linear regression (LR)
methods
Method Independent variable Regression Standard error Wald test p-value Exp (B) 95% CI for Exp (B) Correlation
Coefficient
B Lower Upper
Enter LR Age 0.385 0.327 1.380 0.240 1.469 0.773 2.791 −0.027
Place of habitation 0.425 0.393 1.166 0.280 1.529 0.707 3.307 −0.079
Occupation −2.333 1.014 5.297 0.021 0.097 0.013 0.707 −0.171**
Weight 0.747 0.325 5.292 0.021 2.111 1.117 3.989 −0.042*
Location −0.189 0.331 0.324 0.569 0.828 0.432 1.585 0.079
Treatment 1.607 0.549 8.576 0.003 4.990 1.702 14.634 0.155**
Forward LR Occupation −2.361 1.013 5.425 0.020 0.094 0.013 0.688
Weight 0.784 0.323 5.886 0.015 2.191 1.163 4.128
Treatment 1.605 0.531 9.133 0.003 4.979 1.758 14.102
* Correlation is significant at the 0.05 significance level.
** Correlation is significant at the 0.01 significance level
factors, including occupation, weight, and treatment, with increasing one unit of patient treatment leads to an increase
slightly different coefficients. in survival time by 1.605 units.
As indicated by the forward LR method, there was a sig- Furthermore, the study found no significant relationship
nificant association between the occupation of breast cancer for people's place of habitation and hot or cold weather,
patients and survival, which has been reported previously. A despite the fact that many studies show that people who live
total of 245 patients (housewives) had increased mortality in cold weather have a longer life span. Also, the explana-
risk compared to those who were employed. A coefficient tory variable tumor side (left or right) had no statistically
of −2.361, standard error of 1.013, and p-value of 0.020 significant effect on survival time.
mean that an increase of one unit in the occupation variable
of the patient leads to a decrease in survival time by −2.361
units. This underlines the notion that education may lead to Conclusion
greater health awareness, better observation of breast-related
symptoms, and shorter delays in seeking medical attention 1. Breast cancer is currently being diagnosed in Kurdish
(Galobardes et al. 2006). Compared to experienced patients, women in Sulaimani, Iraq, with over 60% of patients
death rates were greater among housewives and occupational under the age of 50.
groups with less experience. It is still unclear whether the 2. There are correlations between not exercising, staying
difference is due to a delay in diagnosis or a variation in the at home, not eating nutritious food, and eating red meat
biology of tumors between groups with lower levels of edu- and fried meals with survival time, since a majority of
cation and groups that are more advantaged. Breast cancer the patients are homemakers.
diagnosis and treatment delays were greater for lower social 3. In addition, this study discovered that patients who
class groups than for higher social class groups, according receive therapy, such as chemotherapy, radiotherapy, and
to a study of cancer patients in the UK. hormonal therapy alone or in combination, live longer
Next, the study showed that there was a significant associ- than patients who do not receive therapy, demonstrating
ation for patient weight: for those patients with weigh greater that receiving therapies prolongs survival.
than 65 kg, regression coefficients were 0.784, with stand- 4. Compared to patients who spend more than 4000 days
ard error of 0.323 and p-value of 0.015, which means that in the hospital, those who stay less time there have the
increasing one unit in patient weight reflected an increase highest likelihood of survival.
in survival time by 0.784. Moreover, Coates and colleagues 5. There is no doubt that patients who spend more than
came to the conclusion that decreased risk associated with 2000 days in hospital have a low chance of survival,
weight gain in young women was only applicable to less meaning that time increases the hazard.
aggressive cancers (Stephenson and Rose 2003). 6. Assessing the survival rate is a common method of
This study also indicates that the explanatory variable determining a cancer patient's prognosis, and is of great
treatment of patients has a statistically significant effect on concern in planning and evaluation of cancer control
survival time, with a coefficient of 1.605, standard error measures. The number of survivors is growing as a
of 0.531, and p-value of 0.003, which that means that result of advancements in breast cancer diagnosis and
13
1834 Journal of Public Health (2024) 32:1829–1834
treatment methods. In order to increase the quantity and Aziz KG, Murad AG (2020) Analyzing competing factors in Breast
quality of life for breast cancer patients, it is crucial to cancer by using Multistate model. Halabja. Univ J 5(3):378–397.
https://doi.org/10.32410/huj-10334
determine the variables that may affect survival. Blbas HT, Kahwachi WT (2021) A Comparison Between New Modifi-
7. Only a small number of centers in developing countries, cation of Adaptive Nadaraya-Watson Kernel and Classical Adap-
including Iraq, offer multimodal protocol-based breast can- tive Nadaraya-Watson Kernel Methods in Nonparametric Regres-
cer treatments; as a result, most of patients with breast cancer sion. Cihan Univ-Erbil Scientific J 5(2):32–37. https://doi.org/10.
24086/cuesj.v5n2y2021.pp32-37
receive insufficient care because there are not enough high- Collett D (2015) Modelling survival data in medical research. CRC
quality options available due to a lack of financial resources. press, London
Ebrahimi M, Vahdaninia M, Montazeri A (2002) Risk factors for breast
cancer in Iran: a case-control study. Breast Cancer Res 4(5):1–4.
https://doi.org/10.1186/bcr454
Author contributions Conceptualization: Hazhar Blbas, Khanda Aziz, Galobardes B, Shaw M, Lawlor DA, Lynch JW, Smith GD (2006) Indi-
and Azheen Tofiq cators of socioeconomic position (part 1). J Epidemiol Commun
Methodology: Hazhar Blbas and Khanda Aziz Health 60(1):7–12
Data curation: Khanda Aziz Kleinbaum DG, Klein M. (1996) Survival analysis a self-learning text.
Data analysis: Hazhar Blbas and Khanda Aziz Springer
Interpretation of results: Hazhar Blbas Rezaianzadeh A, Peacock J, Reidpath D, Talei A, Hosseini SV, Meh-
Writing and editing: Hazhar Blbas, Khanda Aziz, and Azheen Tofiq rabani D (2009) Survival analysis of 1148 women diagnosed with
Supervision: Hazhar Blbas breast cancer in Southern Iran. BMC cancer 9(1):1–1. https://doi.
org/10.1186/1471-2407-9-168
Data availability The data used to support the findings of this study are Stephenson GD, Rose DP (2003) Breast cancer and obesity: an update.
included within the article. Nutrition Cancer 45(1):1–6. https://d oi.o rg/1 0.1 207/S
15327 914N
C4501_1
Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal
Declarations A, Bray F (2021) Global cancer statistics 2020: GLOBOCAN
estimates of incidence and mortality worldwide for 36 cancers in
Ethics approval All of the participants provided written informed con- 185 countries. CA: Cancer J Clin 71(3):209–249
sent for their involvement in the study after the research procedure was Thomson CA (2012) Diet and breast cancer: understanding risks and
approved by the Ethics Committee at Salahaddin University. benefits. Nutrition Clin Pract 27(5):636–650
Woods LM, Rachet B, Morris M, Bhaskaran K, Coleman MP (2021)
Consent to participate The data for this study were obtained from the Are socio-economic inequalities in breast cancer survival
archived records in the city of Sulaimani for 305 women who received explained by peri-diagnostic factors? BMC Cancer 21(1):485.
breast cancer diagnosis. https://doi.org/10.1186/s12885-021-08087-x
Conflicts of interest The authors declare that they have no conflicts of Publisher’s note Springer Nature remains neutral with regard to
interest to report regarding the present study. jurisdictional claims in published maps and institutional affiliations.
Springer Nature or its licensor (e.g. a society or other partner) holds
exclusive rights to this article under a publishing agreement with the
References author(s) or other rightsholder(s); author self-archiving of the accepted
manuscript version of this article is solely governed by the terms of
Akbari ME, Khayamzadeh M, Khoushnevis SJ, Nafisi N, Akbari A such publishing agreement and applicable law.
(2009) Five and ten years survival in breast cancer patients mas-
tectomies vs. breast conserving surgeries personal experience
Aroian K, Uddin N, Blbas H (2017) Longitudinal study of stress, social
support, and depression in married Arab immigrant women.
Health Care Women Int 38(2):100–117. https://doi.org/10.1080/
07399332.2016.1253698
13