0% found this document useful (0 votes)
73 views12 pages

2019 A - Knowledge-Based - Recommendation - System - That - Includes - Sentiment - Analysis - and - Deep - Learning

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views12 pages

2019 A - Knowledge-Based - Recommendation - System - That - Includes - Sentiment - Analysis - and - Deep - Learning

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

2124 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 15, NO.

4, APRIL 2019

A Knowledge-Based Recommendation System


That Includes Sentiment Analysis and
Deep Learning
Renata Lopes Rosa , Gisele Maria Schwartz , Wilson Vicente Ruggiero,
and Demóstenes Zegarra Rodrı́guez , Senior Member, IEEE

Abstract—Online social networks provide relevant in- number of users, on OSN, is mainly due to the increase of the
formation on users’ opinion about different themes. Thus, number of mobile devices, such as smartphones and tablets,
applications, such as monitoring and recommendation connected to the Internet. Currently, OSN have become a rich
systems (RS) can collect and analyze this data. This paper
presents a knowledge-based recommendation system and universal means of opinion expression, feelings, and they
(KBRS), which includes an emotional health monitoring reflect the bad habits or wellness practices of each user. In recent
system to detect users with potential psychological distur- years, the analysis of the messages posted on OSN have been
bances, specifically, depression and stress. Depending on used by many applications [2], [3] in the industry of health care
the monitoring results, the KBRS, based on ontologies and
informatics.
sentiment analysis, is activated to send happy, calm, relax-
ing, or motivational messages to users with psychological The sentiments and emotions, expressed on the messages
disturbances. Also, the solution includes a mechanism to posted on OSN, provide clues to different aspects of the be-
send warning messages to authorized persons, in case havior of users; for instance, sentences containing words with
a depression disturbance is detected by the monitoring negative meaning may indicate sadness, stress, or dissatisfac-
system. The detection of sentences with depressive and tion [4]. Conversely, it can be inferred that if a person is in a
stressful content is performed through a convolutional neu-
ral network and a bidirectional long short-term memory - positive mood state, this person can be more self-confident and
recurrent neural networks (RNN); the proposed method emotionally stable [5]. Users have different behaviors on OSN,
reached an accuracy of 0.89 and 0.90 to detect depressed if the sentiment intensity value of posted sentences remain at
and stressed users, respectively. Experimental results low levels, or if it frequently changes from high to low levels
show that the proposed KBRS reached a rating of 94% of and vice versa, these facts can indicate some emotional distur-
very satisfied users, as opposed to 69% reached by a RS
without the use of neither a sentiment metric nor ontolo- bance, such as depression or stress events [6]. Hancock et al.
gies. Additionally, subjective test results demonstrated that [7] and Liu [8] observed that users write short sentences when
the proposed solution consumes low memory, processing, they are experiencing a period of depression. Also, these users
and energy from current mobile electronic devices. use the first person pronoun in their sentences and suffer from
Index Terms—Deep learning, knowledge personalization chronic insomnia. Therefore, their behavior can be reflected in
and customization, recommendation system, sentiment the sentences posted on OSN. The presence of certain words in
analysis, social networks. the sentences can be monitored and analyzed to identify users at
a high risk of attempting suicide and an appropriate intervention
I. INTRODUCTION can take place [9].
HE number of active online social network (OSN) users Depression is one of the most prevalent mental disorders in
T has grown considerably, and some studies indicate there
will be 2.95 billion users by the end of 2020 [1]. This high
all regions and cultures around the world [10]. Unfortunately,
depression recognition rate remains low. Most of the studies
about health systems [11]–[13] use sensor devices to detect
mental disorders. Berbano et al. [14] proposed trained classifier,
Manuscript received June 7, 2018; revised July 24, 2018; accepted
August 6, 2018. Date of publication August 24, 2018; date of current which is trained using electroencephalogram signals, is able to
version April 3, 2019. Paper no. TII-18-1482. (Corresponding author: detect stress with an average accuracy of 80.45% using four-
Demóstenes Zegarra Rodrı́guez.) fold cross validation. Ham et al. [15] used heart rate variability
R. L. Rosa and D. Z. Rodrı́guez are with the Federal University of
Lavras, Lavras MG 37200-000, Brazil (e-mail:, [email protected]; data to propose a classification model that considers different
[email protected]). stress levels, baseline, mild stress, and severe stress, reaching
G. M. Schwartz is with the Biosciences Institute of Rio Claro, Univer- accuracy values of 74%, 81%, 82%, respectively.
sidade Estadual Paulista Júlio de Mesquita Filho, São Paulo 13506-900,
Brazil (e-mail:, [email protected]). There is a scarce number of studies that use textual infor-
W. V. Ruggiero is with the Polytechnic School of the University of São mation from OSN data to detect physiological disorders. Xue
Paulo, São Paulo 05508-010, Brazil (e-mail:, [email protected]). et al. [16] use different machine learning (ML) classifiers to per-
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org. form emotion classification focused on the psychological dis-
Digital Object Identifier 10.1109/TII.2018.2867174 orders from microblogs, reaching an average accuracy of 80%.
1551-3203 © 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications standards/publications/rights/index.html for more information.

Authorized licensed use limited to: National Central University. Downloaded on September 22,2023 at 03:03:03 UTC from IEEE Xplore. Restrictions apply.
ROSA et al.: KNOWLEDGE-BASED RECOMMENDATION SYSTEM THAT INCLUDES SENTIMENT ANALYSIS AND DEEP LEARNING 2125

Tsugawa et al. [17] proposed model to detect stress based on the classification of sentences with depressive or stressed
information of Twitter activities reached an accuracy of 69%. content. A CNN is used for character-level representation
Choudhury et al. [18] studied the causes of postpartum de- and BLSTM-RNN for the disorder entity recognition.
pression using OSN information. ML algorithms are also used 2) An improved RS that incorporates a personalized sen-
in studies about mood monitoring systems analyzing messages timent metric, named eSM2, and a health ontology ap-
from OSN [19], reaching an accuracy of 57%. Ma and Hovy [20] proach, specifically the Nuadu ontology.
introduce a network architecture to analyze sentences meaning 3) An enhanced sentiment metric. It is studied and validated
through character-level representations by using a combination that the sentiment metric performance is improved by in-
of long short-term memory (LSTM), a convolutional neural net- corporating the user’s profile information, the geographic
work (CNN), and conditional random field (CRF). Lample et al. location, and the theme of the sentence.
[21] combine recurrent neural networks (RNNs) with CRFs 4) An application installed on a mobile device, which is easy
to obtain the best results on named entity recognition (NER) to be used, and it consumes low memory, processing, and
datasets. A bidirectional LSTM (BLSTM), an improved version energy.
of the LSTM, is also used for labeling tasks. The remainder of this paper is structured as follows. Section II
The deep learning approach has been explored in several areas presents the related work about RS, emotional health monitoring
[22], such as personality analysis [23], age group classification system, machine learning and deep learning approach, and sen-
on OSN [24], sentiment analysis [25], among others. However, timent and affective analysis. Section III presents the methodol-
this approach is not widely explored in psychopathology studies. ogy of the proposed RS using a depression and stress detection
In this context, our study intends to test the performance of by machine learning using CNN, BLSTM-RNN model and the
deep learning algorithms in scenarios of depressed, stressed, eSM2 sentiment metric for mood assessment. In Section IV, the
and nondepressed and nonstressed users’ detection. experimental results are presented along with discussions. Fi-
A recommendation system (RS) application can be used as nally, Section V presents the conclusions and outlines the future
a method to enhance the user’s emotional health, improving work.
the person’s mood in case of negative emotional states [26].
RS based on ontology is being used for health purposes [27], II. RELATED WORK
presenting reliable results from diseases treatment plans.
In this context, the main goal of this paper is to introduce an A. Sentiment and Affective Analysis
RS that uses an approach named knowledge-based recommen- The sentiment analysis helps industries to formulate market-
dation system (KBRS), which aggregates an ontology collection ing strategies, support after-sale services [29], develop health
for health scenarios, named Nuadu [28], which is not addressed monitoring system, RS [3], among other services.
in other RSs designed to improve emotional health. The pro- Sentiment analysis can be performed by:
posed KBRS also includes the sentiment analysis approach and 1) machine learning [30];
an emotional health monitoring system. The monitoring system 2) lexicon-based technique using a word-dictionary of tex-
filters sentences from an OSN that allows to identify potential tual information or corpus-based approach, in which the
users with depression or stress conditions. To accomplish this polarity value is computed based on the occurrences of
task, an objective method based on an BLSTM-RNN is used to the terms in the corpus; and
detect potential psychological disorders, along an CNN. Later, 3) a hybrid technique, which combines machine learning
a KBRS is activated to send happy, calm, relaxing, or motiva- and word-dictionary approaches.
tional messages to these users. These messages have different The machine learning approach needs a large number of data
intensity levels depending on the sentiment intensity of the sen- to obtain reliable results from sentiments; for instance, Chen
tences posted on an OSN, which is determined by an enhanced et al. [31] performs the machine learning approach with a neural
sentiment analysis metric, eSM2. This proposed metric is based network model using BiLSTM-CRF and CNN using 14 492
on a word-dictionary, considering the Portuguese language, and sentences in the training phase.
enhanced with additional information such as user’s profile data, The lexicon-based technique uses an intensity scale with
user’s geographic location, and the theme of the sentence. Fur- emotional words; examples of word-dictionaries are WordNet,
thermore, in the cases of depression detection, the solution sends Sentimeter-Br2 [32], [33], and eSM [3]. This approach is used
warning messages to authorized people who are previously reg- in this research to perform sentiment analysis.
istered in the system. According to the subjective test results, Sentimeter-Br2 is a word-dictionary with its respective sen-
users reported high satisfaction with the KBRS, improving their timent intensity (positive or negative words), considering n-
emotional states. Tests were also performed with a traditional grams, verbal tenses, and adverbs. The sentiment intensity value
RS without the Nuadu ontology and the eSM2 for comparing it of an S-sentence, using the Sentimeter-Br2, is calculated as
with the proposed KBRS. Furthermore, subjective tests reported follows:
that the application running on the user mobile electronic device SU + SB + ST
had low complexity and low-power consumption. Sentimeter Br2(S) = (1)
k+p+q+r
In short, this paper proposes the following.
1) An innovative solution to monitor and to detect po- where Sentimeter-Br2(S) result of the global sentiment inten-
tential users with emotional disturbances, based on the sity of the S-sentence; k is related to the sentence tense, k = 1,

Authorized licensed use limited to: National Central University. Downloaded on September 22,2023 at 03:03:03 UTC from IEEE Xplore. Restrictions apply.
2126 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 15, NO. 4, APRIL 2019

if the sentence has a verb in the past participle; and k = 0 if the eSM by considering the user’ geographic location and the theme
sentence is in another tense or the sentence does not have a verb; of the sentence.
p is the total number of unigrams in the F-sentence, with the
exception of words with no sentimental intensity value (stop- B. Recommendation System
words); q is the total number of bigrams; r is the total number
RS predicts useful items for the user, considering what the
of trigram; ST represents the sentiment score of a trigram; SB is
user may be interested in. For this prediction, some data are
the sentiment score of a bigram and SU represents the sentiment
extracted, for example, user’s profile, user’s preferences, and
score of an unigram.
past behavior [37].
The eSM is an enhanced sentiment metric based on
There are commonly three RS approaches: content based,
Sentimeter-Br2 that additionally considers age, gender, and ed-
collaborative filtering, and hybrid based. The content-based ap-
ucational level, which are obtained from the user profile. The
proach works with the description of an item and the profile
sentiment intensity determination of an S-sentence using eSM is
of the user’s preference; the suggestion of items is based on
given by (2). This relation was obtained from subjective test re-
what the user already liked. The collaborative filtering analyzes
sults, in which each person posted sentences on social networks.
the user’s behavior and preferences and explores similar prefer-
Then, these sentences were scored by both the same person who
ences among people [38]. The hybrid approach combines both
posted the sentences and the eSM relation. In the eSM formula-
methods.
tion can be observed the relation between the sentiment intensity
Traditional RSs are developed based only on words searched
and the user profile characteristics
by the user [39] on the Internet, but many of these words could
eSM (S) = Sentimeter Br2(S) ∗ C ∗ exp(a1 ∗ A1 . . . be searched too far back, and the search would not therefore
represent current information.
+ an ∗ An + g1 ∗ M + g2 ∗ F
Sentiment analysis began to be explored in RS [40] to suggest
+ e1 ∗ G + e2 ∗ nG) (2) more updated contents and based on the person’s mood. The
semantic technique is based on some knowledge base defined
where C represents a scale constant; a1...an represents binary
as ontology, considering the user’s opinions to complete the
factors related to age ranges, if one of them is equal to one,
lack of information, through inferences [41]. As stated before,
the others are zeros; A1...An are the weight factors of each age
a KBRS uses a knowledge base and offers many benefits [41],
range, considering four ranges; g1 and g2 are binary factors
one of which is the treatment of the cold-start problem.
related to the gender; M and F are the weight factors of gender,
This paper proposes a KBRS that incorporates a sentiment
man or woman, respectively; e1 e e2 represents binary factors
metric, which is the difference from the system aforementioned.
related to educational level (higher education or not); G and nG
For performance comparison purposes, tests were performed
are the weight factors of educational level, higher education or
using both a content-based traditional RS and the proposed
not, respectively.
KBRS.
Huffaker and Calvert [34] showed that teenagers behave dif-
ferently in blogs, observing some peculiarities in the writing
style. In our study, the parameters are extended to be used in C. Emotional Health Monitoring System
a sentiment metric; specifically, the geographic location of the Emotional disorder problems need continuous monitoring;
user and the theme of the sentence captured on an OSN. more specifically, depression, and stress disorders. Commonly,
Sentiment analysis is related to positive, negative, or neu- due to the unpredictable behavior of these disorders, the
tral classification. The affective analysis is not limited to three- mood assessment is captured by traditional standard procedures
sentiment polarities because it considers the different emotions, through rating scales and questionnaires [42]. Also, there are
such as sadness and anger; although they have the same senti- solutions to treat human body signals, such as, PSYCHE [43],
ment polarity (negative), the emotions are totally different. The which is a sentiment analysis solution, based on portable sensing
emoticons, icons for representing emotions [35], and expres- devices. The voice signal has also been studied for emotional
sions such as “LOL” (laughing out loud) represent a sentiment assessment [44], along with attitudinal indicators, such as sleep
and affective meaning, which are commonly found in sentences quality, galvanic skin response, activity, and gesture.
extracted from OSN. Thus, affective analysis is used for mea- Studies in psychiatry [45]–[47] found that linguistic styles
suring the person’s emotion. can indicate depression disorders. Nguyen et al. [48] used the
In the sentiment and affective analysis, there are some points lexicon approach and the most used words in a group of depres-
to be explored, such as if the user profile influences the senti- sive users are detected. Anger, according to [6], [49] indicates
ment metric performance, which characteristics must be con- a negative emotion, and it can represent stress disorders. These
sidered and how to perform the association between the user’s studies perform the textual and linguistic features analysis, but
profile and the sentiment metric. For instance, the sentiment they are not explored in a health monitoring system. The advan-
intensity determined by a metric can change depending on the tage of the textual analysis is that this technique does not need
gender [36]. special and specific equipment; thus, it is a cheaper solution.
It is worth noting that, currently, scarce studies about lexicon- The mental health of a person can be reflected on his/her
based metrics take into account profile parameters. In this re- mood. There are applications to improve the user’s optimism
search, our proposed sentiment metric, eSM2 complements the [50], which log self-reported mood, in which the mood charts

Authorized licensed use limited to: National Central University. Downloaded on September 22,2023 at 03:03:03 UTC from IEEE Xplore. Restrictions apply.
ROSA et al.: KNOWLEDGE-BASED RECOMMENDATION SYSTEM THAT INCLUDES SENTIMENT ANALYSIS AND DEEP LEARNING 2127

have been recommended by psychiatrists and therapists, and combination of BLSTM, CNN, and CRF model, reaching a F1-
clients monitor their own mental health. Projects for depression score of 88.59%. BLSTM works with labeling tasks considering
detection have been implemented by Wang et al. [51] reaching past and future contexts.
an accuracy around 80%, through a model based on sentiment In this paper, CNNs perform the character-level representa-
analysis in microblogs. However, the research by Wang et al. tions to feed a BLSTM based on RNN to perform the entity
[51] and [52] focus only on depression monitoring and an RS is representations. Another BLSTM-RNN performs the relation
not implemented. between entities, which is stacked on the first BLSTM-RNN.
As stated before, OSNs have an enormous volume of data The main goal of the study is to improve the classification ac-
that can be used for monitoring the users’ mental health. A curacy of sentences posted by users that present depression and
system to provide continuous emotional monitoring at home, stress disorders.
job, or school/university is a hard task; therefore, OSN is a way
of capturing the users’ emotions in these environments. One of
the great challenges of related works is to propose a solution III. METHODOLOGY
that gets a high accuracy for the emotional disorder detection. This section presents the methodology to build the proposed
In this paper, it is intended to confront this problem. KBRS solution. The first step in this research was to conduct
subjective tests to help determine the model of the proposed
KBRS. Later, each component of the KBRS is explained.
D. Machine Learning and Deep Learning Approach
The ML is a useful technique to conduct classification [53],
statistical analysis, feature selection, and data normalization. A. Subjective Tests
ML classifies problems in general, including affective analysis Subjective tests were performed in a laboratory environment
and the mood detection problem. This technique uses algorithms to determine the eSM2 metric parameters. The tests helped to
to perform a supervised or unsupervised method. establish user’s preferences regarding the kind of messages they
The word2vec [54] is a popular approach for capturing words would like to receive, and to evaluate the KBRS resources con-
to send to an ML algorithm. It allows modeling words as vectors, sumed in the electronic device. Finally, a remote method was
and it is based on the skip-gram and continuous bag-of-words used to validate the performance of the monitoring system and
(CBOW) model to compute the distributed representations of the RS.
words. The CBOW model predicts a word in a context and the Assessors were selected to answer questions about their emo-
skip-gram finds the context given by a word. The work described tional state and write sentences on the social network Facebook.
in [55] presents a sentiment-driven and a standard embedding Some of the users were diagnosed with problems of acute stress
associated with a variety of pooling functions to extract the and mild to moderate depression level, according to their clinical
sentiment of Twitter comments. history.
ML is frequently used to perform emotion classification [56], The sentences were remotely extracted from an OSN with
which are based on six different classes, representing the “Ek- negative and positive nouns, adjectives, and verbs. The sentences
man” model of emotion [57]: anger, disgust, fear, joy, sadness, were analyzed by the machine learning algorithms, including the
and surprise. Stress and depressive messages are detected by CNN, BLSTM-RNN model for classification of sentences with
negative sentences that contain one or more of these emotion depressive, stress, and nondepressive and nonstress content. It
classes. is important to highlight that the assessors were instructed to
The SVM algorithm has been widely used for emotion clas- write sentences on the OSN, if he/she was feeling motivated to
sification tasks, with the good generalization properties [58]. do so, simulating a real situation of writing posts in their daily
SVM is used for classification with several feature selection routine.
techniques in the depression detection context [59]. The SMO The tests with the assessors to be used into the sentiment
algorithm is used to train SVM; SMO was the most accurate for metric were performed in two phases, face-to-face and remote
predicting depression among senior citizens [60]. subjective tests. The face-to-face data collection was conducted
Random Forest and Naı̈ve Bayes algorithms are also used for in a laboratory environment to find which parameters of the
sentiment analysis, for detecting psychological disturbances on user’s profile could affect the sentiment intensity value of a
OSN; the results present an accuracy of about 0.8 [16]. sentence. The collection task also helped to evaluate the perfor-
Deep learning algorithms have been used to extract sentiment mance of the monitoring system. The remote subjective method
features in conjunction with semantic features [25]. was performed to validate the proposed sentiment metric. Thus,
Recent studies using deep CNNs [61], [62] presented signif- the initial task in a laboratory helped to prove the initial hypoth-
icant performance improvement in natural language processing esis, and the remote method validated it.
(NLP) tasks. Collobert et al. [61] used words embedding into The data collection process was performed by Portuguese
a CNN to solve NLP problems, such as part-of-speech tagging native-speaking assessors. The 146 assessors, comprising 74
and semantic labeling. men and 72 women with ages ranging from 18 to 43 years old,
Problems with NER, which involves noun phrases identifi- with different profiles, such as region of birth (north, south, and
cation in classes, such as the solution used for the Vietnamese southeast of Brazil), educational level, among other character-
language [63] uses a deep learning method composed of the istics, presented in Table I. The ages were separated into three

Authorized licensed use limited to: National Central University. Downloaded on September 22,2023 at 03:03:03 UTC from IEEE Xplore. Restrictions apply.
2128 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 15, NO. 4, APRIL 2019

TABLE I
PROFILES ANALYZED ON SOCIAL NETWORK AND RESPECTIVE VALUES

groups (A1, A2, and A3) and the sentences were separated into
three themes (entertainment, work or study, and family).
A web interface questionnaire was presented to assessors
with a question about which words could identify their emo-
tional state, in stressful conditions or at moments of depression.
Furthermore, another question was presented about what kind
of message the user would prefer in depression and stress con- Fig. 1. Architecture of the proposed KBRS, which considers sentences
extracted from social networks.
ditions (happy, calm, relaxing or motivational messages); out of
which the person could choose one or two kinds of messages.
The kinds of messages chosen by the users are used in the RS.
Additionally, users indicate the number of messages and the people that are previously registered. Later, the selected sen-
period of day when they would like to receive the messages. tences are analyzed by the sentiment metric (eSM2) and the sen-
The questionnaire helped to know what person’s characteris- timent intensity is used as input of the recommendation engine.
tics could affect a sentiment metric. Later, the assessors wrote The KBRS server establishes a communication with the KBRS
sentences on OSN, which were captured by a script. The sen- client application, in which the user receives a specific message
tences were classified by both the person who wrote the sen- according to his/her profile, ontology aspects, and the sentiment
tences and by the Sentimeter-Br2 metric; each sentence was value calculated from his/her sentences extracted from OSNs.
classified on a sentiment scale from −5 to +5. The system consists of the following components.
In the second phase, the assessors’ messages were remotely 1) User profile and user data: database built from the data
monitored over a 5-week test period; all the sentences captured captured from OSNs.
were also analyzed by both the Sentimeter-Br2 sentiment metric 2) Messages: there is a database with 360 messages, 90
and by the assessor who posted the sentences. At the end of the 5- messages for each kind (relaxing, motivational, happy,
week period, the assessors came back to the laboratory to test the or calm messages) to be suggested to the user by the
performance of the KBRS in a standard mobile phone. In total, recommendation engine. The users can previously choose
27 308 sentences were extracted from the OSN and evaluated by one or two kinds of messages when they undergo a period
the sentiment metric. A sentiment correction factor based on the of stress or depression. The messages were written by 3
user’s profile was modeled using the subjective test results. This Specialists in psychology and validated by three other
modeled correction factor can be applied to traditional sentiment Specialists.
metrics. 3) Depression or stress detection by machine learning: the
The update of the Sentimeter-Br2 dictionary, considering new sentences are extracted from OSN and they are filtered
slang and expressions, was performed by specialists, who added by machine learning to detect depression or stress condi-
new words and their respective scores. tions. It is implemented in the emotional health monitor-
ing system.
4) Sentiment analysis by eSM2: the sentences are filtered
B. Proposed Knowledge-Based Recommendation
and scored by the eSM2 sentiment metric, from −5 to
System +5. This range was tested and validated in previous stud-
The KBRS contains the emotional health monitoring system, ies [3], [64]. The sentiment intensity of the sentence will
which uses the deep learning model and the sentiment metric determine the message intensity. There are three levels of
named eSM2. A high-level view of the proposed KBRS archi- messages: extreme, intermediate, and lower. If the moni-
tecture is shown in Fig. 1. toring system detects that the user is very/ deeply stressed,
The sentences are extracted from an OSN, as shown in Fig. 1. a very positive message is sent to him or her. The senti-
The emotional health monitoring system identifies which sen- ment intensity range and the respective message intensity
tences present a stress or depression content using machine levels are presented in Table II; the message intensity lev-
learning algorithms and the emotion of the sentence content. els were determined according to the users’ opinions.
The monitoring system is able to send warning message to Examples of very positive messages include intensity

Authorized licensed use limited to: National Central University. Downloaded on September 22,2023 at 03:03:03 UTC from IEEE Xplore. Restrictions apply.
ROSA et al.: KNOWLEDGE-BASED RECOMMENDATION SYSTEM THAT INCLUDES SENTIMENT ANALYSIS AND DEEP LEARNING 2129

TABLE II learning approach. In case a stress or depression disorder is


SENTIMENT INTENSITY RANGE AND RESPECTIVE MESSAGE INTENSITY
LEVEL
identified the KBRS is activated.
In the solution, the main characteristics of depression or stress
are short texts, negative emotions, low values of sentiment in-
tensity and use of the first person pronoun. In case depressive
sentences are detected by machine learning, warning messages
are sent by different options (voice message or e-mail). These
messages are only sent to authorized people that were previously
registered in the system.
The machine learning approach was used to identify sen-
tences with depressive, stress, and nondepressive and nonstress
content extracted from an OSN. These sentences were filtered,
in which expressions and emoticons were available in the mod-
adverbs, such as much, very, strongly, among others. els for detecting stress and depression expressions, such as “hate
Also, the kind of message corresponds to the preference my life,” “feeling sad,” “I am stressed,” among others. Positive
defined in the tests in the first phase. emotional sentences were also filtered to improve the classifi-
5) Ontologies: mechanism of classes, objects, and relation- cation of depression versus nondepression sentences, thereby
ship used for the recommendation engine. The ontologies decreasing the false positive for depression detection. In or-
are expressed by the ontology web language. The data of der to distinguish depression and stress conditions, the emotion
each class can be extracted from OSN. The Nuadu ontol- of the sentence content is considered. Thus, sentences related
ogy collection is used for health scenarios. In this paper, to anger, disgust, and surprise emotions are associated to the
the following Nuadu’s classes were used. stress [6], [49]; and depression is associated to fear and sadness
a) Personal ontology: describes the personal informa- emotions [48].
tion stored for each individual, including gender, The dataset used to classify stress, depression, and nonstress
work, studies, and other preferences. and nondepression expressions, in the training phase, was built
b) Activity ontology: describes information about the using sentences written by 146 assessors on an OSN. In to-
person’s activities. The entries may indicate changes tal, 27 308 labeled Facebook messages were used, of which
in the routine of the person. 23.700% and 26.197% of the messages correspond to depres-
c) Sleep ontology: describes users’ routines and sched- sion and stress sentences, respectively, and 50.103% messages
ules. are related to nonstress and nondepression sentences. It is
d) Risk ontology: describes information about smok- worth noting that in the testing phase, additional 146 assessors
ing and alcohol consumption; these habits reflect bad participated.
habits showing a tendency to greater stress and dis- The CNN with BLSTM-RNN model, SMO, random forest,
ease development. and Naı̈ve Bayesian classification were used in this paper. Stud-
e) Context ontology: describes the environment of the ies [65] present good results using approaches such as hidden
person (home, study, work, or travel). This informa- Markov model and Gaussian mixture model. However, prelim-
tion is important because it can explain a period of inary tests of our work presented low values of accuracy and
missing activity entries or change in sleeping times. F1-score using this approach. Then, they were not used in fur-
6) Recommendation engine: mechanism responsible for ther tests.
generating a list of recommendations. This research used the Theano library [66] for deep learn-
In the proposed system, users personal information, and con- ing architecture implementation and other algorithms. In all
text information on Facebook is used. However, users do not the experiments, tenfold cross validation was used to test and
always post this related information on their Facebook account. to evaluate the accuracy of the stress or depression detection
In case users do not post personal information, a standard infor- by machine learning techniques. The classification was per-
mation is used, such as sleep routine of 8 h, no unhealthy habits, formed in a binary attribute, as depressed/nondepressed and
no preferences about work or study. It is important to note that stresses/nonstressed sentences.
in our tests only 5% of the users do not post this information. In the deep learning architecture, the CNN computes the
A traditional RS is also implemented, in which only the words character-level representation with characters serving as inputs.
searched by a person on OSN are used to feed the system, The convolutional kernel of the CNN performs the convolutions
forming a content-based RS. For the sake of simplicity, the for the characters of the words; for each convolution i, the kernel
traditional content-based RS will not be explained in this section. output ko is performed, as follows:
koi = htaf (Mi rci + bi ) (3)
C. Emotional Health Monitoring System by Machine where parameter Mi is the parameter matrix and bi is the
Learning Using the BLSTM-RNN Model
learned bias vector; HTAF represents the hyperbolic tangent
In the emotional health monitoring system, user’s sentences activation function and rci represents the character-level repre-
are extracted from an OSN and filtered through the machine sentation of word i.

Authorized licensed use limited to: National Central University. Downloaded on September 22,2023 at 03:03:03 UTC from IEEE Xplore. Restrictions apply.
2130 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 15, NO. 4, APRIL 2019

values were chosen after experiments with different values until


the best results were reached.
The F1-score, accuracy, and precision recall area under the
curve (PR AUC) [67] were used as performance metrics to
compare the results obtained by different machine learning
algorithms. F1-score is the harmonic mean between recall and
precision. The PR-AUC is commonly used in cases of unbal-
anced data.

D. eSM2 Sentiment Metric


The correction factor used herein depends on the age range,
gender, theme, educational level, and geographic location of
the person, besides the theme of the sentence. These data were
obtained from 146 volunteer assessors. They could also provide
this information on the application interface. Therefore, the user
had the right to make their data available or not, in accordance
to the privacy policies. Also, 29 assessors were identified with
a depressive profile, and only these assessors were requested to
Fig. 2. Mechanism of the BLSTM-RNN method for classifying the
relation of the sentence “imposition causes stress”. send contacts of authorized people who will receive warning
messages.
The correction factor to be applied to sentiment analyses
In HTAF, each network layer presents a bias node that is based on a user profile is necessary because a sentence can
connected to all other nodes. The HTAF is used for the hidden be scored with the different sentiment values, depending on
and output layers for calculating the backpropagated error signal the user’s characteristic and theme. The parameters age, gen-
of the deep learning architecture. der, educational level, and geographic location were considered
The Softmax layer is responsible for finding the probability because they were identified in preliminary tests as the most
P of the relation labels, according to (4). The hidden activation impacting factors on the global score obtained by a sentiment
layer h is derived from the input characters metric. Also, these factors can be extracted from OSN, using the
programming languages PHP and AJAX. The theme is identified
P = softmax(Mi h + bi ). (4) by an automatic script based on keywords.
The novel proposal of sentiment intensity metric, named
Another test was performed using the SVM classifier instead eSM2, considering an S-sentence, is introduced in (5). The
of the Softmax function for performance comparison purposes. eSM2 is based on Sentimeter-Br2 and a correction factor that
Fig. 2 presents the neural network model, which used the out- uses different parameters related to user profile. A second and a
put vectors of BLSTM to feed the HTAF layer; the character- third order polynomial function were also tested, but exponential
level representation, along with the word embedding vec- function obtained the best performance
tor, served to feed the BLSTM-RNNs. The output vectors of
BLSTM are sent to the DE layer to choose the label sequence; eSM 2(S) = Sentimeter Br2(S) ∗ C ∗ exp(a1 ∗ A1 · · ·
the DE represents the disease extraction (stress or depression) + an ∗ An + g1 ∗ M + g2 ∗ F + e1 ∗ E1
by the Softmax output layer.
The hidden states ha, hb account for capturing information + · · · + en ∗ En + t1 ∗ T1 + · · · + tn ∗ Tn
in next steps in direct and reverse directions. The LSTM output + l1 ∗ L2 + · · · + ln ∗ Ln ) (5)
performs the bottom-up (↑ h) and top-down (↓ h) computation,
the bottom-up captures the information in the current step and where
in the previous steps in the neural network model and the top-
down computation calculates the information using the reversed C scale constant;
inputs. The direct and reverse directions are performed until a1 ...an binary factors related to age ranges; if one of them is
reach the disease extraction. An example of sentence captured equal to one, the others are zeros. A1 ...An are the weight
is “imposition causes stress,” in which the subject of the sentence factors of each age range; this work considered three ranges;
(imposition) represents a1 and the object of the sentence (stress) g1 and g2 binary factors related to the gender; if one of them is
represents b1; the y parameter represents the input vector in a equal to one, the other is zero, and M and F are the weight
LSTM unit, in which y1 is the first word, y2 is the second factors of gender, man or woman, respectively;
word, and y3 is the third word. The DE layer generates the label t1 ...tn binary factors related to themes, if one of them is equal to
stress/non stress in the example of Fig. 2. one, the others are zeros. T1 ...Tn are the weight factors of each
The tests used a batch size of 10, momentum equal to 0.8, a theme. This paper considered three themes (entertainment,
learning rate of 0.01, 50 epochs, and dropout rate of 0.5. These work/studies, and family);

Authorized licensed use limited to: National Central University. Downloaded on September 22,2023 at 03:03:03 UTC from IEEE Xplore. Restrictions apply.
ROSA et al.: KNOWLEDGE-BASED RECOMMENDATION SYSTEM THAT INCLUDES SENTIMENT ANALYSIS AND DEEP LEARNING 2131

l1 ...ln binary factors related to geographic location; if one of IV. EXPERIMENTAL RESULTS
them is equal to one, the others are zeros; L1 ...Ln the weight
This section describes the experiments results regarding the
factors of each geographic location, this paper considered performance evaluation of the emotional health monitoring sys-
three locations in Brazil (north, south, and southeast);
tem, the definition of the eSM2 sentiment metric parameters,
e1 ...en binary factors related to educational level; if one of them
and the performance evaluation of the proposed KBRS.
is equal to one, the others are zeros; and
E1 ...En the weight factors of each educational level. This paper
considered three educational levels (higher education, bachelor A. Performance Evaluation of the Emotional Health
or equivalent, master or superior), these levels were chosen Monitoring System
according to UNESCO [68] educational levels. As stated before, 27 308 labeled messages were used in the
Each assessor evaluated the sentiment value of 20 sentences training phase. Preliminary test results showed that using 75%
in the laboratory to measure the performance results of the of these messages presented similar classification accuracy to
eSM2. With this information, each evaluated sentence represents using 100% of them; thus, it is demonstrated that the data used
an equation with 15 unknown variables; thus, in total, there is sufficient to train the deep learning model. In this phase, the
are 2920 equations and an overdetermined system is obtained. CNN BLSTM-RNN using SoftMax obtained the best perfor-
First, to solve this problem, each equation is linearized, then the mance considering all the performance assessment parameters.
least square method, specifically, a pseudo inverse method is In the experiments performed in the testing phase, additional
used. 146 assessors participated in subjective tests to evaluate the pro-
The following statistical functions were considered for perfor- posed model. A total number of 25 192 sentences were extracted
mance assessment of (5), the root-mean-square error (RMSE), from an OSN along five weeks.
the maximum and average error. Table III show the performance of the machine learning
algorithms used in the testing phase. Specifically, Table III
E. KBRS Client-Server Architecture presents the F1-Score, accuracy and PR-AUC of depressed,
stressed, and nondepressed and nonstressed sentences classi-
The KBRS client and server applications are described as fication.
follows. The CNN BLSTM-RNN using SoftMax had the best perfor-
An application is installed on the mobile device using mance, according to Table III. The results presented an accuracy
the client-server architecture through a web application in of 0.89, 0.90 and 0.93 to detect depressed, stressed, and non-
hyper-text preprocessor (PHP), JavaScript object notation, and stressed and nondepressed sentences, respectively.
HTML5. The Softmax presented the best results because it is optimized
As stated before, the first step towards using the emotional for back-propagation network [69].
health monitoring system and the KBRS is to access the ap- In order to evaluate the classifier methodology performance,
plication interface running on the user device and to grant the participants were monitored, and after each day during the
necessary permissions for the application to extract the user’s test period, they classified their mood with an emotion icon
profile information. (emoticon).
On the other hand, a Web server stores several information In case of users with depressive profile, 100% of the par-
such as the user profile information and the messages databases. ticipants considered useful to send a notification message to
Also, different tasks are performed in the Web server, such as authorized persons. These messages were sent by e-mail and
the recommendation engine, the sentiment analysis metric, the short-message services.
emotion estimation of each sentence analyzed. It is important to It is important to note that the results obtained in the present
note that the user’s device is not overloaded because this device work reached a stress and depression classification accuracy
is solely used to receive messages. higher than 88% overcoming the results obtained by similar
The sentences and user profile are extracted from the Face- works [11]–[17], [19].
book social network by an automatic script, periodically. The
depression or stress detection is performed by machine learn-
ing, and the filtered sentences are scored by the eSM2 sentiment B. Definition of the eSM2 Sentiment Metric Parameters
metric to find the sentiment intensity. Fig. 3 presents the average weight values of each parameter
Four types of messages, which can vary in sentiment intensity considered in the eSM2 model and obtained from the experi-
level and in ontology use, are sent to the users’ application, mental test results.
considering the kind of messages chosen by each user. In the The results show that gender, age range, geographic location,
first type of message, the KBRS uses sentiment analysis with and theme parameters are the most influential factors in a senti-
the eSM2 metric; in the second one, the KBRS uses a sentiment ment metric. It is important to note that the application extract
metric and no ontology; the third message does not contain the user characteristics only if the user allows it. Also, the applica-
sentiment metric but it contains the ontology. The last message tion does not extract characteristics, such as religion and race;
does not contain sentiment metric and ontology. If the user does therefore, ethical standards are not violated.
not write a sentence on an OSN, a default positive message is Table IV presents the performance assessment of eSM2 con-
sent to the user’s application client side. sidering RMSE, maximum error and nverage error in relation

Authorized licensed use limited to: National Central University. Downloaded on September 22,2023 at 03:03:03 UTC from IEEE Xplore. Restrictions apply.
2132 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 15, NO. 4, APRIL 2019

TABLE III
F1-SCORE, ACCURACY AND PR-AUC VALUES FOR DEPRESSION, STRESS AND NON-DEPRESSION/NON-STRESSED SENTENCES DETECTION OBTAINED BY
MACHINE LEARNING ALGORITHMS

TABLE V
PERCEIVED VALUE OF CONSUMED RESOURCES IN THE ELECTRONIC DEVICE
AND RESPECTIVE EVALUATION IN ACCORDANCE WITH A 5-POINT SCALE

Fig. 3. Weight factors of the parameters used in the eSM2 sentiment TABLE VI
metric obtained from subjective test results. PERCENTAGE OF PERCEIVED VALUE OF FOUR KINDS OF RECOMMENDATION
SYSTEM CONSIDERING AND NOT CONSIDERING THE eSM2 SENTIMENT
METRIC AND CONSIDERING AND NOT CONSIDERING THE ONTOLOGY IN THE
TABLE IV RECOMMENDATION SYSTEM
PERFORMANCE ASSESSMENT OF eSM2 CONSIDERING RMSE, MAXIMUM
ERROR AND AVERAGE ERROR IN RELATION TO THE SENTIMENT INTENSITY
SCORED BY THE ASSESSORS

to the sentiment intensity scored by the assessors in subjective


face-to-face tests. which 1 means very unsatisfied, 5 means very satisfied, and 6
means no opinion.
1) How would you rate the time needed to get recommen-
C. Performance Evaluation of the Proposed dations?
Recommendation System
2) How would you rate the appearance and interface?
The application of KBRS at the end-user device was tested in 3) How would you rate the variety of the proposed mes-
the laboratory environment, using a mobile device with a wire- sages?
less interface (Wi-Fi). The application was built to be performed 4) How would you rate the general usability?
in an equipment with a 1400 Mhz 32-bit Quad-Core processor, The results of this analysis indicated that 92% of the assessors
1 GB RAM memory, in which only 1 core of the processor is liked the variety of the proposed messages and the interface of
used to execute KBRS. A 5-point scale average value of the the application running on the user device. Finally, 89% of
parameters was used from 1 to 5, in which value 5 represents the assessors classified the general usability of KBRS with the
the best results and value 1 represents the worst punctuation. maximum score.
Table V presents the results of the evaluation of the recom- As stated before, in the data collection process, the users
mendation message latency, the energy consumption by KBRS chose one or two kinds of messages (happy, calm, relaxing,
(considering ontology and eSM2) and the apparent network re- or motivational messages) that they would prefer to receive
source consumption, which tests were performed in the exclu- in depression or stress conditions. These messages are used by
sive and standardized equipment used by the assessors in the KBRS. Additionally, users indicate the number of messages and
laboratory. the period of day when they would like to receive the messages.
Ergonomic aspects were also analyzed by the assessors, who Table VI presents the results considering the KBRS, which
answered the following questions, using a scale from 1 to 6, in is based on ontology, whereas the traditional content-based RS,

Authorized licensed use limited to: National Central University. Downloaded on September 22,2023 at 03:03:03 UTC from IEEE Xplore. Restrictions apply.
ROSA et al.: KNOWLEDGE-BASED RECOMMENDATION SYSTEM THAT INCLUDES SENTIMENT ANALYSIS AND DEEP LEARNING 2133

TABLE VII obtained by the KBRS proves the effectiveness of the use of
PERCENTAGE OF PERCEIVED VALUE OF FOUR KIND OF RECOMMENDATION
SYSTEM CONSIDERING AND NOT CONSIDERING THE eSM SENTIMENT
ontology and especially the use of a personalized sentiment
METRIC AND CONSIDERING AND NOT CONSIDERING THE ONTOLOGY IN THE analysis instead of a general sentiment analysis. In general, the
RECOMMENDATION SYSTEM recommended messages sent to the appropriated users improved
their emotional state, and this is the most significant contribution
of this research.
Furthermore, the proposed KBRS uses an application at end-
user devices which is not based on a complex programming
language, thus consuming fewer resources from current elec-
tronic devices. Also, the client device interface is easy to use.
Nevertheless, considering the satisfactory results obtained by
the proposed KBRS, in future works, it can be applied in other
services, such as customer complaint systems and user help
systems to detect abrupt changes of customers’ emotion.
named “no ontology” is used with the eSM2 and without the
eSM2. The results from KBRS using the eSM2 metric and on- REFERENCES
tology reached 94% of satisfied users. [1] I.-R. Glavan, A. Mirica, and B. Firtescu, “The use of social media for
The results of Table VI are related to the users’ satisfaction communication,” Official Stat. Eur. Level. Romanian Statist. Rev., vol. 4,
level with message suggestions received on the user’s device. pp. 37–48, Dec. 2016.
[2] M. Al-Qurishi, M. S. Hossain, M. Alrubaian, S. M. M. Rahman, and
The answer options comply to a scale based on adjectives, which A. Alamri, “Leveraging analysis of user behavior to identify malicious
are: very good, good, neutral, poor, and very poor. activities in large-scale social networks,” IEEE Trans. Ind. Inform., vol. 14,
In order to compare the eSM and eSM2 performances, exper- no. 2, pp. 799–813, Feb. 2018.
[3] R. L. Rosa, D. Z. Rodrı́guez, and G. Bressan, “Music recommendation
iments that considered the eSM were also performed. Table VII system based on user’s sentiments extracted from social networks,” IEEE
presents the results of eSM metric, considering the same test Trans. Consum. Electron., vol. 61, no. 3, pp. 359–367, Oct. 2015.
scenarios presented in Table VI. The results from KBRS using [4] R. Rosa et al., “Monitoring system for potential users with depression
using sentiment analysis,” in Proc. IEEE Int. Conf. Consum. Electron..
the eSM metric and ontology reached 89% of satisfied users. Sao Paulo, Brazil, Jan. 2016, pp. 381–382.
[5] I. B. Weiner and R. L. Greene, Handbook of Personality Assessment.
Hoboken, NJ, USA: Wiley, 2008.
V. CONCLUSION AND FUTURE WORK [6] H. Lin et al., “Detecting stress based on social interactions in social
networks,” IEEE Trans. Knowl. Data Eng., vol. 29, no. 9, pp. 1820–1833,
In order to improve the KBRS performance, the eSM2 was Sep. 2017.
modeled, considering user profile parameters, geographical lo- [7] J. T. Hancock, K. Gee, K. Ciaccio, and J. M.-H. Lin, “I’m sad you’re sad:
cation, and the theme of the sentence to identify the sentiment Emotional contagion in cmc,” in Proc. ACM Conf. Comput. Supported
Cooperative Work, 2008, pp. 295–298.
intensity of a message. These two parameters are not consid- [8] B. Liu, “Many facets of sentiment analysis,” A Practical Guide to Senti-
ered in current sentiment metrics. The performance assessment ment Analysis, New York, NY, USA: Springer, Jan. 2017, pp. 11–39.
of eSM and eSM2 metrics was performed, and results obtained [9] Y. P. Huang, T. Goh, and C. L. Liew, “Hunting suicide notes in web
2.0 - preliminary findings,” in Proc. 9th IEEE Int. Symp. Multimedia
by the eSM2 were superior in the perceptual evaluation of Workshops, Dec 2007, pp. 517–521.
the RS. This fact demonstrated the relevance of using addi- [10] W. H. Organization, “World health statistics 2016: Monitoring health for
tional user profile parameters to improve the sentiment metric the sdgs sustainable development goals, world health statistics annual,”
World Health Org., p. 161, 2016.
performance. [11] Y. Zhang, C. Xu, H. Li, K. Yang, J. Zhou, and X. Lin, “Healthdep: An
Also, the ontology concept was used in the proposed KBRS. It efficient and secure deduplication scheme for cloud-assisted ehealth sys-
is important to note that the correction factor proposed in eSM2, tems,” IEEE Trans. Ind. Inform., to be published.
[12] G. Sannino, I. D. Falco, and G. D. Pietro, “A continuous non-invasive
based on the user’s profile, can be applied to other sentiment arterial pressure (cnap) approach for health 4.0 systems,” IEEE Trans.
metrics. Ind. Inform., to be published.
Currently, there are few works that use OSN data to detect [13] H. Thapliyal, V. Khalus, and C. Labrado, “Stress detection and manage-
ment: A survey of wearable smart health devices,” IEEE Consum. Electron.
stress conditions. The solution for monitoring the depressed or Mag., vol. 6, no. 4, pp. 64–69, Oct. 2017.
stressed condition in OSN users, using the CNN for character- [14] A. E. U. Berbano, H. N. V. Pengson, C. G. V. Razon, K. C. G. Tungcul,
level representation, and the BLSTM-RNN for the disorder en- and S. V. Prado, “Classification of stress into emotional, mental, physical
and no stress using electroencephalogram signal analysis,” in Proc. IEEE
tity recognition, presented an accuracy for depression and stress Int. Conf. Signal Image Process. Appl., Sep. 2017, pp. 11–14.
detection of 0.89 and 0.90, respectively. These accuracy values [15] J. Ham, D. Cho, J. Oh, and B. Lee, “Discrimination of multiple stress levels
are higher than the results obtained in related works. in virtual reality environments using heart rate variability,” in Proc. 39th
IEEE Eng. Med. Biol. Soc. Annu. Int. Conf., Jul. 2017, pp. 3989–3992.
In the performance assessment tests, the proposed KBRS [16] Y. Xue, Q. Li, L. Jin, L. Feng, D. A. Clifton, and G. D. Clifford, “De-
was compared to another KBRS that does not consider a senti- tecting adolescent psychological pressures from micro-blog,” in Health
ment metric and ontology. Results demonstrate that the proposed Information Science. New York, NY, USA: Springer, 2014, pp. 83–94.
[17] S. Tsugawa, Y. Kikuchi, F. Kishino, K. Nakajima, Y. Itoh, and H. Ohsaki,
KBRS overcomes the RS without sentiment metric and ontol- “Recognizing depression from twitter activity,” in Proc. 33rd Annu. ACM
ogy, reaching 94% and 69% of very satisfied users, respectively. Conf. Human Factors Comput. Syst., 2015, pp. 3187–3196.
According to the users, an RS that does not consider ontology [18] M. De Choudhury, S. Counts, E. J. Horvitz, and A. Hoff, “Characterizing
and predicting postpartum depression from shared facebook data,” in Proc.
and a sentiment metric performs a very poor suggestion, using 17th ACM Conf. Comput. Supported Cooperative Work, Social Comput.,
a more generic and not personalized content. The best result 2014, pp. 626–638.

Authorized licensed use limited to: National Central University. Downloaded on September 22,2023 at 03:03:03 UTC from IEEE Xplore. Restrictions apply.
2134 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 15, NO. 4, APRIL 2019

[19] R. Rodrigues, R. das Dores, C. Camilo-Junior, and C. Rosa, “Sentihealth- [43] R. Paradiso, A. M. Bianchi, K. Lau, and E. P. Scilingo, “Psyche: Person-
cancer: A sentiment analysis tool to help detecting mood of patients in alised monitoring systems for care in mental health,” in Proc. Annu. Int.
online social networks,” Int. J. Med. Inform., vol. 1, no. 85, pp. 80–95, Conf. IEEE Eng. Med. Biol., Aug. 2010, pp. 3602–3605.
2016. [44] K. R. Scherer, J. Sundberg, L. Tamarit, and G. L. Salomao, “Comparing
[20] X. Ma and E. Hovy, “End-to-end sequence labeling via bi-directional lstm- the acoustic expression of emotion in the speaking and the singing voice,”
cnns-crf,” in Proc. 54th Annu. Meet. Assoc. Comput. Linguistics (Volume Comput. Speech Lang., vol. 29, no. 1, pp. 218–235, 2015.
1: Long Papers). Berlin, Germany: Association for Computational Lin- [45] R. Ramirez-Esparza, C. Chung, E. Kacewicz, and J. Pennebaker, “The
guistics, Aug. 2016, pp. 1064–1074. psychology of word use in depression forums in english and in spanish:
[21] G. Lample, M. Ballesteros, S. Subramanian, K. Kawakami, and C. Dyer, Testing two text analytic approaches,” in Proc. 2nd Int. Conf. Weblogs
“Neural architectures for named entity recognition,” in Proc. Conf. North Social Media, 2008, pp. 102–108.
Amer. Chapter Assoc. Comput. Linguistics, Human Lang. Technol., San [46] S. Rude, E.-M. Gortner, and J. Pennebaker, “Language use of depressed
Diego, CA, USA, May 2016, pp. 260–270. and depression-vulnerable college students,” Cogn. Emotion, vol. 18, no. 8,
[22] M. Khodayar, O. Kaynak, and M. E. Khodayar, “Rough deep neural pp. 1121–1133, 2004.
architecture for short-term wind speed forecasting,” IEEE Trans. Ind. [47] S. W. Stirman and J. Pennebaker, “Word use in the poetry of suicidal
Inform., vol. 13, no. 6, pp. 2770–2779, Dec. 2017. and nonsuicidal poets,” Psychosomatic Med., vol. 63, no. 4, pp. 517–522,
[23] N. Majumder, S. Poria, A. Gelbukh, and E. Cambria, “Deep learning-based 2001.
document modeling for personality detection from text,” IEEE Intell. Syst., [48] T. Nguyen, D. Phung, B. Dao, S. Venkatesh, and M. Berk, “Affective and
vol. 32, no. 2, pp. 74–79, Mar. 2017. content analysis of online depression communities,” IEEE Trans. Affect.
[24] R. G. Guimarães, R. L. Rosa, D. D. Gaetano, D. Z. Rodrı́guez, and Comput., vol. 5, no. 3, pp. 217–226, Jul. 2014.
G. Bressan, “Age groups classification in social network using deep learn- [49] R. Fan, J. Zhao, Y. Chen, and K. Xu, “Anger is more influential than joy:
ing,” IEEE Access, vol. 5, pp. 10 805–10 816, 2017. Sentiment correlation in weibo,” CoRR, vol. 1309, 2013.
[25] O. Araque, I. Corcuera-Platas, J. F. Sánchez-Rada, and C. A. Iglesias, [50] E. Kanjo, L. Al-Husain, and A. Chamberlain, “Emotions in context:
“Enhancing deep learning sentiment analysis with ensemble techniques Examining pervasive affective sensing systems, applications, and anal-
in social applications,” Expert Syst. Appl., vol. 77, pp. 236–246, 2017. yses,” Pers. Ubiquitous Comput., vol. 19, no. 7, pp. 1197–1212, Oct.
[26] Y. Chen, M. L.-J. Yann, H. Davoudi, J. Choi, A. An, and Z. Mei, “Con- 2015.
trast pattern based collaborative behavior recommendation for life im- [51] X. Wang, C. Zhang, Y. Ji, L. Sun, L. Wu, and Z. Bao, “A depression detec-
provement,” in Proc. Pacific-Asia Conf. Knowl. Discovery Data Mining, tion model based on sentiment analysis in micro-blog social network,” in
Jun. 2017, pp. 106–118. Proc. Trends Appl. Knowl. Discovery Data Mining , 2013, pp. 201–213.
[27] H. Hu, A. Elkus, and L. Kerschberg, “A personal health recommender [52] M. R. Morales and R. Levitan, “Speech vs. text: A comparative analysis
system incorporating personal health records, modular ontologies, and of features for depression detection systems,” in Proc. IEEE Spoken Lang.
crowd-sourced data,” in Proc. IEEE/ACM Int. Conf. Adv. Soc. Netw. Anal. Technol. Workshop, Dec. 2016, pp. 136–143.
Mining, Aug. 2016, pp. 1027–1033. [53] G. A. Susto, A. Schirru, S. Pampuri, S. McLoone, and A. Beghi, “Machine
[28] A. Sachinopoulou, J. Leppanen, H. Kaijanranta, and J. Lahteenmaki, learning for predictive maintenance: A multiple classifier approach,” IEEE
“Ontology-based approach for managing personal health and wellness Trans. Ind. Inform., vol. 11, no. 3, pp. 812–820, Jun. 2015.
information,” in Proc. 29th Annu. Int. Conf. IEEE Eng. Med. Biol. Soc., [54] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of
Aug. 2007, pp. 1802–1805. word representations in vector space,” CoRR, Mar. 2013, pp. 1301–1310.
[29] J. Tang et al., “Quantitative study of individual emotional states in so- [55] D.-T. Vo and Y. Zhang, “Target-dependent twitter sentiment classification
cial networks,” IEEE Trans. Affect. Comput., vol. 3, no. 2, pp. 132–144, with rich automatic features,” in Proc. 24th Int. Conf. Artif. Intell., 2015,
Apr. 2012. pp. 1347–1353.
[30] S. Rendle, “Factorization machines with libfm,” ACM Trans. Intell. Syst. [56] Muljono, N. A. S. Winarsih and C. Supriyanto, “Evaluation of classifica-
Technol., vol. 3, no. 3, pp. 57-1–57-22, May 2012. tion methods for indonesian text emotion detection,” in Proc. Int. Seminar
[31] T. Chen, R. Xu, Y. He, and X. Wang, “Improving sentiment analysis via Appl. Technol. Inform. Commun., Aug. 2016, pp. 130–133.
sentence type classification using bilstm-crf and CNN,” Expert Syst. Appl., [57] P. Ekman, “An argument for basic emotions,” Cognition Emotion, pp. 169–
vol. 72, pp. 221–230, 2017. 200, 1992.
[32] R. L. Rosa, D. Z. Rodrı́guez, and G. Bressan, “Music recommendation [58] B. Schuller, A. Batliner, S. Steidl, and D. Seppi, “Recognising realistic
system based on user’s sentiments extracted from social networks,” in emotions and affect in speech: State of the art and lessons learnt from
Proc. IEEE Int. Conf. Consum. Electron., Jan. 2015, pp. 383–384. the first challenge,” Speech Commun., vol. 53, no. 9-10, pp. 1062–1087,
[33] R. L. Rosa, D. Z. Rodrı́guez, and G. Bressan, “Sentimeter-br: A new social Nov. 2011.
web analysis metric to discover consumers’ sentiment,” in Proc. IEEE Int. [59] S. Alghowinem et al., “Multimodal depression detection:fusion analysis
Symp. Consum. Electron., Jun. 2013, pp. 153–154. of paralinguistic, head pose and eye gaze behaviors,” IEEE Trans. Affect.
[34] D. A. Huffaker and S. L. Calvert, “Gender, identity, and language use in Comput., to be published.
teenage blogs,” J. Comput.-Mediated Commun., vol. 10, no. 2, 2005. [60] I. Bhakta and A. Sau, “Prediction of depression among senior citizens
[35] A. Fraisse and Paroubek, “Twitter as a comparable corpus to build multi- using machine learning classifiers,” Int. J. Comput. Appl., vol. 144, no. 7,
lingual affective lexicons,” in Proc. Workshop Building Using Comparable pp. 11–16, 2016.
Corpora, Reykjavik, Iceland, May 2014, pp. 26–31. [61] R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and
[36] M. Thelwall, D. Wilkinson, and S. Uppal, “Data mining emotion in social P. Kuksa, “Natural language processing (almost) from scratch,” J. Mach.
network communication: Gender differences in myspace,” J. Amer. Soc. Learn. Res., vol. 12, pp. 2493–2537, Nov. 2011.
Inform. Sci. Technol., vol. 61, no. 1, pp. 190–199, 2010. [62] S. Poria, E. Cambria, and A. F. Gelbukh, “Deep convolutional neural
[37] H. Garcia-Molina, G. Koutrika, and A. Parameswaran, “Information seek- network textual features and multiple kernel learning for utterance-level
ing: Convergence of search, recommendations, and advertising,” Commun. multimodal sentiment analysis,” in Proc. Conf. Empirical Methods Natural
ACM, vol. 54, no. 11, pp. 121–130, Nov. 2011. Lang., 2015, pp. 2539–2544.
[38] S. Łazaruk, J. Dzikowski, M. Kaczmarek, and W. Abramowicz, “Semantic [63] T.-H. Pham and P. Le-Hong, “End-to-end recurrent neural network models
web recommendation application,” in Proc. Federated Conf. Comput. Sci. for vietnamese named entity recognition: Word-level vs. character-level,”
Inform. Syst., Sep. 2012, pp. 1055–1062. in Proc. 15th Int. Conf. Pacific Assoc. Comput. Ling., 2017, pp. 219–232.
[39] I. Cantador, P. Castells, and A. Bellogin, “An enhanced semantic layer [64] F. Arup Nielsen, “A new anew: Evaluation of a word list for sentiment
for hybrid recommender systems: Application to news recommendation,” analysis in microblogs,” in Proc. ESWC2011 Workshop ’Making Sense
Int. J. Semantic Web Inf. Syst., vol. 7, no. 1, pp. 44–78, 2011. Microposts, Sep. 2011, pp. 93–98.
[40] A. C. M. Fong, B. Zhou, S. C. Hui, G. Y. Hong, and T. A. Do, “Web [65] C. Quan and F. Ren, “Weighted high-order hidden markov models for
content recommender system based on consumer behavior modeling,” compound emotions recognition in text,” Inform. Sci., vol. 329, pp. 581–
IEEE Trans. Consum. Electron., vol. 57, no. 2, pp. 962–969, May 2011. 596, 2016.
[41] E. P. J. M. M. del Castillo and J. A. Delgado-López, “Analysis of the state [66] J. Bergstra et al., “Theano: A CPU and GPU math expression compiler,”
of the topic,” Hipertext.net, no. 6, pp. 3602–3605, Aug. 2008. in Proc. Python Sci. Comput. Conf., 2010, vol. 4, no. 3.
[42] H.-J. E. Harmon-Jones C and Bastian B, “The discrete emotions question- [67] Y. Tang, Y. Q. Zhang, N. V. Chawla, and S. Krasser, “Svms modeling
naire: A new tool for measuring state self-reported emotions,” PLoS ONE, for highly imbalanced classification,” IEEE Trans. Syst., Man, Cybern., B,
vol. 11, no. 8, 2016, pp. 295–298. vol. 39, no. 1, pp. 281–288, Feb. 2009.

Authorized licensed use limited to: National Central University. Downloaded on September 22,2023 at 03:03:03 UTC from IEEE Xplore. Restrictions apply.
ROSA et al.: KNOWLEDGE-BASED RECOMMENDATION SYSTEM THAT INCLUDES SENTIMENT ANALYSIS AND DEEP LEARNING 2135

[68] UNESCO, International Standard Classification of Education, 2006. Wilson Vicente Ruggiero received the Grad-
[69] Y. L. Cun et al., “Advances in neural information processing systems uate and Master’s degree in electric engineer-
2,” D. S. Touretzky, Ed. San Francisco, CA, USA: Morgan Kaufmann ing from the Universidade de São Paulo, São
Publishers Inc., 1990, ch. Handwritten Digit Recognition with a Back- Paulo, Brasil, in 1972 and 1975, respectively,
propagation Network, pp. 396–404. and the Ph.D. degree in computer science from
the Universidade da California Los Angeles, Los
Angeles, CA, USA, in 1978.
Renata Lopes Rosa received the M.S. degree He is currently Full Professor at the Depart-
from the University of São Paulo, São Paulo, ment of Computer Engineering, University of
Brasil, and the Ph.D. degree from the Polytech- São Paulo, the Director of LARC - Laboratory of
nic School at the University of São Paulo, São Computer Architecture and Network, Area coor-
Paulo, Brasil, in 2009 and 2015, respectively. dinator of FAPESP and President of the Innovation and Research council
She is currently an Adjunct Professor at De- at Scopus Tecnologia S A. He has experience in electric engineering, act-
partment of Computer Science, Federal Univer- ing on the following subjects: information security, network technology,
sity of Lavras, Brazil. The author has a solid network, e-learning, performance evaluation of computer, and commu-
knowledge in computer science based on 15 nication systems.
years of professional experience working with
systems, network and programming languages
for solutions in desktop, web and mobile environment. Her current re-
search interests include computer networks, operation system, quality of
experience of multimedia service, social networks, sentiment analysis,
and recommendation systems.

Gisele Maria Schwartz received the degree in


physical education from the University of São
Paulo, São Paulo, Brasil, in 1975, the Master’s
degree in physical education from the State
University of Campinas, Campinas, Brazil, in Demóstenes Zegarra Rodrı́guez (SM’xx) re-
1991, and the Ph.D. degree in school psychol- ceived the B.S. degree in electronic engineering
ogy and human development from the Univer- from the Pontifical Catholic University of Peru,
sity of São Paulo, São Paulo, in 1997 and 2004, Lima, Peru, and the M.S. and Ph.D. degrees
respectively. from the University of São Paulo, São Paulo,
He was a Postdoctorate with the Université du Brasil, in 2009 and 2013, respectively.
Québec à Trois-Riviéres- Canada, in 2011. Vis- He is currently an Adjunct Professor at De-
iting Fellow at the University of Birmingham U.K. in 2013, Senior Intern- partment of Computer Science, Federal Univer-
ship / CAPES - University of Lisbon Lisbon, Portugal, in 2016. Adjunct sity of Lavras, Lavras, MG, Brazil. He has a solid
Professor at the Paulista State University Júlio de Mesquita Filho, with knowledge in Telecommunication Systems and
activities in the Postgraduate Program in Motor Science, Physical Edu- Computer Science based on 15 years of profes-
cation, Sports and Leisure research and in the Postgraduate Program in sional experience in major companies. His research interest includes
human development and technologies, body and culture, working mainly QoS and QoE in Multimedia services, digital TV, recommendation sys-
with the following themes: leisure psychology, adventure activities, vir- tem, signal processing, sentiment analysis, codecs, machine learning,
tual environment, physical education, playful attitude and behavior, and network, electronic, and architect solutions in Telecommunication Sys-
sports management. tems.

Authorized licensed use limited to: National Central University. Downloaded on September 22,2023 at 03:03:03 UTC from IEEE Xplore. Restrictions apply.

You might also like