0% found this document useful (0 votes)
6 views6 pages

Fake Review Detection Using Machine Learning Algorithm On Online Product Selling Platforms Publication Paper

The document discusses a novel approach to detecting fake reviews on online platforms using machine learning, specifically a logistic regression technique. It highlights the challenges posed by deceptive reviews and the limitations of current detection methods, while proposing a new framework that incorporates natural language processing for improved accuracy. The paper also covers software specifications, system architecture, and testing methodologies related to the proposed solution.

Uploaded by

Meenakshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views6 pages

Fake Review Detection Using Machine Learning Algorithm On Online Product Selling Platforms Publication Paper

The document discusses a novel approach to detecting fake reviews on online platforms using machine learning, specifically a logistic regression technique. It highlights the challenges posed by deceptive reviews and the limitations of current detection methods, while proposing a new framework that incorporates natural language processing for improved accuracy. The paper also covers software specifications, system architecture, and testing methodologies related to the proposed solution.

Uploaded by

Meenakshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

International Journal of Advances in Engineering and Management (IJAEM)

Volume 4, Issue 6, pp: 1243-1248 www.ijaem.net ISSN: 2395-5252

FAKE REVIEW DETECTION USING MACHINE LEARNING


ALGORITHM ON ONLINE PRODUCT SELLING
PLATFORM
1
Raguraman P.J, 2Kayathri V, 3Meenakshi D, 4Nandhini R
1
Assistant Professor, Department of Computer Science and Engineering, Paavai College of Engineering
2,3,4
Undergraduate student, Department of Computer Science and Engineering, Paavai College of Engineering
--------------------------------------------------------------------------------------------------------------------------------------
Date of Submission: 16-06-2022 Date of Acceptance: 17-06-2022
---------------------------------------------------------------------------------------------------------------------------------------

ABSTRACT: Fake reviews detection attracts many generalization in order to get a system that performs
researchers’ attention due to the negative impacts on well on yet unseen data instances.
the society. Most existing fake reviews detection [1]. Machine learning is a relatively new
approaches mainly focus on semantic analysis of discipline within Computer Science that provides a
review’s contents. We propose a novel fake reviews collection of data analysis techniques. Some of these
Logistic regression technique. The increasing techniques are based on well-established statistical
popularity of online review systems motivates methods (e.g. logistic regression and principal
malevolent intent in competing sellers and service component analysis) while many others are not.
providers to manipulate consumers by fabricating [2]. Most statistical techniques follow the
product/service reviews. Immoral actors use Sybil paradigm of determining a particular probabilistic
accounts, bot farms, and purchase authentic accounts model that best describes observed data among a
to promote products and vilify competitors. Facing class of related models. Similarly, most machine
the continuous advancement of review spamming learning techniques are designed to find models that
techniques, the research community should step best fit data (i.e. they solve certain optimization
back, assess the approaches explored to date to problems), except that these machine learning models
combat fake reviews, and regroup to define new ones. are no longer restricted to probabilistic ones.
This paper reviews the literature on Fake Review [3]. Therefore, an advantage of machine
Detection (FRD) on online platforms. It covers both learning techniques over statistical ones is that the
basic research and commercial solutions, and latter require underlying probabilistic models while
discusses the reasons behind the limited level of the former do not. Even though some machine
success that the current approaches and regulations learning techniques use probabilistic models, the
have had in preventing damage due to deceptive classical statistical techniques are most often too
reviews. stringent for the oncoming Big Data era, because data
KEYWORDS: Fake Review Detection, Web sources are increasingly complex and multi-faceted.
Scraping, Spam Review Detection, Machine Prescribing probabilistic models relating variables
Learning, Fraud Detection, Google Reviews. from disparate data sources that are plausible and
amenable to statistical analysis might be extremely
I. INTRODUCTION difficult if not impossible.
Predictive analytics tools are powered [4]. While machine learning and predictive
by several different models and algorithms that can analytics can be a boon for any organization,
be applied to wide range of use cases. Determining implementing these solutions haphazardly, without
what predictive modeling techniques are best for your considering how they will fit into everyday
company is key to getting the most out of a predictive operations, will drastically hinder their ability to
analytics solution and leveraging data to make deliver the insights the organization needs.
insightful decisions in the statistical context. Machine [5]. A user review is a review conducted by
Learning is defined as an application of artificial any person who has access to the internet and
intelligence where available information is used publishes their experience to a review site or social
through algorithms to process or assist the processing media platform following product testing or the
of statistical data. evaluation of a service. User reviews are commonly
While Machine Learning involves provided by consumers who volunteer to write the
concepts of automation, it requires human guidance. review, rather than professionals who are paid to
Machine Learning involves a high level of evaluate the product or service.

DOI: 10.35629/5252-45122323 | Impact Factor value 7.429 | ISO 9001: 2008 Certified Journal Page 50
International Journal of Advances in Engineering and Management (IJAEM)
Volume 4, Issue 6, pp: 1243-1248 www.ijaem.net ISSN: 2395-5252

User reviews might be compared to of social media platforms has enabled the facilitation
professional nonprofit reviews from a consumer of interaction between consumers after a review has
organization, or to promotional reviews from an been placed on online communities such as blogs,
advertiser or company marketing a product. Growth internet forums or other popular platforms.

II. SOFTWARE ANALYSIS not yet been able to eradicate the damaging effects of
EXISTING SYSTEM fake review activity in practice. In doing so, we point
In the existing method, fake Reviews at the difficulties associated with combating the
detection multi-task learning model has been different types of malignant influencers.
presented which is based on the following
observations: PROPOSED SYSTEM
Some certain topics have higher In the proposed method, we proposed the
percentages of fake reviews. Some certain news Fake reviews detection technique with logistic
authors have higher intentions to publish fake news. regression architecture. For the preprocessing, the
FDML model investigates the impact of Natural Language processing (NLP) processes are
topic labels for the fake reviews and introduce perform to extract the information from the text data.
contextual information of news at the same time to After that the classification o, Logistic regression is
boost the detection performance on the short fake take places in order to perform operations.
reviews. The existing methods and regulations have

III. SYSTEM SPECIFICATION


HARDWARE REQUIREMENTS see the platform's hardware, operating system, and
Processor : Pentium – IV interpreter version information where the program is
running.
RAM : 4 GB (min) There are four functions for getting
information about the current Python interpreter.
Hard Disk : 20 GB
python_version() and python_version_tuple() return
SOFTWARE REQUIREMENTS different forms of the interpreter version with major,
Operating System : Windows 7 or 8 minor, and patch level components.
python_compiler() reports on the compiler used to
Software : Python Idle build the interpreter. And python_build() gives a
version string for the build of the interpreter.
PRODUCTIVITY AND SPEED .
It is a widespread theory within
development circles that developing Python PYTHON IS POPULAR FOR WEB APPS
applications is approximately up to 10 times faster Web development shows no signs of
than developing the same application in Java or slowing down, so technologies for rapid and
C/C++. The impressive benefit in terms of time productive web development still prevail within the
saving can be explained by the clean object-oriented market. Along with JavaScript and Ruby, Python,
design, enhanced process control capabilities, and with its most popular web framework Django, has
strong integration and text processing capacities. great support for building web apps and is rather
Moreover, its own unit testing framework contributes popular within the web development community.
substantially to its speed and productivity.
The platform module in Python is used to
access the underlying platform's data, such as,
hardware, operating system, and interpreter version
information. The platform module includes tools to

IV. SOFTWARE DESCRIPTION programming. Python is often described as a


FRONT END: PYTHON "batteries included" language due to its
Python is an interpreter, high-level, comprehensive standard library
general-purpose programming language. It supports Python is a multi-paradigm programming
multiple programming paradigms, including language. Object-oriented programming and
procedural, object-oriented, and functional structured programming are fully supported, and
many of its features support functional
DOI: 10.35629/5252-45122323 | Impact Factor value 7.429 | ISO 9001: 2008 Certified Journal Page 51
International Journal of Advances in Engineering and Management (IJAEM)
Volume 4, Issue 6, pp: 1243-1248 www.ijaem.net ISSN: 2395-5252

programming and aspect-oriented programming OPEN-SOURCE AND FRIENDLY


(including by Meta programming and met objects COMMUNITY
(magic methods)). Many other paradigms are As stated on the official website, it is
supported via extensions, including design by developed under an OSI-approved open source
contract and logic programming. license, making it freely usable and distributable.
Python uses dynamic typing and a Additionally, the development is driven by the
combination of reference counting and a cycle- community, actively participating and organizing
detecting garbage collector for memory conference, meet-ups, hackathons, etc. fostering
management. It also features dynamic name friendliness and knowledge-sharing.
resolution (late binding), which binds method and
variable names during program execution. BROAD APPLICATION
Python is meant to be an easily readable It is said that the language is relatively
language. Its formatting is visually uncluttered, and simple so you can get pretty quick results without
it often uses English keywords where other actually wasting too much time on constant
languages use punctuation. Unlike many other improvements and digging into the complex
languages, it does not use curly brackets to delimit engineering insights of the technology. Even though
blocks, and semicolons after statements are optional. Python programmers are really in high demand these
It has fewer syntactic exceptions and special cases days, its friendliness and attractiveness only help to
than C or Pascal. increase number of those eager to master this
programming language.

V. SYSTEM DESIGN
engineering practices that have proven successful in
SYSTEM ARCHITECTURE the modeling of large and complex systems.
A system architecture is the conceptual
model that defines the structure, behavior, and more USE CASE DIAGRAM
views of a system. An architecture description is a A use case diagram in the Unified Modeling
formal description and representation of a system, Language (UML) is a type of behavioral diagram
organized in a way that supports reasoning about the defined by and created from a Use-case analysis. Its
structures and behaviors of the system. purpose is to present a graphical overview of the
functionality provided by a system in terms of actors,
their goals (represented as use cases), and any
dependencies between those use cases.

UML DIAGRAMS
UML stands for Unified Modeling
Language. UML is a standardized general-purpose
modeling language in the field of object-oriented
software engineering. The standard is managed, and
was created by, the Object Management Group. The
goal is for UML to become a common language for
creating models of object oriented computer
software. In its current form UML is comprised of
two major components: a Meta-model and a notation.
In the future, some form of method or process may
also be added to , or associated with, UML. The UML
represents a collection of best

SYSTEM TESTING

DOI: 10.35629/5252-45122323 | Impact Factor value 7.429 | ISO 9001: 2008 Certified Journal Page 51
International Journal of Advances in Engineering and Management (IJAEM)
Volume 4, Issue 6, pp: 1243-1248 www.ijaem.net ISSN: 2395-5252

The purpose of testing is to discover errors.


Testing is the process of trying to discover every
conceivable fault or weakness in a work product. It public BankHome() {
provides a way to check the functionality of InitializeComponent(); }
components, sub-assemblies, assemblies and/or a
finished product it is the process of exercising private void BankHome_Load(object sender,
software with the intent of ensuring that the Software EventArgs e) {
system meets its requirements and user expectations
and does not fail in an unacceptable manner. There }
are various types of test. Each test type addresses a
specific testing requirement. private void textBox4_KeyDown(object
Unit testing involves the design of test cases sender, KeyEventArgs e) { if (e.KeyCode
that validate that the internal program logic is < Keys.D0 || e.KeyCode > Keys.D9) { if
functioning properly, and that program inputs (e.KeyCode < Keys.NumPad0 || e.KeyCode >
produce valid outputs. All decision branches and Keys.NumPad9) { if (e.KeyCode
internal code flow should be validated. It is the != Keys.Back) {
testing of individual software units of the application //nonnumberenter = true; string abc =
.it is done after the completion of an individual unit "Please enter numbers only.";
before integration. This is a structural testing, that textBox5.Text = "";
relies on knowledge of its construction and is
invasive. Unit tests perform basic tests at component DialogResult result1 =
level and test a specific business process, application, MessageBox.Show(abc.ToString(), "Validate
and/or system configuration. Unit tests ensure that numbers", MessageBoxButtons.OK); }
each unique path of a business process performs } } if (Control.ModifierKeys ==
accurately to the documented specifications and Keys.Shift) { //nonnumberenter =
contains clearly defined inputs and expected results true; string abc = "Please enter numbers
Integration tests are designed to test only."; DialogResult result1 =
integrated software components to determine if they MessageBox.Show(abc.ToString(), "Validate
actually run as one program. Testing is event driven numbers", MessageBoxButtons.OK);
and is more concerned with the basic outcome of
screens or fields. Integration tests demonstrate that } }
although the components were individually
satisfaction, as shown by successfully unit testing, the if
combination of components is correct and consistent. (System.Text.RegularExpressions.Regex.IsMatch(te
Integration testing is specifically aimed at exposing xtBox5.Text, pattern)) {
the problems that arise from the combination of //MessageBox.Show("Valid Email address "); }
components else { textBox4.Text = "";

SOURCE CODE MessageBox.Show("Not a vali3d Email


address "); } }
Using System.Collections.Generic;
using System.ComponentModel; private void
using System.Data; dateTimePicker1_ValueChanged(object sender,
using System.Text; EventArgs e) { int age =
using System.Windows.Forms; DateTime.Today.Year -
using System.Data.SqlClient; dateTimePicker1.Value.Year;

namespace IllusionPin { public partial class textBox3.Text = age.ToString();


BankHome : Form {

SqlConnection con = new if (age < 18) {


SqlConnection(@"Data //MessageBox.Show("Age Limit Low!"); }
Source=.\SQLEXPRESS;AttachDbFilename=
IllusionPin\IllusionPin\illusiontb.mdf;Integrated }
Security=True;User Instance=True");
SqlCommand cmd;

DOI: 10.35629/5252-45122323 | Impact Factor value 7.429 | ISO 9001: 2008 Certified Journal Page 51
International Journal of Advances in Engineering and Management (IJAEM)
Volume 4, Issue 6, pp: 1243-1248 www.ijaem.net ISSN: 2395-5252

private void button1_Click(object sender, string gender; if (radioButton1.Checked


EventArgs e) { == true)
}}}

VI. CONCLUSION Ind. Marketing Manage., vol. 90, pp. 523–


We focused on the task of identifying 537, Oct. 2020.
spam reviews. After analyzing the reviews in the
datasets, we propose a hypothesis that fine-grained [3]. N. Jindal and B. Liu, ‘‘Review spam
aspect information can be used as a new scheme for detection,’’ in Proc. 16th Int. Conf. World
fake review detection and reconstructed the Wide Web, 2007, pp. 1189–1190.
representation of reviews from four perspectives:
users, products, reviews text, and fine-grained [4]. A. Mukherjee, V. Venkataraman, B. Liu,
aspects. We proposed a multilevel interactive and N. S. Glance, ‘‘what yelp fake review
attention neural network model with aspect plan; to filter might be doing,’’ in Proc. ICWSM,
optimize the model’s objective function, we 2013, pp. 409–418.
transformed the implicit relationship between users,
reviews and products into a regularization term. To [5]. S. Rayana and L. Akoglu, ‘‘Collective
verify the effectiveness of the Logistic regression, opinion spam detection: Bridging review
we conducted extensive experiments on three public networks and metadata,’’ in Proc. 21th
datasets. Our experiments showed that the ACM SIGKDD Int. Conf. Knowl.
classification effect has been significantly Discovery Data Mining, Aug. 2015, pp.
improved, that the method outperforms the state-of- 985–994.
the-art methods for fake review detection tasks, and
proved the effectiveness and feasibility of our [6]. F. Li, M. Huang, Y. Yang, and X. Zhu,
proposed scheme. ‘‘Learning to identify review spam,’’ in
Proc. IJCAI 22nd Int. Joint Conf. Artif.
FUTURE ENHANCEMEN Intell., vol. 3, 2011, pp. 2488–2493.
Our future research is when it comes to
cross domain issues, we need to further obtain fine [7]. X. Hu, J. Tang, H. Gao, and H. Liu,
grained aspects in the relevant domain. We also ‘‘Social spammer detection with sentiment
predict whether it is system generated review or true information,’’ in Proc. IEEE Int. Conf.
review. Data Mining, Dec. 2014, pp. 180–189.

SOME OF THE ADVANAGES FROM THE


ABOVE RESULTS [8]. S. Kc and A. Mukherjee, ‘‘on the temporal
a) Higher accuracy of the model. dynamics of opinion spamming: Case
b) The performance of the model is high. studies on yelp,’’ in Proc. 25th Int. Conf.
c) The proposed model has ability to work World Wide Web, Apr. 2016, pp. 369–379.
with different kind of dataset.
[9]. Y. Ren and Y. Zhang, ‘‘Deceptive opinion
REFERENCES spam detection using neuralnetwork,’’ in
[1]. R. Filieri and F. McLeay, ‘‘E-WOM and Proc. 26th Int. Conf. Comput. Linguistics,
accommodation: An analysis of the factors Tech. Papers COLING, Dec. 2016, pp.
that influence travelers’ adoption of 140–150.
information from online reviews,’’ J.
Travel Res., vol. 53, no. 1, pp. 44–57, Jan. [10]. X. Wang, K. Liu, and J. Zhao, ‘‘Handling
2014. cold-start problem in review spam
detection by jointly embedding texts and
[2]. E. Kauffmann, J. Peral, D. Gil, A. behaviors,’’ in Proc. 55th Annu. Meeting
Ferrández, R. Sellers, and H. Mora, ‘‘A Assoc. Comput. Linguistics (Long Papers),
framework for big data analytics in vol. 1, 2017, pp. 366–376. [Online].
commercial social networks: A case study Available:
on sentiment analysis and fake review https://www.aclweb.org/anthology/P17-
detection for marketing decision-making,’’ 1034.pdf

DOI: 10.35629/5252-45122323 | Impact Factor value 7.429 | ISO 9001: 2008 Certified Journal Page 52
International Journal of Advances in Engineering and Management (IJAEM)
Volume 4, Issue 6, pp: 1243-1248 www.ijaem.net ISSN: 2395-5252

[11]. C. Yuan, W. Zhou, Q. Ma, S. Lv, J. Han,


and S. Hu, ‘‘Learning review
representations from user and product level
information for spam detection,’’ in Proc.
IEEE Int. Conf. Data Mining (ICDM),
Nov. 2019, pp. 1444–1449.

[12]. Y. Lu, M. Castellanos, U. Dayal, and C.


Zhai, ‘‘Automatic construction of a
context-aware sentiment lexicon: An
optimization approach,’’ in Proc. 20th Int.
Conf. World Wide Web - WWW, 2011, pp.
347–356.

[13]. G. Ji, S. He, L. Xu, K. Liu, and J. Zhao,


‘‘Knowledge graph embedding via
dynamic mapping matrix,’’ in Proc. 53rd
Annu. Meeting Assoc. Comput.
Linguistics 7th Int. Joint Conf. Natural
Lang. Process. (Long Papers), vol. 1, 2015,
pp. 687–696. [Online]. Available:
https://www. aclweb.org/anthology/P15-
1067.pdf [14] N. Jindal and B. Liu,
‘‘Opinion spam and analysis,’’ in Proc. Int.
Conf. Web Search Web Data Mining
WSDM, 2008, pp. 219–230.

[14]. J. Li, M. Ott, C. Cardie, and E. Hovy,


‘‘Towards a general rule for identifying
deceptive opinion spam,’’ in Proc. 52nd
Annu.,vol. 1, 2014, pp. 1566–1576.
[Online]. Available:
https://www.aclweb.org/anthology/P14-
1147.pdf.

DOI: 10.35629/5252-45122323 | Impact Factor value 7.429 | ISO 9001: 2008 Certified Journal Page 51

You might also like