University Institute of Information Technology,
PMAS-Arid Agriculture University, Rawalpindi Pakistan
Website Vulnerability Scanning System
By
Mudasir Rasheed 18-Arid-2971
Tayyab Ali Bhutto 19-Arid-1219
Supervisor
Dr. Muhammad Habib
Bachelor of Science in Software Engineering (2019-2022)
The candidate confirms that the work submitted is their own and appropriate credit has been given
where reference has been made to the work of others.
DECLARATION
We hereby declare that this project, neither whole nor as a part has been copied out from any
source. It is further declared that we have developed this software documentation and
accompanied report entirely based on our personal efforts. If any part of this project is proved to
be copied out from any source or found to be reproduction of some other. We will stand by the
consequences. No Portion of the work presented has been submitted of any application for any
other degree or qualification of this or any other university or institute of learning.
Mudasir Rasheed Tayyab Ali Bhutto
II
CERTIFICATE OF APPROVAL
It is to certify that the final year project of BS (CS) “Website Vulnerability Scanning System”
was developed by “Mudasir Rasheed, 18-ARID-2971”, and “Tayyab Ali Bhutto, 19-ARID-
1219” under the supervision of “Dr. Muhammad Habib” and that in their opinion; it is fully
adequate, in scope and quality for the degree of Bachelor of Science in Computer Science.
Supervisor
External Examiner
Administrator UIIT
III
Acknowledgement
All praise is to Almighty Allah who bestowed upon us a minute portion of His boundless
knowledge by virtue of which we were able to accomplish this challenging task.
We are greatly indebted to our project supervisor “Dr. Muhammad Habib” for personal
supervision, advice, valuable guidance, and completion of this project. We are deeply indebted to
him for encouragement and continual help during this work.
And we are also thankful to our parents and family who have been a constant source of
encouragement for us and brought us the values of honesty & hard work.
Mudasir Rasheed Tayyab Ali Bhutto
IV
Contents
Chapter 1: Introduction.............................................................................................................................................................. 6
1.1. Brief........................................................................................................................................................................................6
1.2 Relevance to Course Modules...............................................................................................................................................7
1.3 Project Background................................................................................................................................................................7
1.4. Literature Review...................................................................................................................................................................8
1.5. Analysis from Literature Review............................................................................................................................................9
1.6. Methodology and Software Lifecycle for This Project.........................................................................................................10
Chapter 2: Problem Definition.................................................................................................................................................. 12
2.1. Problem Statement..............................................................................................................................................................12
2.2. Deliverables and Development Requirements....................................................................................................................12
2.3. Proposed System..................................................................................................................................................................12
2.4.1 Projects Deliverable.............................................................................................................................................................13
2.4.2 Development Requirements................................................................................................................................................13
2.5. Operating Environment.......................................................................................................................................................13
2.6. Assumptions and Dependencies:.........................................................................................................................................13
Chapter 3: Requirement Analysis............................................................................................................................................. 14
3.1 Use Case Model....................................................................................................................................................................14
3.2. Functional Requirements.....................................................................................................................................................15
3.3. Non-Functional Requirements.............................................................................................................................................16
3.3.2 Actors Description................................................................................................................................................................16
3.4 Use Case Description.............................................................................................................................................................17
Chapter 4: Design and Architecture.......................................................................................................................................... 19
4.1 System Architecture.............................................................................................................................................................19
4.2 System Design......................................................................................................................................................................20
4.2.1 UML Structural Diagrams.....................................................................................................................................................20
[Link] Component Diagram.......................................................................................................................................................20
4.3 UML Behavioral Diagram............................................................................................................................................................22
4.3.1 Activity Diagram.......................................................................................................................................................................22
4.4 Class Diagram..............................................................................................................................................................................23
4.5 Sequence Diagram.......................................................................................................................................................................24
4.6 Landing Page...............................................................................................................................................................................25
4.7 Registration Page.........................................................................................................................................................................26
4.8 Website Vulnerability Scanner....................................................................................................................................................27
4.9 XSS Scanner.................................................................................................................................................................................28
4.10 SQL Injection Scanner................................................................................................................................................................29
V
Chapter 1: Introduction
In this chapter we will discuss overview of our whole project, its brief introduction, and how it is
relevant to the courses which we have studied during our degree. We will also discuss project
literature review and its analysis and methodology that we will use in project.
1.1. Brief
With the improvement of cybersecurity, site security is significant in light of the fact that the site
contains basic data about an organization, Now a days site infiltration is regular even a noob and
an infant aggressor can do this. Disregarding their persuasions, web applications do raise various
security associations from ill-advised coding. Genuine assaults or vulnerabilities, permit
aggressors to increase immediate and open association with the backend so as to store
information. A considerable lot of these databases contain important data making them a various
objective of programmers. Albeit here are just recognizing the cross webpage scripting
vulnerabilities with predefined python instrument and It has slither site usefulness which will
gather each end focuses URL) of sites.
According to the current situation, cyber risks are a major threat to small as well as big
organizations and although big companies have the power the prevent and fight these, small
companies and start-ups lack the financial as well as physical capacity to do so. So, by eliminating
the high-cost factor and stripping down and bringing the bare bones version of complex software
used to identify and provide solution for loopholes in a webpage of any given company, we aim to
fill the gap and show the possibilities an affordable software can also provide.
Cyber risk is now at the centre of the international agenda as high-profile breaches and hacking is
raising fears that such attacks and other security breaches could endanger the global economy.
Cybercrime is estimated to cost the global economy over US $400 billion every year, according to
estimates by the Centre for Strategic and International Studies. In 2017, some 10,000 companies
in the United States had their systems compromised by criminals, the Centre reports. So, there is a
necessity for an automated software which will help in recognizing loopholes in web applications.
It is quite common to confront discrepancies within the paperwork such as counterfeit titles,
forged documents, and a complete loss of the record. Such situations lead to expensive court
battles between conflicted parties.
In In Layman’s terms, the basic concept is to create a software which does all the major tasks as
VI
done by much expensively priced software used by companies, as well as keeping the costs
down. The software will scan full websites and find vulnerabilities as well as give information
and solution on how to fix it.
1.2 Relevance to Course Modules
Almost all the technologies used in application “Website Vulnerability Scanning” are related to
our course module.
• Web Technology
• JavaScript
1.3 Project Background
The first reported instance of a Web application attack was perpetrated in 2000 by a 17-year-old
Norwegian boy. While making online transactions with a large bank, he noticed that the URLs of
the pages he was opening displayed his account number as one of the parameters. He then
substituted his account number with the account numbers of random bank customers to gain
access to the customers’ accounts and personal details.
On October 31, 2001, the website of Acme Art Inc. was hacked and all the credit card numbers
from its online store’s database were extracted and displayed on a Usenet newsgroup. This
breach was reported to the public by the media and the company lost hundreds of thousands of
dollars due to orders withdrawn by wary customers. The company also lost its second phase of
funding by a venture capital firm.
Similarly, the 2002 turnover report of a Swedish company was accessed prior to its scheduled
publication. The perpetrator simply changed the year parameter in the URL of the previous
year’s report to that of the present year to gain complete access.
In another 2002 incident, applicants to Harvard Business School accessed their admission status
before the results were officially announced by manipulating the online Web application. This
third-party Web application was also used by other universities. Upon receiving replies to their
applications from these other schools, the applicants examined the URL of the reply and found
two parameters that depicted the unique IDs of that school’s students. Then, they simply
substituted the values in those two parameters in the reply to URL with their Harvard IDs, which
returned the desired information. This procedure, posted on a [Link] online forum,
was subsequently employed by over a hundred students eager to know their admission status.
VII
When the authorities detected this leakage, these students were denied admission.
In June 2003, hackers detected that the Web applications of the fashion label Guess and pet
supply retailer Petco contained SQL injection vulnerabilities. As a result, the credit card
information of almost half a million customers was stolen.
Website defacement is another major problem resulting from Web application attacks. Hackers
have learned to modify the source code of many websites. During the 2004 Christmas holidays,
the “Santy” worm entered Web application servers, defacing 40,000 websites in a single day. On
November 29, 2004, SCO’s website logo was replaced by the text, “We own all your code, pay
us all your money.” Similarly, on December 6, 2004, the homepage of Picasa, the picture sharing
facility from Google, was hacked and replaced with a totally blank page. It is powerful and
validating.
1.4. Literature Review
Current Most of the transaction information or the customer information is stored in the backend
databases for these web applications. One of the vulnerabilities of these web applications is SQL
(Structured Query Language) injection attack. Also, the web application sessions are prone to
session hijacking attack if the adversary can get hold of the session id. Considering that there are
various tools available to retrieve session HTTP cookies, this makes web applications very
vulnerable session hijacking attacks. Though there are many ways proposed to defend the
databases against SQL injection attacks, there is no sure shot way to prevent these SQL injection
attacks. This project proposes an efficient technique for the prevention of SQL injection attack
and session hijacking. The hashing technique is used for implementing the prevention these
attacks.
Second-order SQL injection is a serious threat to Web application, and it is more difficult to
detect than first-order SQL injection. The attack payload of second-order SQL injection is from
untrusted user input and stored in database or filesystem. The SQL statement submitted by web
application is usually dynamically assembled by a trusted constant string in the program and
untrusted user input, and the DBMS in unable to distinguish the trusted and untrusted part of a
SQL statement. The paper presents a method of detecting second-order SQL injection attacks
based on ISR Instruction Set Randomization. The method randomize the trusted SQL keywords
contained in Web applications to dynamically build new SQL instruction sets, and add a proxy
server before DBMS, the proxy detects whether the received SQL instruction contains standard
SQL keywords to find attack behaviour. Experimental results show that this system can
VIII
effectively detect second-order SQL injection attack and has low processing cost.
1.5. Analysis from Literature Review
In web application there are so many ways to exploit XSS vulnerabilities in parameters because
of improper input validation which directly get input from users and store it to database without
any validation which cause website vulnerable. So, to overcome this, new methodology is
proposed to find vulnerabilities in this website also adding with this cross-site scripting is mostly
vulnerable to websites as per recent survey, so it has been focused mainly on XSS to find and
exploit [10]. It has been observed that, if there is encoding it can also be bypassed with some
payload also added in this tool. A study says that tool with existing feature should be upgraded to
new functionality as per industry standard, so the primary focus is on OWASP top 10
vulnerabilities to exploit.
These days’ cross-site scripting assaults happen on the grounds that the engineers add some
defencelessness to the code. Each designer is answerable for assaults since engineers ought to
comprehend what sort of assaults are conceivable on web applications. Try not to believe client
input on the grounds that the client can embed any kind of significant worth and consistently use
channels as it lessens these assaults. Engineers should change over what's composed between any
two tags, which are encased in '<' and '>'. XSS gaps can harm your application because the
assailants will reveal these sorts of gaps to people in general and regularly everybody can see
your own data. Separating doesn't give a legitimate answer for cross site scripting assaults. Be
that as it may, if engineers use) and (, to & quot;, ' to ' and change over # and to #(#)
and & (and) [.
Assaults against web administrations are various. They assault the administration, yet likewise,
they can assault everything that is identified with the sites, for example, the server, the host
framework, and the backend. In this way, they are too critical all in all data framework. The
reason for this investigation is to introduce instruments and procedures that an assailant can use,
to examine the effect of such assaults on the framework and to propose strategies to forestall and
identify the accomplishment of XSS assaults.
IX
Fig 1.1 Analysis of Vulnerabilities
1.6. Methodology and Software Lifecycle for This Project
The waterfall model is a classical model used in the system development life cycle to create a
system with a linear and sequential approach. It is termed a waterfall because the model develops
systematically from one phase to another in a downward fashion. The waterfall approach does
not define the process to go back to the previous phase to handle changes in requirements. The
waterfall approach is the earliest approach that was used for software development.
X
Fig 1.2 Project Methodology
XI
Chapter 2: Problem Definition
This chapter discusses the precise problem to be solved. It should extend to include the outcome.
2.1. Problem Statement
With the rapid development of the Internet, Web security issues have become increasingly
prevalent; hackers will exploit Web vulnerabilities to infiltrate websites, resulting in numerous
security incidents. Web vulnerability scanners on the market have several issues, including
insufficient scanning accuracy, large software, low scalability, and so on.
Traditional scanners generally obtain the URL of the website via a crawler, send a request to the
website with attack parameters to obtain the payload, and output the corresponding vulnerability
report if the payload is successfully verified.
Based on these security threats, using vulnerability scanners to detect vulnerabilities on websites
has some value. This Website Vulnerability Scanner uses a callable plug-in framework to
automate the scanning process, send a request with parameters to the target website, and detect
website vulnerabilities based on the response.
2.2. Deliverables and Development Requirements
In this project the deliverables will consist of the input as equipment (i.e., hardware components
and software components) then there will be a process (i.e., development phases in which
development will be completed) applied on the input and then as a result there will be an output.
In this case product deliverables are the completed parts or modules of the project. Input will be
hardware and software-based components. The project is divided into in different modules and
each module is major milestone in the project.
2.3. Proposed System
System architecture helps in better understanding of the process involved at various stages. It
helps in making the system design efficient and in understanding the nature of the system. It
defines the analyses, findings, and more views of a system. It gives a formal description and
representation of a system, organized in a way that supports the structures and behaviours of the
system.
The main function of this system is integrating and collecting some resource from different tool
to generate payloads. Then it will scan entire websites with the help of this tool to find affected
XII
endpoints. A generated payloads will exploit the vulnerabilities in affected endpoints then report
will be generated based on user preference.
2.4. Project Deliverable
Following are the deliverable and development requirements:
2.4.1 Projects Deliverable
In this project the deliverables will consist of the input as equipment (i.e., hardware components
and software components) then there will be a process (i.e., development phases in which
development will be completed) applied on the input, and then as a result there will be an output
(Project being completed "Website Vulnerability Scanning”). In this case, product deliverables
are the completed parts or modules of the project. Input will be hardware and software-based
components. The project is divided into different modules and each module is a major milestone
in the project.
2.4.2 Development Requirements
Development requirements are the requirements needed for the development purposes without
which the development is not possible. It can be hardware, software, or any kind of requirement.
These include the software and hardware equipment’s, time and data constraints, budget,
planning, following the SDLC, etc. development requirements are met accordingly to make sure
that the result does not differ from what is expected and that it can perform its functionality
accurately and perfectly without any glitches.
2.5. Operating Environment
Operating environment for the “Website Vulnerability Scanning” is as listed below:
Operating System: Window 10, MacOS, Ubuntu
Database: MYSQL
2.6. Assumptions and Dependencies:
Assumptions:
The application developed from this technique is more efficient than others. The software gives
trust, security, and fast scanning.
Dependencies:
XIII
This software is dependent on internet connectivity.
XIV
Chapter 3: Requirement Analysis
In this chapter, we will define all the requirements of the proposed system that including
functional and non-functional requirements. We will also discuss use cases of the system and
see how our system will respond to various use cases.
3.1 Use Case Model
In the Unified Modeling Language (UML), a use case diagram can summarize the details of your
system's users (also known as actors) and their interactions with the system. Following are the
use cases of the Inventory Management system.
Fig 3.1 System Use Case
XV
3.2. Functional Requirements
• User OUTPUT DESIGN
Outputs from computer systems are required primarily to communicate the results of processing
to users. They are also used to provide a permanent copy of the results for later consultation. The
various types of outputs in general are:
a. External Outputs, whose destination is outside the organization.
b. Internal Outputs whose destination is with in organization
c. User’s main interface with the computer.
d. Operational outputs whose use is purely with in the computer department.
e. Interface outputs, which involve the user in communicating directly
• INPUT DESIGN
Input design is a part of overall system design. The main objective during the input design is as
given below:
a) To produce a cost-effective method of input.
b) To achieve the highest possible level of accuracy.
c) To ensure that the input is acceptable and understood by the user.
• INPUT STAGES
The main input stages can be listed as below:
a) Data recording
b) Data transcription
c) Data conversion
d) Data verification
e) Data control
f) Data transmission
g) Data validation
Data correction
• ERROR AVOIDANCE
At this stage care is to be taken to ensure that input data remains accurate form the stage at which
it is recorded up to the stage in which the data is accepted by the system. This can be achieved
only by means of careful control each time the data is handled.
XVI
• ERROR DETECTION
Even though every effort is made to avoid the occurrence of errors, still a small proportion of
errors is always likely to occur, these types of errors can be discovered by using validations to
check the input data.
• DATA VALIDATION Procedures are designed to detect errors in data at a lower level of detail.
Data validations have been included in the system in almost every area where there is a
possibility for the user to commit errors. The system will not accept invalid data.
3.3. Non-Functional Requirements
a) User can change their authentication credentials.
b) Privacy of information should be audited.
c) Every unsuccessful attempt by a user to access an item of data shall be recorded on an
audit trail.
3.3.2 Actors Description
Actor An actor is a person, organization or external system that plays a role in one or more
interactions with the system. We have these actors.
Admin
He manages the database which contains information regarding to the institute and employees.
User
Cybersecurity professional who wants to scan the site to check if it is vulnerable or not.
XVII
3.4 Use Case Description
Use Case ID: ID-01
Use Case Name: Authentication
Actors: Users
Description: User authentic himself/herself by registration.
Trigger: When the user gets started with the application.
Preconditions: The user must have website URL.
Post conditions: The user will log in successfully and generate the report.
Normal Flow: The credential must be entered by the user.
The user will log in successfully, after authentication.
Alternative Flows: If the user account verification gets failed, then the user will
again, try to verify until all the verification is done.
Exceptions: If the user is not authenticated, an error message will appear.
Special Requirement: The user must login for unlimited scans.
Assumptions: None
Notes and Issues: The user has three times to authenticate the credentials after this
account will be blocked.
Table 3.4.1: Authentication
XVIII
Use Case ID: U_ID_02
Use Case Name: Scanning
Actors: User
Description: System will show a form in which user provide the URL of the
target website and system will scan it.
Trigger: When user wants to scan the website.
Preconditions: User should be authenticating.
User should have internet connection.
Post conditions: The System will show the vulnerability.
Normal Flow: User will scan the website without authentication for limited
number of time to access unlimited access user must
authenticate himself/herself.
Alternative Flows: None
Exceptions: None
Special None
Requirements:
Assumptions: Web Application will be connected to the system and check
vulnerability.
Notes and Issues: If vulnerability found user can get report.
Table 3.4.2: Scanning
XIX
Chapter 4: Design and Architecture
This chapter will discuss the design and architecture of your system.
4.1 System Architecture
Proposed Architecture for Web Vulnerability Scanning System.
Fig 4.1 System Architecture
XX
4.2 System Design
Systems design is the process of defining elements of a system like components, modules,
architecture and their interfaces and data for a system based on the specified requirements.
The purpose of the System Design process is to provide sufficient detailed data and information
about the system. Following is the system design of the Web Vulnerability Scanning System.
4.2.1 UML Structural Diagrams
Following are the UML structural diagrams of our system:
[Link] Component Diagram
Fig 4.2 Component Diagram
XXI
[Link] Package Diagram
Draw resources
Presentation layer
User interface
Vulnerability scanning XSS Scanning
User
Admin
Business logic
Presentation component
Backend code
Admin Db Vulnerability scanner DB
Fig 4.3 Package Diagram
XXII
4.3 UML Behavioral Diagram
Behavioral diagrams is another important diagram in UML to describe dynamic aspects of the
system. Behavioral diagrams is basically another activity. The activity can be described as an
operation of the system.
4.3.1 Activity Diagram
Fig 4.4 Activity Diagram
XXIII
4.4 Class Diagram
Fig 4.5 Class Diagram
XXIV
4.5 Sequence Diagram
Fig 4.6 Component Diagram
XXV
4.6 Landing Page
Fig 4.7 Home Page
XXVI
4.7 Registration Page
Fig 4.8 Registration Page
XXVII
4.8 Website Vulnerability Scanner
Fig 4.9 Web Scanner
XXVII
I
4.9 XSS Scanner
Fig 4.10 Cross-site Scripting
XXIX
4.10 SQL Injection Scanner
Fig 4.11 SQL Injection
XXX