Final Year Project
Final Year Project
A report Submitted
for partial completion of the degree of
Bachelor of Engineering
By
Bishwas Sagar (Roll No. :10331720018)
Md. Musharraf (Roll No. :10331720032)
Avinash Kumar (Roll No. :10331720016)
Sayan Samanta (Roll No. :10331720047)
Signature of students
……………………………………
Bishwas Sagar (Roll No.: 10331720018)
……………………………………
Md Musharraf (Roll No.: 10331720032)
…………………………………..
Avinash Kumar (Roll No.: 10331720016)
……………………………………
Sayan Samanta (Roll No.: 10331720047)
1|Page
ACKNOWLEDGEMENT
2|Page
INDEX
Sl. No. Content Page No.
Abstract 06-07
CHAPTER 1
INTRODUCTION
1.1 Introduction 08-09
1.2 Motive And Objective 09-10
CHAPTER 2
SOFTWARE DESIGN
2.1 Overview 11-12
2.2 Email Checker 13
2.3 Malware Analyzer 13-14
2.4 Password Checker 15
2.5 Password Generator 15
CHAPTER 3
TOOLS & LIBRARIES USED
3.1 Python 16
3.2 Virus Total 16-17
3.3 Have I been Pwned 17-18
3.4 GetPass 18-19
3.5 Requests 19-20
3.6 Colorama 20-21
3.7 Art 21
3.8 Hashlib 21-22
3.9 Sys 22-23
3.10 Time 23
3.11 Threading 24
3.12 Tkinter 25
3.13 Pyperclip 26
CHAPTER 4
PROGRAM FLOW
4.1 Overview 27-28
4.2 EmailChecker.py 28-29
4.3 MalwareAnalyzer.py 30-31
4.4 PasswordChecker.py 31-32
3|Page
4.5 PasswordGenerator.py 33
CHAPTER 5
CONCLUSION
5.1 Conclusion 34-35
REFERENCES
SAMPLE CODE
4|Page
Chapter 1
ABSTRACT
In this online world, news about malware attacks on various organizations, data
breaches of millions of users and cyber threats has become increasingly common.
The common masses don't have that much information about these threats nor are
they safe from it. On average private information has been exposed in at least 3
breaches. These secrets might become subject to theft or publicly exposed
unintentionally by the key's owner. In such cases, the keys are deemed
compromised and need to be revoked and abandoned instantaneously.
Unfortunately, it is rarely possible for users to know whether their secret keys
have been publicly exposed.
Closing this gap, we introduce "Malware Analyzer and Password Security Tool",
an open-source solution for proactive cybersecurity. It uses Signature Based
Checks to help the user check the files for possibility of malware and crawls the
Internet for data breaches about your email and password in data breaches. The
tool provides an intuitive command-line interface for users to analyze suspicious
files, check if email addresses or passwords have been compromised, and
generate secure random passwords.
This tool aims to promote cybersecurity best practices among common users
through proactive notifications, integrity checks, and assistance with improving
security posture. Ongoing and future enhancements like heuristics-based
malware prediction, password manager integration, and cross-platform
accessibility seek to further progress adoption.
5|Page
INTRODUCTION
6|Page
phone number, it makes it simple for attackers to guess their passwords. This
helps users rapidly recall them. They occasionally use the same password across
several websites. They take this action to reduce the number of passwords they
need to remember. Users seldom select passwords that are both difficult to guess
and simple to remember, which is another issue. Several experts offered various
advice for adhering to various standards to assist users in selecting strong
passwords. To solve this problem, we have presented in this article an automated
method that combines text, symbols, and numbers to create strong, user-friendly
passwords. Diffusion, unpredictability, and confusion are all achievable with our
suggested password generator and are vital in the event that a hacker attempts to
decipher the passwords. We also conduct an entropy analysis and explore the
implications of our results for the NIST password policy creation guidelines.
Recent statistics paint a grim picture - over 90% of login credentials available
online come from data breaches. The average internet user has at least 25
compromised accounts and over 3 billion credential pairs are available to
cybercriminals. Around 70% of businesses have experienced some form of
malware attack with the average cost of $2.6 million. Further, 81% of breaches
leverage stolen or weak passwords.
7|Page
This underscores the need for accessible cybersecurity tools tailored for the
common user to proactively assess threats and improve security posture. Closing
this gap, our motivation was to create an easy-to-use integrated toolkit covering
key aspects like malware scanning, breach monitoring, and password best
practices. The tool aims to empower users to verify integrity, check exposure
status, and fix vulnerabilities before extensive damage.
1. File Integrity Checks: Enable users to scan any file on their local systems
against an aggregated set of threat intelligence to identify malware
proactively before execution. This prevents infection while saving time and
overhead of post-execution antivirus scans.
2. Breach Monitoring: Continuously monitor and alert users if their email
addresses or passwords show up in public data dumps or breaches. This
allows rapid response to contain threats and prevent lateral movement.
3. Secure Passwords: Assist users in generating complex random passwords
and securely storing them in a local vault to prevent reuse across sites and
services. This limits damage in case of a future breach.
4. Accessibility and Portability: Craft a lightweight command line interface
with no dependencies making the tool easy to install and run on low-end
systems like Raspberry Pi(s) all the way up to servers.
8|Page
Chapter 2
SOFTWARE DESIGN
The proposed tool is purely made in python programming language and
divided into 5 parts.
The first part is named driver.py which gives the user an interactive
interface for availing various features of this tool and connects with
other 4 parts: - Email Checker, Malware Analyzer, Password
Checker, Password Generator.
• The main script provides a menu-driven interface with the following options:
1. Scan a file for malware.
2. Check if an email has been leaked on the dark web.
3. Check if a password has been leaked on the dark web.
4. Generate and copy a secure password.
5. Exit the program.
• The menu is displayed using the Colorama library for colored output and the
Art library for ASCII art.
9|Page
Email Checker (EmailChecker.py):
10 | P a g e
EMAIL CHECKER
The feature allows any user to check if their email has been found in the data
breaches and if found it tells where their email has been found.
If breach is found:
MALWARE ANALYZER
The feature allows any user to choose files in their filesystem for malware
analysis and if the file is found malicious then the same is informed to the user.
It uses signature-based checks to identify malware.
11 | P a g e
Figure 2.4 (Browse Window to Choose File)
12 | P a g e
PASSWORD CHECKER
The feature allows any user to check if their email has been found in the data
breaches and if found it tells where their email has been found.
Figure 2.8 (If Password Has Not Been Found in The Breach)
PASSWORD GENERATOR
This feature allows any user to check if their email has been found in the data
breaches and if found it tells where their email has been found.
13 | P a g e
Chapter 3
TOOLS & LIBRARIES USED
Python
Python was selected as the core language due to its remarkable versatility,
readability, and robust ecosystem of libraries. Its clear and concise syntax
facilitates easy comprehension, making it an ideal choice for a wide range of
applications. The language's versatility allows seamless integration of security,
cryptography, and engineering-focused libraries, fostering the development of
robust and secure systems.
VirusTotal
14 | P a g e
intelligence platforms. By aggregating data from over 70 antivirus scanners and
other security sources, VirusTotal provides a consensus assessment of potential
threats.
Have I Been Pwned (HIBP) is a crucial online service that offers public access to
comprehensive breach databases and APIs, empowering individuals and
organizations to monitor the security status of their email addresses and
passwords. By aggregating data from thousands of security incidents, HIBP
compiles a repository of compromised identities and secrets, numbering in the
billions. This vast database enables users to proactively assess whether their
credentials have been compromised in any notable data breaches.
15 | P a g e
HIBP's breach status monitoring is facilitated through user-friendly interfaces and
programmable APIs, allowing seamless integration into various security systems
and applications. Users can input their email addresses or passwords into the
system, which then cross-references this information with its extensive database.
If a match is found, users receive immediate alerts, prompting them to take swift
action such as changing passwords and enhancing their online security.
The primary use case for `getpass` lies in scenarios where confidentiality is
paramount, such as authentication processes or handling sensitive data. By
obscuring user input, it prevents prying eyes from capturing sensitive
information, enhancing overall security.
16 | P a g e
effectiveness depends on the broader security measures implemented within the
application.
Example Usage
Here's a simple example demonstrating the use of the getpass library to collect a
password from the user:
Requests
The Requests library in Python is a popular and versatile tool for making HTTP
requests. It simplifies the process of sending HTTP requests and handling
responses, making it a go-to choice for developers when working with web APIs
or conducting web scraping. The Requests library provides a simple and elegant
API for sending HTTP requests. With just a few lines of code, developers can
perform common HTTP methods such as GET, POST, PUT, and DELETE.
17 | P a g e
Requests supports various authentication mechanisms, including basic
authentication, OAuth, and custom authentication schemes. This makes it a
suitable choice for accessing APIs that require authentication.
Example Usage:
Here is a simple example demonstrating how to use the Requests library to make
a GET request.
Colorama
Colorama is a Python library designed to simplify adding color to terminal text.
It provides an easy-to-use interface for adding colored output to command-line
applications. This library has become a valuable asset for developers seeking to
enhance the visual appeal and user experience of their command-line interfaces
(CLIs). Colorama works seamlessly across different operating systems, including
Windows, macOS, and Linux. This ensures consistent behavior and appearance
regardless of the underlying platform.
Colorama leverages ANSI escape codes to apply colors to text. This widely
supported standard ensures compatibility with a broad range of terminal
emulators and environments.
Example Usage
18 | P a g e
Here's a simple example illustrating the use of Colorama to print colored text in
a Python script:
Art
The Art library enables integration of ASCII art designs into text interfaces. In
this tool, ASCII art is leveraged to create visually appealing custom banners and
headers for the main menu and other application areas. This enhances the
aesthetics and makes the command line tool more engaging compared to plain
text. The use of art contributes to a positive overall user experience.
Hashlib
Hashlib is a Python library that provides a secure and efficient interface for hash
functions, commonly used in various security and data integrity applications. This
library simplifies the process of generating hash values for data, providing a
robust foundation for tasks such as password hashing, digital signatures, and data
integrity verification.
Hash functions play a crucial role in computer science and information security
by converting variable-length data into a fixed-size hash value.
19 | P a g e
Hashlib supports a variety of hash algorithms, including MD5, SHA-1, SHA-224,
SHA-256, SHA-384, SHA-512, and more. This allows developers to choose the
appropriate algorithm based on their specific requirements.
Example Usage:
Here is a simple example demonstrating the use of Hashlib to calculate the SHA-
256 hash of a string in Python:
Sys
The sys module in Python provides access to some variables used or maintained
by the Python interpreter and functions that interact strongly with the interpreter.
It is part of the Python Standard Library and is always available for use in Python
programs.
20 | P a g e
The sys.argv variable is a list in Python, which contains the command-line
arguments passed to the script. The first element (sys.argv[0]) is the name of the
script itself.
Time
Time is a critical aspect of programming, and the time module in Python
facilitates the manipulation and representation of time-related information. It
provides access to the system's clock and allows developers to measure the time
taken by a program, format and parse dates, and create time delays.
Example Usage:
21 | P a g e
Threading
Threading is a technique that enables a program to accomplish multitasking and
parallelism. Python's threading module allows developers to create and manage
threads, which are lightweight, independent units of execution. Threads share the
same memory space, making them suitable for scenarios where tasks can run
concurrently without interfering with each other.
Example Usage
22 | P a g e
Tkinter
Tkinter is a Python library that facilitates the creation of graphical user interfaces.
It is a thin object-oriented layer on top of the Tcl/Tk GUI toolkit. Tkinter provides
a set of tools for constructing GUI applications with a variety of widgets and
layout options. Its simplicity and ease of integration make it a popular choice for
both beginners and experienced developers.
Tkinter includes a rich set of widgets (GUI elements) such as labels, buttons,
entry fields, text boxes, and more. These widgets can be customized and arranged
to create interactive and visually appealing interfaces.
Tkinter provides geometry managers (pack, grid, and place) to control the
placement and organization of widgets within a window. This makes it easy to
create responsive and flexible layouts.
Example Usage:
23 | P a g e
Pyperclip
Example Usage:
Here's a simple example demonstrating the use of Pyperclip to copy and paste
text:
24 | P a g e
Chapter 4
PROGRAM FLOW
Driver.py
The script imports and coordinates plugins to handle the underlying functionality
of each option. For example, the MalwareAnalyzer module interfaces with
VirusTotal APIs to upload and scan files. The EmailChecker and
25 | P a g e
PasswordChecker modules query online "have I been pwned" databases to check
breach status. And PasswordGenerator uses cryptography to create strong
passwords for users.
After printing an ASCII art header and menu, driver.py prompts for necessary
inputs, clears the console, prints a custom header, and invokes the corresponding
plugin workflow. It handles all user interaction and flow coordination
responsibilities, keeping the plugins focused strictly on their domain logic. This
modular architecture should prove scalable, with the flexibility to easily onboard
new plugins expanding the capabilities of the toolkit. Robust error handling and
connectivity checks aim to make the toolkit resilient and user-friendly. Overall,
driver.py delivers an extensible framework for security tools, avoiding repetition
through reusable plugins.
EmailChecker.py
This part of the program allows any user to check if their email has been found
in the data breaches and if found it tells where their email has been found.
The user when chooses the email checker function. The program starts by
confirming the connection, it checks if the HIBP database is accessible or not.
Then it asks the user to input their email address. The email is then checked
against the breached database. If the email has been found in the breached
database, then the user is presented with the places their email has been seen, if
the email has not been found in any of the data breaches, then the user is greeted
and told that they’re safe. The response will indicate whether the email address
has been involved in any breaches. If breaches are found, details such as breach
names, dates, and other relevant information will be included.
26 | P a g e
Figure: 4.2(Flow Chart for Email Checker)
27 | P a g e
MalwareAnalyzer.py
In the process of pre-emptively assessing the safety of a file before execution, the
system employs a robust mechanism. When a user initiates a scan of a specific
file, the code systematically calculates the cryptographic hash of the given file.
Subsequently, the calculated hash is transmitted securely via an API call to the
VirusTotal database, a reputable repository of known malicious file signatures.
28 | P a g e
The VirusTotal API, in turn, cross-references the provided hash with its extensive
database, which aggregates hash values associated with previously identified
malicious files. If a match is found, indicating that the calculated hash
corresponds to a known malicious file, the system promptly flags the file as
potentially harmful. Conversely, if the calculated hash does not align with any
entry in the VirusTotal database, the file is deemed safe for execution.
PasswordChecker.py
29 | P a g e
The use of the Hashlib library ensures a secure and standardized approach to
hashing, while the API call to HIBP leverages external intelligence to evaluate
the password's breach status.
This systematic approach enhances the security posture of the system by
proactively identifying compromised passwords and notifying users to take
corrective actions. It aligns with best practices in password security, offering a
robust solution to mitigate the risks associated with the use of compromised
passwords.
30 | P a g e
PasswordGenerator.py
This Python script offers a practical solution for users seeking to enhance their
password security by providing a convenient and customizable tool for generating
strong and secure passwords. The incorporation of advanced randomization
techniques, exclusion of visually ambiguous characters, and clipboard integration
collectively contribute to the script's effectiveness in promoting robust
cybersecurity practices. The research presented herein underscores the
importance of password management and highlights the role of open-source tools
in empowering users to fortify their digital defenses.
31 | P a g e
Chapter 5
CONCLUSION
32 | P a g e
While the current tool provides comprehensive capabilities for security best
practices, several promising enhancements have been identified to expand
protections even further. One such effort involves exploring integration with
additional antivirus engines beyond VirusTotal to reinforce malware detection
reliability through consensus scanning. Complementing signature-based
approaches with heuristics and machine learning is also on the technology
roadmap.
33 | P a g e
REFERENCES
[1] Henry Hosseini, Julian Rengstorf, and Thomas Hupperich: "Automated Search for Leaked
Private Keys on the Internet: Has Your Private Key Been Pwned?"
[2] F. Z. Glory, A. Ul Aftab, O. Tremblay-Savard and N. Mohammed, "Strong Password
Generation Based on User Inputs," 2019 IEEE 10th Annual Information Technology,
Electronics and Mobile Communication Conference (IEMCON), Vancouver, BC, Canada,
2019, pp. 0416-0423, doi: 10.1109/IEMCON.2019.8936178.
[3] AWS Labs (2019). awslabs/git-secrets: Prevents you from committing secrets and
credentials into git repositories. https://github.com/awslabs/git-secrets. (Accessed on
27/05/2022)
[4] Richard Shay, Saranga Komanduri, Patrick Gage Kelley, Pedro Giovanni Leon, Michelle
L. Mazurek, Lujo Bauer, Nicolas Christin, and Lorrie Faith Cranor. 2010. Encountering
stronger password requirements: user attitudes and behaviors. In Proceedings of the Sixth
Symposium on Usable Privacy and Security (SOUPS '10). Association for Computing
Machinery, New York, NY, USA, Article 2, 1–20.
https://doi.org/10.1145/1837110.1837113
[5] Robert Morris and Ken Thompson. 1979. Password security: a case history. Commun.
ACM 22, 11 (Nov. 1979), 594–597. https://doi.org/10.1145/359168.359172
[6] Gaurav Sood and Ken Cor. 2019. Pwned: The Risk of Exposure From Data Breaches. In
Proceedings of the 10th ACM Conference on Web Science (WebSci '19). Association for
Computing Machinery, New York, NY, USA, 289–292.
https://doi.org/10.1145/3292522.3326046
[7] James Scott, Sr. Fellow, ICIT. Signature Based Malware Detection. 2017. Institute for
Critical Infrastructure Technology
[8] Venugopal, Deepak and Hu, Guoning. ‘Efficient Signature Based Malware Detection on
Mobile Devices’. 1 Jan. 2008 : 33 – 49. Print.
[9] Saha, A., Denning, T., Srikumar, V., and Kasera, S. K. (2020). Secrets in Source Code:
Reducing False Positives using Machine Learning. In 2020 International Conference on
COMmunication Systems NETworkS (COMSNETS), pages 168–175. ISSN: 2155-2509.
[10] Hunt, T. (2022). Have I Been Pwned: Check if your email has been compromised in a
data breach. https://have ibeenpwned.com/. (Accessed on 27/05/2022).
34 | P a g e