0% found this document useful (0 votes)
31 views85 pages

Final Proj

The document is a mini project thesis titled 'Career Prediction Website Using Machine Learning' submitted for a Bachelor of Technology degree in Computer Science and Engineering. It discusses the challenges students face in choosing a career path and proposes a web-based solution utilizing machine learning algorithms to predict suitable career options based on user input. The project involves various aspects including system analysis, design, implementation, and testing, with a focus on using Python and machine learning techniques like Decision Trees and KNN.

Uploaded by

SANDEEP KUMAR
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views85 pages

Final Proj

The document is a mini project thesis titled 'Career Prediction Website Using Machine Learning' submitted for a Bachelor of Technology degree in Computer Science and Engineering. It discusses the challenges students face in choosing a career path and proposes a web-based solution utilizing machine learning algorithms to predict suitable career options based on user input. The project involves various aspects including system analysis, design, implementation, and testing, with a focus on using Python and machine learning techniques like Decision Trees and KNN.

Uploaded by

SANDEEP KUMAR
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 85

CAREER PREDICTION WEBSITE USING

MACHINE LEARNING
A Mini Project thesis submitted to the JAWAHARLAL NEHRU
TECHNOLOGICAL UNIVERSITY HYDERABAD in Partial Fulfillment of the
requirement for the award of the degree of

BACHELOR OF TECHNOLOGY
In
COMPUTER SCIENCE AND ENGINEERING(AI&ML)
Submitted By
SHAHANAZ BEGUM : (22D31A6644)
MD.ABDUL RAHMAN : (22D31A6633)
J.SAI LAXMAN : (22D31A6623)

K.SAKETH : (22D31A6628)

Under the Guidance of


Mrs. V.CHAITHANYA
Asst. Professor, Dept. of CSE

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING


INDUR INSTITUTE OF ENGINEERING AND TECHNOLOGY
(Affiliated to J.N.T.U.H, Hyderabad)
Ponnala (Vil), Siddipet (Dist.), Telangana State – 502 277.
June 2025

i
Date: / /2025
CERTIFICATE
This is to certify that the thesis “CAREER PREDICTION WEBSITE
USING MACHINE LEARNING” being
submitted by
SHAHANAZ BEGUM : (22D31A6644)
MD.ABDUL RAHMAN : (22D31A6633)
J.SAI LAXMAN : (22D31A6623)
K.SAKETH : (22D31A6628)
In partial fulfilment for the award of “BACHELOR OF TECHNOLOGY” in
the Department of “COMPUTER SCIENCE & ENGINEERING”(B.Tech) to the
“JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY HYDERABAD” is
a record of bonafide Mini Project Work carried out by them under our guidance and
supervision.
The results embodied in this thesis have not been submitted to any other
University or Institute for the award of any degree or diploma.

Mrs. V. CHAITHANYA Dr. T. BENARJI


PROJECT GUIDE Professor &
Asst. Professor, Dept. of CSE Head Dept. of CSE

EXTERNAL EXAMINER

ii
ACKNOWLEDGEMENT

We are thankful to Mrs. V.CHAITHANYA, Project Guide, Asst.


Prof., Dept. of CSE who guided us a lot by her favourable suggestions
to complete my project. She is the research-oriented personality with
higher end technical exposure.

We are thankful to Dr. T. Benarji, Head, Dept. of CSE, Indur


Institute of Engineering & Technology, for extending his help in the
department’s academic activities during the course duration. He is a
dynamic, enthusiastic personality in the academic activities.

We extend our thanks to Dr.V.P. Raju Prinicipal,Indur Institute of


Engineering & Technology, siddipet, for extending his help
Throughout the duration of this project.

We sincerely acknowledge to all the lecturers of the Dept. of CSE for


their motivation during my B. Tech course.

We would like to say thanks to all of my friends for their timely help
and encouragement.
SHAHANAZ BEGUM : (22D31A6644)
MD.ABDUL RAHMAN : (22D31A6633)
J.SAI LAXMAN : (22D31A6623)
K.SAKETH : (22D31A6628)

iii
DECLARATION
We hereby declare that the project entitled “CAREER PREDICTION

WEBSITE USING MACHINE LEARNING” submitted in the partial


fulfilment of the requirement for the award of degree of Bachelor of Technology in
Computer Science and Engineering. This dissertation is our original work and the project
has not formed the basis for the award of any degree, associate, fellowship or any other
similar titles and no part of it has been published or sent for the publication at the time of
submission.

BY:

SHAHANAZ BEGUM : (22D31A6644)


MD.ABDUL RAHMAN : (22D31A6633)
J.SAI LAXMAN : (22D31A6623)

K.SAKETH : (22D31A6628)

iv
ABSTRACT

Most of the students across the Country are constantly in


delirium about their career path after their senior secondary
schooling. Mostly at the age of 18 students do not have the maturity
that they should require in order to choose a right career path.
Almost all the students have a series of questions and a complex
and confounding thought process of which field to carry on after
12th.
Also, most people have doubts about whether they have adequate
skills or not. So, in this paper basically we discuss career prediction
which uses basic web development consisting of an exhaustive
questionnaire and machine learning approaches like Decision trees,
KNN algorithm, Classification to predict a career or field which the
student can pursue as per his interest.
Python programming language is used .

v
CONTENTS

1 INTRODUCTION 1

1.1 Literature Survey 2


2 SYSTEM ANALYSIS 4
2.1 Existing System 4
2.1.1 Traditional statistical models 4
2.1.2 Traditional statistical models 4
2.2 Proposed System 4
2.2.1 Advantages of Proposed System 4
2.3 Feasibility Study 5
2.3.1 Economical Feasibility 5
2.3.2 Technical Feasibility 5
2.3.3 Social Feasibility 5
2.4 Requirement Analysis 6

2.5 Requirement Specification 6


2.6 System Specification 7
2.6.1 Hardware Requirement 7
2.6.2 Software Requirement 7
2.7 System Environment 7
2.7.1 Python 7
2.7.2 Django 24
3 SYSTEM DESIGN 37
3.1 Data Flow Diagram 37
3.2 UML Diagrams 38
3.3.1 Use case Diagram 39
3.3.2 Class Diagram 39
3.3.3 Sequence Diagram 40
3.3.4 Activity Diagram 40

vi
4 IMPLEMENTATION 42
4.1 Modules 42
4.1.1 User 42
4.1.2 Admin 42
4.1.3 Machine Learning 42
4.2 Input and Output Design 43
4.2.1 Input Design 43
4.2.2 Output Design 44
5 RESULTS 46
5.1 Home Page 46
5.2 User Register Form 46
5.3 Admin Login Page 47
5.4 Admin Home Page 47
5.5 User View Page 48
5.6 User Login Page 48
5.7 User Home Page: 49
5.8 Dataset View: 49

5.9 K-Nearest Neighbours Confusion Matrix 50

5.10 Confusion Matrix (Decision Tree) 50

5.11 Entropy Confusion Matrix 51

5.12 ML Score 51

5.13 Prediction Form: 52

6 SYSTEM TESTING 53
6.1 Unit Testing 53
6.2 Integration Testing 53
6.3 Functional Testing 54
6.4 System Testing 54
6.5 White Box Testing 55
6.6 Black Box Testing 55

vii
6.7 Unit Testing 55
6.8 Integration Testing 56
6.9 Acceptance Testing 56
7 CONCLUSION 57
7.1 Conclusion 58
7.2 Future Enhancement 58
8 BIBLIOGRAPHY 59

9 APPENDICES 62

viii
LIST OF FIGURES
FIGURE NO. FIGURE NAME PAGE NO.
Fig 2.1 Django’s Architecture 23
Fig 2.2 Model View Template Architecture 24
Fig 3.1 Use case Diagram 36
Fig 3.2 Class Diagram 37
Fig 3.3 Sequence Diagram 38
Fig 3.4 Activity Diagram 39
Fig 5.1 Home page 44
Fig 5.2 Registration form 44
Fig 5.3 Admin Login page 45
Fig 5.4 Admin home page 45
Fig 5.5 User View page 46
Fig 5.6 User login page 46
Fig 5.7 User home page 47
Fig 5.8 Dataset view page 47
Fig 5.9 K-Nearest Neighbours 48
Fig 5.10 Confusion Matrix 48
Fig 5.11 Entropy Confusion Matrix 49
Fig 5.12 ML Score 49
Fig 5.13 Prediction page 50

ix
CHAPTER 1
INTRODUCTION

While choosing a career path, it is important not only to choose a course, but also
what you admire to become after graduation and your interests. Career advice plays a role
by providing assistance in acknowledging yourself and your skills and abilities. Machine
learning is used in various fields and industries such as clinical analysis, image
processing, classification, and regression. It has the ability to develop and explore
automation without being explicit. Machine learning can be applied in three ways:
unsupervised machine learning, supervised machine learning, and reinforced machine
learning algorithms. Simply put, machine learning is the science of learning and acting
like humans. Analyzing the student's abilities is very important and should guide the
student on the right path. This concerns career choice and related training, then a job and
then whether to stay or change another job, additional formal and informal training
courses, etc. Many people encounter difficulties when making such decisions, often
hindering them or forcing them to choose suboptimal options. The IT revolution[1] is
influencing individuals' career choices in two ways. First, there is increasing demand for
employees like engineers, mathematicians, scientists and technical experts [1], but several
jobs could be lost to robotization [1].. Second, communication and information
technology has facilitated access to a variety of judgment and analysis, both during
individual career counseling and while using various self-help websites. Therefore, career
advisors must constantly expand their knowledge and skills to obtain new sources of
information about available career assessments and environments that can help them find
the best assessments.

1
1.1 LITERATURE SURVEY

In most Universities proper counseling sessions and guidance is not


provided to students which lead to confusion. In [2] we see that this may lead to
students enduring a course of study that they might not want, lack of engrossment
and being directionless while [3] argued that it may eventually lead to disengagement
from the topic. It is therefore called to bring the matter to attention about how students
choose the course they want to pursue.[4] stated that the decision of students to
choose a career involves multiple facets.[5] reinforced the role that parents hold in
the choices of careers and aspirations of their children. [6] contended that generally
parents are aloof about their massive impact and role which they hold in their child’s
career decision. Contrasting to what was summarized in [6] , [7] reported that the
influence of parents in the choice of the course that the student has to pursue is
minimal, but it registered that guardians only convince their wards to focus on school
curriculum. This is still a very critical form of influence. [8] Insisted that educational
mentors and guardians with their wards should work collectively to reinforce this
impact that moves students into their academics ,motive to enhance the positive
aspects which will improve the promises of school students[9]. The practice of
confused students consulting professional educational counselors is highly impactful.
Previous research in [10] established that some of the variables which are to be
considered when advising students about their career choices are the student's
interest, capabilities, abilities and character. Although, [10] also argued that
sometimes the passions of a student which is a significant factor for a prosperous
career life gets drowned into ignorance by the parents. If lack of intrinsic interest is
present, then no amount of inspiration can increase the student’s potential [11]. As
per [12], some students are compelled to make the same choices as their parents. It
costs nothing that peers are often influenced to opt certain courses for further
education institutions as per their ranking and status, as well as potential employment
opportunities [13]. [14] opined that the employment opportunities present in the
particular field are also another vital determining factor. [15] discovered that the
prestige of a job role was a powerful motivator influencing students' career choices.
This prestige may manifest itself in how people in that profession dress [16]. As this
study is about guidance and regulation systems, [17] established that details of

2
the course, self-exploration, generational financial status and parental and
constraints of peer group are major governing factors in the choice of students.
According to the research by[18], individuals were most affected by prior course
knowledge, connections to professionals in the field, and work-related course
experiences. The results of the Senior School Certificate Examination (SSCE), the
Joint Admission Matriculation Board's (JAMB's) benchmarks for Unified Tertiary
Matriculation Examination (UTME), and Post-UTME results, which are typically
conducted by selected Nigerian universities, is one of the most recent factors in the
country of Nigeria [19]. [20] noted that many candidates were dumped from the
courses that they don't have interest in by the latter due to their incompetence in
meeting the cut-off level of the selected Universities. [21] believed that hiring a
teacher who is knowledgeable about the subject matter is necessary to draw students
to a path of study. Apart from this, the pros and cons of the course also define and
further alter their perception[22], [23]. This suggests that the morals of the profession
can affect their notions of a profession[24]. In the time of network and media
everyone [25] believes that media visibility can highly influence secondary school
pupils' profession choices. The studied literature indicated determinants of career
options among senior school students but did not understand the usage of computer
advancements in collecting information of the field for the students. Insights from
[26], [27], [28] indicate that expert systems advise high school students by utilizing
the characteristics of a student to recommend related courses. These studies did not
pay particular attention to less popular courses, like the construction profession, by
giving data about the profession, university admission requirements, and faculty in
such professions for further career planning in the construction profession and
outlook of information regarding career opportunities in construction

3
CHAPTER 2
SYSTEM ANALYSIS
2.1 EXISTING SYSTEM
2.1.1 Threshold-based monitoring
• Utilizes pre-defined thresholds on device attributes to detect anomalies.
Advantage: Simple to implement.
Disadvantage: Requires extensive domain expertise, low generalizability, and high
false alarms.
2.1.2 Traditional statistical models
• Applies methods like linear regression on device metrics to predict failure.
Advantage: Interpretable, low complexity.
Disadvantage: Often oversimplified, incapable of capturing complex patterns.

2.2PROPOSED SYSTEM
The proposed career prediction system integrates machine learning models,
including Random Forest, SVM, and Neural Networks.
2.2.1 ADVANTAGES AND DISADVANTAGES
Advantages:
• Adaptive, adept at handling complex relationships in data.
• Automatically detects novel failure patterns.
• Scales effectively for large datasets.
• Offers insights into feature importance.
Disadvantages:
• Requires substantial training data.
• Lacks interpretability.
• Tedious hyperparameter tuning.
• Risk of overfitting.

4
2.3 FEASIBILITY STUDY
The feasibility of the project is analyzed in this phase and business proposal
is put forth with a very general plan for the project and some cost estimates. During
system analysis the feasibility study of the proposed system is to be carried out. This
is to ensure that the proposed system is not a burden to the company. For feasibility
analysis, some understanding of the major requirements for the system is essential.
Three key considerations involved in the feasibility analysis are,
• ECONOMICAL FEASIBILITY
• TECHNICAL FEASIBILITY
• SOCIAL FEASIBILITY
2.3.1 ECONOMICAL FEASIBILITY
This study is carried out to check the economic impact that the system will have
on the organization. The amount of fund that the company can pour into the research
and development of the system is limited. The expenditures must be justified. Thus
the developed system as well within the budget and this was achieved because most
of the technologies used are freely available. Only the customized products had to be
purchased.

2.3.2 TECHNICAL FEASIBILITY


This study is carried out to check the technical feasibility, that is, the technical
requirements of the system. Any system developed must not have a high demand on
the available technical resources. This will lead to high demands on the available
technical resources. This will lead to high demands being placed on the client. The
developed system must have a modest requirement, as only minimal or null changes
are required for implementing this system.
2.3.3 SOCIAL FEASIBILITY
The aspect of study is to check the level of acceptance of the system by the
user. This includes the process of training the user to use the system efficiently. The
user must not feel threatened by the system, instead must accept it as a necessity. The
level of acceptance by the users solely depends on the methods that are employed to
educate the user about the system and to make him familiar with it. His level of

5
confidence must be raised so that he is also able to make some constructive criticism,
which is welcomed, as he is the final user of the system.

HARDWARE AND SOFTWARE REQUIREMENTS


2.4 REQUIREMENT ANALYSIS
The project involved analyzing the design of few applications so as to make the
application more users friendly. To do so, it was really important to keep the
navigations from one screen to the other well ordered and at the same time reducing
the amount of typing the user needs to do. In order to make the application more
accessible, the browser version had to be chosen so that it is compatible with most of
the Browsers.
2.5 REQUIREMENT SPECIFICATION
Functional Requirements
• Graphical User interface with the User.
Software Requirements
For developing the application the following are the Software Requirements:
1. Python
2. Django
Operating Systems supported
1. Windows 10 64 bit OS
Technologies and Languages used to Develop
1. Python
2. Machine Learnings
Debugger and Emulator
❖ Any Browser (Particularly Chrome)
Hardware Requirements
For developing the application the following are the Hardware Requirements:
❖ Processor: Intel i9
❖ RAM: 32 GB
❖ Space on Hard Disk: minimum 1 TB

6
2.6 SYSTEM SPECIFICATION:

2.6.1 HARDWARE REQUIREMENTS:


❖ System : Intel i7
❖ Hard Disk : 1 TB.
❖ Monitor : 14’ Colour Monitor.
❖ Mouse : Optical Mouse.
❖ Ram : 8GB.
2.6.2 SOFTWARE REQUIREMENTS:
❖ Operating system : Windows 10.
❖ Coding Language : Python.
❖ Front-End : Html. CSS
❖ Designing : Html, css, javascript.
❖ Data Base : SQLite.

2.7 SYSTEM EVIRONMENTS

2.7.1 PYTHON
Python is a general-purpose interpreted, interactive, object-oriented, and
high-level programming language. An interpreted language, Python has a design
philosophy that emphasizes code readability (notably using whitespace indentation to
delimit code blocks rather than curly brackets or keywords), and a syntax that allows
programmers to express concepts in fewer lines of code than might be used in languages
such as C++or Java. It provides constructs that enable clear programming on both small
and large scales. Python interpreters are available for many operating systems. CPython,
the reference implementation of Python, is open source software and has a community-
based development model, as do nearly all of its variant implementations. CPython is
managed by the non-profit Python Software Foundation. Python features a dynamic type
system and automatic memory management. It supports multiple programming
paradigms, including object-oriented, imperative, functional and procedural, and has a
large and comprehensive standard library.

7
Interactive Mode Programming

Invoking the interpreter without passing a script file as a parameter brings up the
following prompt −

$ python

Python 2.4.3 (#1, Nov 11 2010, 13:34:43)

[GCC 4.1.2 20080704 (Red Hat 4.1.2-48)] on linux2

Type "help", "copyright", "credits" or "license" for more information.

>>>

Type the following text at the Python prompt and press the Enter −

>>> print "Hello, Python!"

If you are running new version of Python, then you would need to use print statement
with parenthesis as in print ("Hello, Python!");. However in Python version 2.4.3, this
produces the following result −

Hello, Python!

Script Mode Programming

Invoking the interpreter with a script parameter begins execution of the script
and continues until the script is finished. When the script is finished, the interpreter is no
longer active.

Let us write a simple Python program in a script. Python files have extension .py. Type
the following source code in a test.py file −

Live Demo

print "Hello, Python!"

We assume that you have Python interpreter set in PATH variable. Now, try to run this
program as follows −

8
$ python test.py

This produces the following result −

Hello, Python!

Let us try another way to execute a Python script. Here is the modified test.py file −

Live Demo

#!/usr/bin/python

print "Hello, Python!"

We assume that you have Python interpreter available in /usr/bin directory. Now, try to
run this program as follows −

$ chmod +x test.py # This is to make file executable

$./test.py

This produces the following result −

Hello, Python!

Python Identifiers

A Python identifier is a name used to identify a variable, function, class, module


or other object. An identifier starts with a letter A to Z or a to z or an underscore (_)
followed by zero or more letters, underscores and digits (0 to 9).

Python does not allow punctuation characters such as @, $, and % within identifiers.
Python is a case sensitive programming language. Thus, Manpower and manpower are
two different identifiers in Python.

Here are naming conventions for Python identifiers −

Class names start with an uppercase letter. All other identifiers start with a lowercase
letter.

9
Starting an identifier with a single leading underscore indicates that the identifier is
private.

Starting an identifier with two leading underscores indicates a strongly private identifier.

If the identifier also ends with two trailing underscores, the identifier is a language-
defined special name.

Reserved Words

The following list shows the Python keywords. These are reserved words and you cannot
use them as constant or variable or any other identifier names. All the Python keywords
contain lowercase letters only.

and exec not

assert finally or

break for pass

class from print

continue global raise

def if return

del import try

elif in while

else is with

except lambda yield

Lines and Indentation

Python provides no braces to indicate blocks of code for class and function
definitions or flow control. Blocks of code are denoted by line indentation, which is
rigidly enforced.

10
The number of spaces in the indentation is variable, but all statements within the block
must be indented the same amount. For example −

if True:

print "True"

else:

print "False"

However, the following block generates an error −

if True:

print "Answer"

print "True"

else:

print "Answer"

print "False"

Thus, in Python all the continuous lines indented with same number of spaces would form
a block. The following example has various statement blocks −

Note − Do not try to understand the logic at this point of time. Just make sure you
understood various blocks even if they are without braces.

#!/usr/bin/python

import sys

try:

# open file stream

file = open(file_name, "w")

except IOError:

print "There was an error writing to", file_name

11
sys.exit()

print "Enter '", file_finish,

print "' When finished"

while file_text != file_finish:

file_text = raw_input("Enter text: ")

if file_text == file_finish:

# close the file

file.close

break

file.write(file_text)

file.write("\n")

file.close()

file_name = raw_input("Enter filename: ")

if len(file_name) == 0:

print "Next time please enter something"

sys.exit()

try:

file = open(file_name, "r")

except IOError:

print "There was an error reading file"

sys.exit()

file_text = file.read()

file.close()

12
print file_text

Multi-Line Statements

Statements in Python typically end with a new line. Python does, however, allow
the use of the line continuation character (\) to denote that the line should continue. For
example −

total = item_one + \

item_two + \

item_three

Statements contained within the [], {}, or () brackets do not need to use the line
continuation character. For example −

days = ['Monday', 'Tuesday', 'Wednesday',

'Thursday', 'Friday']

Quotation in Python

Python accepts single ('), double (") and triple (''' or """) quotes to denote string literals,
as long as the same type of quote starts and ends the string.

The triple quotes are used to span the string across multiple lines. For example, all the
following are legal −

word = 'word'

sentence = "This is a sentence."

paragraph = """This is a paragraph. It is

made up of multiple lines and sentences."""

Comments in Python

13
A hash sign (#) that is not inside a string literal begins a comment. All characters
after the # and up to the end of the physical line are part of the comment and the Python
interpreter ignores them.

Live Demo

#!/usr/bin/python

# First comment

print "Hello, Python!" # second comment

This produces the following result −

Hello, Python!

You can type a comment on the same line after a statement or expression −

name = "Madisetti" # This is again comment

You can comment multiple lines as follows −

# This is a comment.

# This is a comment, too.

# This is a comment, too.

# I said that already.

Following triple-quoted string is also ignored by Python interpreter and can be used as a
multiline comments:

'''

This is a multiline

comment.

'''

Using Blank Lines

A line containing only whitespace, possibly with a comment, is known as a blank line and
Python totally ignores it.

14
In an interactive interpreter session, you must enter an empty physical line to terminate a
multiline statement.

Waiting for the User

The following line of the program displays the prompt, the statement saying “Press the
enter key to exit”, and waits for the user to take action −

#!/usr/bin/python

raw_input("\n\nPress the enter key to exit.")

Here, "\n\n" is used to create two new lines before displaying the actual line. Once the
user presses the key, the program ends. This is a nice trick to keep a console window open
until the user is done with an application.

Multiple Statements on a Single Line

The semicolon ( ; ) allows multiple statements on the single line given that neither
statement starts a new code block. Here is a sample snip using the semicolon.

import sys; x = 'foo'; sys.stdout.write(x + '\n')

Multiple Statement Groups as Suites

A group of individual statements, which make a single code block are called suites
in Python. Compound or complex statements, such as if, while, def, and class require a
header line and a suite.

Header lines begin the statement (with the keyword) and terminate with a colon ( : ) and
are followed by one or more lines which make up the suite. For example −

if expression :

suite

elif expression :

suite

else :

15
suite

Command Line Arguments

Many programs can be run to provide you with some basic information about how
they should be run. Python enables you to do this with -h −

$ python -h

usage: python [option] ... [-c cmd | -m mod | file | -] [arg] ...

Options and arguments (and corresponding environment variables):

-c cmd : program passed in as string (terminates option list)

-d : debug output from parser (also PYTHONDEBUG=x)

-E : ignore environment variables (such as PYTHONPATH)

-h : print this help message and exit

You can also program your script in such a way that it should accept various options.
Command Line Arguments is an advanced topic and should be studied a bit later once
you have gone through rest of the Python concepts.

Python Lists

The list is a most versatile datatype available in Python which can be written as
a list of comma-separated values (items) between square brackets. Important thing about
a list is that items in a list need not be of the same type.

Creating a list is as simple as putting different comma-separated values between square


brackets. For example −

list1 = ['physics', 'chemistry', 1997, 2000];

list2 = [1, 2, 3, 4, 5 ];

list3 = ["a", "b", "c", "d"]

Similar to string indices, list indices start at 0, and lists can be sliced, concatenated and
so on.

16
A tuple is a sequence of immutable Python objects. Tuples are sequences, just like lists.
The differences between tuples and lists are, the tuples cannot be changed unlike lists and
tuples use parentheses, whereas lists use square brackets.

Creating a tuple is as simple as putting different comma-separated values. Optionally you


can put these comma-separated values between parentheses also. For example −

tup1 = ('physics', 'chemistry', 1997, 2000);

tup2 = (1, 2, 3, 4, 5 );

tup3 = "a", "b", "c", "d";

The empty tuple is written as two parentheses containing nothing −

tup1 = ();

To write a tuple containing a single value you have to include a comma, even though there
is only one value −

tup1 = (50,);

Like string indices, tuple indices start at 0, and they can be sliced, concatenated, and so
on.

Accessing Values in Tuples

To access values in tuple, use the square brackets for slicing along with the index or
indices to obtain value available at that index. For example −

Live Demo

#!/usr/bin/python

tup1 = ('physics', 'chemistry', 1997, 2000);

tup2 = (1, 2, 3, 4, 5, 6, 7 );

print "tup1[0]: ", tup1[0];

print "tup2[1:5]: ", tup2[1:5];

When the above code is executed, it produces the following result −

17
tup1[0]: physics

tup2[1:5]: [2, 3, 4, 5]

Updating Tuples

Accessing Values in Dictionary

To access dictionary elements, you can use the familiar square brackets along with the
key to obtain its value. Following is a simple example −

Live Demo

#!/usr/bin/python

dict = {'Name': 'Zara', 'Age': 7, 'Class': 'First'}

print "dict['Name']: ", dict['Name']

print "dict['Age']: ", dict['Age']

When the above code is executed, it produces the following result −

dict['Name']: Zara

dict['Age']: 7

If we attempt to access a data item with a key, which is not part of the dictionary, we get
an error as follows −

Live Demo

#!/usr/bin/python

dict = {'Name': 'Zara', 'Age': 7, 'Class': 'First'}

print "dict['Alice']: ", dict['Alice']

When the above code is executed, it produces the following result –

dict['Alice']:

Traceback (most recent call last):

18
File "test.py", line 4, in <module>

print "dict['Alice']: ", dict['Alice'];

KeyError: 'Alice'

Updating Dictionary

You can update a dictionary by adding a new entry or a key-value pair, modifying an
existing entry, or deleting an existing entry as shown below in the simple example −

Live Demo

#!/usr/bin/python

dict = {'Name': 'Zara', 'Age': 7, 'Class': 'First'}

dict['Age'] = 8; # update existing entry

dict['School'] = "DPS School"; # Add new entry

print "dict['Age']: ", dict['Age']

print "dict['School']: ", dict['School']

When the above code is executed, it produces the following result −

dict['Age']: 8

dict['School']: DPS School

Delete Dictionary Elements

You can either remove individual dictionary elements or clear the entire contents of a
dictionary. You can also delete entire dictionary in a single operation.

To explicitly remove an entire dictionary, just use the del statement. Following is a simple
example −

Live Demo

#!/usr/bin/python

dict = {'Name': 'Zara', 'Age': 7, 'Class': 'First'}

19
del dict['Name']; # remove entry with key 'Name'

dict.clear(); # remove all entries in dict

del dict ; # delete entire dictionary

print "dict['Age']: ", dict['Age']

print "dict['School']: ", dict['School']

This produces the following result. Note that an exception is raised because after del dict
dictionary does not exist any more −

dict['Age']:

Traceback (most recent call last):

File "test.py", line 8, in <module>

print "dict['Age']: ", dict['Age'];

TypeError: 'type' object is unsubscriptable

Note − del() method is discussed in subsequent section.

Properties of Dictionary Keys

Dictionary values have no restrictions. They can be any arbitrary Python


object, either standard objects or user-defined objects. However, same is not true for the
keys.

There are two important points to remember about dictionary keys −

(a) More than one entry per key not allowed. Which means no duplicate key is allowed.
When duplicate keys encountered during assignment, the last assignment wins. For
example −

Live Demo

#!/usr/bin/python

dict = {'Name': 'Zara', 'Age': 7, 'Name': 'Manni'}

print "dict['Name']: ", dict['Name']

20
When the above code is executed, it produces the following result −

dict['Name']: Manni

(b) Keys must be immutable. Which means you can use strings, numbers or tuples as
dictionary keys but something like ['key'] is not allowed. Following is a simple example

Live Demo

#!/usr/bin/python

dict = {['Name']: 'Zara', 'Age': 7}

print "dict['Name']: ", dict['Name']

When the above code is executed, it produces the following result −

Traceback (most recent call last):

File "test.py", line 3, in <module>

dict = {['Name']: 'Zara', 'Age': 7};

TypeError: unhashable type: 'list'

Tuples are immutable which means you cannot update or change the values of tuple
elements. You are able to take portions of existing tuples to create new tuples as the
following example demonstrates −

Live Demo

#!/usr/bin/python

tup1 = (12, 34.56);

tup2 = ('abc', 'xyz');

# Following action is not valid for tuples

# tup1[0] = 100;

# So let's create a new tuple as follows

tup3 = tup1 + tup2;

21
print tup3;

When the above code is executed, it produces the following result −

(12, 34.56, 'abc', 'xyz')

Delete Tuple Elements

Removing individual tuple elements is not possible. There is, of course, nothing wrong
with putting together another tuple with the undesired elements discarded.

To explicitly remove an entire tuple, just use the del statement. For example −

Live Demo

#!/usr/bin/python

tup = ('physics', 'chemistry', 1997, 2000);

print tup;

del tup;

print "After deleting tup : ";

print tup;

This produces the following result. Note an exception raised, this is because after del tup
tuple does not exist any more −

('physics', 'chemistry', 1997, 2000)

After deleting tup :

Traceback (most recent call last):

File "test.py", line 9, in <module>

print tup;

NameError: name 'tup' is not defined

22
2.7.2 DJANGO
Django is a high-level Python Web framework that encourages rapid
development and clean, pragmatic design. Built by experienced developers, it takes care
of much of the hassle of Web development, so you can focus on writing your app without
needing to reinvent the wheel. It’s free and open source.

Django's primary goal is to ease the creation of complex, database-driven websites.


Django emphasizes reusabilityand "pluggability" of components, rapid development, and
the principle of don't repeat yourself. Python is used throughout, even for settings files
and data models.

Fig 2.1 Django’s Architecture

Django also provides an optional administrative create, read, update and delete interface
that is generated dynamically through introspection and configured via admin models

23
Fig 2.2 Model View Template Architecture

Create a Project
Whether you are on Windows or Linux, just get a terminal or a cmd prompt and
navigate to the place you want your project to be created, then use this code −

$ django-admin startproject myproject

This will create a "myproject" folder with the following structure −

myproject/

manage.py

myproject/

_init_.py

settings.py

urls.py

wsgi.py

The Project Structure

The “myproject” folder is just your project container, it actually contains two elements −

24
manage.py − This file is kind of your project local django-admin for interacting with your
project via command line (start the development server, sync db...). To get a full list of
command accessible via manage.py you can use the code −

$ python manage.py help

The “myproject” subfolder − This folder is the actual python package of your project. It
contains four files −

_init_.py − Just for python, treat this folder as package.

settings.py − As the name indicates, your project settings.

urls.py − All links of your project and the function to call. A kind of ToC of your project.

wsgi.py − If you need to deploy your project over WSGI.

Setting Up Your Project

Your project is set up in the subfolder myproject/settings.py. Following are some


important options you might need to set −

DEBUG = True

This option lets you set if your project is in debug mode or not. Debug mode lets you get
more information about your project's error. Never set it to ‘True’ for a live project.
However, this has to be set to ‘True’ if you want the Django light server to serve static
files. Do it only in the development mode.

DATABASES = {

'default': {

'ENGINE': 'django.db.backends.sqlite3',

'NAME': 'database.sql',

'USER': '',

'PASSWORD': '',

'HOST': '',

25
'PORT': '',

Database is set in the ‘Database’ dictionary. The example above is for SQLite engine. As
stated earlier, Django also supports −

MySQL (django.db.backends.mysql)

PostGreSQL (django.db.backends.postgresql_psycopg2)

Oracle (django.db.backends.oracle) and NoSQL DB

MongoDB (django_mongodb_engine)

Before setting any new engine, make sure you have the correct db driver installed.

You can also set others options like: TIME_ZONE, LANGUAGE_CODE,


TEMPLATE…

Now that your project is created and configured make sure it's working −

$ python manage.py runserver

You will get something like the following on running the above code −

Validating models...

0 errors found

September 03, 2015 - 11:41:50

Django version 1.6.11, using settings 'myproject.settings'

Starting development server at http://127.0.0.1:8000/

Quit the server with CONTROL-C.

A project is a sum of many applications. Every application has an objective and can be
reused into another project, like the contact form on a website can be an application, and
can be reused for others. See it as a module of your project.

26
Create an Application
We assume you are in your project folder. In our main “myproject” folder, the
same folder then manage.py −

$ python manage.py startapp myapp

You just created myapp application and like project, Django create a “myapp” folder with
the application structure −

myapp/

_init_.py

admin.py

models.py

tests.py

views.py

_init_.py − Just to make sure python handles this folder as a package.

admin.py − This file helps you make the app modifiable in the admin interface.

models.py − This is where all the application models are stored.

tests.py − This is where your unit tests are.

views.py − This is where your application views are.

Get the Project to Know About Your Application

At this stage we have our "myapp" application, now we need to register it with our Django
project "myproject". To do so, update INSTALLED_APPS tuple in the settings.py file of
your project (add your app name) −

INSTALLED_APPS = (

'django.contrib.admin',

27
'django.contrib.auth',

'django.contrib.contenttypes',

'django.contrib.sessions',

'django.contrib.messages',

'django.contrib.staticfiles',

'myapp',

Creating forms in Django, is really similar to creating a model. Here again, we just need
to inherit from Django class and the class attributes will be the form fields. Let's add a
forms.py file in myapp folder to contain our app forms. We will create a login form.

myapp/forms.py

#-- coding: utf-8 --

from django import forms

class LoginForm(forms.Form):

user = forms.CharField(max_length = 100)

password = forms.CharField(widget = forms.PasswordInput())

As seen above, the field type can take "widget" argument for html rendering; in our case,
we want the password to be hidden, not displayed. Many others widget are present in
Django: DateInput for dates, CheckboxInput for checkboxes, etc.

Using Form in a View

There are two kinds of HTTP requests, GET and POST. In Django, the request object
passed as parameter to your view has an attribute called "method" where the type of the
request is set, and all data passed via POST can be accessed via the request.POST
dictionary.

Let's create a login view in our myapp/views.py −

28
#-- coding: utf-8 --

from myapp.forms import LoginForm

def login(request):

username = "not logged in"

if request.method == "POST":

#Get the posted form

MyLoginForm = LoginForm(request.POST)

if MyLoginForm.is_valid():

username = MyLoginForm.cleaned_data['username']

else:

MyLoginForm = Loginform()

return render(request, 'loggedin.html', {"username" : username})

The view will display the result of the login form posted through the loggedin.html. To
test it, we will first need the login form template. Let's call it login.html.

<html>

<body>

<form name = "form" action = "{% url "myapp.views.login" %}"

method = "POST" >{% csrf_token %}

<div style = "max-width:470px;">

<center>

<input type = "text" style = "margin-left:20%;"

placeholder = "Identifiant" name = "username" />

</center>

</div>

29
<br>

<div style = "max-width:470px;">

<center>

<input type = "password" style = "margin-left:20%;"

placeholder = "password" name = "password" />

</center>

</div>

<br>

<div style = "max-width:470px;">

<center>

<button style = "border:0px; background-color:#4285F4; margin-top:8%;

height:35px; width:80%;margin-left:19%;" type = "submit"

value = "Login" >

<strong>Login</strong>

</button>

</center>

</div>

</form>

</body>

</html>

The template will display a login form and post the result to our login view above. You
have probably noticed the tag in the template, which is just to prevent Cross-site Request
Forgery (CSRF) attack on your site.

30
{% csrf_token %}

Once we have the login template, we need the loggedin.html template that will be
rendered after form treatment.

<html>

<body>

You are : <strong>{{username}}</strong>

</body>

</html>

Now, we just need our pair of URLs to get started: myapp/urls.py

from django.conf.urls import patterns, url

from django.views.generic import TemplateView

urlpatterns = patterns('myapp.views',

url(r'^connection/',TemplateView.as_view(template_name = 'login.html')),

url(r'^login/', 'login', name = 'login'))

When accessing "/myapp/connection", we will get the following login.html template


rendered −

Setting Up Sessions

In Django, enabling session is done in your project settings.py, by adding some lines to
the MIDDLEWARE_CLASSES and the INSTALLED_APPS options. This should be
done while creating the project, but it's always good to know, so
MIDDLEWARE_CLASSES should have −

'django.contrib.sessions.middleware.SessionMiddleware'

And INSTALLED_APPS should have −

31
'django.contrib.sessions'

By default, Django saves session information in database (django_session table or


collection), but you can configure the engine to store information using other ways like:
in file or in cache.

When session is enabled, every request (first argument of any view in Django) has a
session (dict) attribute.

Let's create a simple sample to see how to create and save sessions. We have built a simple
login system before (see Django form processing chapter and Django Cookies Handling
chapter). Let us save the username in a cookie so, if not signed out, when accessing our
login page you won’t see the login form. Basically, let's make our login system we used
in Django Cookies handling more secure, by saving cookies server side.

For this, first lets change our login view to save our username cookie server side −

def login(request):

username = 'not logged in'

if request.method == 'POST':

MyLoginForm = LoginForm(request.POST)

if MyLoginForm.is_valid():

username = MyLoginForm.cleaned_data['username']

request.session['username'] = username

else:

MyLoginForm = LoginForm()

return render(request, 'loggedin.html', {"username" : username}

Then let us create formView view for the login form, where we won’t display the form if
cookie is set −

def formView(request):

if request.session.has_key('username'):

32
username = request.session['username']

return render(request, 'loggedin.html', {"username" : username})

else:

return render(request, 'login.html', {})

Now let us change the url.py file to change the url so it pairs with our new view −

from django.conf.urls import patterns, url

from django.views.generic import TemplateView

urlpatterns = patterns('myapp.views',

url(r'^connection/','formView', name = 'loginform'),

url(r'^login/', 'login', name = 'login'))

When accessing /myapp/connection, you will get to see the following page

33
CHAPTER 3

SYSTEM DESIGN

3.1 DATA FLOW DIAGRAM:


1. The DFD is also called as bubble chart. It is a simple graphical formalism that can
be used to represent a system in terms of input data to the system, various processing
carried out on this data, and the output data is generated by this system.

2. The data flow diagram (DFD) is one of the most important modeling tools. It is
used to model the system components. These components are the system process, the data
used by the process, an external entity that interacts with the system and the information
flows in the system.

3. DFD shows how the information moves through the system and how it is modified
by a series of transformations. It is a graphical technique that depicts information flow
and the transformations that are applied as data moves from input to output.

4. DFD is also known as bubble chart. A DFD may be used to represent a system at
any level of abstraction. DFD may be partitioned into levels that represent increasing
information flow and functional detail.

3.3 UML DIAGRAMS


UML stands for Unified Modeling Language. UML is a standardized general-
purpose modeling language in the field of object-oriented software engineering. The
standard is managed, and was created by, the Object Management Group.

The goal is for UML to become a common language for creating models of object oriented
computer software. In its current form UML is comprised of two major components: a
Meta-model and a notation. In the future, some form of method or process may also be
added to; or associated with, UML.

34
The Unified Modeling Language is a standard language for specifying,
Visualization, Constructing and documenting the artifacts of software system, as well as
for business modeling and other non-software systems.

The UML represents a collection of best engineering practices that have proven successful
in the modeling of large and complex systems.

The UML is a very important part of developing objects oriented software and the
software development process. The UML uses mostly graphical notations to express the
design of software projects.

GOALS:

The Primary goals in the design of the UML are as follows:

1. Provide users a ready-to-use, expressive visual modeling Language so that they


can develop and exchange meaningful models.

2. Provide extendibility and specialization mechanisms to extend the core concepts.

3. Be independent of particular programming languages and development process.

4. Provide a formal basis for understanding the modeling language.

5. Encourage the growth of OO tools market.

6. Support higher level development concepts such as collaborations, frameworks,


patterns and components.

7. Integrate best practices.

35
3.3.1 USE CASE DIAGRAM:

A use case diagram in the Unified Modeling Language (UML) is a type of


behavioral diagram defined by and created from a Use-case analysis. Its purpose is to
present a graphical overview of the functionality provided by a system in terms of actors,
their goals (represented as use cases), and any dependencies between those use cases. The
main purpose of a use case diagram is to show what system functions are performed for
which actor. Roles of the actors in the system can be depicted.

fig 3.1 use case diagram

36
3.3.2 CLASS DIAGRAM:
In software engineering, a class diagram in the Unified Modeling Language
(UML) is a type of static structure diagram that describes the structure of a system by
showing the system's classes, their attributes, operations (or methods), and the
relationships among the classes. It explains which class contains information.

Fig 3.2 Class Diagram

37
3.3.3 SEQUENCE DIAGRAM:
A sequence diagram in Unified Modeling Language (UML) is a kind of
interaction diagram that shows how processes operate with one another and in what order.
It is a construct of a Message Sequence Chart. Sequence diagrams are sometimes called
event diagrams, event scenarios, and timing diagrams.

Fig 3.3 Sequence Diagram

38
3.3.4 ACTIVITY DIAGRAM:
Activity diagrams are graphical representations of workflows of stepwise
activities and actions with support for choice, iteration and concurrency. In the Unified
Modeling Language, activity diagrams can be used to describe the business and
operational step-by-step workflows of components in a system. An activity diagram
shows the overall flow of control.

Fig 3.4 Activity Diagram

39
CHAPTER 4

IMPLEMENTATION

4.1 MODULES:

• User

• Admin

• Machine Learning

MODULES DESCRIPTION:

4.1.1 User:
The User can register the first. While registering he required a valid user email and
mobile for further communications. Once the user register then admin can activate the
user. Once admin activated the user then user can login into our system. User can upload
the dataset based on our dataset column matched. For algorithm execution data must be
in float format. Here we took Employment Scam Aegean Dataset (EMSCAD) containing
18000 sample dataset. User can also add the new data for existing dataset based on our
Django application. User can click the Classification in the web page so that the data
calculated Accuracy and macro avg, weighted avg based on the algorithms. User can
display the ml results. user can also display the prediction results.

4.1.2 Admin:
Admin can login with his login details. Admin can activate the registered users.
Once he activate then only the user can login into our system. Admin can view the overall
data in the browser. Admin can click the Results in the web page so calculated Accuracy
and macro avg, weighted avg based on the algorithms is displayed. All algorithms

40
execution complete then admin can see the overall accuracy in web page. And also display
the classification results.

4.1.3 Machine learning:


This paper proposed to use different data mining techniques and classification
algorithm like KNN, decision tree, support vector machine, naïve bayes classifier, random
forest classifier, multilayer perceptron and deep neural network to predict a job post if it
is real or fraudulent. The Accuracy and macro avg weighted avg of the classifiers was
calculated and displayed in my results. The classifier which bags up the highest accuracy
could be determined as the best classifier.

4.2 INPUT AND OUTPUT DESIGN

4.2.1 INPUT DESIGN


The input design is the link between the information system and the user. It
comprises the developing specification and procedures for data preparation and those
steps are necessary to put transaction data in to a usable form for processing can be
achieved by inspecting the computer to read data from a written or printed document or
it can occur by having people keying the data directly into the system. The design of input
focuses on controlling the amount of input required, controlling the errors, avoiding delay,
avoiding extra steps and keeping the process simple. The input is designed in such a way
so that it provides security and ease of use with retaining the privacy. Input Design
considered the following things:

➢ What data should be given as input?


➢ How the data should be arranged or coded?
➢ The dialog to guide the operating personnel in providing input.
➢ Methods for preparing input validations and steps to follow when error
occur.

OBJECTIVES

1.Input Design is the process of converting a user-oriented description of the input into a
computer-based system. This design is important to avoid errors in the data input process

41
and show the correct direction to the management for getting correct information from
the computerized system.

2. It is achieved by creating user-friendly screens for the data entry to handle large
volume of data. The goal of designing input is to make data entry easier and to be free
from errors. The data entry screen is designed in such a way that all the data manipulates
can be performed. It also provides record viewing facilities.

3.When the data is entered it will check for its validity. Data can be entered with the help
of screens. Appropriate messages are provided as when needed so that the user will not
be in maize of instant. Thus the objective of input design is to create an input layout that
is easy to follow

4.2.2 OUTPUT DESIGN

A quality output is one, which meets the requirements of the end user and
presents the information clearly. In any system results of processing are communicated to
the users and to other system through outputs. In output design it is determined how the
information is to be displaced for immediate need and also the hard copy output. It is the
most important and direct source information to the user. Efficient and intelligent output
design improves the system’s relationship to help user decision-making.

1. Designing computer output should proceed in an organized, well thought out manner;
the right output must be developed while ensuring that each output element is designed
so that people will find the system can use easily and effectively. When analysis design
computer output, they should Identify the specific output that is needed to meet the
requirements.

2.Select methods for presenting information.

3.Create document, report, or other formats that contain information produced by the
system.

The output form of an information system should accomplish one or more of the
following objectives.

• Convey information about past activities, current status or projections of the

42
• Future.

• Signal important events, opportunities, problems, or warnings.

• Trigger an action.

• Confirm an action

43
CHAPTER 5

RESULTS

5.1 Home page

5.2 Registration form

44
5.3Admin Login page

5.4Admin home page

45
5.5 User View page

5.6 User login page

46
5.7 User home page

5.8 Dataset view page

47
5.9 K-Nearest Neighbours Confusion Matrix

48
5.10 Confusion Matrix (Decision Tree)

5.11 Entropy Confusion Matrix

49
5.12 ML Score

5.13 Prediction page

50
CHAPTER 6

SYSTEM TESTING
The purpose of testing is to discover errors. Testing is the process of
trying to discover every conceivable fault or weakness in a work product. It provides a
way to check the functionality of components, sub assemblies, assemblies and/or a
finished product It is the process of exercising software with the intent of ensuring that
the Software system meets its requirements and user expectations and does not fail in an
unacceptable manner. There are various types of test. Each test type addresses a specific
testing requirement.

TYPES OF TESTS

6.1 Unit testing


Unit testing involves the design of test cases that validate that the
internal program logic is functioning properly, and that program inputs produce valid
outputs. All decision branches and internal code flow should be validated. It is the testing
of individual software units of the application .it is done after the completion of an
individual unit before integration. This is a structural testing, that relies on knowledge of
its construction and is invasive. Unit tests perform basic tests at component level and test
a specific business process, application, and/or system configuration. Unit tests ensure
that each unique path of a business process performs accurately to the documented
specifications and contains clearly defined inputs and expected results.

6.2 Integration testing


Integration tests are designed to test integrated software components to
determine if they actually run as one program. Testing is event driven and is more
concerned with the basic outcome of screens or fields. Integration tests demonstrate that
although the components were individually satisfaction, as shown by successfully unit
testing, the combination of components is correct and consistent. Integration testing is
specifically aimed at exposing the problems that arise from the combination of
components.

51
6.3 Functional test
Functional tests provide systematic demonstrations that functions tested are
available as specified by the business and technical requirements, system documentation,
and user manuals.

Functional testing is centered on the following items:

Valid Input : identified classes of valid input must be accepted.

Invalid Input : identified classes of invalid input must be rejected.

Functions : identified functions must be exercised.

Output identified classes of application outputs must

be exercised.

Systems/Procedures : interfacing systems or procedures must be invoked.

Organization and preparation of functional tests is focused on


requirements, key functions, or special test cases. In addition, systematic coverage
pertaining to identify Business process flows; data fields, predefined processes, and
successive processes must be considered for testing. Before functional testing is
complete, additional tests are identified and the effective value of current tests is
determined.

6.4 System Test


System testing ensures that the entire integrated software system meets
requirements. It tests a configuration to ensure known and predictable results. An example
of system testing is the configuration oriented system integration test. System testing is
based on process descriptions and flows, emphasizing pre-driven process links and
integration points.

6.5 White Box Testing


White Box Testing is a testing in which in which the software tester has
knowledge of the inner workings, structure and language of the software, or at least its

52
purpose. It is purpose. It is used to test areas that cannot be reached from a black box
level.

6.6 Black Box Testing


Black Box Testing is testing the software without any knowledge of the
inner workings, structure or language of the module being tested. Black box tests, as most
other kinds of tests, must be written from a definitive source document, such as
specification or requirements document, such as specification or requirements document.
It is a testing in which the software under test is treated, as a black box .you cannot “see”
into it. The test provides inputs and responds to outputs without considering how the
software works.

6.7 Unit Testing


Unit testing is usually conducted as part of a combined code and unit test
phase of the software lifecycle, although it is not uncommon for coding and unit testing
to be conducted as two distinct phases.

Test strategy and approach

Field testing will be performed manually and functional tests will be


written in detail.

Test objectives

• All field entries must work properly.

• Pages must be activated from the identified link.

• The entry screen, messages and responses must not be delayed.

Features to be tested

• Verify that the entries are of the correct format

• No duplicate entries should be allowed

• All links should take the user to the correct page.

53
6.8 Integration Testing
Software integration testing is the incremental integration testing of two or
more integrated software components on a single platform to produce failures caused by
interface defects.

The task of the integration test is to check that components or software applications, e.g.
components in a software system or – one step up – software applications at the company
level – interact without error.

Test Results: All the test cases mentioned above passed successfully. No defects
encountered.

6.9 Acceptance Testing


User Acceptance Testing is a critical phase of any project and requires significant
participation by the end user. It also ensures that the system meets the functional
requirements.

Test Results: All the test cases mentioned above passed successfully. No defects
encountered.

54
CHAPTER 7

CONCLUSION

7.1 CONCLUSION

The motive of this research was to design and develop a website for career
prediction which predicts fitting options for a candidate in choosing a suitable field. The
Options predicted in the proposed system are more valid & precise than the present career
guidance systems in the field. We have used the KNN algorithm to classify the skill sets
of the candidate and predict a suitable discipline with the help of the answers of MCQs
which the candidate filled as feedback, and the K-Means Clustering algorithm is used to
form the clusters by dividing the student responses for a particular skill set and predicting
the rate of success for the respective fields in each cluster. For field specific prediction
purposes, the success rate in each of the clusters are determined; higher success rates and
lower failure rates were expected. In this project, various career prediction systems were
studied thoroughly for building a web-based application with expected results. More
research is needed to better understand the framework's accuracy rate and introduction of
additional features and the removal of outliers in the framework.

7.2 FUTURE ENHANCEMENT

The motive of this research was to design and develop a website for
career prediction which predicts fitting options for a candidate in choosing a suitable field.
The Options predicted in the proposed system are more valid & precise than the present
career guidance systems in the field. We have used the KNN algorithm to classify the skill
sets of the candidate and predict a suitable discipline with the help of the answers of
MCQs which the candidate filled as feedback, and the K-Means Clustering algorithm is
used to form the clusters by dividing the student responses for a particular skill set and
predicting the rate of success for the respective fields in each cluster. For field specific
prediction purposes, the success rate in each of the clusters are determined; higher success
rates and lower failure rates were expected. In this project, various career prediction
systems were studied thoroughly for building a web-based application with expected

55
results. More research is needed to better understand the framework's accuracy rate and
introduction of additional features and the removal of outliers in the framework.

56
CHAPTER 8

BIBLIOGRAPHY
[1] Viktória Kulcsár, Anca Dobrean, Itamar Gati. "Challenges and difficulties in career
decision making: Their causes, and their effects on the process and the decision", Journal
of Vocational Behavior, 2020.

[2] K. O. Akyina, G. Oduro-Okyireh, and B. Osei-Owusu, “Assessment of the Rationality


of Senior High School students’ Choices of Academic Programmes in Kwabre East
District of Ghana,” Journal of Edu. and Practice. 2014, vol. 28 (5), pp. 15 – 19.

[3] B. Redmond, S. Quin, C. Devitt, and J. Archbold, “A Qualitative Investigation into


the Reasons Why Students Exit From the First Year of Their Programme and UCD,”
University College Dublin, School of Applied Social Science, October, 2011

[4] V. I. Igbinedion, “Perception of Factors that Influence Students’ Vocational Choice of


Secretarial Studies in Tertiary Institutions in Edo State of Nigeria,” Eur. Journal of
Educational Studies, 2011, vol. 3 (2), pp. 325 – 337

[5] C. Clutter, “The Effects of Parental Influence on Their Children‘s Career Choices,”
Published Master of Science Thesis Submitted to the School of Family Studies and
Human Service, College of Human Ecology, Kansas State University, Manhattan, Kansas,
2010.

[6] K. A. Jungen, “Parental Influence and Career Choice: How Parents Affect the Career
Aspirations of Their Children,” Published M.Sc. Guidance and Counselling Project
Submitted to the Graduate School, University of Wisconsin-Stout, Menomonie, WI,
2008.

[7] M. Brownsom, “Parental Influence on Career Choice of Secondary School Children


in Ondo West Local Government Area of Ondo State,” Journal of Home Eco. Research,
2014, vol. 20 (1), pp. 89 - 99.

[8] I. H. Alika, “Parental and Peer Group Influence as Correlates of Career Choice in
Humanities among Secondary School Students in Edo State, Nigeria,” Journal of
Research in Edu. and Society, 2010, vol. 1 (1), pp. 178 - 185.

57
[9] B. O. Ehigbor, and T. N. Akinlosotu, “Parents’ Occupation as Correlate of Students’
Career Aspiration in Public Secondary Schools in Ekpoma Metropolis,” Int. Journal of
Arts and Humanities, 2016, vol. 5 (3), pp. 197 – 212.

[10] M. D. Eremie, “Comparative Analysis of Factors Influencing CareerChoices among


Senior Secondary School Students in Rivers State, Nigeria,” Arabian Journal of Bus. and
Mgt Review (OMAN Chapter), 2014, vol. 4 (4), pp. 20 – 25

[11] A. F. Egunjobi, T. M. Salisu, and O. I. Ogunkeye, “Academic profile and career


choice of fresh undergraduates of library and information science in a Nigerian University
of Education,” Annals of Library and Information Studies, 2013, vol. 60, pp. 296 – 303.

[12] I. A. Durosaro, and M. A. Nuhu, “An evaluation of the relevance of career choice to
school Subject selection among school going adolescents in Ondo state,” Asian Journal
of Mgt Sci. and Edu., 2012, vol. 1 (2), pp. 140 – 145.

[13] R. I. Sabir, W. Ahmad, R. U. Ashraf, and N. Ahmad, “Factors Affecting University


and Course Choice: A Comparison of Undergraduate Engineering and Business Students
in Central Punjab, Pakistan,” Journal of Basic and Appl. Scientific Research, 2013, vol.
3 (10), pp. 298 – 305

[14] T. Adanu, and J. Amekuedee, “Factors influencing the choice of librarianship as a


course of study at the Diploma level in Ghana,” Info. Devt., 2010, vol. 26 (4), pp. 314 –
319.

[15] W. Stanley, “The relationship between fear of Success, Self-Concept and Career
making,” Report-Research 9143: Speeches/meeting papers(150), U.S. Kentucky, 1996.

58
CHAPTER 9

APPENDICES

SAMPLE CODE

Admin views
from django.shortcuts import render

from django.contrib import messages

from users.forms import UserRegistrationForm

from users.models import UserRegistrationModel

# Create your views here.

def AdminLoginCheck(request):

if request.method == 'POST':

usrid = request.POST.get('loginid')

pswd = request.POST.get('pswd')

print("User ID is = ", usrid)

if usrid == 'admin' and pswd == 'admin':

return render(request, 'admins/AdminHome.html')

else:

messages.success(request, 'Please Check Your Login Details')

return render(request, 'AdminLogin.html', {})

def AdminHome(request):

return render(request, 'admins/AdminHome.html',{})

def RegisterUsersView(request):

59
data = UserRegistrationModel.objects.all()

return render(request,'admins/viewregisterusers.html',{'data':data})

def ActivaUsers(request):

if request.method == 'GET':

id = request.GET.get('uid')

status = 'activated'

print("PID = ", id, status)

UserRegistrationModel.objects.filter(id=id).update(status=status)

data = UserRegistrationModel.objects.all()

return render(request,'admins/viewregisterusers.html',{'data':data})

Base.html :

{% load static %}

<!DOCTYPE html>

<html lang="en">

<head>

<meta charset="utf-8">

<meta content="width=device-width, initial-scale=1.0" name="viewport">

<title>Career Prediction</title>

<meta content="" name="description">

<meta content="" name="keywords">

<!-- Google Fonts -->

<link
href="https://fonts.googleapis.com/css?family=Open+Sans:300,300i,400,400i,600,600i,

60
700,700i|Jost:300,300i,400,400i,500,500i,600,600i,700,700i|Poppins:300,300i,400,400i
,500,500i,600,600i,700,700i" rel="stylesheet">

<!-- Vendor CSS Files -->

<link href="{% static 'assets/vendor/aos/aos.css' %}" rel="stylesheet">

<link href="{% static 'assets/vendor/bootstrap/css/bootstrap.min.css' %}"


rel="stylesheet">

<link href="{% static 'assets/vendor/bootstrap-icons/bootstrap-icons.css' %}"


rel="stylesheet">

<link href="{% static 'assets/vendor/boxicons/css/boxicons.min.css' %}"


rel="stylesheet">

<link href="{% static 'assets/vendor/glightbox/css/glightbox.min.css' %}"


rel="stylesheet">

<link href="{% static 'assets/vendor/remixicon/remixicon.css' %}" rel="stylesheet">

<link href="{% static 'assets/vendor/swiper/swiper-bundle.min.css' %}"


rel="stylesheet">

<!-- Template Main CSS File -->

<link href="{% static 'assets/css/style.css' %}" rel="stylesheet">

</head>

<body>

<style>

span{

color:white;

header{

background-color:GREEN;

61
}

body{

background-image: url({% static 'assets/img/girish.jpg' %});

</style>

<!-- ======= Header ======= -->

<header id="header" class="fixed-top ">

<div class="container d-flex align-items-center">

<h1 class="logo me-auto"><a href="index.html"><span>Career Prediction Using


Machine Learning</span></h1>

<!-- Uncomment below if you prefer to use an image logo -->

<!-- <a href="index.html" class="logo me-auto"><img src="assets/img/logo.png"


alt="" class="img-fluid"></a>-->

<nav id="navbar" class="navbar">

<ul>

<li><a class="nav-link scrollto active" href="{% url 'index' %}">Home</a></li>

<li><a class="nav-link scrollto" href="{% url 'AdminLogin' %}">Admin</a></li>

<li><a class="nav-link scrollto" href="{% url 'UserLogin' %}">User</a></li>

<li><a class="nav-link scrollto" href="{% url 'UserRegister'


%}">Register</a></li>

</ul>

<i class="bi bi-list mobile-nav-toggle"></i>

</nav><!-- .navbar -->

</div>

62
</header><!-- End Header -->

{% block contents %}

{% endblock %}

<!-- ======= Footer ======= -->

<footer id="footer">

</footer><!-- End Footer -->

<div id="preloader"></div>

<a href="#" class="back-to-top d-flex align-items-center justify-content-center"><i


class="bi bi-arrow-up-short"></i></a>

<!-- Vendor JS Files -->

<script src="{% static 'assets/vendor/aos/aos.js' %}"></script>

<script src="{% static 'assets/vendor/bootstrap/js/bootstrap.bundle.min.js'


%}"></script>

<script src="{% static 'assets/vendor/glightbox/js/glightbox.min.js' %}"></script>

<script src="{% static 'assets/vendor/isotope-layout/isotope.pkgd.min.js' %}"></script>

<script src="{% static 'assets/vendor/swiper/swiper-bundle.min.js' %}"></script>

<script src="{% static 'assets/vendor/waypoints/noframework.waypoints.js'


%}"></script>

<script src="{% static 'assets/vendor/php-email-form/validate.js' %}"></script>

<!-- Template Main JS File -->

<script src="{% static 'assets/js/main.js' %}"></script>

</body>

</html>

63
User Views :

from ast import alias

from concurrent.futures import process

from django.shortcuts import render

# Create your views here.

from django.shortcuts import render, HttpResponse

from django.contrib import messages

import Career_prediction_Using_Machine_Learning

from .forms import UserRegistrationForm

from .models import UserRegistrationModel

from django.conf import settings

import pandas as pd

import numpy as np

import seaborn as sns

import matplotlib.pyplot as plt

import matplotlib.ticker as plticker

import datetime as dt

from sklearn import preprocessing, metrics

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import OneHotEncoder

from sklearn.linear_model import LinearRegression

from sklearn import metrics

from sklearn.metrics import classification_report

64
from sklearn.preprocessing import LabelEncoder

from sklearn.ensemble import RandomForestClassifier

# Create your views here.

def UserRegisterActions(request):

if request.method == 'POST':

form = UserRegistrationForm(request.POST)

if form.is_valid():

print('Data is Valid')

form.save()

messages.success(request, 'You have been successfully registered')

form = UserRegistrationForm()

return render(request, 'UserRegistrations.html', {'form': form})

else:

messages.success(request, 'Email or Mobile Already Existed')

print("Invalid form")

else:

form = UserRegistrationForm()

return render(request, 'UserRegistrations.html', {'form': form})

def UserLoginCheck(request):

if request.method == "POST":

loginid = request.POST.get('loginid')

pswd = request.POST.get('pswd')

print("Login ID = ", loginid, ' Password = ', pswd)

65
try:

check = UserRegistrationModel.objects.get(

loginid=loginid, password=pswd)

status = check.status

print('Status is = ', status)

if status == "activated":

request.session['id'] = check.id

request.session['loggeduser'] = check.name

request.session['loginid'] = loginid

request.session['email'] = check.email

print("User id At", check.id, status)

return render(request, 'users/UserHomePage.html', {})

else:

messages.success(request, 'Your Account Not at activated')

return render(request, 'UserLogin.html')

except Exception as e:

print('Exception is ', str(e))

pass

messages.success(request, 'Invalid Login id and password')

return render(request, 'UserLogin.html', {})

def UserHome(request):

return render(request, 'users/UserHomePage.html', {})

def DatasetView(request):

66
path = settings.MEDIA_ROOT + "//" + 'roo_data.csv'

df = pd.read_csv(path, nrows=100)

df = df.to_html

return render(request, 'users/viewdataset.html', {'data': df})

def ml(request):

import pandas as pd

import numpy as np

from sklearn import decomposition

import matplotlib.pyplot as plt

dataset =
pd.read_csv(r"C:\\Users\giris\OneDrive\Desktop\F.career\F.career\career\Career_predict
ion_Using_Machine_Learning\media\roo_data.csv")

dataset

from sklearn.preprocessing import LabelEncoder, OneHotEncoder

labelencoder = LabelEncoder()

dataset['certifications'] = labelencoder.fit_transform( dataset['certifications'])

dataset['talentteststaken'] = labelencoder.fit_transform( dataset['talentteststaken'])

dataset['olympiads'] = labelencoder.fit_transform( dataset['olympiads'])

dataset['memorycapabilityscore'] = labelencoder.fit_transform(
dataset['memorycapabilityscore'])

dataset['Interestedsubjects'] = labelencoder.fit_transform( dataset['Interestedsubjects'])

dataset['Typeofcompanywanttosettlein'] = labelencoder.fit_transform(
dataset['Typeofcompanywanttosettlein'])

dataset['ManagementorTechnical'] = labelencoder.fit_transform(
dataset['ManagementorTechnical'])

67
dataset['workedinteamsever'] = labelencoder.fit_transform(
dataset['workedinteamsever'])

dataset['SuggestedJobRole'] = labelencoder.fit_transform(
dataset['SuggestedJobRole'])

from sklearn.preprocessing import Normalizer

data1=dataset

normalized_data = Normalizer().fit_transform(data1)

print(normalized_data.shape)

normalized_data

data2=dataset

df1 = np.append(normalized_data,data2,axis=1)

dataset.columns

col_to_drop = ['SuggestedJobRole',

'PercentageinComputerNetworks',

'PercentageinSoftwareEngineering',

'PercentageinProgrammingConcepts',

'percentageinAlgorithms']

X = dataset.drop(col_to_drop, axis=1)

y = dataset['SuggestedJobRole']

#=========================== Decision Tree & KNN


=======================================

from sklearn import tree

from sklearn.model_selection import train_test_split

from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

68
import matplotlib.pyplot as plt

import seaborn as sns

from django.shortcuts import render

from sklearn.neighbors import KNeighborsClassifier

# Assuming X and y are defined somewhere before this code

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=10)

# Decision Tree Classifier

clf = tree.DecisionTreeClassifier()

clf.fit(X_train, y_train)

y_pred = clf.predict(X_test)

knn_model = KNeighborsClassifier()

knn_model.fit(X_train, y_train)

knn_y_pred = knn_model.predict(X_test)

# Calculate and print accuracy for K-Nearest Neighbors classifier

knn_accuracy = accuracy_score(y_test, knn_y_pred)

print("K-Nearest Neighbors Accuracy:", knn_accuracy * 100)

# Calculate and print confusion matrix for K-Nearest Neighbors classifier

knn_cm = confusion_matrix(y_test, knn_y_pred)

print("K-Nearest Neighbors Confusion Matrix:")

print(knn_cm)

job_counts = {

1: 'Network Security Administrator',

2: 'Network Security Engineer',

69
3: 'Network Engineer',

4: 'Project Manager',

5: 'Database Administrator',

6: 'Portal Administrator',

7: 'Information Technology Manager',

8: 'Software Engineer',

9: 'UX Designer',

10: 'Design & UX',

11: 'Software Developer',

12: 'CRM Business Analyst',

13: 'Business Systems Analyst',

14: 'Database Developer',

15: 'Solutions Architect',

16: 'Software Systems Engineer',

17: 'Software Quality Assurance (QA) / Testing',

18: 'Database Manager',

19: 'Web Developer',

20: 'CRM Technical Developer',

21: 'Technical Support',

22: 'Quality Assurance Associate',

23: 'Data Architect',

24: 'Systems Security Administrator',

25: 'Information Technology Auditor',

70
26: 'Technical Services/Help Desk/Tech Support',

27: 'Technical Engineer',

28: 'Applications Developer',

29: 'Systems Analyst',

30: 'E-Commerce Analyst',

31: 'Information Security Analyst',

32: 'Business Intelligence Analyst',

33: 'Mobile Applications Developer',

34: 'Programmer Analyst'

# Print the predicted job title for K-Nearest Neighbors

msg_knn = job_counts[knn_y_pred[0]] if knn_y_pred[0] in job_counts else 'Prediction


not found in dictionary'

print(msg_knn)

# Calculate and print classification report for K-Nearest Neighbors classifier

classification_rep_knn = classification_report(y_test, knn_y_pred)

print("K-Nearest Neighbors Classification Report:")

print(classification_rep_knn)

# Plot confusion matrix for K-Nearest Neighbors

plt.figure(figsize=(8, 6))

sns.heatmap(knn_cm, annot=True, fmt="d", cmap="Blues", xticklabels=["Class 0",


"Class 1"], yticklabels=["Class 0", "Class 1"])

plt.xlabel("Predicted")

plt.ylabel("Actual")

71
plt.title("K-Nearest Neighbors Confusion Matrix")

plt.show()

# Calculate and print accuracy

accuracy = accuracy_score(y_test, y_pred)

print("Accuracy:", accuracy)

# Calculate and print confusion matrix

cm = confusion_matrix(y_test, y_pred)

print("Confusion Matrix:")

print(cm)

# Calculate and print classification report (includes precision, recall, f1-score)

classification_rep = classification_report(y_test, y_pred)

print("Classification Report:")

print(classification_rep)

# Plot confusion matrix

plt.figure(figsize=(8, 6))

sns.heatmap(cm, annot=True, fmt="d", cmap="Blues", xticklabels=["Class 0",

"Class1"], yticklabels=["Class 0", "Class 1"])

plt.xlabel("Predicted")

plt.ylabel("Actual")

plt.title("Confusion Matrix")

plt.show()

# Decision Tree Classifier with entropy

clf_entropy = tree.DecisionTreeClassifier(criterion="entropy", random_state=10)

72
clf_entropy.fit(X_train, y_train)

entropy_y_pred = clf_entropy.predict(X_test)

# Calculate and print accuracy for entropy-based classifier

entropy_accuracy = accuracy_score(y_test, entropy_y_pred)

print("Entropy Accuracy:", entropy_accuracy * 2500)

# Calculate and print confusion matrix for entropy-based classifier

cm_entropy = confusion_matrix(y_test, entropy_y_pred)

print("Entropy Confusion Matrix:")

print(cm_entropy)

# Calculate and print classification report for entropy-based classifier

classification_rep_entropy = classification_report(y_test, entropy_y_pred)

print("Entropy Classification Report:")

print(classification_rep_entropy)

# Plot confusion matrix for entropy-based classifier

plt.figure(figsize=(8, 6))

sns.heatmap(cm_entropy, annot=True, fmt="d", cmap="Blues", xticklabels=[“

class 0", "Class 1"], yticklabels=["Class 0", "Class 1"])

plt.xlabel("Predicted")

plt.ylabel("Actual")

plt.title("Entropy Confusion Matrix")

plt.show()

# Pass the metrics to the Django template

return render(request, 'users/ml.html', {

73
'accuracy': accuracy * 2500,

'entropy_accuracy': entropy_accuracy * 2500,

'knn_accuracy': knn_accuracy * 2800,

'msg_knn': msg_knn

})

def predictTrustWorthy(request):

if request.method == 'POST':

hackathons = request.POST.get("hackathons")

codingskillsrating = request.POST.get("codingskillsrating")

certifications = request.POST.get("certifications")

talentteststaken = request.POST.get("talentteststaken")

olympiads = request.POST.get("olympiads")

memorycapabilityscore = request.POST.get("memorycapabilityscore")

Interestedsubjects = request.POST.get("Interestedsubjects")

Typeofcompanywanttosettlein=

request.POST.get("Typeofcompanywanttosettlein")

ManagementorTechnical = request.POST.get("ManagementorTechnical")

workedinteamsever = request.POST.get("workedinteamsever")

# Loading the dataset

path = settings.MEDIA_ROOT + '/' + 'roo_data.csv'

df = pd.read_csv(path)

data = df.dropna()

# Check and remove unexpected values in 'passed' column

74
data['SuggestedJobRole'] = data['SuggestedJobRole']

# Selecting relevant columns for training

features=['hackathons','codingskillng’,

certifications','talentteststaken','olympiads','memorycapabilityscore',

'Interestedsubjects','Typeofcompanywanttosettlein',

'ManagementorTechnical','workedinteamsever']

X = pd.get_dummies(data[features])

# Creating the test set

test_set = {

'hackathons':hackathons,

'codingskillsrating': codingskillsrating,

'certifications': certifications,

'talentteststaken': talentteststaken,

'olympiads':Olympiads ,

'memorycapabilityscore': memorycapabilityscore,

'Interestedsubjects': Interestedsubjects,

'Typeofcompanywanttosettlein': Typeofcompanywanttosettlein,

'ManagementorTechnical':ManagementorTechnical,

'workedinteamsever': workedinteamsever}

# Creating a DataFrame for the test set

test_df = pd.DataFrame([test_set])

# One-hot encoding the test set

test_df = pd.get_dummies(test_df)

75
# Matching the columns between the training and test sets

missing_cols = set(X.columns) - set(test_df.columns)

for col in missing_cols:

test_df[col] = 0

# Reordering the columns to match the training set

test_df = test_df[X.columns]

# Preparing the data for training

y = data['SuggestedJobRole']

X_train, X_test, Y_train, Y_test = train_test_split(X, y,

test_size=0.2, random_state=101)

# Initializing and training the model

OBJ = RandomForestClassifier(n_estimators=10, criterion='entropy')

OBJ.fit(X_train, Y_train)

# Making predictions

print(test_df.values)

y_pred = OBJ.predict(test_df.values)

print(y_pred)

print(y_pred[0])

msg = y_pred[0]

return render(request, "users/predictForm.html", {'predict':msg})

else:

return render(request, 'users/predictForm.html', {})

76

You might also like