0% found this document useful (0 votes)

105 views13 pages

Resume Screening Project Report Final

This document outlines a project aimed at automating the resume screening process using a Python-based system that employs Natural Language Processing (NLP) to analyze candidate qualifications. The tool is designed to reduce manual effort in recruitment by generating relevance scores based on predefined skills and educational qualifications from resumes in PDF and DOCX formats. The project successfully addresses challenges related to inconsistent resume formats and demonstrates effective extraction and scoring of candidate information.

Uploaded by

Khwaja Moinuddin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

105 views13 pages

Resume Screening Project Report Final

Uploaded by

Khwaja Moinuddin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 13

Resume Screening for Recruitment

Table of Contents
Executive Summary...........................................1

Table of Contents...........................................2

Project Objective...........................................3

Scope.......................................................3

Methodology.................................................4

Artifacts Used..............................................5

Technical Coverage..........................................6

Project Coding..............................................7

Results.....................................................9

Challenges and Resolutions..................................10

Conclusion..................................................11

References..................................................12
Executive Summary
This project focuses on automating the resume screening process using a Python-based
approach. With the increase in job applications, recruiters face a significant burden in
shortlisting qualified candidates. The proposed system extracts and analyzes candidate
qualifications using Natural Language Processing (NLP), enabling HR teams to efficiently
identify resumes that match predefined job requirements and streamline the recruitment
workflow.
Project Objective
The primary objective is to create an automated resume screening tool capable of
identifying relevant skills and educational qualifications from resumes. This system aims to
reduce manual intervention and accelerate the initial screening process in hiring workflows
by generating a relevance score for each candidate.
Scope
This project supports parsing and analyzing resumes in both PDF and DOCX formats. It
defines a set of required skills and educational levels, matches those against the resume
contents, and returns a structured report. This tool is especially useful for pre-screening in
large recruitment drives and can be extended to include experience levels and certifications.
Methodology
The methodology includes the following steps:

1. Text Extraction: Using PyPDF2 for PDF and docx2txt for DOCX.

2. Preprocessing: Converting text to lowercase and tokenizing using NLTK.

3. NLP Parsing: spaCy's language model processes the text structure.

4. Keyword Matching: Identifying matches against a predefined set of skills and education.

5. Scoring: A score is computed based on the number of matches.

This process ensures that only the most relevant resumes move forward in the recruitment
pipeline.
Artifacts Used
- Python 3.10+

- PyPDF2: Library for PDF parsing

- docx2txt: Module for reading DOCX files

- spaCy: NLP processing

- NLTK: Tokenization

- Sample resume files

- Jupyter Notebook or Python CLI for testing

Technical Coverage
This solution leverages core Natural Language Processing techniques, including
tokenization and named entity recognition. Text extraction uses robust Python libraries
designed for various file types. Skill and education matching is efficiently implemented
using set operations. The result is a lightweight, extensible, and scalable resume filtering
engine that can be integrated into larger ATS systems.
Results
The system was tested on various resume formats and consistently extracted useful skills
and education matches. Each processed resume returned:

- List of matched skills

- Matched education qualifications

- Total score

The tool proved successful in reducing the manual screening effort and produced consistent
outputs across different document formats.
Challenges and Resolutions
Challenges:

- Inconsistent resume formats caused difficulties in text extraction.

- Some PDFs lacked clear structure for parsing.

- Tokenization mismatches due to case sensitivity.

Resolutions:

- Normalized all text inputs to lowercase.

- Used reliable libraries like PyPDF2 and docx2txt.

- Defined keyword sets carefully to match common resume terms.

Conclusion
This project demonstrates a practical implementation of automated resume screening. It
simplifies the initial recruitment phase by identifying qualified candidates based on
required skill sets and education. With further enhancements such as experience detection
and machine learning models, the system can evolve into a comprehensive ATS component.
References
- https://spacy.io

- https://www.nltk.org

- https://pypi.org/project/python-docx/

- https://pythonhosted.org/PyPDF2/
Project Coding
The following code represents the core logic of the resume screening system:

import docx2txt
import PyPDF2
import spacy
from nltk.tokenize import word_tokenize

nlp = spacy.load("en_core_web_sm")
REQUIRED_SKILLS = {"python", "machine learning", "data analysis",
"nlp", "sql"}
REQUIRED_EDUCATION = {"bachelor", "master", "phd"}

def extract_text_from_pdf(pdf_path):
text = ""
with open(pdf_path, "rb") as file:
reader = PyPDF2.PdfReader(file)
for page in reader.pages:
text += page.extract_text() + " "
return text

def extract_text_from_docx(docx_path):
return docx2txt.process(docx_path)

def screen_resume(resume_text):
resume_text = resume_text.lower()
tokens = set(word_tokenize(resume_text))
matched_skills = REQUIRED_SKILLS.intersection(tokens)
matched_education = REQUIRED_EDUCATION.intersection(tokens)
score = len(matched_skills) + len(matched_education)
return {
"matched_skills": list(matched_skills),
"matched_education": list(matched_education),
"score": score
}

def process_resume(file_path):
if file_path.endswith(".pdf"):
text = extract_text_from_pdf(file_path)
elif file_path.endswith(".docx"):
text = extract_text_from_docx(file_path)
else:
return "Unsupported file format"
return screen_resume(text)

if __name__ == "__main__":
resume_path = "ATS classic HR resume.docx"
result = process_resume(resume_path)
print(result)

Resume Shortlisting System
No ratings yet
Resume Shortlisting System
9 pages
Reply Format Notice Restitution of Conjugal Rights
No ratings yet
Reply Format Notice Restitution of Conjugal Rights
12 pages
NLP Project
No ratings yet
NLP Project
12 pages
Nehru On Secularism
100% (1)
Nehru On Secularism
53 pages
Resume Parser and Job Recommendation System Using Machine Learning
No ratings yet
Resume Parser and Job Recommendation System Using Machine Learning
6 pages
Some Useful Charts (F) PDF
100% (3)
Some Useful Charts (F) PDF
83 pages
Introduction To Voice Procedure
100% (2)
Introduction To Voice Procedure
9 pages
Characteristics of Good Marketing Research
No ratings yet
Characteristics of Good Marketing Research
18 pages
Qmo DLP PDF
No ratings yet
Qmo DLP PDF
155 pages
Chemistry of Water Treatment, Second Edition
No ratings yet
Chemistry of Water Treatment, Second Edition
600 pages
Mlops Ese Report
No ratings yet
Mlops Ese Report
11 pages
Intelligent Resume Screening and Ranking System Using NLP
No ratings yet
Intelligent Resume Screening and Ranking System Using NLP
51 pages
Basal Ganglia Disordrs
100% (1)
Basal Ganglia Disordrs
50 pages
Personal Care: Extracts
No ratings yet
Personal Care: Extracts
24 pages
Fabrics and Garments Designing Techniques 2
No ratings yet
Fabrics and Garments Designing Techniques 2
21 pages
Ieee Paper
No ratings yet
Ieee Paper
7 pages
Resume Parsing Report M
No ratings yet
Resume Parsing Report M
103 pages
People Vs Propeta Drug Case
No ratings yet
People Vs Propeta Drug Case
6 pages
RESUMEPARSER
No ratings yet
RESUMEPARSER
75 pages
C TADM 22 Dumps
No ratings yet
C TADM 22 Dumps
20 pages
Miraj PWP Report
No ratings yet
Miraj PWP Report
16 pages
Resume Screener
No ratings yet
Resume Screener
17 pages
Code Output
No ratings yet
Code Output
12 pages
Project - Synopsis Resume Scraping
No ratings yet
Project - Synopsis Resume Scraping
16 pages
Foml
No ratings yet
Foml
15 pages
Automated Resume Screening: COMP 4750: Natural Language Processing Shawon Ibn Kamal
No ratings yet
Automated Resume Screening: COMP 4750: Natural Language Processing Shawon Ibn Kamal
14 pages
Automated Resume Screening: COMP 4750: Natural Language Processing Shawon Ibn Kamal
No ratings yet
Automated Resume Screening: COMP 4750: Natural Language Processing Shawon Ibn Kamal
14 pages
Ada Assn Rep
No ratings yet
Ada Assn Rep
10 pages
Project Report 8th Sem
No ratings yet
Project Report 8th Sem
36 pages
Abhishek, Intelligent Resume Screening Tool
No ratings yet
Abhishek, Intelligent Resume Screening Tool
17 pages
Basic Education Learning Recovery and Continuity Plan
No ratings yet
Basic Education Learning Recovery and Continuity Plan
20 pages
Shradha Pujari Resume Screening NLP Python
No ratings yet
Shradha Pujari Resume Screening NLP Python
12 pages
19XJ1A0348 - Calibration of Orifice, Rotameter, Venturimeter
No ratings yet
19XJ1A0348 - Calibration of Orifice, Rotameter, Venturimeter
7 pages
ResumeRecomendationSystemThrough AI
No ratings yet
ResumeRecomendationSystemThrough AI
33 pages
Hack Xios
No ratings yet
Hack Xios
11 pages
AI CV Scanner
No ratings yet
AI CV Scanner
13 pages
Mini Project
No ratings yet
Mini Project
10 pages
Resume Screening
No ratings yet
Resume Screening
16 pages
SE Assignment
No ratings yet
SE Assignment
14 pages
Major Review 1 199
No ratings yet
Major Review 1 199
18 pages
Final - Mini Project PDF
No ratings yet
Final - Mini Project PDF
14 pages
Ola Electric Financials
No ratings yet
Ola Electric Financials
6 pages
Project - Phase - II - Final (1) (1) 1
No ratings yet
Project - Phase - II - Final (1) (1) 1
17 pages
Technical Seminar - 4
No ratings yet
Technical Seminar - 4
14 pages
SaraHackos TSSPioneer
No ratings yet
SaraHackos TSSPioneer
8 pages
Rizal Works Maam IC
No ratings yet
Rizal Works Maam IC
15 pages
Capstone Project AI
No ratings yet
Capstone Project AI
15 pages
Resum (1) (3) Pro
No ratings yet
Resum (1) (3) Pro
16 pages
Resume Shortlisting System (14!2!2025)
No ratings yet
Resume Shortlisting System (14!2!2025)
15 pages
Project 90
No ratings yet
Project 90
3 pages
Registered Professionals Developers Details Structural Engineers (Grade-I) 2023
No ratings yet
Registered Professionals Developers Details Structural Engineers (Grade-I) 2023
5 pages
Resume Mini
No ratings yet
Resume Mini
10 pages
Purple Futuristic Technology Presentation
No ratings yet
Purple Futuristic Technology Presentation
19 pages
Ai Resume Analyzer
No ratings yet
Ai Resume Analyzer
13 pages
Capstone Project AI
No ratings yet
Capstone Project AI
10 pages
Resume Screening Review Paper
No ratings yet
Resume Screening Review Paper
5 pages
Resume Parser Progress
No ratings yet
Resume Parser Progress
11 pages
Scholarly Paper
No ratings yet
Scholarly Paper
8 pages
Abstract
No ratings yet
Abstract
10 pages
Literature Survey
No ratings yet
Literature Survey
6 pages
A Summary of Your Rights Under The Fair Credit Reporting Act
No ratings yet
A Summary of Your Rights Under The Fair Credit Reporting Act
4 pages
? Project Title
No ratings yet
? Project Title
3 pages
Literature Survey Ai Resume Screening
No ratings yet
Literature Survey Ai Resume Screening
3 pages
Book Publishing Helper Empowering Authors
No ratings yet
Book Publishing Helper Empowering Authors
3 pages
Here's A Structured Outline For You - Skill Gap Analysis
No ratings yet
Here's A Structured Outline For You - Skill Gap Analysis
2 pages
LS1019L - Vista Explodida
No ratings yet
LS1019L - Vista Explodida
9 pages
International Journal of Research Publication and Reviews: A Smart Resume Analyser For Career Optimization Using NLP
No ratings yet
International Journal of Research Publication and Reviews: A Smart Resume Analyser For Career Optimization Using NLP
6 pages
Project Report2
No ratings yet
Project Report2
3 pages
Synopsis
No ratings yet
Synopsis
8 pages
E-Recruiting and Shortlisting Using Candidate Resume With NLP and Machine Learning
No ratings yet
E-Recruiting and Shortlisting Using Candidate Resume With NLP and Machine Learning
5 pages
Fin Irjmets1683342426
No ratings yet
Fin Irjmets1683342426
7 pages
Scope of Work HR
No ratings yet
Scope of Work HR
3 pages
Scilab Module
No ratings yet
Scilab Module
33 pages
Modeling and Forecasting The Evolution of Preferences Over Time: A Hidden Markov Model of Travel Behavior
No ratings yet
Modeling and Forecasting The Evolution of Preferences Over Time: A Hidden Markov Model of Travel Behavior
34 pages
Research Paper
No ratings yet
Research Paper
4 pages
Aib 052
No ratings yet
Aib 052
2 pages
Shwetamajorsynopsis
No ratings yet
Shwetamajorsynopsis
5 pages
Project Proposal
No ratings yet
Project Proposal
1 page
Read100 Finals
No ratings yet
Read100 Finals
5 pages
A4 Resume Parser
No ratings yet
A4 Resume Parser
1 page
REsFil Machine Learning
No ratings yet
REsFil Machine Learning
5 pages
ABSTRACT
No ratings yet
ABSTRACT
1 page
Check Valve Type RK, RB: Product Documentation
No ratings yet
Check Valve Type RK, RB: Product Documentation
28 pages
Unit 6 Part
No ratings yet
Unit 6 Part
27 pages
NOUN
No ratings yet
NOUN
5 pages
Power Point Presentation ON Honey Pots: BY Garapati Bhavana 17981A0553
No ratings yet
Power Point Presentation ON Honey Pots: BY Garapati Bhavana 17981A0553
13 pages
Test No 10 Ans Co - Chem
No ratings yet
Test No 10 Ans Co - Chem
10 pages
HR Questions Sample 1
No ratings yet
HR Questions Sample 1
3 pages
Java Questions
No ratings yet
Java Questions
2 pages
Home Movie Night Clean Dark Flyer
No ratings yet
Home Movie Night Clean Dark Flyer
2 pages
Red and White Bold Car Wash and Detailing Flyer
No ratings yet
Red and White Bold Car Wash and Detailing Flyer
1 page
Foro - Ciclo Ii
No ratings yet
Foro - Ciclo Ii
1 page
MEUIR v. STATE OF FLORIDA Et Al - Document No. 4
No ratings yet
MEUIR v. STATE OF FLORIDA Et Al - Document No. 4
2 pages

Resume Screening Project Report Final

Uploaded by

Resume Screening Project Report Final

Uploaded by

Resume Screening for Recruitment

Challenges and Resolutions..................................10

2. Preprocessing: Converting text to lowercase and tokenizing using NLTK.

3. NLP Parsing: spaCy's language model processes the text structure.

5. Scoring: A score is computed based on the number of matches.

- PyPDF2: Library for PDF parsing

- docx2txt: Module for reading DOCX files

- spaCy: NLP processing

- Sample resume files

- Jupyter Notebook or Python CLI for testing

- List of matched skills

- Matched education qualifications

- Inconsistent resume formats caused difficulties in text extraction.

- Some PDFs lacked clear structure for parsing.

- Tokenization mismatches due to case sensitivity.

- Normalized all text inputs to lowercase.

- Used reliable libraries like PyPDF2 and docx2txt.

- Defined keyword sets carefully to match common resume terms.

You might also like