A NEW METHOD TO HELP THE HUMAN RESOURCES STAFF TO FIND THE RIGHT CANDIDATES, BASED ON DEEP LEARNING - CV-vers1
A NEW METHOD TO HELP THE HUMAN RESOURCES STAFF TO FIND THE RIGHT CANDIDATES, BASED ON DEEP LEARNING - CV-vers1
The paper presents two useful tools for HR staff in order to find the right
candidates for the company’s needs, automatically and objectively.
• The first one is the Sentiment Analysis on candidate’s opinions about the
company.
• The second one is the Matching between the company’s requirements and
candidates’ CVs.
Both tools use NLP (Natural Language Processing) based on DL (Deep Learning).
The following is the 2-nd tool presented.
The tool/method seeks to find the degree of similarity between the semantics /
meaning of the company's requirement and the semantics of each CV.
A NEW METHOD TO HELP THE HUMAN RESOURCES STAFF
TO FIND THE RIGHT CANDIDATES, BASED ON DEEP
LEARNING
The main steps for resolving the issue of the similarity between two texts are:
• The creation of a very large database with sentences in which each word occurs
in all possible grammatical forms.
• Processing the database in several forms, for example, tokenization, eliminating
the stop point between 2 sentences and so creating a unique very large sentence.
A NEW METHOD TO HELP THE HUMAN RESOURCES STAFF
TO FIND THE RIGHT CANDIDATES, BASED ON DEEP
LEARNING
• Transforming each distinct word from the database into a 300-dimension vector.
The operation is called word2vect or word embeddings.
Two methods are used:
• CBOW (Continuous Bag-of-Words Model),
• skip-gram (Continuous Skip-gram model); skip-gram is used more.
When trained on a large dataset, these models capture substantial amount of
semantic information; closely related words have similar vector representations.
The semantic space vector models the language.
The database created from the vectors corresponding to the words is used in the
next step as a transfer learning.
A NEW METHOD TO HELP THE HUMAN RESOURCES STAFF
TO FIND THE RIGHT CANDIDATES, BASED ON DEEP
LEARNING
• Transforming every text that needs to be semantically analyzed into a 512-
dimension vector, regardless of the size of the text. The operation is called
text2vect or text embeddings. There are several methods used, namely:
• Averaging Networks
• Deep Averaging Networks (DAN)
• Transformer: it uses a self-focus mechanism that directly models the
relationships between all words in a sentence, regardless of their respective
position (the best method).
After training, in the 512-dim space of the vectors you create, the close vectors
correspond to texts that have a close meaning.
• Measuring the similarity of meanings between two texts – it is done by
measuring the angle between the corresponding vectors
A NEW METHOD TO HELP THE HUMAN RESOURCES STAFF
TO FIND THE RIGHT CANDIDATES, BASED ON DEEP
LEARNING
•In the following example, Company Requirement and each CV are converted in their
corresponding 512 –dim vectors and then the semantic similarity between Company
Requirement and each CV is calculated
•Company Requirement : "C++ programmer."
•CV 1: "I’ve worked as a programmer for five years. I have created and customized software applications and tools
using advanced development and coding techniques. I’ve managed all phases of application design, from coding
and prototyping through system testing, integration, and deployment. I’ve developed and enhanced programs
using Java, C, and C++, contributing to solutions that streamlined processes, increased accuracy and lower costs."
•CV 2: "I’ve served as a member of electronic placement project team, developing strategies to ensure compliance
with new industry standards for electronic exchange. I’ve developed custom software solutions that help
customers make more informed decisions, improving their use of capital and manage risk more effectively. “
Results obtained using tensorflow framework with "universal-sentence-encoder/2“
program for similarity are:
Matching between Company Requirement and CV 1 = 0.63089013; matching - OK
Matching between Company Requirement and CV 2 = 0.39129144; no matching - OK
Result = 1 means perfect matching; threshold = 0.5.