Matching Keywords, Partial Keywords, and Keyword Logic

March 14, 2012 by Jon Ciampi · 5 Comments 

Keyword logic explainedOn a conceptual basis, job keywords make sense to everyone.  However, when you dig into the details of how computers identify keywords, many challenges arise.  I am going to explain the logic to clear up the confusion and hopefully start an ongoing discussion in the comments section about keywords.  I will use laymen terms to remove the complexity of understanding computational linguistics.  For those looking for more info on the subject, a good introductory article from Microsoft is here.

Computers Look for Different Keywords than Humans

As for the job keywords, lots of advice suggest focusing on industry keywords and functional keywords.  While this advice is still valid, a new era has emerged where computers are looking for keywords in a résumé before a hiring manager reviews a résumé.

Unlike humans, computers do not try to decipher meaning from individual words (e.g., does “manage” mean managing people or managing products).  Instead, they apply complex mathematical formulas to determine the words and phrases that can precisely and compactly represent the content of the job description.  Then, these phrases are searched for in the résumé.  Based on the search, a complex ranking system is used to compare one candidate’s résumé to anothers’.  Complex?  Yes!

I think a good example of how a computer identifies keywords is to use a sample job description.  Let’s focus on just three lines of the Requirements section:

Requirements:
Bachelors degree in a relevant scientific discipline or equivalent.
At least 2 years of relevant experience as a CRA in the biotech / pharmaceutical industry.
3+ years CRA experience is preferred 
Knowledge of GCP and ICH guidelines

Computers identify keywords by determining how often phrases are used among other job descriptions, then the computer looks for the phrases in a candidate’s résumé and ranks the candidate based on the findings.  To understand the process, we can break it into 4 parts: 1. Computer must identify keyword phrases, 2. Computer must determine frequency of the phrases, or “significance”, 3. computer searches résumé for matching or partially matching keyword phrases, 4. computer assigns a rank based on the matches and partial matches.

1. Identifying the Keyword Phrases

The computers first begin by analyzing the job description to identify all the keyword phrases in the job description.  What is a “phrase”.  A “phrase” is one or more words in succession from the job description.  Phrases can be single words like “CRA” from our example, or longer strings of words like “Bachelors degree in a relevant scientific discipline.

2. Determine Frequency of the phrases

With the phrases identified, next, the computer identifies how many times that phrase is found in all the other job descriptions.  The more it is found, a higher score is assigned to the phrase.  The less it is found, the lower the score.  For instance, let’s look at the first two lines of our sample job description.

Requirements:
Bachelors degree in a relevant scientific discipline or equivalent.
At least 2 years of relevant experience as a CRA in the biotech / pharmaceutical industry.

If we had 10 other job descriptions and counted the frequency of the phrases, we might end up with something like this:

Phrase Frequency
Bachelors

10

Bachelors degree 10
Bachelors degree in 10
Bachelors degree in a 8
relevant 10
relevant scientific 7
relevant scientific discipline 5
relevant scientific discipline or equivalent 4
At 10
At least 8
At least 2 3
At least 2 years 3
At least 2 years of 3
At least 2 years of 3
relevant 10
relevant experience 10
relevant experience as 10
relevant experience as a 10
CRA 2
CRA in 1
CRA in the 1
CRA in the biotech 1
CRA in the biotech pharmaceutical 1

What the computer does is start with a word and get a count for its frequency (i.e., how many times was it found in all job descriptions). Then, it will add on additional words and get a count.

Once the frequency is determined, then the computer decides what are the keywords for a job.  With the information we have above, we could claim the words that show up less frequently are the most important phrases for this job, and the words that show up more frequently are too generic. For instance, if a phrase appears in 10 job descriptions, we may think this is not important (this is the case with “Bachelors degree”).  However, the phrase “CRA in the biotech pharmaceutical” is very unique to this job.  Therefore, we could assert “any phrase with a count of 3 or less is a keyword phrase”.

But these don’t look like Keywords

The reason it appears to be incomplete phrases or gibberish is due to the added words in the phrase that make it less frequent. The computer is not looking for grammatical or commonly used phrases.  For example, you may believe “2 years of experience” is the keyword but a computer may say “At least 2 years of” is the keyword, because “2 years of experience” shows up in too many job descriptions.

3. Searching Résumés for Keyword Phrases

Once the computer has a set of keyword phrases, next is searches a résumé for the keywords.  If it finds a match or partial match, it will give it a score.  Let’s use “at least 2 years experience” as the keyword phrase.  If the résumé had “I have more than 2 years experience in…”, we would get a partial match with “2 years experience” being the overlap.  Changing tense of a word, and/or adding or removing plurality or possession will result in getting a partial match.  Partial matches are not bad.  It is unlikely any résumé will match the job description exactly without copying it word for word. Therefore, the goal is to eliminate any missing keywords and fill your résumé with matched and partially matched keywords.

4. Assigning a Résumé Rank (i.e., Job Fit Rating or Resumeter Rating)

Once the computer has a list of all the matched and partially matched keywords, the computer assigns a rank or value.  The rank is weighted based on the matches and the frequency of the keyword phrase.  A keyword phrase that is less frequently found will get a higher weight than a keyword phrase that is more frequently found.  An exact match will get a higher weight than a partial match.  Within the partial match, the closer to the exact phrase you can get, the higher the rank.  The computer takes all of these into account and assigns a weighting.

For every résumé that comes in, a rating can be assigned.

Does it work?

Using our example, let’s do a simple test to see if the process works.  Let’s assume we get hundreds of résumés. If 10 résumés have “CRA” or “CRA in the biotech” versus 100 résumés that have “bachelors degree”, a hiring manager could quickly narrow the applicant list to just 10.  While the hiring manager may miss out on a strong candidate who does not have this term, they do avoid having to read through hundreds of résumés.   There are plenty of arguments on why this may not result in the best hiring decisions, but in today’s economy where employees are required to do more with less, these systems are here to stay.

 

About Jon Ciampi
Jonathan Ciampi is the President and founder of Preptel Corporation. Before Preptel, Mr. Ciampi worked for eight years at SumTotal Systems, a talent management software solutions company, where he most recently served as General Manager and Vice President of the Performance Management and OnDemand Solutions divisions. Prior to this role, Mr. Ciampi was Vice President of Global Marketing. Mr. Ciampi began his career at Wells Fargo Bank and Oracle before founding his own company, nSeconds Corporation that was acquired by HireRocket.

Comments

5 Responses to “Matching Keywords, Partial Keywords, and Keyword Logic”
  1. jwalser says:

    Hi Jon,

    The term “job description” seems to be used interchangeably with “resume” in the article.

    For example, “Computers identify keywords by determining how often phrases are used among other job descriptions, then the computer looks for the phrases in a candidate’s résumé and ranks the candidate based on the findings.” Shouldn’t this be, “Computers identify keywords by determining how often phrases are used among other resumes submitted for the job…”?

    VN:F [1.9.3_1094]
    Rating: 0.0/5 (0 votes cast)
    VN:F [1.9.3_1094]
    Rating: -1 (from 3 votes)
  2. ts says:

    Hi jwalser,

    I think “Computers identify keywords by determining how often phrases are used among other job descriptions, then the computer looks for the phrases in a candidate’s résumé and ranks the candidate based on the findings.” is the right one, because the computer looks for what’s unique in the job description (what’s in this particular job description and not in others -other job descriptions- ).

    VA:F [1.9.3_1094]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.3_1094]
    Rating: +2 (from 2 votes)
  3. Shantun Thakur says:

    Is this service available in NZ? If not, how can I utilize this service for job hunting in NZ?

    VN:F [1.9.3_1094]
    Rating: 0.0/5 (0 votes cast)
    VN:F [1.9.3_1094]
    Rating: 0 (from 0 votes)
  4. Jon Ciampi says:

    You can use this in NZ. We don’t provide any translation so you may see a few inappropriate “Z”, but otherwise there should not be any issues.

    VN:F [1.9.3_1094]
    Rating: 0.0/5 (0 votes cast)
    VN:F [1.9.3_1094]
    Rating: 0 (from 0 votes)
  5. Mike Precure says:

    This is certainly an interesting point of view. It also explains why I get so few responses to applications I send to those sites that use an ATS. But, with so many high quality candidates filtered out, I would think the savings of using an ATS would be negated by the cost of missing high quality applicants very quickly.

    Are there any studies on this?

    - Mike

    VN:F [1.9.3_1094]
    Rating: 0.0/5 (0 votes cast)
    VN:F [1.9.3_1094]
    Rating: 0 (from 0 votes)

Speak Your Mind

Tell us what you're thinking...
and oh, if you want a pic to show with your comment, go get a gravatar!