Scrape Indeed
What is it?
- Node.js package that allows flexible job searching of Indeed's job postings.
- Uses ES6 Promises to handle asynchronous control flow.
Why use it?
- Allows your web app to get full job posting data from Indeed.
- Can use it for 63 countries (see https://www.indeed.com/worldwide)
- You don't have to deal with Indeed's cluttered interface.
How to use it?
Installation
Install from NPM registry:
npm install scrape-indeed
Basic usage.
// Require our module.const IndeedService = ; // Test with: node test.js 'Programmer' 'Vancouver' 25 50let options = title: processargv2 // Programmer location: processargv3 // Vancouver country: processargv4 // Canada radius: processargv5 // 25 kilometer radius count: processargv6 // 50 job postings; IndeedService;
n
job postings.
That's great. But that only gives us - We can ask for the next
n
ads by usingIndeedService.nextPage()
- We can see which ad index we're currently at by using
IndeedService.parameters.adIndex
- Once we've performed a search, the returned
data
object has a property containing the total number of job postings:data.featuredAdCount
// Require our module.const IndeedService = ; let options = title: processargv2 location: processargv3 country: processargv4 // Canada radius: processargv5 // 25 kilometer radius count: processargv6 // 50 job postings; // Get initial Indeed data using IndeedService.query()IndeedService;
What does the data look like?
Look at the table to see the different kinds of data available.
IndeedService.query() returns an object containing ...
name | datatype | description |
---|---|---|
salaryList | array | List of links to job searches sorted by salary ($50000+, $70000+, etc.) |
jobTypeList | array | ... sorted by job type (SALARY, CONTRACT, HOURLY, etc.) |
locationList | array | ... sorted by location (Toronto, Newmarket, Richmond Hill, etc.) |
companyList | array | ... sorted by company |
titleList | array | ... sorted by job title (Senior Web Developer, Junior Dev, C Developer, etc.) |
jobList | array | List of all main job postings [JSON format] |
Below is an example of what a main job posting is. jobList
contains a list of these main postings.
jobList object contains ...
name | datatype | description |
---|---|---|
href | string | A complete URL to the Canadian job posting |
title | string | Job title of posting |
isSponsored | boolean | Indicates whether the posting is Sponsored. Sponsored ads are seen first/last |
company | string | Company name of job posting |
location | string | Geographical location of job |
salary | string | Indicates salary/hourly wage |
summary | string | Short summary of the job posting |
datePosted | string | Indicates # of days since inception. Maximum is 30+ days |
Known issues?
- Some main job postings will be missing data ('N/A')
- This is because job posters don't provide all information
Backlog
Allow all North American jobs to be searched, rather than only Canada.(0.5.0)Allow a single object to be passed into query function, rather than a separate parameter for each search token.(0.4.0)Allow wide range of job postings to be searched, rather than default 10 per query.(0.4.0)Create NPM registry to enable npm install.(0.3.2)