AutoAI: A Powerful Tool in Detection of Fake Job Posts

9 min readMay 2, 2021

Was there ever a time, when you thought that you were the perfect candidate for a job, you even excelled at the interview, but the phone call never came? Then you are not alone. Every once in a while we all applied for a fake job post, sometimes without even realizing that the position was not for real. Now more than ever it is important that we have tools against fake ads. Due to Covid-19, online hiring processes, remote jobs and online networking sites are on the rise, which makes it easier to find jobs and hire potential colleagues, but it also led to the rise of online job scams. Some are annoying but “harmless”, while other ads can cause real financial loss. But how can we spot fake job posts and avoid frauds? As part of a project we’ve started to work with Watson Studio AutoAI and recognized, with the right set of data, AutoAI in fact can serve as a powerful tool to help spotting potential job scams.

Why do these fake job postings even exist?

When you think about it, really it sounds absurd, why would someone go into so much trouble to post a vacancy, without the intention of ever filling it?

In most cases the employer just wants to collect information about the job market, like finding out how easy it is to replace current workers, how in-demand a position is, or how much to pay for given jobs, to name a few examples. We can’t forget about nepotism either, it has always existed, today’s job market is no exception. When the boss wants to employ his or her relative, even though the position already has been filled, the company will post an ad, in order to show that their hiring process is fair and square. Many recruiters also flood the market with fake job posts to collect CV-s and to widen their network of potential applicants.

While these can be frustrating, in other cases job scams have the soul purpose of stealing your money or even your identity. If they ask for your financial data or want you to transfer money disguised as a registration fee to start the job, then you should be cautious, it is a warning sign of a job scam.

Whatever the reasons though, it’s no consolation for job seekers. Especially when good deal of people have lost their job and are desperate to find a new one, whereupon many have fallen victim to these job scams.

LinkedIn, Indeed, Glassdoor and other big networking sites, give tips how to detect these fake job postings. But lots of people already have lost faith in online job networking sites. It is like a vicious circle. Since there is no filtering of fake vacancy posts, the job market has less credibility, which leads to even more unprofessional behavior by recruiters and employers to post more spams. In return job applicants take it less seriously and will apply to everything to try their luck. With the increase of applications recruiters won’t take the time of day to reply to every one of them, just the promising ones. And the circle goes on and on…

So the big question is, besides raising awareness to the issue and urging people to be more cautious of the job posts they apply for, could these big job boards do a better job policing the jobs posted on their sites?

With Data Science, Yes.

Can we get even faster and more accurate answers?

With AutoAI, Absolutely.

A Kind of Magic

Why AutoAI, you could ask.

When you look up data scientist on job searching sites, as a result you will see, that it belongs to a shortage of professions. Too much of their valuable time is spent on dull and repetitive tasks, which could and should be automated. With better resource management less data scientist would leave their chosen field.

However, AutoAI allows us to optimize the productivity of data scientists. It enables data scientists to gather, read, analyse and interpret large sets of data in a short time. Instead of preparing the data and building the model for weeks, they could focus on the fun part of a data expert’s job, meaning analyzing and evaluating the model. Using Watson Studio’s new feature, you can build and deploy a machine learning model without writing a single line of code. The tool does most of the work for you.

If you are up to it, you can experiment with AutoAI on your own. Don’t worry, we will walk you through step by step, how to use AutoAI like a pro. It is easy to use, takes little time or effort on your part and works like magic.

5 steps to deploying your own AI on IBM Cloud:

Search for interesting datasets
Provisioning a Watson studio service
Creating AutoAI experiment
Run the experiment
Test your predictions

Step 1 search for an interesting datasets

There are thousands of data on which you can predict
Go to kaggle, it’s a community driven site, which focuses on machine learning

Step 2 provisioning a Watson studio service

Click the Catalog button on the top
Select Service from the catalog
Search for Watson studio and click on it

You are now at the deployment page
Select a location and the lite plan for a free experience

After the creation of the service , go to the Resource list
Look for your Watson Studio in the services section and click on it

Click on the “Get Started” icon it will guide you to your Watson Studio

Step 3 Creating the AutoAI experiment

Click create a project
Click create an empty project
Create a new project with a unique project name

On the project page click on the Assets section and upload you data on the right side

Click on Add to project on the top right and select AutoAI experiment

Give a unique name and create the experiment

Select your dataset from your project and proceed with the experiment

Step 4 predict with AutoAI

Select on what to predict on from your data

You can predict on any data category from you dataset

In the experiment settings you can set the type of prediction, optimization of the experiment and the algorithms to use

After creating the experiment a Relationship map will appear, it visualizes the work of the AutoAI

On the bottom there are your pipelines with different amount of good prediction percents, choose the top prediction

Click on save as and choose model as the asset type
Save the model

Step 5 Test your predictions

Navigate to the created model and click on promote to deployment space
Create or use a target space

Navigate to your new deployment page, click on the top left main navigator icon, and look for your deployment
You will see your previously created model
Click on deploy

Create an online deployment with a unique name

Click on you newly created online deployment and choose the Test menu bar
Input some data and let the magic happen, the AI will predict on every available data a precise answer

Conclusion

When we tested AutoAI, we were looking for answers to these questions:

Can an AI be of assistance to detect job scams?
Can AutoAI really predict what was the intention of the recruiter by analyzing the job posts?

Artificial intelligence has revolutionized many industries. Machines and technology have evolved to the point where in addition to being able to anticipate human decision-making, they can predict future events as well. We all heard about how data science and artificial intelligence is used to increase revenue, create better user experience and target users with customized advertisement. But little is said about good implementation of data science like Kaggle, which uses data science to predict poverty levels to identify where the highest need is for social welfare assistance. In Cali, Colombia an open platform is available to the public, which can predict where homicides will most likely occur. A company in the USA has developed a system, which can anticipate a driver’s actions before they happen. Only our imagination can limit how we are using data science and artificial intelligence for social impact. Whether to prevent accidents or crimes AI can be a powerful tool.

We trained and tested the prediction model created by AutoAI, which we have named baby Sherlock. Do not forget, Sherlock needs a Watson, so click here to test the model while Watson is making magic in the background: https://sherlock.kisscloud.eu

After testing you will see, that baby Sherlock is still in its infancy. If the number of disposable data through which the model was taught is limited, it can influence the quality of the algorithm’s outcomes. Therefore it gives us sometimes false positive answers or predicts with lower percentage if a job post is actually fake. With an even bigger data set, imagine the possible outputs.

“The more the quantity and the better the quality of data, the more accurate and reliable the results.”

The data set that we used hails from Kaggle, it contains 18K job descriptions out of which about 800 are fake. The data consists of both textual information and meta-information about the jobs.

The dataset is very valuable as it can be used to answer the following questions:

Create a classification model that uses text data features and meta-features and predict which job description are fraudulent or real.
Identify key traits/features (words, entities, phrases) of job descriptions which are fraudulent in nature.
Run a contextual embedding model to identify the most similar job descriptions.

If big job boards would exploit the potential of AutoAI and similar technologies, it would make really hard for frauds to post scams. It doesn’t mean, that there would be no more fake job ads, the system is not infallible, but job scammers would have less opportunity to prey on those, who are most at risk.