A step-by-step guide to a real-world implementation of a machine learning solution.
Do you remember that rush and excitement after leaving a conference talk or ending an article describing a successful story about Machine Learning (ML) or AI? How easy and straightforward it seemed? And then, back at the office, when trying to replicate the story, feeling frustrated by the lack of time of your team, the absence of quality — “Kaggle-grade” — data, the lack of computing power at your disposal, or experiencing the feeling of a daunting task to implement all best practices such as TDD, CI/CD, DevOps, MLOps, Agile project management, let alone ethical AI?
This is our first post of a series of articles about our journey: how we formalized the business problem, the challenges we faced, technical details about our data pipelines, ML techniques used, validation methods, and how we implemented some of the best practices (CI/CD, MLOps) to deliver a full-blown product to our colleagues. In this first post, we describe our first NLP project applied to one important aspect of our core business, which is recruitment.
Because this experience summed up many aspects of ML projects, from stating the problem to the deployment of the solution, and because it was conducted in the usual conditions SMEs would face (time and resource constraints), we were willing to share it with a broader audience.
So let’s dive in!
Step 1: Problem Statement
To relate a journey that leads us somewhere, it is useful to describe where we come from. Qim is a consulting company operating in Switzerland (Geneva and Lausanne, and recently Zürich) and France. With about 300 employees, it ranks in the top 10 of the consulting firms in Romandy. Our clients send us requests for many different types of profiles: analysts, developers, data engineers, architects, DevOps, project managers, …
As new technologies emerge and, more importantly, are adopted by companies and agencies, the need for professionals mastering them becomes more pressing. As a consulting firm, we need to hire the best profiles to successfully meet our customers’ expectations and needs. Therefore, our recruiters have the difficult responsibility to find the perfect match between customer demands and candidates’ skills and interests.
Considering the type of profiles sought after, it requires very seasoned recruiters to understand all these CVs in detail and spot the right candidate. Recruiters often get to skim through many — sometimes hundreds — CVs to filter out potential candidates. As you can imagine, this process is time-consuming and can be frustrating in some cases.
The team of Data Scientists and ML Engineers at Qim has a strong experience supporting our customers in their data and analytics projects and our newly created Center of Expertise enabled us to apply our NLP skills to this recruitment challenge.
Because good data science begins with a good understanding of the business needs, we began by working closely with the recruiters: for every job description they work on, recruiters must find relevant candidates.
Therefore, we expressed our recruiters’ need as follows:
“Recruiters want a system that takes a job description as input, and returns the relevant CVs from our database”
Step 2: Assess possibilities
Now that we formulated the problem so that it states clearly “what needs to be done”, we will quickly check what we have at our disposal to solve it:
From Table 1, we observe that we have most of what is needed to conduct a successful AI project except Clean Data and Labeled Data. Even though a good data processing pipeline will help us get clean data (we will dedicate an article to the pipeline we implemented on AWS), the lack of labeled data is definitely an issue. Indeed, how are we supposed to train a machine learning system without valid examples?
The first, obvious, solution to this issue consists of manually selecting relevant candidates for a large and representative set of job descriptions, and record those matches as labels. Although this option would help us solve the issue directly, it would require a lot of time and effort. We will therefore consider a second option before going down that road: redefine the problem to use more accessible information, as described in the next section.
Step 3: Formalize the problem
Now that we know what needs to be done and have validated that we have the appropriate elements (commitment, skills, data, resources), let’s see how we can achieve our goal.
Because we lack proper Labeled Data, we adapted the original recruiters’ need as follows:
“Recruiters want a system that takes a job description as input, and orders the CVs from our database by relevance”
Doing so, we preserve most of the value for our recruiters, who will still be able of finding relevant candidates more quickly, while reducing the overall effort to build a Labeled Dataset. Indeed, we can now formalize our problem as a sort, which implies that we should somehow evaluate the similarity between documents (job description and CV).
Among many, a very popular method of evaluating document similarity relies on word embeddings, which are learned representations of the meaning of words directly from their distributions in texts. These representations are used in every natural language processing application that makes use of meaning (Jurafsky & Martin, 2020). In a future post, we’ll describe the process of designing, training, and evaluating document sorting methods relying on word embeddings in a controlled setup. For the time being, let’s assume that we have a method to compute a similarity between two documents. How can we make sure that this measure is helpful to solve our problem?
Step 4: Train and evaluate models
We still miss a proper way to validate that we correctly sort our entire CV database given a job description. Again, it would require a large amount of time and effort to produce a dataset directly.
We started looking at some of the CVs we had access to, and because Qim is active in the IT space, most of the candidates could be associated with one or more of these job roles: front-end developer, back-end developer, architect, project manager, analyst, security professional, system administrator, support staff, or data professional. Because job roles are very relevant to recruiters at the time of considering a candidate, we used those as labels to evaluate our models. In essence, two documents sharing a common label are deemed to be similar.
Although we now have a simple way of creating Labeled Data and evaluate models, these are still proxy values and proper evaluations on actual job descriptions must be conducted. However, this requires the help of our recruiters, whose time is best used at finding candidates for actual job descriptions than at evaluating models, which is why only the best model according to our proxy benchmark was evaluated using their help.
To conduct this real-world evaluation, we designed and developed a minimal UI that would enable participating recruiters to use our model for their actual research and enable them to provide feedback about the results.
Step 5: Integrate the model in the workflow
There are many ways to integrate a model into existing workflows and many considerations drive the decision. In our case, because the tool used by Qim’s recruiters is developed by Qim, we have more liberty when it comes to adapt the workflow to best integrate our new model.
For example, recruiters add the new job description into the system before starting the search for candidates to link the interesting CVs to that job description. In this case, we could call the model as soon as the job description’s PDF is uploaded, providing recruiters with results ordered by relevance during their search. With additional modifications of the existing UI, relevant candidates could even be shown as soon as the model’s results are available, saving even more time for the recruiter!
This kind of integration is very different from the minimal UI discussed in the previous section because that interface was made to collect feedback and not to streamline the job. However, it is a good step to take to evaluate if the model is worth the integration development costs.
When integrating the model, it is useful to plan for the monitoring of the model and prepare for its retraining. Even though this significantly increases integration costs, planning for issues such as new job roles, techniques and frameworks, and future improvements such as accepting CVs in German or Italian in addition to French and English will lower maintenance and model improvement costs. We will present the techniques and tools we used to implement this MLOps cycle in a future post.
Conclusions
In this article, we briefly presented our 5 steps approach applied to our business problem. We expose some of the considerations we had to build our MLOps framework. If we had to map those 5 steps to the traditional MLOps cycle, it would be looking as follows:
This introductory post will be followed by:
- an in-depth description of our pdf processing pipeline
- the algorithm and evaluation techniques that we used
- the MLOps tools used to implement the cycle.
We hope that you’ll appreciate those posts too!