How to screen machine learning skills
Do you need to hire someone with machine learning skills? Not sure what it really is?
Machine learning is the process of enabling computers to perform tasks that have until only recently, been carried out exclusively by humans.
Before the days of functional machine learning, software and computer systems only knew the information that a programmer would tell it. The result is a software system that is unable to innovate, and that must be given commands to function.
Machine learning allows organizations to transform large data sets into statistical knowledge and actionable intelligence. This valuable knowledge can be integrated into everyday business processes and operational activities to respond to changing market demands or business circumstances. Aside from automating repeatable tasks, companies globally use machine learning to help improve their businesses’ operations and scalability.
As machines possess a much wider scope of data processing ability than humans, it is possible for them to organize and scan data far more quickly than any person can. It not only creates more useful software but also more effective software.
This is super relevant to a hiring manager without a strong technical background. It’s their role to decide if a candidate has the right machine learning skills that are necessary to be successful. So let’s delve a little deeper into machine learning and the best ways to screen a machine learning expert.
What is machine learning?
Machine learning is a subset of AI. That is, all machine learning counts as AI, but not all AI counts as machine learning.
Machine learning algorithms use statistics to find patterns in usually large amounts of data. Data, in this instance, encompasses a wide range of things—numbers, words, images, clicks, anything that can be processed by a computer. Basically, If it can be digitally stored, it can be fed into a machine learning algorithm.
Machine learning is essentially a form of ‘self-programming’. Machine learning algorithms automatically build a mathematical model using sample data – also known as “training data” to innovatively make decisions. A machine learning model is a program that has been trained to recognize certain types of patterns. You train a model over a set of data, providing it an algorithm that it can use to reason over and learn from those data. These decisions are made without the need to be humanly programmed, and voila Artificial Intelligence at your fingertips.
1.1 What is AI?
Artificial intelligence is the concept of computer systems performing tasks normally requiring human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages.
In artificial intelligence, machines mimic cognitive functions that are associated with human minds, such as ‘learning’ and ‘problem-solving’.
1.2. What is machine learning used for?
We use the power of machine learning for a variety of modern-day services: recommendation systems like those on Netflix, YouTube, and Spotify; search engines like Google and Baidu; social-media feeds like Facebook and Twitter, and voice assistants like Siri and Alexa. The list is endless.
While using these services, each platform is collecting as much data about you as possible. For example, what genres you like watching, what links you are clicking and which statuses you are reacting to. This data is then used to create algorithms that make calculated inferences about what you might want next. This process is actually quite basic: find the pattern, apply the pattern. However, it is ubiquitous in almost all technology we access today.
Other uses for machine learning include making predictions, (e.g. future user purchase behavior, credit risk, fluctuations in the housing market), to detect anomalies (e.g. when wire fraud is committed or factory equipment is close to failure), or to generate new content (e.g. translate text in a foreign language, find the best route to a location, guide a robot that automatically cleans surfaces).
1.3. What is the function of a machine learning engineer?
Someone with machine learning skills is usually referred to as a machine learning engineer. The role is quite new, although the term ‘machine learning’ was first coined in 1959 by Arthur Samuel, an American pioneer in the field of computer gaming and artificial intelligence.
A machine learning engineer is primarily responsible for building, developing, and maintaining a business’ machine learning models.
The role encompasses selecting the right machine learning method for the company as well as the preferred method of model evaluation. The engineer is also responsible for quality control and overseeing the deployment to the production stage. After production, the ML engineer will monitor and adjust the model according to the changing market situation. A list of their responsibilities include:
- Running machine learning experiments using a programming language with machine learning libraries,
- Deploying machine learning solutions into production,
- Optimizing solutions for performance and scalability,
- Data engineering, i.e. ensuring a good data flow between database and backend systems,
- Implementing custom machine learning code,
- Data analysis.
1.4. Are positions in machine learning similar to any other jobs?
A machine learning engineer role is a specialized position, similar to a data scientist – but a data scientist is trained to perform more varied tasks.
While there’s overlap, data scientists with software engineering backgrounds often move into machine learning engineer roles. Data scientists focus on analyzing data, providing business insights, and prototyping models, while machine learning engineers focus on coding and deploying complex, large-scale machine learning products.
What is important for an IT Recruiter to know about machine learning?
The implementation of machine learning essentially means a system is no longer limited by the programmers’ human vision anymore. Now, a machine is able to learn its own methods through new and innovative processes that programmers or analysts may not have even considered.
This is very useful because it allows programmers to create software with a specific goal in mind, without having to focus on the entire process of how it does so.
Finding ways to program computers to interpret such vast amounts of information has become challenging for even the best programmers to execute. Machine learning allows for the creation of methodologies beyond human planning and foresight.
2.1. How often does the environment/challenges faced change?
The landscape of machine learning changes constantly. Data is always getting bigger, the problems always harder, so new techniques are developed and new frameworks will follow.
2.2. Are there many resources/tools/technologies (libraries, frameworks, etc.) available for machine learning?
A lot of tools for machine learning are available in the Python language, while R is less common. Some deep learning frameworks are available in C++ or Java, because it’s faster and more memory-efficient than Python. In Python, the most popular libraries include pandas, scikit-learn, PyTorch, and TensorFlow.
2.3. What machine learning skills, tools, and techniques should an engineer be familiar with?
A successful machine learning engineer should possess a great mathematical mind. Also, they must be an expert in both programming and statistics where they use their problem-solving skills to have a depth of knowledge of machine learning models. Python is the undisputed lingua franca of Machine Learning.
2.4. What AI skills, tools, and techniques should a machine learning engineer be familiar with?
A good understanding of programming languages, preferably Python, R, Java, and C++. It is recommended to have a good understanding of the concepts of Matrices, Vectors, and Matrix Multiplication. Moreover, knowledge in Derivatives and Integrals and their applications is essential to even understand simple concepts like gradient descent. A solid foundation and expertise in algorithm theory are surely a must.
Having experience with neural network architecture is the most precise way of countering many problems like Translation, Speech Recognition, and Image Classification, which plays a pivotal role in the AI department.
Good communication and rapid prototyping skills as well as possessing a wide domain knowledge are essential for a machine learning engineer.
2.5. What type of experience is important to look for in a machine learning engineer?
For research only projects — academic or scientific experience will be the most crucial and well-rounded. But in terms of creating production models — previous experience with working with other models of production will give you the best insight.
How to verify machine learning skills in the screening phase?
Most recruiters prioritize skills testing when looking for the ideal candidate. Ultimately, hiring someone who lacks technical skills can be a costly mistake. However, successful machine learning engineers also have valuable traits that a skill test alone cannot identify. A lot of these you can’t learn from a book.
So, what are they and how do you identify them?
Also quite ironically, firms and recruiters are increasingly turning to AI and machine learning-based solutions to find the right hire.
3.1. What to take into account when screening a CV?
Machine learning engineers should be fluent with mathematical and statistical concepts including linear algebra, multivariate calculus, variance, derivatives, integrals and standard deviations, etc.
They must also know the basic concepts of probability like the Bayes rule, Gaussian mixture models, and the Markov decision processes. Previous experience with machine learning libraries is a must.
The candidate should have a computer science/software engineering background and be fluent in at least one programming language with sufficient coding experience claims Tsisana Caryn, HR specialist from Assignment Writing Services. It’s vital to have an in-depth understanding of computer science concepts like data structures, computer architectures, algorithms, computability, and complexities.
Be sure to check whether the candidate has decent business acumen and a well-rounded understanding of business fundamentals and principles. The candidate being able to quantitatively list their achievements within an organization will be a big advantage.
3.2. What glossary terms are important to know in machine learning (including frameworks, libraries, and language versions)?
- Classical machine learning – solving tasks using models like linear or logistic regression, decision trees, random forests, boosting, support vector machines, non-negative matrix factorization, K-means, k-nearest neighbors.
- Neural network – a kind of machine learning inspired by the workings of the human brain. It’s a computing system made up of interconnected units (like neurons) that processes information by responding to external inputs, relaying information between each unit. The process requires multiple passes at the data to find connections and derive meaning from undefined data.
- Deep learning – solving tasks using neural networks (like mimicking the brain). Some types of neural networks include convolutional neural networks and recurrent neural networks. Deep learning has use in detecting objects, recognizing speech, translating languages, and making decisions. Deep learning AI is able to learn without human supervision, drawing from data that is both unstructured and unlabeled.
Data manipulation libraries | In Python: NumPy, pandas In R: dyplr, tidyr |
Distributed data manipulation libraries | In Python: Dask in Scala, Java e Python: Spark |
General machine learning libraries | In Python: scikit-learn In Python, R, Java, Scala, C++: H2O.ai In R: caret, e1071 |
Deep learning libraries | In Python: Keras, Tensorflow, PyTorch In R: Nnet In C++: Caffe |
3.3. Which certifications are available and respected? How useful are they in determining machine learning skills?
There’s a lot of things being said about certificates not being of much importance to recruiters. On the contrary, the certification does prove you know the subject to a high level and also indicates that you are motivated to keep on learning. Plus, engineers are able to add the project work to their portfolio. Some respected courses include:
- Machine Learning Certification by Stanford University (Coursera)
- Artificial Intelligence (Northwestern | Kellogg School of Management)
- Machine Learning with TensorFlow on Google Cloud Platform
- Artificial Intelligence: Business Strategies & Applications (Berkeley ExecEd)
- Deep Learning Certification by DeepLearning.ai – Andrew Ng (Coursera)
- Machine Learning Data Science Certification from Harvard University (edX)
- Machine Learning – Data Science Certification from IBM (Coursera)
- Professional Certificate Program in Machine Learning & Artificial Intelligence (MIT Professional Education)
- Machine Learning Certification (University of Washington)
3.4. What other lines on a CV can show machine learning skills?
Taking part in machine learning competitions can also be a great advantage. Platforms such as Kaggle.com, topcoder.com, crowdai.org, and knowledgepit.ml all offer the chance to compete for awards in the space.
Browsing a candidate’s LinkedIn and GitHub accounts can be useful to gauge the outline of a candidate as well to see their proficiency on open-source projects.
Technical screening of machine learning skills during a phone/video technical interview
Those applying for machine learning jobs can expect a number of different types of questions during an interview, says Colin Shaw, director of machine learning at RevUnit.
“Good machine learning engineers have a blend of a variety of skills and also know how to fuse this knowledge into code that can be taken to production. The general areas of interest that we look for include mathematics and statistics, machine learning and data science, deep learning, general knowledge and problem solving, and computer science and programming.”
4.1. Questions that you should ask about an MLE’s esperienza. Perché dovreste porre ciascuna di queste domande?
- Can you describe the kind of machine learning problems you have solved?
This is a warm-up, introductory question, but also shows the extent of the candidate’s knowledge in the field. As there are a wide range of varied problems, it’s best to find people that have had experience with the issues that you’re recruiting for. - What kind of machine learning models have you used in the past?
Aimed at finding out the extent of the engineer’s knowledge in specific ML techniques. There is a substantial difference between classical ML algorithms and deep learning algorithms, so knowledge of one doesn’t imply knowledge of the other. - What’s the most interesting project you’ve ever worked on?
This is a good question because it gives candidates a chance to talk about something they are passionate about and show off their knowledge about something that they know very well. Plus, it helps nervous candidates feel more comfortable and showcases their best qualities. - How long was the project? Have you taken it into production and/or developed the model further?
Designed to check if the engineer has previous experience with productionizing of machine learning models, which has a specific subset of challenges that would be otherwise unknown.
4.2. Questions that you should ask about an MLE’s conoscenze e opinioni. Perché dovreste porre ciascuna di queste domande?
- Come si fa a verificare che un modello funzioni correttamente?
The ideal methodology is to split the dataset into sections: training set, validation set, and test set. The training set is the only one available to the model and is the basis of the training process. The model’s parameters are set using the validation set and model efficiency is tested on the test set. - What are the differences between classical ML models and deep learning models?
Deep learning models always use neural networks and do not require as much of feature engineering as classical models. However, they usually require larger training sets to learn patterns than classical models. - What ML library/libraries would you use for a dataset consisting of images?
Currently, the best approach for image data is to use either OpenCV – a library that allows for extensive image manipulation. As well as any kind of deep learning libraries like: Keras, Tensorflow, pyTorch, Caffe.
4.3. Comportamento questions that you should ask an MLE. Why should you ask each of those questions?
- What kind of problems would you like to solve in the future? What kinds of ML models would you like to use?A question to check the candidate’s preference for models/ problems, or to see if they have a specialization and which area they might perform best. This question can also help conclude how a candidate plans to develop in the machine learning field.
- Where do you find information about new machine learning techniques?
This question is asked to find out how involved or uninvolved a candidate is in the technology community, and in learning new skills in a constantly-evolving field. Any of these sources are worthy: conference papers, workshop papers, MOOCs, Facebook or mail groups with a machine learning theme, or even learning from a mentor. - What do you consider to be your greatest success and biggest failure in the machine learning field?
A pretty generic question, but it shows the self-reflection skills of the candidate. This is necessary in the learning process which is a major part of being a great machine learning engineer.
5. Technical screening of an MLE’s skills using an online coding test
Hiring a good machine learning engineer remains a challenging task for recruiters – not only because of the scarcity of ML talent but also due to a lack of relevant experience among recruiting specialists. Machine learning remains a new and obscure field for most recruiters. We’re going to show you how best to screen for a machine learning engineer!
5.1. Which online test for machine learning skills should you choose?
Quando si cerca il giusto machine learning skills test è necessario assicurarsi che corrisponda ai seguenti criteri:
- Il test riflette la qualità del lavoro professionale che viene svolto
- La durata non è eccessiva, da una a due ore al massimo.
- Il test può essere inviato automaticamente ed è di natura semplice.
- Il livello di difficoltà è adeguato alle capacità del candidato.
- Il test va oltre la verifica del funzionamento della soluzione: controlla la qualità del codice e il suo funzionamento nei casi limite.
- È il più vicino possibile all'ambiente di programmazione naturale e consente al candidato di accedere alle risorse pertinenti.
- Fornisce al candidato l'opportunità di utilizzare tutte le librerie, i framework e gli altri strumenti che incontra regolarmente.
5.2. DevSkiller ready-to-use online machine learning skills tests
DevSkiller coding tests use our RealLifeTesting™ methodology to mirror the actual coding environment that your candidate works in. Rather than using obscure academic algorithms, DevSkiller tests require candidates to build applications or features. They are graded completely automatically and can be taken anywhere in the world. At the same time, the candidate has access to all of the resources that they would normally use including libraries, frameworks, StackOverflow, and even Google.
Le aziende utilizzano DevSkiller per testare i candidati utilizzando la propria base di codice da qualsiasi parte del mondo. Per semplificare le cose, DevSkiller offre anche una serie di test di competenze di data science già pronti, come quelli qui riportati:
- Competenze testate
- Durata
- 104 minuti al massimo.
- Valutazione
- Automatico
- Panoramica del test
-
Domande a scelta
valutare la conoscenza di Keras, Apprendimento automatico, Pitone
Attività di programmazione - Livello: Medio
Python | NLP, Keras | Analisi del sentiment delle recensioni dei clienti - Eseguire l'analisi del sentiment e l'etichettatura delle recensioni dei clienti di film e compagnie aeree, utilizzando un modello di rete neurale multi-output.
- Competenze testate
- Durata
- 72 minuti al massimo.
- Valutazione
- Automatico
- Panoramica del test
-
Domande a scelta
valutare la conoscenza di Apprendimento automatico, Apprendimento per rinforzo
Attività di programmazione - Livello: Medio
Python | PyTorch | Reinforcement Learning | Deep Q-Network - Completare l'implementazione dell'algoritmo DQN.
- Competenze testate
- Durata
- 63 minuti al massimo.
- Valutazione
- Automatico
- Panoramica del test
-
Domande a scelta
valutare la conoscenza di Apprendimento automatico, PyTorch
Attività di programmazione - Livello: Facile
Python | PyTorch, Computer Vision | Model Builder - Completare l'implementazione di una pipeline di addestramento dei modelli.
- Competenze testate
- Durata
- 70 minuti al massimo.
- Valutazione
- Automatico
- Panoramica del test
-
Domande a scelta
valutare la conoscenza di Apprendimento automatico, Pitone
Attività di programmazione - Livello: Medio
Python | Analizzatore di DNA | Creare e pulire filamenti di DNA - Implementare 2 metodi in Python per creare e pulire filamenti di DNA.
- Competenze testate
- Durata
- 49 minuti al massimo.
- Valutazione
- Automatico
- Panoramica del test
-
Domande a scelta
valutare la conoscenza di Apprendimento automatico
Attività di programmazione - Livello: Facile
Python | Analizzatore di DNA - Implementa un metodo in Python che genera rapporti statistici sul DNA.
- Competenze testate
- Durata
- 80 minuti al massimo.
- Valutazione
- Automatico
- Panoramica del test
-
Domande a scelta
valutare la conoscenza di Apprendimento automatico, Pitone
Attività di programmazione - Livello: Medio
Python | Analizzatore di DNA | Creare e pulire filamenti di DNA - Implementare 2 metodi in Python per creare e pulire filamenti di DNA.
- Competenze testate
- Durata
- 80 minuti al massimo.
- Valutazione
- Automatico
- Panoramica del test
-
Domande a scelta
valutare la conoscenza di Apprendimento automatico, Pitone
Attività di programmazione - Livello: Medio
Estrazione ed elaborazione dati in Python - Completare e aggiornare il codice del programma che estrae i file PDF e li converte in un formato specifico per la visualizzazione/output.
- Competenze testate
- Durata
- 102 minuti al massimo.
- Valutazione
- Automatico
- Panoramica del test
-
Domande a scelta
valutare la conoscenza di Apprendimento automatico, Android
Attività di programmazione - Livello: Medio
Android | Accesso ai social network - Implementare le sezioni mancanti di LoginActivity e MainActivity, LoginManager e CredentialsStorage.