Advice for people wanting to learn ML

First things first: consider alternatives#

Machine Learning is a tremendously exciting field. But at the moment there is a slight oversupply of Junior-level Machine Learning Engineers or Data Scientists. There is however a huge shortage of Data Engineers and Python Software Developers. If you care more about your career path, opportunities, and salary, you can also choose to NOT work in ML. If you care about data and statistics, you should choose ML though.

For a combo of both, I’d recommend becoming a Full Stack Developer/Data Engineer/Python Dev with a good understanding of the basics of ML and that can easily work with ML Engineers and talk about embeddings, data splitting, model versioning, training vs inferencing, feature stores, …

Other people seem to agree:

Is it still a good time to get into ML? I believe that the AI hype is real and at some point, it has to calm down. That point might have already happened. However, I don’t believe that ML will disappear. There might be fewer companies that can afford to do ML research, but there will be no shortage of companies that need tooling to bring ML into their production.

If you have to choose between engineering and ML, choose engineering. It’s easier for great engineers to pick up ML knowledge, but it’s a lot harder for ML experts to become great engineers. If you become an engineer who builds great tools for ML, I’d forever be in your debt.

Source

If you still want to proceed, start with getting general exposure#

This section can be used as stand-alone assets for business people or starters and is mostly nontechnical, or intro-level technical.

I would recommend the following assets to people who want to learn a business-level understanding of Machine Learning. This would provide enough exposure to be able to know most trends and common terminology without diving into any technical details or requiring programming.

General introduction to Artificial Intelligence

100 slide self-explaining presentation by Jason Mayes (2018)
MIT online course - Artificial Intelligence: Implications for Business Strategy
Coursera online course - AI for everyone by Andrew Ng
Udemy online course - Machine Learning A-Z (Hands-on Python and R). This is obviously a bit technical.

General introduction to Natural Language Processing using Transformers

Financial times’ intro to Generative AI
Visual Explainer: Attention
The illustrated Transformer
Code explanations of multi-headed attention, positional encoding
Let’s build GPT: from scratch, in code, spelled out. by the great Andrej Karpathy

After that, become an ML Engineer#

Overview#

This 4-year course roadmap gives a good overview of all the fields you could learn. That doesn’t mean you have to learn all of them, but you should know the basics of all of them and dive deep into your favorites.
The ML Engineer 2020 Roadmap is more skill tree based.
Teach yourself programming in 10 years is widely acclaimed advice on how to become a software engineer.

Courses#

The first thing you should do is learn Python or be proficient in another high-level programming language. Without that, I believe you can’t start.

Next, depending on your preferred learning style, I would recommend

If you like to learn at your own pace:
- First, start with the DataCamp ML Scientist with Python and Data Engineering with Python tracks.
- Subsequently, I’d recommend Machine Learning and Deep Learning by Andrew NG
If you like to learn by doing:
- Get the fundamental theory in first.
- Kaggle competitions
- Reimplement papers
  - Side projects (if you need ideas, ping me)
  - Unpaid internships
If you like to learn with others.
- (Belgium only) I would recommend the BeCode AI Bootcamp which is a great initiative that Faktion has supported financially and educationally since its inception. BeCode graduates have been hired by Faktion in the past too.

Note that I haven’t done any of these courses myself, it’s only from second-hand hearing.

Essentials skills for a Junior ML Engineer#

When we interview a Junior ML Engineer, here is what we expect them to know:

Strong and deep fundamentals of the theory around Machine Learning and Deep Learning
Programming in the Python stack (pandas, numpy, tensorflow, keras, scikit-learn, typing, …)
Software Engineering best practices like writing clean code, using git, documentation, object oriented and functional programming, api’s, …
Standard software engineering tools like working with the IDE (VSCode / PyCharm), git, bash, RegEx, ..
Basics of working with the command line
- Linux CLI for Data Science
- Game format (fun for all levels!)

Some skills are not expected but could be a nice bonus. They can be learned on the job too.

Sensor data; AHRS, Kalman filters, particle filters, etc.
Cloud platforms like Microsoft Azure, Google Cloud Compute, and Amazon AWS
Docker, Kubernetes, TF Serving, continuous integration, microservices architecture, …
Open source contributions

Finally, specialize!#

After understanding the basics and having had exposure to multiple fields gaining broad knowledge, you should naturally have spent time to gain deep knowledge too.

Natural Language Processing#

Personally, I find this the most interesting field in Artificial Intelligence at the moment.

Start with learning sequence-based deep learning for NLP
- the Stanford NLP with Deep Learning YouTube course
- DeepLearning.ai NLP by Andrew NG on Coursera
Next, any state-of-the-art NLP typically relies very heavily on pre-training using Transformers. Transformers are crucial to understand deeply and I’d recommend the following assets to learn
- Deep dive into transformers is a very visual and detailed explanation
- Attention is all you need paper is one of the most important papers
- Intro to BERT 25 mins lecture.
- Overview of current knowledge about Transformers (November 2020). This includes sections on compression (distillation), pretraining tasks, and tips for fine-tuning.
For a hands-on explanation of how to build state-of-the-art classification and Natural Language Generation I’d recommend the huggingface blog, and specifically Leveraging Pre-trained Language Model Checkpoints for Encoder-Decoder Models
To specialize even further in the field, I would recommend
- ABigSurvey contains overview papers per application. Not always up to date and some survey papers are quite old.
- paperswithcode is similar but one level deeper: contains individual papers per application. Is up to date.
- NLP Progress compares benchmarks of many different tasks.