Getting started in ML

I never got a CS degree or PhD, and I’ve worked as an ML applied scientist at a FAANG company for 6 years. I posted about that on X recently and a lot of people had questions about how to get started. This post is all my best advice for starting out in ML.

Classes/Resources to follow

Getting hired for an intro level ML job only takes 2 graduate level classes in addition to a normal software engineering job:

  1. Intro to ML
  2. Deep Learning with focus on a specific area
    • For a focus in NLP, Stanford CS 224. Lectures are on youtube, class page with homeworks/previous years
    • For a focus in computer vision, Stanford CS 231n Lectures are on youtube, class page with homeworks/previous years

If you prefer textbooks, Deep Learning by Goodfellow is great (and it’s free). It covers intro to ML and some deep learning, but doesn’t have anything after 2016, so you’ll need to learn modern architectures somewhere else.

You’ll also need basics of linear algebra + calculus if you haven’t seen that already (You don’t need everything in those classes at first, being able to understand the multivariate chain rule will get you most of what you need). The classes linked above can be difficult at first. If you want an easier version, you can do the Coursera Machine Learning Specialization and Deep Learning Specialization (although I found those to be too easy, it was possible to pass the class without really getting it)

Don’t skip the basics! Machine learning is extremely easy to do wrong. Here’s cautionary tale: Once I saw someone build an ML model, test it, and get 90%+ F1 score. They deployed the model to production, where it ran for months. They then transferred ownership of the model to my team, who investigated more deeply. It turns out the model was useless! It wasn’t doing any better than random guessing! The problem was there was train/test leakage, so the high test score was just from overfitting to examples in the test set. The people responsible for deploying the model originally didn’t know the importance of auditing + monitoring the model in production so they didn’t even know how useless it was.

Learn by building

I can’t count how many times I’ve watched a lecture or read a paper, thought I understood it, but when I tried to implement it, I realized I didn’t understand as well as I thought I did.

Taking courses is important, but you’ll get so much more out of them if you start with a project idea in mind.

Don’t expect things to work perfectly first try. It usually takes me 50+ experiments before I have a model that I’m ready to release to production. Design your experiments so you can iterate quickly.

What projects to work on

Start off with small projects, things you can imagine building in a weekend. As you learn more, you’ll be able to build bigger and cooler projects in the same amount of time (also you’ll build a good library of reusable code that’s common between the projects).

One good thing to practice is reading a paper and implementing it. That helps you understand the paper better, and can be really useful to other people.

For building a portfolio to get hired, it’s important that there’s some way to easily verify that the project is good. For example, if I tell you I trained a model and got 99% percent accuracy, that doesn’t really tell you anything, because maybe I just had some train/test leakage, or the data was imbalanced with more positives, or it was just an easy problem. Try to have some easily understandable impact from your model. For example, in my initial portfolio, I did one image generation project, and during the interview I showed the interviewer pictures from my phone. For another project I made a model to play the 2048 game. The model was able to get the 4096 tile, which is better than I could do myself. My last portfolio project was some freelance work I had done, so I could demonstrate business impact.

It’s even better if you show the quality of the project through a metric everyone recognizes. E.g. If you can do well in a kaggle competition, get a paper into a conference, or make a library or model that a lot of people use, then that will stand out.

Work with others

Post about your work online. Share what you’ve learned. You’ll meet cool people and learn a lot from them. Getting attention from posting your work is a good way to make opportunities for yourself, especially compared to applying for jobs where you don’t even know if the recruiter will look at your resume.

Leave a Reply

Discover more from Rosmine ML Blog

Subscribe now to keep reading and get access to the full archive.

Continue reading