ML Student Career Advice

Student thinking If you’re currently in university and thinking about working in ML, it’s never too early to start thinking about your career. Should you get a PhD? What companies should you work for? What are the important skills to learn? I gave a talk on career advice to the Data Science club at my alma mata, The University of Toronto. This article summarizes my highly opinionated advice!

TLDR; The best place to work after school is big tech, but it’s not easy to get in! In order to work in ML right out of school, you probably need to go to grad school. However, another path is joining an ML team as a backend engineer and transitioning from within a company. In order to land that first job, practice your coding interviews and accumulate internships during school.

Where Should You Work?

There’s lots of different companies hiring ML engineers: Big Tech, Start Ups, Finance, and non-tech companies (telecoms, old school retail etc.) I highly recommend joining the best Big Tech company you can get into out of school. There’s 3 reasons for this:

Startups

Note: this is a generalization, some start ups are great and some are bad! A lot of early stage start ups are focussed on product innovation over technical innovation, this can accelerate the careers of managers and product managers. You can get the chance to become a PM or manager much earlier at a growing start up than an established big tech company. On the other hand, a lack of focus on tech can translate into poor experiences for engineers. There can be a lot of technical thrashing while product market fit is being discovered. This leads to engineers hacking together crud, low scale prototypes instead of investing in high scale, high quality tech solutions using best practices. Start ups can have a technical edge on big tech if they’re applying new technologies and paradigms to a market. They’re the ones discovering the best practices. So it can make sense to join a start up if they’re focusing on a technical niche or application you’re interested in. Once you’re the experienced engineer, you can join a start up and accelerate your career by working on large scoped projects.

There’s also pay, work life balance and future stability to consider. This makes it harder for startups to attract experienced engineers to learn from (who often want to work on interesting large scale tech problems. They get frustrated with instability and poor technical processes.) You might be working with much lower tier engineers than at a big tech company. If you join a start up and it’s not all that you imagined, start taking a look at big tech companies after a year, it never hurts to get a competing offer.

What type of big tech company to join?

In the past, there’s usually been some golden company that everyone knows is the best: IBM, Microsoft and then Google. Right now, there’s a lot more equality. The big companies have lowered their hiring standards and there’s a lot of medium sized companies with great practices and interesting problems. I’d say there’s a distinction between mega tech companies like Facebook, Google, Amazon, Microsoft etc. and medium sized companies like Pinterest. Larger sized companies have good reputations and the largest problems to work on. They have teams devoted to fundamental research or very deep technical niches far removed from product (eg. improving OS kernels or building their own databases.) However, it can be harder to have a real impact on the company and it’s not easy to get opportunities to build systems from scratch. Along with cutting edge tech, there’s lots of teams keeping the lights on for legacy business critical systems. Not much glory resume highlights there. Their large scale usually means amazing central platforms to build systems off of. The downside is that engineers can lose touch with the underlying technologies and concepts for how these large systems are built. It’s common to interview senior engineers at FANG companies that can’t explain the underlying abstractions of how their distributed systems are built.

There’s also geography. I think it’s a great experience to spend some time in headquarters. This is where guest speakers come to give live talks, you bump into directors and high level people in the cafeteria and really feel the energy of the company. However, covid may change this quite a bit. I would definitely recommend new grads work in a physical office instead of going full remote. This is the best way to learn from engineers, make friends from outside the team and grow your network.

How do you get that job?

So you’ve decided it’s a good idea to work at a big tech company after school. Unfortunately, so has everyone else! How do you actually get in? It’s important to get some experience before you graduate. That means securing internships at big tech companies while you’re in school. It’s pretty hard to snag that Google internship in the summer after first year, so you have to work your way up with smaller companies. Good grades can also help (internships are probably the only time that grades impact hiring.) One nice thing about being a new grad is that big tech companies will interview a lot of candidates to hire into a general pool, instead of hiring for individual roles. Once you’re selected for an interview, you need to really do well on coding. Practice medium questions on Leetcode and see my coding interview guide. Finally, apply early and apply often! Job openings for summer internships and new grad hires open up in the Fall, so get applying. Hopefully you can land your job in first semester.

What doesn’t really help?

In my experience, these types of things get emphasized on student resumes but they’re rarely taken into consideration for hiring. Real industry experience trumps these side projects because projects are hard to verify or quantify. You built a tensorflow detector for course X? Does that just mean you spent an hour changing the parameters of a tutorial? Industry experience typically means developing something end to end and launching it into production.

Technical Skills

For fresh grads, big tech companies don’t really look for particular skills. Eg. you won’t be tested on knowing details of the Tensorflow library. It’s expected that you can transfer your smarts and skills to learn new technologies. However, there’s a few skills that will help you hit the ground running:

Should you go to grad school?

I’d say that grad school is critical to joining a company as an ML engineer from school. Another route is to join as a regular software engineer after undergrad, and transition to ML by joining an ML team. You won’t always get that opportunity, and there’s still a lot of theoretical knowledge you’ll have to study for. This is a situation where joining a startup might give you the freedom to work on ML projects. Bottom line, if you’re convinced you want to work in ML, then I’d recommend at least getting your masters. A PhD really isn’t necessary to work as an ML Eng on applied problems, but it becomes more important if you want to lead the research directions of a company. When you look at the top ML leaders at big companies, most of them have PhDs. However, keep in mind the opportunity cost. You’re trading 3-5 prime working years for that PhD, and it’s pretty likely you’ll end up doing a similar job as you would with a Masters. A masters grad with 5 years of industry experience working in ML is worth a lot more than a fresh PhD grad..

What if you don’t get your dream job?

So your marks sucked, you didn’t get any internships and you bombed your interview. That’s okay, your first job doesn’t dictate your whole future! The demand for software and ML engineers has never been higher. You just need to start somewhere, gain some skills, practice your coding interviews and keep moving up the food chain. Start at a startup or anywhere that’s working on relevant technology. Learn what you can and move on to bigger opportunities as quickly as you can.

Hot take bonus advice

If everything has gone well, you landed that new grad job and now you’re just finishing off your final year of school. Before you head off to the data mine, take a little detour. Defer your start date as long as possible so that you can travel. Yes the recruiter will tell you that they want you to start right away. It doesn’t matter. You’ll have the rest of your life to make money, get promoted and accumulate stocks. What you won’t have is months of freedom while you’re in your 20s. Travel is an amazing teacher for subjects beyond the curriculums of engineering school (don’t delay work to play video games for months in your parents basement.) It’s also good to decompress from the stress of school before starting a new chapter in your life.