ML Student Career Advice
If you’re currently in university and thinking about working in ML, it’s never too early to start thinking about your career. Should you get a PhD? What companies should you work for? What are the important skills to learn? I gave a talk on career advice to the Data Science club at my alma mata, The University of Toronto. This article summarizes my highly opinionated advice!
TLDR; The best place to work after school is big tech, but it’s not easy to get in! In order to work in ML right out of school, you probably need to go to grad school. However, another path is joining an ML team as a backend engineer and transitioning from within a company. In order to land that first job, practice your coding interviews and accumulate internships during school.
Table of Contents
Where Should You Work?
There’s lots of different companies hiring ML engineers: Big Tech, Start Ups, Finance, and non-tech companies (telecoms, old school retail etc.) I highly recommend joining the best Big Tech company you can get into out of school. There’s 3 reasons for this:
-
Best Technical Learning Opportunities: Your highest priority out of school should be advancing your technical skills. The most exciting, ground breaking technologies are being developed at big tech companies. For a lot of companies, innovation and engineering is a business necessity (not a cost to cut.) Big tech companies offer two advantages here:
- Experienced Engineers: As a fresh grad, it’s important to get mentored by experienced engineers. That’s because school can give students the theoretical background to succeed, but you don’t know what you don’t know. Learning best practices from experienced engineers and management can accelerate your learning way faster than trial and error.
- Large Scale Problems: Big tech operates at scale, they’re not searching for business use cases by hacking prototypes together. This means they can invest in best in class solutions.
-
Organizational Learning Opportunities: Big tech companies tend to be… big. In order to build large scale systems, they need lots of engineers, managers and product people. The cost is process overhead to coordinate everybody. To work on large scale systems, you need to learn how to navigate large organizations. This goes beyond technical competence: influence and communication. How do you get a big organization to decide what to focus its resources on and determine whether the product is heading in the right direction? When you spend time at a big company, you get to see the full year cycle: roadmapping, how corporate strategy translates into OKRs, team goals and individual projects. How engineers get recognized and promoted not only for technical competence, but impact on the company. You see how wins, losses and strategy are communicated. As you become more senior, you’re measured by your ability to influence projects and outcomes. So it’s crucial to have experience working in a large organization.
-
Reputation: Big tech companies are well known and hard to get into. So when you can put that on your resume, it makes you much more attractive to other companies. Once you work at Amazon, you’ll always be an ex-Amazon.
-
High pay: Big tech companies pay really well, and you can really get lucky if the stock surges. The money you make early in your life can end up having a big impact on your net worth due to compounding. However, this shouldn’t be your most important consideration compared to learning opportunities. Starting salaries for fresh grads have the least variance, and the difference in offers between different companies is a rounding error compared the luck of what happens to stocks and the salary increase you’ll get when you’re a senior. Odds are, after a few years you’ll need to change companies to really raise your salary anyways. At that point your leverage will come from having multiple offers. So don’t optimize your first job based on salary alone, compensation becomes a bigger factor a few years down the line when you’re more senior.
Startups
Note: this is a generalization, some start ups are great and some are bad! A lot of early stage start ups are focussed on product innovation over technical innovation, this can accelerate the careers of managers and product managers. You can get the chance to become a PM or manager much earlier at a growing start up than an established big tech company. On the other hand, a lack of focus on tech can translate into poor experiences for engineers. There can be a lot of technical thrashing while product market fit is being discovered. This leads to engineers hacking together crud, low scale prototypes instead of investing in high scale, high quality tech solutions using best practices. Start ups can have a technical edge on big tech if they’re applying new technologies and paradigms to a market. They’re the ones discovering the best practices. So it can make sense to join a start up if they’re focusing on a technical niche or application you’re interested in. Once you’re the experienced engineer, you can join a start up and accelerate your career by working on large scoped projects.
There’s also pay, work life balance and future stability to consider. This makes it harder for startups to attract experienced engineers to learn from (who often want to work on interesting large scale tech problems. They get frustrated with instability and poor technical processes.) You might be working with much lower tier engineers than at a big tech company. If you join a start up and it’s not all that you imagined, start taking a look at big tech companies after a year, it never hurts to get a competing offer.
What type of big tech company to join?
In the past, there’s usually been some golden company that everyone knows is the best: IBM, Microsoft and then Google. Right now, there’s a lot more equality. The big companies have lowered their hiring standards and there’s a lot of medium sized companies with great practices and interesting problems. I’d say there’s a distinction between mega tech companies like Facebook, Google, Amazon, Microsoft etc. and medium sized companies like Pinterest. Larger sized companies have good reputations and the largest problems to work on. They have teams devoted to fundamental research or very deep technical niches far removed from product (eg. improving OS kernels or building their own databases.) However, it can be harder to have a real impact on the company and it’s not easy to get opportunities to build systems from scratch. Along with cutting edge tech, there’s lots of teams keeping the lights on for legacy business critical systems. Not much glory resume highlights there. Their large scale usually means amazing central platforms to build systems off of. The downside is that engineers can lose touch with the underlying technologies and concepts for how these large systems are built. It’s common to interview senior engineers at FANG companies that can’t explain the underlying abstractions of how their distributed systems are built.
There’s also geography. I think it’s a great experience to spend some time in headquarters. This is where guest speakers come to give live talks, you bump into directors and high level people in the cafeteria and really feel the energy of the company. However, covid may change this quite a bit. I would definitely recommend new grads work in a physical office instead of going full remote. This is the best way to learn from engineers, make friends from outside the team and grow your network.
How do you get that job?
So you’ve decided it’s a good idea to work at a big tech company after school. Unfortunately, so has everyone else! How do you actually get in? It’s important to get some experience before you graduate. That means securing internships at big tech companies while you’re in school. It’s pretty hard to snag that Google internship in the summer after first year, so you have to work your way up with smaller companies. Good grades can also help (internships are probably the only time that grades impact hiring.) One nice thing about being a new grad is that big tech companies will interview a lot of candidates to hire into a general pool, instead of hiring for individual roles. Once you’re selected for an interview, you need to really do well on coding. Practice medium questions on Leetcode and see my coding interview guide. Finally, apply early and apply often! Job openings for summer internships and new grad hires open up in the Fall, so get applying. Hopefully you can land your job in first semester.
What doesn’t really help?
- side projects
- school projects
- github repos of random private projects (if you’ve contributed to a popular open source project that’s different)
In my experience, these types of things get emphasized on student resumes but they’re rarely taken into consideration for hiring. Real industry experience trumps these side projects because projects are hard to verify or quantify. You built a tensorflow detector for course X? Does that just mean you spent an hour changing the parameters of a tutorial? Industry experience typically means developing something end to end and launching it into production.
Technical Skills
For fresh grads, big tech companies don’t really look for particular skills. Eg. you won’t be tested on knowing details of the Tensorflow library. It’s expected that you can transfer your smarts and skills to learn new technologies. However, there’s a few skills that will help you hit the ground running:
- Python: this is the lingua franca of ML. If you can write clean code in Python during the interview that helps
- SQL: in order to work with ML, you need data! SQL (and its extensions into Spark, Presto etc.) is the language for getting that data. You typically don’t learn this well in school, it comes from an internship.
- Java/Go: High scale backend systems are typically written in a typed languager. It’s nice to have familiarity with one typed language, the skills are transferable to other typed languages. Too be honest, this likely won’t be tested during the hiring process.
- Deep Learning: Deep learning is the best in class solution for many problems, so experience with PyTorch or TF is important.
- Non Deep Learning: Not all models are built with huge DNNs, you should know how to build a regular linear regression or gradient boosted tree.
Should you go to grad school?
I’d say that grad school is critical to joining a company as an ML engineer from school. Another route is to join as a regular software engineer after undergrad, and transition to ML by joining an ML team. You won’t always get that opportunity, and there’s still a lot of theoretical knowledge you’ll have to study for. This is a situation where joining a startup might give you the freedom to work on ML projects. Bottom line, if you’re convinced you want to work in ML, then I’d recommend at least getting your masters. A PhD really isn’t necessary to work as an ML Eng on applied problems, but it becomes more important if you want to lead the research directions of a company. When you look at the top ML leaders at big companies, most of them have PhDs. However, keep in mind the opportunity cost. You’re trading 3-5 prime working years for that PhD, and it’s pretty likely you’ll end up doing a similar job as you would with a Masters. A masters grad with 5 years of industry experience working in ML is worth a lot more than a fresh PhD grad..
What if you don’t get your dream job?
So your marks sucked, you didn’t get any internships and you bombed your interview. That’s okay, your first job doesn’t dictate your whole future! The demand for software and ML engineers has never been higher. You just need to start somewhere, gain some skills, practice your coding interviews and keep moving up the food chain. Start at a startup or anywhere that’s working on relevant technology. Learn what you can and move on to bigger opportunities as quickly as you can.
Hot take bonus advice
If everything has gone well, you landed that new grad job and now you’re just finishing off your final year of school. Before you head off to the data mine, take a little detour. Defer your start date as long as possible so that you can travel. Yes the recruiter will tell you that they want you to start right away. It doesn’t matter. You’ll have the rest of your life to make money, get promoted and accumulate stocks. What you won’t have is months of freedom while you’re in your 20s. Travel is an amazing teacher for subjects beyond the curriculums of engineering school (don’t delay work to play video games for months in your parents basement.) It’s also good to decompress from the stress of school before starting a new chapter in your life.