If you’re currently in university and thinking about working in ML, it’s never too early to start thinking about your career. Should you get a PhD? What companies should you work for? What are the important skills to learn? I gave a talk on career advice to the Data Science club at my alma mata, The University of Toronto. This article summarizes my highly opinionated advice!
TLDR; The best place to work after school is big tech, but it’s not easy to get in!
Advanced degree in computer science? Check.
Good marks in school? Check.
Years of professional experience at prestigious tech companies? Check.
Okay, just solve this programming riddle while an interviewer looks over your shoulder. By the way, the interviewer just added “it’s an easy question” ie. if you don’t solve it in 10 nanoseconds you’re an idiot.
Coding interviews are pretty hated, but you shouldn’t let them stop you from getting your dream job.
A lot of people ask me for advice about ML Engineering interviews, since each company does things differently and it really combines a lot of skills. I went through a round of interviews at big companies on my way to joining Pinterest as a Staff ML Engineer, so I’m going to put together a few articles with detailed advice. These guides offer strategic advice on how to deal with the interview process as well as an overview of important tech and ML concepts you should know.
One of the trickiest interview rounds for ML practitioners is ML systems design. If you’re applying to be a Data Scientist, ML Engineer or ML Manager at a big tech company, you’ll probably face an ML Systems design question. I recently tackled this question at a few big tech companies on my way to becoming a Staff ML Engineer at Pinterest. In this article I’m going to talk about how to approach ML Systems Design interviews, core concepts to know and I’ll provide links to some of the resources I used.
This article gives tips and advice for systems design interviews, which are the hardest questions to prepare for in FAANG style software engineering interviews. I recently joined Pinterest as a Staff ML Engineer, which meant interviewing with a bunch of big tech companies for the first time in 6 years. I did well enough to get some offers so I’d like to share my approach (without breaching any NDAs and getting sued.
My talk from this year’s Spark Summit is online about our experiences deploying Deep Reinforcement Learning in production. We also talk about our new open source Deep RL library, RL Bakery. I was looking forward to going to San Francisco to give the talk live, but of course Covid turned the summit virtual. It does take a bit of pressure off, prerecording the talk, but you also lose some of the energy without a live audience.
There’s a big learning curve when you jump from studying statistics in school to programming statistical tools for Amazon scale data. Big Data’s swiss knife is Map Reduce, so it’s often the case that you have to describe any data manipulation algorithm in the Map Reduce Paradigm. Since I’m new to Map Reduce, I’ve been reading up on best practices. One book that I saw recommended on the web was Data-Intensive Text Processing with MapReduce, by Jimmy Lin and Chris Dyer.
Thesis: In The Thank You Economy, Gary Vaynerchuk says that it in order for companies of any size to grow in the future, they will have to engage their customers at a personal level. His logic for this is that in the past, commerce was dominated by small mom and pop shops and interactions with customers at a personal level was the norm. Then in the 1950′s, people moved out to the suburbs which led to increased isolation.
The Quest for The Cure is a very interesting book that gives an overview of past and present pharmaceutical breakthroughs. I don’t have much formal education in chemistry or biology, but I am drawn to medical and biotechnology breakthroughs. Medical issues touch everyone and in my view this field has the greatest impact on improving the human condition. That being said, this is a one of kind book that gives just the right amount of detail on the past, present and future of drug development.
One of the cool things I’ve learned about in grad school is stochastic calculus. I’m far from an expert on the subject but I’m going to share the basic idea of a Stochastic Integral.
Quick Review Reimann Integral Equation The integral everybody learns about in high school and undergrad is a Reimann Integral, where you’re finding the area under a function by partitioning the space into a sequence of rectangles.
A good saying in software development is that “if it’s not tested it doesn’t work”. Every programmer agrees it’s a good idea to test your code, the question is what’s the best methodology to do the testing. One very useful tool is unit tests, which is code that performs a series of specific tests on a module to test its functionality. By using a common unit testing framework across a project, it’s easy to create new tests, run the entire suite of tests for a project, and check if any tests are failing.
I found this book to be a great refresher on writing clean code. By clean code, I don’t mean code that has good “style”, such as indentations and comments. The essence of being a good programmer is splitting a problem into clean abstractions. This book advocates writing extremely short functions and classes, each with one specific purpose. While some may argue that increasing the number of functions or classes also increases the complexity of the program, Robert Martin simply views this plethora of functions as exposing the true complexity of the problem.
Code Complete 2 is a really useful book on good programming, software engineering and management principles. I read this several years ago and it’s really shaped the way I program. It’s a thick book, but it covers creating software at a lot of different levels. It starts at a really low level with coding style issues, like variable naming, spacing, and other low level details. It progresses to higher level software design issues; abstracting code into classes and functions, design patterns.