Systems Design Interview Guide

Systems Design Bridge

This article gives tips and advice for systems design interviews, which are the hardest questions to prepare for in FAANG style software engineering interviews. I recently joined Pinterest as a Staff ML Engineer, which meant interviewing with a bunch of big tech companies for the first time in 6 years. I did well enough to get some offers so I’d like to share my approach (without breaching any NDAs and getting sued.)

As an ML Engineer, I had to go through 2 types of design interviews: standard distributed systems and ML systems. In this article, I’ll discuss high level advice that can be applied to both types of interviews. I’m going to write specific articles for distributed system and ML system design interview questions.

Here’s a summary of my advice:

  1. Understand the signals that companies are looking for
  2. Think of systems design questions as an improv presentation
  3. You can prepare by thinking of your process for walking through a design and researching the building blocks you can use

Special thanks to my friends Masud Khan (Facebook), Michael Ritche (Google), Curren Pangler (Zynga) and Mehdi Ben Ayed (Spotify) for reviewing this giving their input

What Are Systems Design Interview Questions and Why Are They so Important?

If you’re interviewing to be a senior engineer, staff engineer or manager, you’re probably going to face a system design interview or 2. For software engineers going through a distributed system design, these are some example systems you may be asked to design in interviews:

ML Systems Design questions are typically about recommendation systems. For example, Facebook has been known to ask about creating a Facebook content feed.

These are the hardest questions to prepare for in the interview loop for a few reasons:

Systems design interviews signal your experience working on real world, high scale problems. It’s critically important to do well in these interviews for more senior roles (where you’re being hired for your experience.) As you become more senior, there’s less and less distinction in lower level skills like coding. That makes the behaviour and systems design questions key drivers for determining your level.

What Signals Companies Are Looking For?

Ultimately, Systems Design questions show you have a strong theoretical background, and you know how to apply it to solve real world problems. They also show your creative thinking and problem solving skills. There’s a few layers to this.

Design

What does it mean to show design skills? These questions are almost always based on real world problems that have evolved many different state of the art solutions over the years. Which means you’re expected to invent the new state of the art standard during your 45 minute interview. Design in this sense means decomposing a system into different components. You then utilize standard building blocks for each component. This interview shows your ability to split a big problem into smaller conceptual pieces. It also shows what building blocks you have in your design toolbox, how well you know them, and how wise are in putting them together.

There isn’t one right answer for these problems, and there’s often lots of options for each component. It’s a good idea to name the possible options for each component and discuss their tradeoffs.

Process

How do you approach complex problem solving in general? These types of systems have a lot of breadth and depth, so it’s up to you to use your time to cover the important aspects. If the interviewer asks “Could you design a system to extract all the english words in dictionary.com”, don’t just dive into “Well I’d start with a MySQL database and then spin up five ec2 instances with Python API…”. You’re jumping into detail too early. I discuss my approach later on in the article.

Background Theory

ML and Distributed systems have a lot of theory that back standard methods and solutions. Do you know how to dive a bit deeper into these solutions to explain what theoretical problem they solve? Or how theoretical concepts like CAP limits capabilities. One of the tricky things with working in ML is that there’s a lot more theory to cover for well known practices.

Real World Gotchas

This is a great chance to show your battle scars and hard won real world experience. These are the types of things that aren’t always called out in articles or documentation, but they affect production high scale systems. Some examples are:

Showing these signals are key for being ranked at more senior levels.

Technical Leadership

Can you organize and communicate your ideas well? Can you overcome challenges? Are you trustworthy and wise in your decision making? Can you split your design into subsystems that different teams can work on?

Product/User Empathy

Do you understand how your technical implementations affect the product and user experience? Can you design systems in a way to allow product leaders in the company to deliver better user experiences? Can you anticipate the stakeholder tensions for the system?

High Level Approach

Unlike every other interview, you should be driving the system design interview. Recruiters mentioned numbers like you should be doing 80% of the talking. This can feel unnatural, but if you do find yourself in a situation where it’s a comfortable loop of the interviewer asking you a question and you giving a quick response, things might not be going so well. This has led me to think of Systems Design interviews as improv presentations. The interviewer gives you a topic and some constraints, then you give an impromptu presentation on your solution (as you’re coming up with the solution.)

There’s 3 stages to the interview.

If you’ve ever given a presentation, you know that they go a lot smoother when you’ve done some preparation. But how can you prepare for a presentation when you don’t know the solution or interview question? It’s not that bad, you might not know the details but you can prepare the general flow of the presentation ahead of time. The topics you choose to cover also speak a lot about your level.

One nice thing about systems design interviews is that there’s no one right solution. In fact, it’s important to be able to generate multiple solutions and discuss their tradeoffs. An interviewer may even take your perfectly good solution and throw a wrench in it, forcing you to adapt and think of new solutions. There’s also going to be areas of the solution you won’t know much about. Don’t worry about calling that out, it’s better than bullshitting and being confidently wrong. Don’t let it throw you off either. Nobody is expected to be an expert on everything, but everybody needs to be trusted to know their own knowledge limits. So be confident about being unsure!

Design Approach

Imagine you’re in a small interview room in Facebook’s Menlo Park offices. You’ve had a couple of minutes of small talk with the smiling interviewer but now it’s time to get down to business. They ask you to design a service like TinyURL. How are we going to do this?

Clarification

Absolutely, do not, under any circumstances, no matter how panicked you are, start discussing a solution. Instead, you need to ask clarifying questions about this vague request to understand the priorities, scale and restrictions of the system you’re designing. For both ML and distributed systems, you want to ask about scale: how many users, how much data, latency expectations etc. Sometimes you ask, sometimes you lead a discussion. Eg. for latency, an experienced engineer will know that 2 seconds is too long for an API, so offer practical ranges where you can. For distributed system questions, you’ll want to take your requirements and translate these into rough resource estimates (eg. network capacity, number of machines, data storage.)

Once you understand the goal of the system and clarify requirements, it’s time to actually “present” a solution. I’ve provided some example conceptual diagrams that apply to distributed systems, but the general concepts work for ML designs too. Here’s how I approached this.

High Level Solution

Start abstract and simple to lay out an end to end solution. This is useful because it ensures you have a full understanding of the problem the interviewer is asking you to solve. Give the interviewer a chance to understand your approach and ask questions, you want some feedback that your approach will work. You’ll also want to touch on how design decisions affect the business or product to make sure you’ve clarified priorities of the system. Another advantage of having a simple end to end solution is that it helps you identify what the real technical challenges are. There may be some hidden depth in how components interact that wasn’t obvious from the problem definition.

At this stage, the components of the initial high level system should be technology agnostic and abstract away scaling and concrete methods/technologies. For distributed systems, you’d be dealing with high level concepts like clients, servers and state storage. You’ll also call out how the system components interact with each other. This can include API definitions or data schemas. For ML Systems, you’ll be talking about the high level sources of data you’ll be working with, and how models interact with the product. You’ll call out the types of models (eg. binary classifier) and how the system will be evaluated (eg. offline evaluation of increasing retention via A/B test.)

High level design components
First pass of design showing high level components and how they interact.

Architectural Components

Next, you can start filling in some details. One approach is to start with a simple approach and then evolve it to increase performance. Your components become more specific, using well known architectural building blocks. For a distributed system design, these types of building blocks are things like databases, queues, SQL/NOSQL databases. In ML designs, you can get more concrete about how models are delivered to the product, eg. batch or real time, what type of infrastructure is required? Make sure you describe the reasoning behind your design choices (eg. how does using a queue fit into the system requirements.) You can also include some data models, if necessary.

Second level design components
Second pass of design with more details filled in like data models and component types.

Concrete Component Implementations

Next you can start diving into choices for your components. You don’t just want to arbitrarily choose your solutions. You want to go through a list of options and discuss their trade offs. You can describe how you’d start using one choice and then consider upgrading to another under certain conditions. As you’re iterating through ideas, it’s great to bring up state of the art solutions you’ve researched that can be applied. Often you’ll choose (or the interviewer will ask) to dive into some details about some specific components. How does a distributed queue work? How will your system handle concurrency?

Implementation level design components
Lower level design with implementation details filled in.

Finishing Touches

By this point, you probably have a nice solution and you’ve shown off your design chops. Hopefully you’ve saved a bit of time to really put yourself over the top. This is especially important if you’re going for more senior tech leadership positions like staff eng. Here’s where you’ll want to cover some advanced topics like how the system development could be split between different teams, long term ways the system could evolve to meet future business needs, common components that could be reused by different products, real world issues like regionalization or data schema migrations.

Tips

Start Simple and Add Complexity

Throughout the interview, you need to show breadth and depth. There isn’t enough time to get into depth on everything, so you want to be strategic about this to cover areas you’re good at. Start with breadth. Get and end to end working solution, ideally with high level components. You can then start diving into details for some of these components. The interviewer may actually guide you to areas that they care about (or are more fruitful for design discussions.)

Describe Your Thought Process

There’s no one right answer, but there needs to be a method to the madness. It’s a bad look to make arbitrary decisions without explaining them. “For the binary classifier we’ll use a Tensorflow deep neural network” Why? Why did you make a specific choice to use Tensorflow over PyTorch? Why are we using a DNN instead of other modelling techniques?

Don’t Let the Interviewer Screw You Up

Remember, you’re driving the interview. So don’t expect the interviewer to necessarily ask you about all the areas you’re expected to discuss to pass the interview. It’s tricky though, because interviewers will guide you to areas to focus on, and some interviewers do ask lots of detailed questions. If you finish the interview and think “phew, he didn’t ask me about concurrency!”, you probably did poorly.

This isn’t just the hardest interview to prepare for, they’re also hard to conduct. Some interviewers will get off track honing in on tiny details or seem satisfied with very shallow answers. Don’t forget, you’re not just convincing the interviewer, you’re convincing the interview panel. So don’t be afraid to shift the direction of the conversation to cover areas where you can add more signal. Offer to cover things, “I could make a diagram of this”. Obviously, you don’t want to be too pushy, if they’re asking you questions or directing you to a specific area, make sure you cover it. Avoiding answering a direct question is nearly always fatal in interviews.

Listen to the Interviewer

Even though you’re meant to drive the interview, you still need to listen. These problems have lots of areas to go into depth, and the interviewer may be interested in one particular aspect. Make sure you listen to cues and go with the flow. As I mentioned before, ignoring or deflecting a direct question or request is a quick way to fail.

Be Confident, But Don’t Bullshit

You’re not expected to be an expert in everything. There will be some aspects of the system you won’t know much about in depth. It’s much better to call these out than bullshit something that’s incorrect. Do still attempt partial answers, that’s usually more than enough.

Q: “How do you ensure only authorized users can access their data in your API”?

A: “This isn’t my area of expertise, but I know OAuth is commonly used for authentication.”

Diagram By Hand

As I write this at the end of 2020, it’s about day 300 of every company working from home during Corona. Which means all these interviews are being conducted through video conferencing. This may actually continue into the future, since it saves a lot of time, money and effort to interview remotely than fly people out to hq for a few days. But it makes communicating designs harder. You’ll be given the option of creating digital diagrams through Google Docs or Coderpad, but I’d highly recommend writing on paper and showing them to the camera. I’ve tried both and it’s so much easier and faster writing diagrams by hand. It’s even better if you have a small white board in your office.

I’ve Never Done This Before!

Odds are, you’ll be asked to design something you’ve never personally done before. This is where you need to recognize how your past experience can be applied to this new problem. You should also research and be aware of other published production solutions and be able to incorporate those ideas. As you become more experienced, you can fill in the blanks from, “I’ve read about this service” to “I’ve tried A and B, A had good performance for X but I’d start with B because Y”.

However, maybe you’re working somewhere where you haven’t dealt with large scale data or they aren’t using any type of sophisticated ML techniques. This is a great motivation to get a new job! Sometimes companies pay for potential and sometimes they pay for experience. If you don’t have the experience, then you may have to settle for a lower level position where you can gain experience. If you stay at a company for too long where you’re not being exposed to new challenges and techniques, it starts reflecting poorly on your own decision making and values.

Conclusion

Hopefully this article will give you an understanding of how to approach systems design questions. This is a hard question to prepare for, with few resources online, but there’s a few things you can do: Think about the process you’ll go through to discuss your design Study and understand the building blocks you’ll use in your design Read up on state of the art designs

In my next articles I’ll write some more detailed steps for the ML Systems interview and Distributed Systems interview.