0% found this document useful (0 votes)
25 views31 pages

Interview Prep Guide

The document is a comprehensive interview guide for aspiring Data Analysts and Data Scientists, detailing the roles, expectations, and interview preparation strategies. It includes insights from industry professionals and outlines the necessary skills and project types associated with each role. The guide emphasizes the importance of understanding the interview process and provides resources for coding, statistics, and data analysis preparation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views31 pages

Interview Prep Guide

The document is a comprehensive interview guide for aspiring Data Analysts and Data Scientists, detailing the roles, expectations, and interview preparation strategies. It includes insights from industry professionals and outlines the necessary skills and project types associated with each role. The guide emphasizes the importance of understanding the interview process and provides resources for coding, statistics, and data analysis preparation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Break Into Data

Interview
GUIDE
Written by Karun Thankachan, Sai Kumar Bysani, Meri Nova, Dawn Choo
About the Authors

Venkata Naga Sai Kumar Bysani - Lead Data Analyst

I am Sai Kumar, 3 years ago I moved to the United States to pursue my master's in data
science from the University of Connecticut. I had 0 work experience in the Data field and 0
corporate work experience. I am currently working as a lead data analyst at Blue Cross Blue
Shield of South Carolina. I also worked as a Data Scientist at LEGO as an Intern. I have
been consistently creating data content and have been giving back to the data community
ever since. I have been featured on Times Square, Fox, NBC, and 20 more articles for my
work in the field of data science. Feel free to shoot a message on my LinkedIn:)

Dawn Choo - Product Data Scientist

I am a Product Data Scientist in an Education Tech start-up. Prior to ClassDojo, I worked at


Meta, Amazon and Patreon. My career progression into Data Science was neither linear nor
easy; I started her career as a Financial Analyst, then Business Analyst, then Business
Intelligence Engineer, and now Product Data Scientist.

I am also the founder of Ask Data Dawn – a career coaching service aimed at helping others
get their dream job in the Data field.
Karun Thankachan

Sr. Data Scientist @ Walmart E-Commerce. Specializing in NLP and Recommender


Systems, with 6+ years of experience across SDE and Data Science.

I am also a mentor on Topmate! For free Data Science Coaching and Interview Preparation
Resources, check out - https://topmate.io/karun

Also, recently started a community for folks to continue to progress in Data Science, check it
out here and follow to be get a chance to join this invite-only community.
https://www.linkedin.com/company/buildml

Meri Nova

Founder at Break Into Data community.

I specialize in Machine Learning and AI Engineering. Previously I worked at a Computer


Vision and Robotics Startup in Palo Alto, California, where I helped automate the hydroponic
farming inside urban shipping containers.

Nowadays, I am building the largest Data Community called Break Into Data. I invite the
most influential experts in Data for weekly speaker sessions and host practical project-based
workshops and coding challenges. Join us to upskill and land your next job in data!
Let’s Get started!

Welcome to your one-stop roadmap to cracking Data Analyst (DA) & Data Scientist
(DS) interviews.

Who Is This Roadmap For?

This guide is not for the novice. We assume you have experience coding
(Python/SQL) and have done data-related projects. The guide is for those who
are looking to land a DA/DS over the next few months.

What Does this Roadmap Cover?

The guide provides an introduction to the three roles - Data Analyst, Applied Data
Scientist, and Product Data Scientist. It's important to understand what projects in
these roles look like to know which role fits best with your current experience.
Applying to relevant roles will improve your chance of cracking it.

The guide then goes over the types of interviews you could face in each role and
how links resources to prepare for them. The key benefit of the guide is that it's a
minimal set of resources that you can finish within limited time.

How to use this roadmap

The best way to use this roadmap is to

● identify the role you want to crack based on the Role Expectations section
● identify the type of interview you will face in that role
● prepare using the provided resources in the Interview Preparation section
Table of contents

Role Expectations 3
Data Analyst 3
Applied Data Scientist 6
Product Data Scientist 7

Interview Preparation 9
Coding 10
Statistics & Experimentation 15
Data Modelling and Visualization 16
Machine Learning and Deep Learning 18
Product Sense 21
System Design 22
Behavioral & Culture Fit 23
Take-Home Assignment 24
Role Expectations
Every role in the data-domain aims to use data to drive decision that grows business.
While there are over a dozen roles (you can find more about a few of them here), in
this roadmap we focus on the following three

Data Analyst

A data analyst transforms raw data into meaningful insights that guide strategic
decisions and drive business growth. They identify inefficiencies and optimize
workflows, enhancing operational efficiency and reducing costs. By analyzing
customer data, they uncover trends and preferences that improve products and
services, boosting customer satisfaction. Data analysts also play a crucial role in risk
mitigation, detecting potential issues early to prevent problems. They support
business development by identifying new market opportunities and optimizing
resource allocation. Additionally, they ensure data integrity and compliance with
regulatory standards, maintaining organizational credibility and trust.

What does a typical project involve?

Business Problem

Defining the problem that needs to be solved. This is the foundational step where the
objectives and goals of the analysis are established.

Data Collection

Gathering data from various sources such as databases, spreadsheets, cloud, APIs,
or other systems. This step involves identifying and acquiring the relevant data
needed for analysis.

Data Exploration

Exploring the data that has been collected to understand its structure, content, and
initial patterns. This step helps in gaining a preliminary understanding of the data.
Data Cleaning and Preprocessing

Identifying and addressing issues like missing values, outliers, duplicates, and
inconsistencies in the data. This may involve using tools like Excel, Python, R, or
SQL. It also includes adding calculated fields if required to prepare the data for
analysis.

Data Analysis

Applying statistical and analytical techniques to understand patterns, trends, and


relationships within the data. This could involve exploratory data analysis (EDA),
hypothesis testing, regression analysis, clustering, classification, or other methods
depending on the specific objectives.

Data Visualization

Creating visual representations of data using charts, graphs, and dashboards to


communicate findings effectively to stakeholders. Tools like Tableau, Power BI,
matplotlib, seaborn, or ggplot2 are commonly used for this purpose.

Reporting and Presentation

Summarizing analysis results into reports, presentations, or dashboards that are


easy to understand for non-technical stakeholders. This often involves translating
technical findings into actionable insights and telling a compelling data story.

Ad hoc Requests

Responding to unplanned inquiries or urgent needs for data analysis, often requiring
quick turnaround times. They prioritize, gather, analyze, and present data to address
specific queries or issues raised by stakeholders, adjusting their workflow to
accommodate these additional tasks while balancing ongoing projects.
What does the interview look like?

● Screening Round
○ Initial assessment of basic qualifications and fit for the role.
○ Often conducted via phone or video call.
● Hiring Manager Round
○ Interview with the hiring manager to discuss skills, experience, and role
expectations.
○ Opportunity to ask detailed questions about the team and projects.
● Coding Round (Python/SQL – Depends on Company, Team, Role)
○ Technical assessment focusing on coding skills in Python, SQL, or
both.
○ This may include solving data manipulation problems or writing
queries.
● Take-home Assessment (Depends on Company, Team, Role)
○ Extended assignment to be completed at home within a specified
timeframe.
○ Typically involves analyzing data and presenting findings or solving a
specific problem.
● Panel Interview and/or Executive Round
○ Final stage involves interviews with multiple team members or
executives. It assesses collaborative skills, cultural fit, and ability to
communicate effectively.
Applied Data Scientist

An applied scientist works on projects where an ML/DL model is used to drive


decisions that in turn drive business growth. This could be scientists at Amazon
using demand forecasting models data to determine how much product to stock for
the upcoming holiday, or Netflix using personalization models on your watch history
to help you figure what you would enjoy next. So, what does a typical project
involve?

Exploratory Data Analysis

To understand data quality, relationship between variables and test feasibility of ML


solutions.

For example, when developing a forecasting model you want to see you have
enough data to detect trends, are there trends/seasonality you can utilize, does the
data have too many anomalies/null values making prediction infeasible etc.

Feature Engineering

Utilizing statistical/ML technique to assess what features to use for modelling.

For instance, when forecasting apart from the historical value there could be
regressors you can add such as weekend indicator, holiday indicator etc.

ML Design and Experimentation

Selecting metrics tied to business goals, identifying baselines, and researching


models to experiment with.

In the prior forecasting model, you would weigh between metrics e.g.
MAPE/sMAPE/MASE etc. Then set baseline as mean forecaster, and experiment
between models such as ARIMA/ETS/DeepANT based on prior work on problems
similar to yours.
Productionalization and A/B Testing

The final model is validated, containerized and deployed as part of batch/online


training and inference pipelines. Monitoring dashboard are created and A/B testing is
run to assess the impact of the model.

For example, in the above forecasting example, you may finalize an ARIMA model
by ensuring back testing gives MAPE values that are likely to support business
goals. Then you would docker-ize the model, create batch training and predicting
pipeline on AWS SageMaker, and Tableau dashboard to track input/prediction
drift/MAPE of predictions as and when live data comes in.

Note: For online inference pipelines, e.g. a recommender system model that
suggests an item similar to the one you are currently viewing - as part of
productionalization you may also be required to A/B test. To see if the ML solution is
creating a positive impact on user experience, in comparison to the no-ML involved
experience for users.

The above project requirements are reflected in the interview process. As such, a
typical interview process can involve

● Coding
● SQL
● Statistics & Experimentation
● Machine Learning and Deep Learning
● System Design
● Take-home Assignment
Product Data Scientist

Product Data Scientists typically sit on a Product team where they work with
Engineers, Product Managers, Designers and other cross-functional partners.

In the tech industry, a "product" can be thought of as anything that the company
makes. It could be as broad as an entire app (e.g. Instagram), or it could be a
specific feature in the app (e.g. Instagram Stories). Most Product Data Scientists
would be staffed on a product team, and the product team is responsible for
monitoring and improving the product that they own.

Successful Product Data Scientists are those who can use data to successfully
inform strategic and tactical decisions. They also invest in helping build out the
team’s data and experimentation infrastructure.

Some projects that a Product Data Scientist might work on:

Exploratory Data Analysis

Product Data Scientists dig into user behavior and product usage trends. They
uncover opportunities and risks for the business by analyzing patterns in the data.

The techniques involved could be as simple as a segmentation analysis or as


complex as a clustering analysis. Typically the level of complexity depends on the
maturity of the product, and the problem at hand.

Experimentation

Product Data Scientists own experiments from end-to-end, starting with the
experiment set-up, metric definition and analysis and recommendations.

They are often also responsible for building out the experimentation infrastructure to
allow the company to iterate on experiments and learn quickly.

Metric Reporting & Investigation

Product Data Scientists develop and maintain regular reports on key product metrics.
They are often the owners of metrics, which means they are responsible for checking
on the metrics, communicating them to the company, and investigating any
anomalies. Their goal is to help the company understand the high-level question of
“how are we doing?”

Data Modeling and Engineering

Product Data Scientists often collaborate with Data Engineers to design and
implement data models that support product analytics. They help create and
maintain data pipelines to ensure data quality and reliability. A successful
collaboration between Data Science and Data Engineering results in robust data
infrastructure that enables efficient analysis and decision-making.

The above project requirements are reflected in the interview process.

● Coding, typically SQL and Python


● Statistics, focused on the application of statistical concepts for Product
questions
● Experimentation
● Product sense, including
○ Opportunity sizing
○ Metric definition
○ Metric investigation
● Behavioral
● Take-home Assignment
Interview Preparation
The following are the most common interview rounds across the three roles
mentioned above.

Note: Make sure you prepare for the interview rounds that correspond to your role.

Disclaimer: Interview process are quite dynamic, so we cannot guarantee these are
the only type of rounds you will face. However, being proficient in the following
should help crack any other type of interview you may face.

Coding

This interview aims to assess candidates’ problem-solving and coding proficiency.


There are three types of questions you can face -

LeetCode Style Rounds

This is the most popular type of coding interview.

An example of this type of question is “Two-Sum” on LeetCode - “Given an array of


integers and a target, return indices of two numbers such that they add up to the
target”.

The best mentor for LeetCode/DSA Rounds - Navdeep Singh


To prepare for these rounds

● Work through questions on the LeetCode Blind 75. The solutions can be
found on NeetCode. Prioritize Arrays, String, and Dynamic Programming, as
these are the most commonly asked.
● Dynamic Programming is going to be the hardest to overcome. Check out
these patterns to make it slightly easier.
ML Coding

These are less popular. These rounds, instead of assessing problem-solving, test
understanding of ML algorithms.

For example, - Code K-Mean algorithm from scratch.

To prepare for these rounds, work through coding the most popular ML algorithms

● Linear Regression
● Logistic Regression
● Decision Trees
● kNN
● K-Means
Emma Ding, is a great content creator for everything Data Science

Data Science Coding

These are focused on how you would use Python, R or SQL on a real Data project.
For example, they could provide you with a few datasets, a vague business question,
and ask you to uncover insights on the data.

The steps involved are typically:

● Data cleaning
● Data visualization and analysis
● Interpreting findings and making recommendations

Note: No machine learning modelling code is required.

To prepare for these rounds, work through end-to-end portfolio projects that
require you to clean data and extract analyses from the data. As a start, you can use
some guided projects like this -

● Understanding COVID fatality


● Analyze movie reviews
● Analyzing Pixar movies

However, as you get familiar with the process of building DS solutions, make sure
you’re working on your own Data projects.

● Find a dataset on Kaggle


● Time yourself to the length of the interview (30 minutes to 1 hour typically)
● Answer a vague business question
SQL

This interview aims to assess candidates’

1. Understanding of SQL syntax, functions, and advanced queries. (Advanced –


Depends on the role)
2. Ability to manipulate and transform data using SQL.
3. Capability to design efficient queries to solve complex/business problems.
4. Proficiency in using SQL for data analysis, deriving insights, and how these
results can be used to make business decisions or recommendations.
5. Understanding of database schema design and relationships. (Advanced
roles)

Example of the type of questions asked can be -

● Write a query to find the second-highest salary in the employee's table.


● How would you pivot a table in SQL without using the PIVOT function?
● Create a query to identify and remove duplicate records from a table without a
primary key.

You can expect a lot more difficult and different questions based on the company &
the role. 😉
Resources to Learn:

1. w3schools - https://www.w3schools.com/sql/ - Good for beginners to


understand basic concepts and syntax
2. Alex the analyst - Useful for both beginners and intermediates, with practical
examples.
○ This is for basics -
https://youtube.com/playlist?list=PLUaB-1hjhk8Fq6RBY-3MQ5MCXB5
qxb8VA&si=u_EGlhFkTefXEVQY
○ This is for Intermediate -
https://youtube.com/playlist?list=PLUaB-1hjhk8G5zci4HA8E21x2BJS3j
zNm&si=DKmmyR7KvzJI5dyh
○ This is for advanced -
https://youtube.com/playlist?list=PLUaB-1hjhk8GjfgvWlreA6BvTvazz8R
HG&si=zEh6bH7bmbI0Cn2W

3. Simplilearn - A mix of free and paid resources, suitable for structured learning.
○ Complete SQL Playlist -
https://youtube.com/playlist?list=PLEiEAq2VkUUKL3yPbn8yWnatjUg0
P0I-Z&si=rZXNcUNYgsYdLN34
4. SQL for Data Science - https://www.coursera.org/learn/sql-for-data-science -
Great for a structured course with certification

Then practice real-world SQL problems to improve critical thinking and


problem-solving skills. Below are a few resources to practice. Try to solve 2-3
queries every day on any of these platform(s).

1. Leetcode - https://leetcode.com/
○ This list is for the top 50 SQL questions and is free -
https://leetcode.com/studyplan/top-sql-50/
○ This list is for the top 50 advanced SQL questions -
https://leetcode.com/studyplan/premium-sql-50/ - To solve these you
would need subscribtion
2. Interview Query - It also has amazing case studies for Data Analytics and
Data Science - Use my code “VENKATA” to get 10% off if you plan on taking
the subscription - https://www.interviewquery.com/?via=venkata
3. Dataford - It has all the SQL interview questions asked in top companies -
Use my code “SAI20” to get 20% off if you plan on taking the subscription-
https://www.dataford.io/?via=VENKATA
4. Hackerrank - https://www.hackerrank.com/
Statistics & Experimentation

This interview aims to assess candidates’

● Understanding of statistical concepts and methods


● How to apply statistical concepts in real-world problems
● Familiarity with A/B testing and other experimental designs

Example of the type of questions asked can be -

● We want to predict which users are likely to upgrade to our premium service.
How would you build and validate a logistic regression model for this
purpose?
● Our e-commerce platform has implemented a new recommendation
algorithm. How would you design an experiment to test if it significantly
improves customer purchase rates?

Resources to Learn:

● "Practical Statistics for Data Scientists" by Peter Bruce and Andrew Bruce
○ Chapter 1: Exploratory Data Analysis
○ Chapter 2: Data and Sampling Distributions
○ Chapter 3: Statistical Experiments and Significance Testing
○ Chapter 4: Regression and Prediction
○ Chapter 5: Classification
● “Trustworthy Online Controlled Experiments” by Ron Kohavi
○ Read the entire book :)
● Familiarize yourself with A/B testing methodologies from experiment design to
analysis
○ Emma Ding’s AB testing cheat sheet
● Practice explaining complex statistical concepts in simple terms
● Use ChatGPT to come up with situations where you would use each statistical
concept in a real-world setting for that specific company.

You could use this prompt:

I am studying for Data Science interviews and I’m currently learning about
[INSERT CONCEPT]. Can you come up with some examples for how a Data
Scientist might use these concepts in [INSERT COMPANY]? Also, can you
come up with some interview questions to test my knowledge on this

concept?
Data Modelling and Visualization

What is the aim of the interview (what are we testing for?)

The data modeling and visualization interview aims to assess the candidate's
proficiency in designing data models, their ability to visualize data effectively, and
their understanding of the tools and techniques used in the industry.

1. Assess the candidate’s familiarity with data modeling principles, including


entity-relationship diagrams (ERD), normalization, and schema design.
2. Evaluate the capability to design logical and physical data models that meet
business requirements.
3. Test knowledge and experience with data visualization tools such as Tableau,
Power BI, or similar.
4. Gauge the ability to transform data into actionable insights through effective
visualizations.
5. Assess problem-solving skills and the ability to address complex business
questions through data modeling and visualization.
6. Check understanding of integrating data from various sources and ensuring
data quality and consistency. (Advanced roles)

Sample Questions

● What visualization would you use to show sales trends over time?
● How would you visualize the relationship between multiple variables?
● How would you design a data model for a retail sales system?
● Describe a situation where you'd use a star schema vs. a snowflake schema
● What are the advantages and disadvantages of denormalization?

Tips to learn

● Review common data modeling concepts (e.g., ER diagrams, normalization)


● Study different chart types and when to use them
● Familiarize yourself with SQL and database design principles
● Practice creating visualizations using tools like Tableau, Power BI, or Python
libraries
Resources to Learn:

● Coursera’s Data Modeling and Database Design -


https://www.coursera.org/learn/database-design-postgresql

○ Structured course covering the fundamentals of data modeling and


database design.
● Khan Academy’s Intro to SQL -
https://www.khanacademy.org/computing/computer-programming/sql

○ Free resource to learn data modeling concepts.


● DataCamp’s Data Visualization Courses
-https://app.datacamp.com/learn/courses/?technologies=9

○ Comprehensive courses on data visualization using various tools.


● Tableau’s Free Training Videos - https://www.tableau.com/learn/training

○ Free tutorials and training videos for learning Tableau.


● Power BI Guided Learning -
https://learn.microsoft.com/en-us/training/browse/?products=power-bi

○ Guided learning paths for learning Power BI.

Practice Platforms:

● Kaggle - https://www.kaggle.com/
Offers datasets and competitions to practice data modeling and visualization.
● Leetcode - https://leetcode.com/
While primarily for coding, also includes SQL and database design problems.
● DataCamp - https://www.datacamp.com/
Interactive courses and projects to practice data visualization and modeling.
Machine Learning and Deep Learning

These rounds test your ability to fit and tune models and make informed decisions
when it comes to selecting models and metrics.

. Some of the types of questions to expect are -

● Why is a log-loss function used in logistic regression instead of RMSE?


● How does randomization help Random Forests?
● How do you decide if linear regression is a good model to solve your
regression use case?
● Why is Adam an effective Optimizer?

To prepare for these rounds, focus on the following as these are concepts that are
tested irrespective of the projects you have worked on -

● Read ‘Introduction to Statistical Learning in Python’. The following are the


main concepts to learn
○ Chapter 2 - Understand the bias-variance tradeoff
○ Chapter 3 - When is linear regression a good model to use? Pros and
Cons of Linear Regression
○ Chapter 4 - When is logistic regression a good model to use? Pros and
Cons of Logistic Regression
○ Chapter 6 - Understanding Regularization
○ Chapter 8 - When are tree-based models a good model to use? Pros
and Cons of tree-based models
○ Chapter 9 - When are SVMs a good model to use? Pros and Cons of
SVM
○ Chapter 12 - When is K-Means a good model to use? Pros and Cons
of K-Means
○ Chapter 13 - Understand Hypothesis Testing
Trust us, ISLP is the only resource you need for Machine Learning Basics for 80%
interviews

● Read ‘Deep Learning’ by Ian Goodfellow. The following are the main concepts
to learn
○ Chapter 6 - Understanding Basics of Neural Network
○ Chapter 7 - Regularization of Neural Networks
○ Chapter 8 - Optimizing Neural Networks
○ Chapter 9 - CNNs, its variants, and optimizing them
○ Chapter 10 - RNNs, its variants, and optimizing them
In addition to the above, you will be asked a question based on the projects you
have worked on. For your own projects, make sure you reasoning behind your
‘design decisions’ i.e.

● Why did you choose this metric?


● Why did you choose this model? (If It's because this model performed the
best, why do you think that is?)
● If you have done feature engineering/selection - What features worked? and
Why?
Product Sense

What is the aim of the interview (what are we testing for?)

How you would approach a real-world problem as a Data Scientist on a product


team. They typically come in the form of a case question.

The 3 general categories of these Product Sense questions are:

● Opportunity sizing
● Metric definition
● Metric investigation

Sample Questions

● What are some ways to increase user retention on an online grocery shopping
app?
● You're part of the search team at an e-commerce company. The marketing
team has noticed that the search results for certain product categories are not
as relevant as they could be, leading to lower conversion rates. How would
you approach this problem and improve the search relevance?

Links to resources and how to use them

● Dawn’s Product Data Science interview guide


○ Understand what interviewers are looking for in these types of
questions
○ Memorize frameworks that you can use to tackle these questions
○ Identify ways to make yourself stand out as a candidate

Send Dawn a DM on LinkedIn if you want a discount code on this book.


● Study for Product Manager interviews by reading “Cracking the Product
Manager Career” by Jackie Bavaro and Gayle McDowell

● Emma Ding’s Product Case interview cheat sheet:


○ Print this out and use it as a high-level cheat sheet during your
interviews
System Design

This interview tests how you approach a problem that can be solved using machine
learning. A few examples of the type of questions are -

● Design ranking model for Instagram Feed


● Design a system to predict NetFlix Watch Times

To prepare for these rounds

● Read Chip Huyen’s ML System Design book - read everything, but give
special focus to the case studies.
● Check out the Jay Feng’s Mock Interview series
● Rarely, companies ask to code during system design rounds (you can confirm
this with the recruiter before interviews). In case they do ask check out these
videos from Exponent.
○ Predict App Deletion
○ Instagram Ranking
○ Predict Netflix Watch Times
○ Fake News Detection System
Behavioral & Culture Fit

What is the aim of the interview (what are we testing for?)

These are aimed at assessing

● Your interpersonal skills


● Problem-solving abilities
● How you handle real-world situations
● Whether your values and goals match the organization's culture

Sample Questions

● Tell me about a conflict you had with your co-worker and how you handled it.
● Tell me about a time when you had to convince a partner to do something that
they were initially resistant to

Links to resources and how to use them

● Make sure you know every experience and bullet point on your resume, and
are able to talk about it in detail.
● Use this 3-step framework to prepare before every interview
1. Prepare 5 - 10 stories using the STAR framework

2. Ask ChatGPT to come up with 20 behavioral questions for the specific


job description, using this prompt:

“I am interviewing for [role] at [company]. Come up with 20 behavioral /


hiring manager interview questions to test the skills required in the job
description.

Here is the job description: [insert job description]

3. Map the 20 questions from #2 to your stories from #1


Take-Home Assignment

This is the most comprehensive assessment of your skills as a data scientist as it


test for everything - coding, problem-solving, analytical thinking, ability to handle
data, fit and tune models, and most importantly communicate actionable insights.

For a few sample questions, check out these competitions on Kaggle

● Walmart Sales Forecasting Accuracy - What will the sales for store-item over
next X days
● Santander Customer Transaction Prediction - Which customer will make
transactions

To prepare for these rounds,

● First, understand the Data Science LifeCycle - use this to structure your
analysis and communicate how you approach a problem

● Next, practice solving problems that require a Data Science solution. You
would typically already have this from your projects/work experience.
However, if you are rusty bush up on your technical domain of choise using
the ‘Getting Started’ projects on Kaggle

Note: These are not examples of projects you want to put on your resume.,
Rather, these can be used to brush up on different technical domains. You
can find more on Kaggle, based on the area of expertise

● Most importantly, research the company and business domain the team is
working in. Align the direction of solution, business metrics you choose, and
insights you communicate to what would be of interest to the team.

That’s it! Try to relax before your interviews, listen to cues from your interviews, and
go in with a positive mindset!

You got this!


Need More Help?

Feel free to connect with us on Linkedin and at Break Into Data PRO Community!

As part of the BID PRO membership, you will gain access to:

- Private Technical Workshops to build your Portfolio


- Private Library of resources for Data & AI roles
- Daily Focus sessions and Accountability calls
- Premium access to Interview Query platform (eligible only for 6 and 12 months memberships)

and so much more!

You might also like