Data Science Career
Launch Guide
Preparing for a Data Science job
Data Science Career Launch Guide
Preface
Why read this book
Quite often, job titles and descriptions are confusing, as they
vary from company to company, as the industry itself is in the
process of figuring out how to effectively use Data Science. For
an aspirant in the field however, this can lead to uncertainties.
In the following chapters, we will outline exactly how to
determine where you are in your Data Science journey, and
then define a sequential, step-by-step path to getting the career
of your dreams. You will also find knowledge about the industry,
which will help you understand how it works and how to
prepare for a career as a Data Scientist.
How we built this book
Over the span of 3 years, GreyAtom has been the bridge
between Data Science aspirants and the industry in many ways:
● Brought on experts from the industry on board to mentor
our learners, and give them exposure to what the life of a
Data Scientist entails
● Built our curriculum with what the industry needs firmly
in mind, and often in collaboration with our hiring
partners
All rights reserved 1
Data Science Career Launch Guide
● Coached thousands of learners during their job hunting
phase, through mock interviews, professional profile
preparation, and much more
Who this book is for
If you are interested in actionable insights on how to get a job as
a Data Scientist, but don’t know where to start, this book is for
you.
All rights reserved 2
Data Science Career Launch Guide
Table of contents
Chapter 1: Things to know before you begin 6
Overview 6
Learning outcome 7
What you should know? 7
Business understanding 7
Technical understanding 8
Skill improvement 9
Communication skills 9
What to expect ahead 10
Summary 10
Chapter 2: Relevant roles in the Data Science industry 11
Overview 11
Learning outcome 12
Roles in the Data Science industry 13
What employers expect apart from your technical skills 19
What you should be doing? 21
Stretch goals 21
Chapter 3: Who is hiring for Data Science roles? 22
Overview 22
Learning outcome 22
Industries hiring for Data Science roles 22
Major hirers 22
Minor hirers 24
Conclusion 25
References and further reading 26
Chapter 4: Day in the life of a Data Scientist 27
Overview 27
Learning outcome 27
All rights reserved 3
Data Science Career Launch Guide
Let’s read more about Shantanu 27
Additional reading 33
Chapter 5: How to build a portfolio project and blog about it
34
Overview 34
Learning outcome 35
Portfolio projects 35
How to choose your portfolio project 35
Ideal qualities of a good portfolio project 37
How to showcase your portfolio projects to the world 38
References 43
Chapter 6: Data Science must-have skills: what is the
industry looking for 44
Overview 44
Learning outcome 45
What is the industry looking for? 45
Key pointers to keep in mind 45
Additional resources 48
About GreyAtom 49
All rights reserved 4
Data Science Career Launch Guide
Chapter 1: Things to know before
you begin
Overview
Data Science is often misinterpreted as a kind of magic, and
Data Scientists are magicians, who instantly resolve major
business problems
with their ninja data
knowledge and
skills.
Obviously, this is far
from the truth and it
is important to set
the right
expectations right
from the start.
Data Science is not
magic. The power of
Data Science comes from a combination of three skill sets: deep
understanding of statistics and algorithms; programming and
hacking; and communication. More importantly, it is vital to
apply these skills in a disciplined and systematic way.
All rights reserved 5
Data Science Career Launch Guide
Additionally, one has to be able to multitask effectively to be a
good Data Scientist.
Expectations and stakes are high right from the beginning and,
after a point of time, they will start to seem overwhelming;
especially if you were not aware of them when starting the
journey.
In this section, let’s talk about these aspects. As they say,
forewarned is forearmed.
Learning outcome
● What are the problems that a working Data Scientist has
to face?
● How to overcome the problems faced by a Data
Scientist?
● What are the skills that matter the most?
What you should know?
Here are a few things you should know before you embark on a
journey in Data Science.
Business understanding
You are at the point where you are about to start your Data
Science journey. At this stage, most Data Science aspirants
think that, either by being a super coder, or having a deep
understanding of mathematics, or by doing both, they will
All rights reserved 6
Data Science Career Launch Guide
become industry-ready and will make a major impact in
whatever business domain they are working in. Sure, these skills
will definitely be helpful, but are they sufficient?
NO! At the end of the day, you are being hired to help solve a
business problem so that the organization benefits. So one of
the most important skills that recruiters are looking for is a
business understanding of Data Science.
Once you have grasped the importance of business
understanding, the next step is to understand that your role will
be s olving business problems using Data Science.
Technical understanding
What are the top technical skills you think are
necessary for you to be a better Data Scientist?
If you google this, you will find a long list of
suggestions: Python, SQL, R, and so on. These
are not skills, but tools that are used by the industry and are
currently trending. The fact is that new and better tools will
eventually replace these ones in a couple of years. Tools can only
be leveraged effectively, if there is technical understanding of
what the tool is used for. Therefore the target skill is not the tool
itself, but the technical understanding of the requirement of the
tool.
All rights reserved 7
Data Science Career Launch Guide
Skill improvement
There is a reason the industry faces a shortage of
capable Data scientists, in spite of the field having
a huge number of aspirants. Let's face it: it is a
challenging field, as it lies at the intersection of
two of the most dreaded technical subjects:
'Programming' and 'Mathematics'.
Aspirants start their journey with high hopes, as the internet
makes it sound as if almost anyone can become a Data Scientist
in a day or so. Obviously, this is not true; it takes time and
continuous effort to become one. The trick is to not give up, just
concentrate on i mproving your skills.
Communication skills
Communication is a very crucial, often
underappreciated, soft skill for a Data Scientist.
You will have to communicate effectively with
your own team members, other team
members, and stakeholders of the company.
Everyone listens to you carefully, as you are not just expected to
give information, but business insights that have been
discovered through analysis. Communication is one of the most
important skills that you have to develop to be a successful Data
scientist.
All rights reserved 8
Data Science Career Launch Guide
What to expect ahead
Truth be told, there are a lot of challenges that one needs to sail
through in this journey. Becoming aware of the challenges will
help you be prepared for them. Your biggest challenge would
be in learning programming and mathematics. Cultivating the
right mindset here would go a long way in smoothening the
journey.
Here are some great resources on how to acquire these skills,
and cultivate the mindset needed while learning them.
● Things to know while learning programming
● Learning maths for Machine Learning
● Things to know before you start learning Data Science
Summary
Having a good grasp of Mathematics and Programming will
help you become a good Data Scientist, but there are other
ignored aspects that will differentiate you from the hoard of
Data Science enthusiasts.
All rights reserved 9
Data Science Career Launch Guide
Chapter 2: Relevant roles in the
Data Science industry
Overview
As you are on your journey to transition and upskill, the very first
step should be to have a good understanding of what it is you
are getting into in concrete terms.
For instance, which roles and what kind of work will you end up
doing? Data Science is still a fairly new field and companies are
still figuring out the right team structure and defining roles, so
as to be able to make informed decisions using data.
Designations and roles will vary in different companies, so the
focus should be the responsibilities, as opposed to dwelling on
the designation.
Technology has always been subject to change, and due to this
companies tend to receive massive amounts of data at their
disposal to help them to make better business decisions. But to
make good choices with all of that data, they need to have
workers who are skilled in Data Science. Data science
professionals are being relied upon by businesses to make
critical decisions about creating products, expanding into other
markets, and even acquiring other companies.
All rights reserved 10
Data Science Career Launch Guide
Remember: No one knows everything! Even if you
feel confident using any one skill, you are set to
commence your job search journey.
Let's review some of the typical roles involved in the end-to-end
delivery of a Data Science project, along with their respective
responsibilities.
Learning outcome
● What jobs can you get in the Data Science industry?
● What are the various skills which a Data Science
professional possesses?
● What does the employer expect apart from technical
skills?
All rights reserved 11
Data Science Career Launch Guide
Roles in the Data Science industry
The different roles you can get in the Data Science industry are:
● Machine Learning engineer
● Data Analyst
● Data Scientist
● Business Analyst
● Data Science Manager
Role Key Responsibilities Mindset
Mandate
Machine Create Work closely with Data Coding
Learning and Scientists to transform ninja with
engineer deploy what they wrote as a an
solutions Jupyter Notebook or a applicatio
using ML Python script into a n mindset
and DL software that can be towards
algorithm deployed. Design and problem
s to solve implement ML solving
various applications to address
problems business challenges,
benchmark
infrastructure, and do
A/B testing. Work with
product and
engineering teams to
improve data quality
via tooling,
All rights reserved 12
Data Science Career Launch Guide
optimization, and
testing. Monitor the
model performance
and finetune the
model if required.
Data Acquire, Scrape and query data Full-fledge
Analyst process while bringing it to a d data
and form that is suited for junkie
summariz stakeholders. Manage with
e different the quality of data and lightning
insights acquire additional data, fast ability
from data if needed, and to
augment to existing summariz
data. Perform e insights
extensive EDA on the in data
data and check the
different hypotheses
on data. Interpret data
properly and effectively
communicate the
findings through
visualizations.
Data Solve Understand business Data
Scientist critical problems or wizard
business market-required with a
problems capabilities that need a relentless
using data solution, and drive to
to propose implement an analytics find
All rights reserved 13
Data Science Career Launch Guide
solutions framework to solve it. answers in
for Acquire, clean, process data
effective and manage data from
decision various sources and
making break the overall
business problem into
manageable chunks.
Create valuable and
actionable insights
from data by
conducting a
predictive and
prescriptive analysis of
data. Enable
data-driven
decision-making by
building models by
communicating their
findings to the
business. Solve
multiple
business-relevant
questions at every
stage of analysis and
modelling.
All rights reserved 14
Data Science Career Launch Guide
Business Bridge Identify business needs Integrate
Analyst gap and process data for business
between easy analysis and understan
business understanding. Use ding with
and IT by extensive domain a sharp
providing knowledge to identify eye for
technolog key gaps, challenges, data
y-based and potential impacts trends
solutions of a solution or
to strategy. Use
enhance storytelling and
business effective
processes communication
techniques to translate
technical or statistical
analysis into business
intelligence.
Concentrate on
retrospective and
descriptive analysis of
data and give business
insights.
Data Align the Manage Data Science Data
Science team of and analyst teams and Science
Manager Data ensure the Data cheerlead
Scientists Science projects are er with a
to long aligned to long term
term organizational goals. vision and
organizati Execute and manage
All rights reserved 15
Data Science Career Launch Guide
onal goals Data Science projects goals for
and end-to-end, and ensure the team
ensure timely deliverables to
that the the stakeholders. Ask
goals are the right Data Science
achieved questions that need
answering and ensure
that the right expert is
mapped to solve the
right problem. Plan
and execute the Data
Science roadmap for
the organization and
keep the leadership in
the loop.
All rights reserved 16
Data Science Career Launch Guide
All rights reserved 17
Data Science Career Launch Guide
What employers expect apart from your
technical skills
While your technical skills comprise your main skill set, there are
some equally important skills that can not be ignored.
● Communication skills: A sizeable weightage of 30-40% is
given to your communication skills. These are not only
important when it comes to showcasing your skills to
stakeholders, but also when you are communicating with
stakeholders when capturing business requirements.
Imagine capturing the wrong requirement and building
something which was not expected or required. This
leads to not just time and bandwidth loss but monetary
loss as well. So, understanding requirements, asking the
right questions at each stage, and communicating what
you have planned and built is very important.
● Collaboration and working as a team: Whether it is
about solving business problems or building great
models, it is never a single person's contribution. It is
always a team effort. In an interview, you will be
evaluated on whether you are a team player and how
open you are in working with a big team. The projects
that you have built in teams, come in handy here. You
can talk about how you divided the roles and
responsibilities between the members, and how you
were able to execute successful delivery.
All rights reserved 18
Data Science Career Launch Guide
● Ability to learn quickly, openness to new ideas and
flexibility: Once you join an organisation, each day will
bring in new learning. You may end up working on
different projects, different teams, and different
technology. The need of the hour will be to pick up new
things, as and when required, for timely and successful
delivery. How open and flexible are you to learn new
things and take one for the team really matters.
● Understanding the domain: It is very important to
understand the following factors about any organisation
you interview for:
○ Is it a services company (eg. TCS, Accenture, Wipro)
or a product company (eg. Uber, BookmyShow,
Flipkart)?
○ What is the domain (eg. e-commerce, finance,
retail)?
○ How does the domain work at a high level? You
will definitely learn a lot about the domain once
you start working, but a high-level understanding
is important. Do go through the website of the
company before the interview.
● Awareness about recent innovation/changes in your
technology: Demonstrating your awareness about the
latest trends in the industry and technology can help you
stand apart from the others. This can be a real value-add
and can impress the employer.
All rights reserved 19
Data Science Career Launch Guide
What you should be doing?
● Look at the roles and responsibilities instead of job titles
before applying. Even if you have some key skills, not all,
don’t feel overwhelmed. Go ahead and apply.
● Know that, since Data Science is an emerging field, there
is not much clarity with respect to roles. Seek more clarity
on the role during your interview process. Ask specific
questions like: Please describe a typical work day scenario
for this role
● Look out for red flags in the job description. If someone is
asking for 6+ years of experience for a 5-year old field,
then that is definitely not the right company.
Stretch goals
● Read more about Data Science roles
● Read more about the structure of Data Science teams
● How to transition to Data Science
● What skills you require to transition into Data Science
roles
All rights reserved 20
Data Science Career Launch Guide
Chapter 3: Who is hiring for Data
Science roles?
Overview
Data-driven decision-making is the way forward for many
industries globally and Data Science is one of the key tools for
that decision-making process. Hence it is a good idea to
understand the different industries that are hiring for Data
Science and the general trends in hiring.
Learning outcome
● What are the different industries hiring for Data Science
roles?
● What is the distinction between different types of
companies that are hiring for Data Science roles?
Industries hiring for Data Science roles
Major hirers
● Banking, Financial Services and Insurance (BFSI): These
sectors leverage a lot of data and apply analytics to derive
actionable insights. Read this article on how the banking
industry uses analytics. Another good resource in the
Economic Times about the trends in the BFSI sector, with
All rights reserved 21
Data Science Career Launch Guide
respect to analytics. Top companies in the BFSI sector
hiring for Data Science: Barclays, Bharti AXA, HDFC Life,
DSP Blackrock, Aditya Birla Capital, CRED
● Healthcare: Healthcare is a huge sector for Data Science.
There are multiple use cases for Data Science especially
with respect to patient data analytics and hospital
management. Read more about the various use cases of
predictive analytics in healthcare. Top companies in
Healthcare hiring for Data Science: Cerner, Citius Tech,
AllScript, Episource
● E-commerce: Another booming sector for Data Science
adoption, the e-commerce sector has exciting problems
like market basket analysis, inventory management and
price management. These are a few examples that use
Data Science techniques to arrive at a solution. Read
more about the different use cases of e-commerce. Top
companies in E-commerce hiring for Data Science:
Amazon, Flipkart, BigBasket, Grofers
● Telecom: The telecom sector definitely needs Data
Science to understand things like customer churn trends
and fraud detection, as well as problems such as the
efficient allocation of bandwidth. Read about the exciting
use cases that can be solved using Data Science in
telecom. Top companies in the telecom sector hiring:
Airtel, Vodafone, Jio
● Media and entertainment: The last few years have seen
the exponential rise of video streaming platforms.
Recommendation engines suggest what to watch next to
All rights reserved 22
Data Science Career Launch Guide
users and real-time video analytics are just a few of the
exciting use cases in this industry. Read about the
different use cases of Data Science in media and
entertainment. Top companies in the media industries
that are hiring: Hotstar, Zee5, MX Player
Minor hirers
There are other industries too which are looking for Data
Scientists. Some examples are:
● Pharma: The pharmaceutical industry is booming and
they too are adopting Data Science techniques to inform
their decision processes. Read about some of the exciting
use cases solved in the pharma industry.
● FMCG: Predictive analytics is also entering FMCG
industries and there is definitely quite a bit of hiring
happening there. Read this report on the adoption of
Data Science in the FMCG domain.
Other sectors like Oil and Energy and Heavy Manufacturing
are also adopting Data Science. Of course, every internet
business vertical — edtech (online learning), food tech
(restaurant delivery), logistics (managing movement of goods),
recruit tech (hiring driven by data), online events and
management — are all looking to hire Data Scientists to derive
valuable insights from data and scale business.
All rights reserved 23
Data Science Career Launch Guide
Conclusion
Data Scientists are hired across the board by different-sized
companies. Each type of industry has its pros and cons.
Product companies are focused on building Data Science into
the product they are offering. The tech stack and the skills they
are looking for is very specific. On the other hands, service
companies, who provide Data Science as a service, may require
a broader skill set, and an effective communicator to deal with
clients.
An early-stage startup could be exciting in terms of the
challenges involved in setting up the Data Science tech stack,
but the lack of a support community in terms of fellow Data
Scientists could be quite intimidating. Before any ML algorithms
can be applied, initial days would be just about setting up basic
analytics and providing dashboards to the right stakeholders.
A company on the trajectory of rapid growth would have a Data
Science team already in place. There would be good peer
support, and typically in such organizations, a lot of innovation
takes place. But sometimes, the pace of work at the mid-sized
startup can get overwhelming.
Huge companies that are just starting to adopt Data Science
would have the Data Scientist embedded into the engineering
teams. In such organizations, a lot of emphasis is placed on
security and hence access to data could be difficult.
All rights reserved 24
Data Science Career Launch Guide
Read this comprehensive report prepared by the Analytics India
Magazine on t he state of Data Science jobs in India.
References and further reading
● Read the article to get an understanding of the Data
Science hiring trends in America
● Read the article to understand how to succeed as a Data
Scientist in small organizations/startups.
● Chapter 2 of the book B
uild a Career in Data Science.
All rights reserved 25
Data Science Career Launch Guide
Chapter 4: Day in the life of a Data
Scientist
Overview
In this section, we are going to learn about what a Data
Scientists’ day looks like while at work. We will also read about
the key performing areas of a Data Scientist, who the other
professionals he works with on projects are, and the skills he
uses. We will be reading more about S
hantanu Kumar.
Learning outcome
● What does a Data Scientist do in his day-to-day work life?
● How does a Data Scientist build his own capabilities and
those of his team members? How does he learn newer
technologies?
● What are the typical behavioural and technical traits
needed in a Data Scientist?
Let’s read more about Shantanu
Passionate about education, AI and all permutations of the two,
Shantanu is an MIT New Ventures Leadership (Class of 2019)
graduate and on the Alumni Advisory Board. He co-created
MiMent, a finalist of the program, structured around
personalised learning plans for Indian university students
All rights reserved 26
Data Science Career Launch Guide
looking out for jobs and places to apply their skills. He is the
author of the LexScore algorithm, an internationally published
researcher in Natural Language Processing, Domain-Specific
Sentiment Analysis, and Analytics with equally strong
extracurricular accolades at the national level. A 2-time winner
of the world’s largest annual code competition (AI Theme) and
an active contributor to Stanford Scholar, where one of his
papers received the highest votes on the platform.
A global speaker and guest lecturer at premier B-schools in
India, Shantanu is an advisor to multiple AI and Ed-Tech
startups. He is a course instructor and Machine Learning
instructor at GreyAtom, a member of the Association of
Computational Linguistics (ACL). He also loves storyboarding
data, and mentoring and reviewing projects for students
pursuing the Udacity Nanodegree in Data Analysis, Machine
Learning, Business Analytics, and Artificial Intelligence.
Can you tell us in brief what business problems you solve
with the help of Data Science on a day-to-day basis?
Some business problems we're solving using data science:
● Actionable insights for Chief Human Resources Officers
using textual analytics.
● How can we predict attrition in various organisations?
● How can we find areas of disengagement within an
organisation?
All rights reserved 27
Data Science Career Launch Guide
What does your typical workday look like? How do you
spend your time across different kinds of activities?
I lead the end-to-end development of all AI-related features that
are embedded into inFeedo’s products. We’ve recently
developed a state-of-the-art NLP engine in the AI/HR space that
boasts of domain-specific textual analytics, which is powered by
a proprietary algorithm called LexScore. LexScore helps drive
actionable and effective action within the organisation. This
involves a huge chunk of my bandwidth in terms of:
● Gathering market intelligence and research and
development of new features
● Ideating with my team to develop GTM strategies
● Data analytics and engineering to find metrics that back
our hypothesis
● Developing an extensive prototype to test out with
customers
● Deploying and tracking success metrics
Which teams and stakeholders do you work with?
Since product ideation is an integral part of our work, working
with the product and design teams along with tech teams for
integrations is something we do.
How do you build your own capabilities and those of your
team? How do you learn newer technologies?
All rights reserved 28
Data Science Career Launch Guide
We’re primarily learning by iterations — the more the merrier!
Picking up use cases to solve for and exploring multiple
solutions around it, as we go through each one of them, helps
us learn a lot! On the side, we contribute to open-source
projects and pick up stuff from there. Plus, following
educational and futuristic opinion blogs and tech leaders.
What are the most rewarding/frustrating moments in your
journey as a Data Scientist?
One of the most rewarding moments in Data Science for me
remains how we can add immense value to businesses by
crunching large volumes of data — drawing insights that were
not visible before. What remains frustrating is the number of
solutions we need to explore and check the viability of before
we can say that it is something that truly does “work”.
What according to you are typical behavioural and
technical traits needed in a Data Scientist?
The primary behavioural trait I’ve noticed in successful data
scientists is perseverance and the commitment to keep
exploring and moving forward. Since this role involves trashing
a ton of your own work before you are satisfied with the results,
it does need the effort to throw away things personal to you and
that have involved lots of work, while you start over.
Technically, data scientists should be good with data. Period.
Regardless of what tool they’re using — be it Excel, R, Python or
whatever. Knowing how to present data and why they’re
All rights reserved 29
Data Science Career Launch Guide
selecting a particular sample over the other is very important.
Being good with data essentially means having strong domain
knowledge of what data is important and what isn’t, since most
code is openly available these days.
Data Scientists in their job roles are required to understand the
statistical and mathematical models in order to apply them to
the data. They apply their theoretical knowledge in the domains
of statistics and algorithms to find the best way to solve a
certain problem.
There are Data Scientists who fine-tune the statistical and
mathematical models that are applied to data. When somebody
is applying their theoretical knowledge of statistics and
algorithms to find the best way to solve a Data Science problem,
they are fulfilling the role of Data Scientist. A Data Scientist is
someone who is able to build a data question into a business
proposition, solve the business problem, create the predictive
models, answer the pressing problems that the business is
facing, and do a little bit of storytelling when it comes to
manifesting the findings.
A Data Scientist’s job is to analyze data for actionable insights by
doing the following tasks:
● Identifying data analytics problems that offer the greatest
value to the organization
● Getting to know the most appropriate datasets and
variables
● Working with unstructured data like video, images, etc.
All rights reserved 30
Data Science Career Launch Guide
● Discovering new solutions and opportunities by analyzing
data
● Collecting large sets of structured and unstructured data
from disparate sources
● Cleaning and validating data to ensure accuracy,
completeness, and uniformity
● Devising and applying models and algorithms for mining
big data
● Analyzing the data to identify patterns and trends
● Communicating findings to stakeholders using
visualization and other means
Some of the technologies and skills that a Data Scientist works
with:
● Programming skills in Java, Python, R, and SQL
● Reporting and data visualization techniques
● Big Data Hadoop and its various tools
● Communication and interpersonal skills
The day-to-day activities of a Data Scientist sometimes can be
predictable, and sometimes not. There are many requirements
for becoming a Data Scientist. If you are interested in becoming
a Data Scientist, then you should have the skills to crunch data,
make new inferences, ability to look at the same problem from a
different angle, and so on.
All rights reserved 31
Data Science Career Launch Guide
Additional reading
● What does a Data Scientist do on a daily basis? Top 5
Quora answers
All rights reserved 32
Data Science Career Launch Guide
Chapter 5: How to build a portfolio
project and blog about it
Overview
Once you have acquired
core Data Science skills,
the next step is to show
your skills to the
community (and the
potential recruiters, of
course). The best way to
do this is to build a
portfolio project where
you showcase most of the
Data Science skills that
you have.
A good portfolio project is extremely important since it is public
and demonstrated evidence of your Data Science skills and can
set you apart from other job seekers. The stronger your portfolio,
the better your chances of getting hired.
Let's deep dive and understand how to build an effective
portfolio project that can improve your chances of getting hired.
All rights reserved 33
Data Science Career Launch Guide
Learning outcome
● How to choose and build your portfolio project
● What are the ideal qualities of a portfolio project
● How to showcase your portfolio projects to the world (and
potential recruiters)
Portfolio projects
How to choose your portfolio project
First, let's answer the question: what are the projects that you
must never consider as portfolio projects? NEVER put trivial and
learning projects, like the ones on Titanic, MNIST, Iris, etc. Since
these are simple problems to solve, there are not many ways for
you to distinguish your solution from those of others. Instead of
boosting your profile, they would be detrimental for you.
Another way to narrow down the list of portfolio projects is to
choose the domains in which you are interested to work, or a
domain where you are an expert. For example, if you are an
insurance person looking to switch to Data Science, a FinTech
project would be right up your alley. If you do not have a
preferential domain, then pick 1 or 2 domains and get started.
Don't overthink it.
Build a Career in Data Science has very good advice on
choosing your portfolio projects, which we are summarizing
All rights reserved 34
Data Science Career Launch Guide
here. There could be two possible workflows for choosing your
portfolio project:
● Start with an interesting question and find the data to
answer the question.
○ Don't wait for the perfect question or inspiration.
Start somewhere but start soon!
○ Improve as you go along, don't wait for the perfect
resources.
○ Explore open APIs first to see if you can quickly
obtain data and then opt for scraping data.
○ If you are unable to find good data sources for your
problem, then pivot into a different question that
can be solved with the data available to you.
○ This approach is harder than others and therefore
makes your portfolio project stand out. Use it to
further sharpen your skills.
● Start with an interesting dataset and find interesting
questions to answer.
○ Browse through the various open data sources
available: Kaggle, DrivenData, Open Govt datasets,
or even use Google dataset.
○ Once you find the dataset, think of all the
interesting business questions that you can answer
with the dataset.
○ Try to augment the dataset with data from other
relevant sources. This will make your project stand
out further from others.
All rights reserved 35
Data Science Career Launch Guide
If you consider doing two good portfolio projects — one each of
the above two workflows — it would be a very good showcase to
recruiters. The best portfolio projects are made when solving a
problem that is exciting for you personally and apply your
knowledge to solve it.
Ideal qualities of a good portfolio project
Now that you have chosen the project, it is a good idea to keep
in mind the ideal qualities of a portfolio project. Read the blog
on how not to get hired as a Data Scientist. It gives you a good
idea on the importance of the portfolio project. Here are the
ideal qualities of a good portfolio project:
● Data collection: If the projects are made by collecting
the data yourself (completely or partially) through
scraping or APIs instead of readymade clean datasets, it
would set your project apart. Data collection is an
important part of Data Science and this would showcase
that skill to prospective employers. Read this article on
one of the ways to build your data is web scraping with
scrapy.
● Feature engineering: Showcase your feature
engineering skills by generating new features from
existing features or augmenting new features. This is an
important skill that the industry is looking for.
● Business insights: In every portfolio project, along with
the ML metric you are optimizing for, do keep an eye on
the business problem you are solving, and what business
All rights reserved 36
Data Science Career Launch Guide
insights you are recommending as part of the problem
you are solving. In the project documentation, do talk
about the stakeholder you are working with and how the
stakeholder will benefit from the analysis.
● Deployment: Once the project is done, host the project
live using tools like Flask (to build a good frontend
interface for your Data Science project) and Heroku (to
host the project). These are definite brownie points that
help you get noticed.
In addition, keep your code well documented and readable by
following the latest code documentation standards.
How to showcase your portfolio projects to the
world
Having done a good portfolio project, it is equally important to
showcase the portfolio project to the world properly, so that the
right audience (read: recruiters) notice your work and your Data
Science skills.
● GitHub: Your portfolio project must be hosted on GitHub
so that this project can be showcased to recruiters. A few
points to remember:
All rights reserved 37
Data Science Career Launch Guide
○ Have a well-defined [Link] file as part of the
project. It must contain the basic details of the
project, along with information on how the project
repo is structured.
○ Ensure that your code is well-documented and
follows naming and documentation norms.
○ If you are including a Jupyter notebook, write good
markdown explaining the code as well as the
different visualizations.
○ Here are two examples of excellent portfolio
projects which have been showcased to the world
via GitHub.
■ Market Basket Analysis: The author has
documented every aspect of the solution
neatly in separate files and has a clearly
defined proposal of what he wishes to
achieve in the project.
■ Predicting Jockey Race Wins: This too is a
really well-documented Jupyter notebook,
clearly explaining the steps involved in the
project. More importantly, the goals of the
project are clearly identified.
○ You could use either of the projects as inspiration
to model your own portfolio projects. Remember
showcasing your project in a neatly documented
manner is as important as solving a good project.
● Blogging: Having portfolio projects hosted on GitHub is
good, but it is also important to communicate the
All rights reserved 38
Data Science Career Launch Guide
workflow and the thought process behind your project in
non-technical terms. Writing a blog improves the
visibility of your work and also helps to showcase the
project to non-technical recruiters.
○ Here is a quick template on how to write your blog
along with the various questions you need to
answer. (This is just a reference and you can write
your blog in a different way if you so choose.)
■ Problem definition
● What is the business problem you are
solving?
● Who is the stakeholder for whom you
are solving?
● What is the impact you wish to create
with your solution?
■ Dataset exploration
● How did you collect the data?
● Basic statistical summary of the data -
size of train/test, size and number of
classes, etc
■ Data cleaning and preprocessing
● Which data cleaning have been
techniques used?
● Exploratory visualizations of the
dataset
● Describe the preprocessing
techniques used
● Why were these techniques chosen?
All rights reserved 39
Data Science Career Launch Guide
● Which feature engineering
techniques were used?
■ Model
● What type of model was used and
why?
● What were the model's
hyperparameters?
● How was the hyperparameter search
done?
● How were hyperparameters finally
selected?
● Model-specific challenges (for
example, for ANN/Deep Learning,
details about what optimizer was
used, batch size, number of epochs
and values for hyperparameters will
be required)
● What was the model's performance
on various samples/datasets?
■ Model evaluation
● Evaluation metric chosen -
performance of data, validation set,
test set etc.
● Does the model performance on the
validation set meet pre-specified
criteria using the evaluation metric of
the task? (For example: Is the Kappa
Score at least 0.5?)
All rights reserved 40
Data Science Career Launch Guide
● Describe the misclassified examples,
the part of the dataset (for example,
classes) where the model struggled
■ Visualizations
● Visualizations of the model
● Visualizations of feature importance
■ Business decisions
● What are the overall
recommendations to the business
based on your analysis?
○ Ensure that your language is clear and cogent, and
there are no glaring spelling/grammar/factual
errors in the blog. Read this article on the best
practices for blogging about Data Science.
■ Here are two great examples of well-written
blogs:
● The most in-demand tech skills for
Data Scientists: The author has clearly
explained how he collected data and
presented insights and visualizations,
interleaved with explanations.
● Introduction to interactive Time
Series Visualizations: Here, the author
takes a more tutorial-style approach
while explaining the project with
code-snippets in between, along with
explanations and visualizations.
All rights reserved 41
Data Science Career Launch Guide
Remember that the idea here is to explain your
project in a visual and intuitive way, so that even
non-techies can understand. This would be a huge
boost in terms of showcasing your communication
skills.
References
● Read this great article on how to build a Data Science
portfolio. It contains excellent advice and further
references on the subject.
● Build a Career in Data Science, Chapter 4, talks a lot
about portfolio projects and blogging.
● Build a Data Science portfolio is another excellent article
on building your Data Science portfolio projects.
All rights reserved 42
Data Science Career Launch Guide
Chapter 6: Data Science
must-have skills: what is the
industry looking for
Overview
Today Data Science is at
the heart of nearly every
business and
organization. The
growing need to not
only gather data, but sift
through it and analyze it
to direct decisions, has
prompted a huge
demand for qualified
Data Science professionals.
Obviously, strong technical skills are essential. But the question
is which specific skills does one have to master to set upon this
particular career path? What is the industry and the recruiter
are looking for in a candidate?
All rights reserved 43
Data Science Career Launch Guide
Learning outcome
In this section, we will deep dive into:
● What is the industry looking for while hiring for a Data
Science role
● What will make you an ideal candidate for your role
What is the industry looking for?
● Understand what a company looks for in the process of
hiring and interviewing candidates for an open Data
Science position
● Complement that by understanding your options are
when starting a career in Data Science
Key pointers to keep in mind
● One crucial person who can do the job: Any company
who is hiring for a Data Science position will look out for a
person who can do the job, and not the person who gets
the most interview questions right, or who has the most
degrees or years of experience. Companies want to hire
someone who will do his job and help the team achieve
their business goals.
● Have the necessary skills: Necessary skills will not only
cover technical skills but also necessary non-technical
skills.
All rights reserved 44
Data Science Career Launch Guide
On the technical side, you need to have some
combination of math and statistics, as well as databases
and programming. On the nontechnical side, you need
general business understanding, as well as skills such as
project management, people management, visual
design, and any number of other skills that are relevant to
the role.
● Ability to understand and judge business problems:
Many Data Science aspirants may feel that just the
knowledge of technical aspects and building algorithms
is enough to be successful. But this is far removed from
the truth.
For a Data Science professional to be successful, it is not
just the knowledge of a technique but also the ability to
understand which is the right algorithm or technique to
solve a particular business problem. For this, he must be
able to showcase his logical abilities and reasoning skills
during the recruitment process. During the selection
process, he should be able to implement his theoretical
knowledge in real-life situations.
● Data visualization and presentation: Sometimes, a good
data presentation is as important as an effective
algorithm. Data Visualization is an important skill
because it helps to identify patterns, correlations and
trends which are not easily seen in text-based data. Not
just this, Data Visualization also helps to simplify complex
problems through images, charts and graphs. Therefore,
All rights reserved 45
Data Science Career Launch Guide
recruiters look for professionals with a good grasp of Data
Visualisation tools.
● Showcase your skills in the best possible way: Whether
the recruitment process is full of innovative rounds, a live
project or a standardised interview, the objective is to
select a Data Science professional who can display
exceptional skills. These Data Science skills are required
to determine the extent of complex projects that he can
handle in the future. As a potential Data Science
candidate, you need to show streaks of your knowledge
and skills in the most effective yet subtle way.
● Be someone who gets things done: Having the right
skills is of no use unless you know where and how to use
them! You need to be able to find solutions to problems
on the job and implement those solutions. Data Science
has lots of places where a person can get stuck; such as
figuring out messy data, thinking through the problem,
trying different models, and tidying up a result. A person
who can overcome each of those challenges will be much
better at doing the job than someone who sits around
waiting for help without asking for it.
● Be a team player: If you say something offensive, act
defensive, or have character traits that would make
interacting or collaborating with you difficult, a company
won't want to hire you. This means that during the
interview, you'll want to be agreeable, compassionate,
and positive. It just means that the people on your future
All rights reserved 46
Data Science Career Launch Guide
team need to see you as being someone they want to
work with.
● Strong communication skills: In addition to strong
technical and quantitative skills, a candidate for Data
Science jobs must possess strong communication skills.
His ability to communicate should be strong enough to
influence the decision-makers. The recruiter may design
an employment test to judge whether the potential
candidate is a good listener, understands business needs
and articulates his point of view well.
Before you appear for an interview, make sure that you are
confident about your skills, you have a good understanding of
the business problem that the company is solving, be well
prepared and always read up about the company and its
people.
Read a case example about the hiring process for Data
Scientists at IBM.
Additional resources
● What recruiters and hiring managers are looking for in a
Data Scientist
● How to get a hiring manager to take you seriously
● What it takes to grab a Data Science job
All rights reserved 47
Data Science Career Launch Guide
About GreyAtom
GreyAtom is a bootcamp-style immersive learning programme
for emerging tech, in the fields of data science, machine
learning, artificial intelligence, and more. Aspirants learn
technologies by working on real problem statements, and
datasets from GreyAtom’s industry partners. The learning
platform, GLabs orchestrates an end-to-end learner experience,
where people code, follow a structured learning path, and get
an instant assessment of their performance.
GreyAtom is on a quest to transform the career trajectories for
learners in the emerging tech space.
● 3+ years of boot camp-style programs
● 175+ industry partners
● 140,000+ people upskilled
● 1,200+ careers transformed
● 250+ world-class mentors and curriculum
All rights reserved 48
Data Science Career Launch Guide
● Winner of YourStory SheSparks EdTech startup of the
year 2018
● Finalist at TechEdXEurope 2018
Our mission
● Change lives with meaningful career transitions
● Engage with best-in-class industry practitioners
● Highly relevant and advanced curriculum and programs
Our core learning philosophy
We founded GreyAtom to bridge the gap between the skills
learners have and what the industry needs by bringing them
together on our learning platform, GLabs. which provides
hands-on coding experience, mentorship, and access to a huge
peer community.
Our programs
All rights reserved 49
Data Science Career Launch Guide
We have Data Science programs based on your learning goals.
Whether you want to learn Data Science, transition horizontally
into Data Science, or to launch a career in Data Science — we
have the program for you.
Each program comes with live mentor sessions, portfolio
projects, and hackathons, designed to make sure you get a solid
foundation in concepts and develop strong application skills.
Take a definitive step towards a career in Data Science, learn
more about our programs.
All rights reserved 50