Shop New Home Essentials Now
Buy New
To see product details, add this item to your cart.
Ships from: Amazon.com
Sold by: Amazon.com
To see product details, add this item to your cart. You can always remove it later.
Shipper / Seller
Amazon.com
Amazon.com
Shipper / Seller
Amazon.com
Returns
30-day refund / replacement
30-day refund / replacement
This item can be returned in its original condition for a full refund or replacement within 30 days of receipt.
Read full return policy
Payment
Secure transaction
Your transaction is secure
We work hard to protect your security and privacy. Our payment security system encrypts your information during transmission. We don’t share your credit card details with third-party sellers, and we don’t sell your information to others. Learn more
Gift options
Available at checkout
Available at checkout This item is a gift. Change
At checkout, you can add a custom message, a gift receipt for easy returns and have the item gift-wrapped
To see product details, add this item to your cart. You can always remove it later.
Connecting readers with great books since 1972! Used textbooks may not include companion materials such as access codes, etc. May have some wear or writing/highlighting. We ship orders daily and Customer Service is our top priority! Connecting readers with great books since 1972! Used textbooks may not include companion materials such as access codes, etc. May have some wear or writing/highlighting. We ship orders daily and Customer Service is our top priority! See less
Access codes and supplements are not guaranteed with used items.
Ships from and sold by HPB-Red.
Added to

Sorry, there was a problem.

There was an error retrieving your Wish Lists. Please try again.

Sorry, there was a problem.

List unavailable.
Kindle app logo image

Download the free Kindle app and start reading Kindle books instantly on your smartphone, tablet, or computer - no Kindle device required.

Read instantly on your browser with Kindle for Web.

Using your mobile phone camera - scan the code below and download the Kindle app.

QR code to download the Kindle App

  • Data Science from Scratch: First Principles with Python

Follow the author

Get new release updates & improved recommendations
Something went wrong. Please try your request again later.

Data Science from Scratch: First Principles with Python

4.4 out of 5 stars (775)

Purchase options and add-ons

To really learn data science, you should not only master the tools―data science libraries, frameworks, modules, and toolkits―but also understand the ideas and principles underlying them. Updated for Python 3.6, this second edition of Data Science from Scratch shows you how these tools and algorithms work by implementing them from scratch.

If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with the hacking skills you need to get started as a data scientist. Packed with new material on deep learning, statistics, and natural language processing, this updated book shows you how to find the gems in today’s messy glut of data.

  • Get a crash course in Python
  • Learn the basics of linear algebra, statistics, and probability―and how and when they’re used in data science
  • Collect, explore, clean, munge, and manipulate data
  • Dive into the fundamentals of machine learning
  • Implement models such as k-nearest neighbors, Naïve Bayes, linear and logistic regression, decision trees, neural networks, and clustering
  • Explore recommender systems, natural language processing, network analysis, MapReduce, and databases

Books with Buzz
Discover the latest buzz-worthy books, from mysteries and romance to humor and nonfiction. Explore more

Frequently bought together

This item: Data Science from Scratch: First Principles with Python
$38.83
Get it as soon as Thursday, Jun 11
In Stock
Ships from and sold by Amazon.com.
+
$37.10
Get it as soon as Thursday, Jun 11
In Stock
Ships from and sold by Amazon.com.
+
$43.99
Get it as soon as Thursday, Jun 11
In Stock
Ships from and sold by Amazon.com.
Total price: $00
To see our price, add these items to your cart.
Details
Added to Cart
Choose items to buy together.

Customers also bought or read

Loading...

From the brand

Editorial Reviews

About the Author

Joel Grus is a research engineer at the Allen Institute for Artificial Intelligence. Previously he worked as a software engineer at Google and a data scientist at several startups. He lives in Seattle, where he regularly attends data science happy hours.

Product details

  • Publisher ‏ : ‎ O'Reilly Media
  • Publication date ‏ : ‎ June 11, 2019
  • Edition ‏ : ‎ 2nd
  • Language ‏ : ‎ English
  • Print length ‏ : ‎ 403 pages
  • ISBN-10 ‏ : ‎ 1492041130
  • ISBN-13 ‏ : ‎ 978-1492041139
  • Item Weight ‏ : ‎ 1.57 pounds
  • Dimensions ‏ : ‎ 6.9 x 0.9 x 9.1 inches
  • Best Sellers Rank: #73,679 in Books (See Top 100 in Books)
  • Customer Reviews:
    4.4 out of 5 stars (775)

About the author

Follow authors to get new release updates, plus improved recommendations.
Joel Grus
Brief content visible, double tap to read full content.
Full content visible, double tap to read brief content.

Joel Grus is Principal Engineer at Capital Group, where he leads a small team that designs and implements machine learning and data products. Before that he was a software engineer at the Allen Institute for AI and Google, and a data scientist at a variety of startups.

He's the author of the the beloved "Data Science from Scratch", the quirky "Ten Essays on Fizz Buzz", and the polarizing JupyterCon talk "I Don't Like Notebooks".

He lives in Seattle, where he regularly attends data science happy hours. He blogs infrequently at joelgrus.com.

Customer reviews

4.4 out of 5 stars
775 global ratings

Customers say

Customers find this data science book excellent for beginners, with one review noting it covers a range of topics. Moreover, the content receives positive feedback for its clarity, with one customer describing it as a concise introduction to the field. However, customers disagree on the readability, with some finding it frustrating to read. Additionally, the code examples receive mixed reactions, with one customer appreciating the type annotations while another finds them unhelpful.
AI Generated from the text of customer reviews

Select to learn more

40 customers mention content, 28 positive, 12 negative
Customers praise the book's content, describing it as a great fundamental text that is excellent for anyone interested in data science, particularly beginners.
...Above all, it is a good book if used as an index on where to start to understand Data Science, but it definitely doesn't fulfill the promise of...Read more
...Great book!Read more
This book is not good, it should be first principles of data science and not programmingRead more
It's a great book for beginners. Everything is explained short and sweet.Read more
10 customers mention clarity, 9 positive, 1 negative
Customers find the book's explanations clear, with one customer noting it provides a concise introduction to data science, while another mentions it is very straight to the point.
...It covers a broad range of topics, has very clear explanations, and examples.Read more
The content is very good and the examples are very clear, but once you see the code, your mind aches!...Read more
...and the way he writes, sometimes its funny, but also its very straight to the point and dry when it comes to a "quick overview" on concepts and...Read more
...It surveys the material broadly and in detail. It really helps the non-Python coders.Read more
8 customers mention code, 4 positive, 4 negative
Customers have mixed opinions about the code in the book, with some finding it clear and well-structured, while others find it too technical and difficult to follow.
...It described the methods, code, and math in an accessible way. It spent 5-10 pages per technique (and included a primer on python and stats)....Read more
...This one is by far the worst of them all. The code examples are not helpful and the explanations of the algorithms are (often) not helpful....Read more
...While I can appreciate the minimalist approach with more code than wordy explanation, I actually felt more lost and discouraged the more I read....Read more
...for some one in a discipline applying data science as it is too much technical coding....Read more
6 customers mention monochrome, 2 positive, 4 negative
Customers have mixed opinions about the monochrome format of the book, with one customer noting that color is important for understanding the material.
...Please keep in mind that the book is monochrome. If that bothers you, consider viewing the electronic version....Read more
...Color is important to understanding the material....Read more
It's frustrating to see that the print is black & white but not colorful....Read more
Bought book today to realize that it's all grey scaled and not even a single colored page. Besides, print quality is poor and text reads blurry....Read more
6 customers mention readability, 2 positive, 4 negative
Customers have mixed opinions about the book's readability, with some finding it frustrating to read.
...It is frustrating to read and follow the author implement mathematical formulas without explaining why....Read more
...Excellent use is made of clear, concise verbiage to make things "black and white"....Read more
...It is hard to read and the examples doesn’t have impact.Read more
...code than wordy explanation, I actually felt more lost and discouraged the more I read....Read more
The BEST book for learning how many data science functions work under the hood - START HERE!
5 out of 5 stars
The BEST book for learning how many data science functions work under the hood - START HERE!
Did you see something on the news about ChatGPT, Stable Diffusion, or some other big development that made you want to look into machine learning? Maybe you truly plan on entering data science as a field but don't know where to start? Or perhaps you've seen one of the author's brilliant/hilarious talks about why he doesn't like Jupyter Notebooks or how to answer the infamous "FizzBuzz" programming interview question using Tensorflow neural networks (seriously, look up Joel Grus on YouTube). If you know a little bit of Python, a little bit of relevant math, and want to go into any data science or machine learning path, then this book is a must-have. It certainly won't be the only resource you'll need, but it helps you get the most out of other content you'll likely look into later (like how to code up a machine learning pipeline, or maybe a large language model if you're really adventurous). Far too many machine learning lessons out there just tell you to import certain Python libraries (scikit-learn for example) and start using them without giving you any basic understanding of how those imported functions even work to begin with. Even to this day there are still college courses and coding bootcamps that ask you to download a Jupyter Notebook file and just hit "Shift + Enter" and look at the output. You're not going to learn how to code that way!!! Joel Grus does an excellent job of filling in this gap by teaching you more Python than what a statistics professional would usually know and more math than what a typical software developer would know. And that's key if you want to go into a field that relies on both. All the information for Python and math that you need to get started is here. It's 27 chapters that get you familiar with Python and how to use it, as well as the math used in data science and ML (linear algebra, probability and statistics, algorithms, etc). You eventually learn enough of both as you go through the chapters to start applying what you learn for some real-world usage. I've had this book for years and it's still as useful as when it first came out, but the only exception I've seen is that the Twitter API tutorial in the book no longer applies to the paid format that Twitter now uses to access that feature. The tutorial is still good for learning how API's get put to use. Once you've read this book and have gotten familiar with all it has to offer, your next step will probably involve looking into a book about how to actually use pre-built data science libraries (like what you find in the Anaconda distribution of Python). This book may turn out to be heavily responsible for my first startup, but that's a story for later.
Thank you for your feedback
Sorry, there was an error
Sorry we couldn't load the review

Top reviews from the United States

  • 5 out of 5 stars
    The BEST book for learning how many data science functions work under the hood - START HERE!
    Reviewed in the United States on April 17, 2023
    Brief content visible, double tap to read full content.
    Full content visible, double tap to read brief content.

    Did you see something on the news about ChatGPT, Stable Diffusion, or some other big development that made you want to look into machine learning?

    Maybe you truly plan on entering data science as a field but don't know where to start?

    Or perhaps you've seen one of the author's brilliant/hilarious talks about why he doesn't like Jupyter Notebooks or how to answer the infamous "FizzBuzz" programming interview question using Tensorflow neural networks (seriously, look up Joel Grus on YouTube).

    If you know a little bit of Python, a little bit of relevant math, and want to go into any data science or machine learning path, then this book is a must-have. It certainly won't be the only resource you'll need, but it helps you get the most out of other content you'll likely look into later (like how to code up a machine learning pipeline, or maybe a large language model if you're really adventurous).

    Far too many machine learning lessons out there just tell you to import certain Python libraries (scikit-learn for example) and start using them without giving you any basic understanding of how those imported functions even work to begin with. Even to this day there are still college courses and coding bootcamps that ask you to download a Jupyter Notebook file and just hit "Shift + Enter" and look at the output.

    You're not going to learn how to code that way!!!

    Joel Grus does an excellent job of filling in this gap by teaching you more Python than what a statistics professional would usually know and more math than what a typical software developer would know. And that's key if you want to go into a field that relies on both.

    All the information for Python and math that you need to get started is here. It's 27 chapters that get you familiar with Python and how to use it, as well as the math used in data science and ML (linear algebra, probability and statistics, algorithms, etc).

    You eventually learn enough of both as you go through the chapters to start applying what you learn for some real-world usage.

    I've had this book for years and it's still as useful as when it first came out, but the only exception I've seen is that the Twitter API tutorial in the book no longer applies to the paid format that Twitter now uses to access that feature. The tutorial is still good for learning how API's get put to use.

    Once you've read this book and have gotten familiar with all it has to offer, your next step will probably involve looking into a book about how to actually use pre-built data science libraries (like what you find in the Anaconda distribution of Python).

    This book may turn out to be heavily responsible for my first startup, but that's a story for later.

    The BEST book for learning how many data science functions work under the hood - START HERE!

    Did you see something on the news about ChatGPT, Stable Diffusion, or some other big development that made you want to look into machine learning?

    Maybe you truly plan on entering data science as a field but don't know where to start?

    Or perhaps you've seen one of the author's brilliant/hilarious talks about why he doesn't like Jupyter Notebooks or how to answer the infamous "FizzBuzz" programming interview question using Tensorflow neural networks (seriously, look up Joel Grus on YouTube).

    If you know a little bit of Python, a little bit of relevant math, and want to go into any data science or machine learning path, then this book is a must-have. It certainly won't be the only resource you'll need, but it helps you get the most out of other content you'll likely look into later (like how to code up a machine learning pipeline, or maybe a large language model if you're really adventurous).

    Far too many machine learning lessons out there just tell you to import certain Python libraries (scikit-learn for example) and start using them without giving you any basic understanding of how those imported functions even work to begin with. Even to this day there are still college courses and coding bootcamps that ask you to download a Jupyter Notebook file and just hit "Shift + Enter" and look at the output.

    You're not going to learn how to code that way!!!

    Joel Grus does an excellent job of filling in this gap by teaching you more Python than what a statistics professional would usually know and more math than what a typical software developer would know. And that's key if you want to go into a field that relies on both.

    All the information for Python and math that you need to get started is here. It's 27 chapters that get you familiar with Python and how to use it, as well as the math used in data science and ML (linear algebra, probability and statistics, algorithms, etc).

    You eventually learn enough of both as you go through the chapters to start applying what you learn for some real-world usage.

    I've had this book for years and it's still as useful as when it first came out, but the only exception I've seen is that the Twitter API tutorial in the book no longer applies to the paid format that Twitter now uses to access that feature. The tutorial is still good for learning how API's get put to use.

    Once you've read this book and have gotten familiar with all it has to offer, your next step will probably involve looking into a book about how to actually use pre-built data science libraries (like what you find in the Anaconda distribution of Python).

    This book may turn out to be heavily responsible for my first startup, but that's a story for later.

    22 people found this helpful
    Sending feedback...
    Thank you for your feedback.
    Sending feedback...
    Thanks, we'll investigate in the next few days.
  • 5 out of 5 stars
    Amazing introduction to Data Science
    Reviewed in the United States on May 15, 2020
    Brief content visible, double tap to read full content.
    Full content visible, double tap to read brief content.

    Let me start this review by explaining clearly who this book is for: anyone who has had some form of introduction (even if concise) to programming in Python, algebra, statistics, and probability will find this book a great introduction to Data Science. While the author does a great job at having a crash course on these topics (and I even learned a thing or two here and there), I can see the contents being a bit overwhelming if this is your first point of contact with these subjects. However, should you meet the requirements I mentioned above, you'll find this book a breeze! Joel does a good job at explaining the topics using his signature brand of humor, keeping the read entertaining even in the most advanced areas. I'd even say that this is a must read if you are considering going into machine learning, since it teaches you a thing or two in the topic as well. Please keep in mind that the book is monochrome. If that bothers you, consider viewing the electronic version.

    TLDR: If you're looking for a concise introduction to data science and have a bit of knowledge of basic Python, algebra, statistics and probability, look no further than this book! Otherwise, come back once you've picked up those tools and you'll feel right at home :)

    30 people found this helpful
    Sending feedback...
    Thank you for your feedback.
    Sending feedback...
    Thanks, we'll investigate in the next few days.
  • 4 out of 5 stars
    Good book for startes on AI/ML
    Reviewed in the United States on December 26, 2020
    Brief content visible, double tap to read full content.
    Full content visible, double tap to read brief content.

    Good book for someone starting on learning basics of AI/ML

    2 people found this helpful
    Sending feedback...
    Thank you for your feedback.
    Sending feedback...
    Thanks, we'll investigate in the next few days.
  • 5 out of 5 stars
    Very very good book!
    Reviewed in the United States on March 17, 2020
    Brief content visible, double tap to read full content.
    Full content visible, double tap to read brief content.

    This book is suitable for people with basic python programming skills. It is very good for beginners and advanced users alike. The codes are very clear and without errors. This book teaches you the basics and introduce some expert level topics for you to explore further if keen. If you are a novice data analyst and some harder topics throw you off, you should probably revisit the topics after you have gain more knowledge on data science.

    I highly recommend this book as your first book into data science because the codes and thought processes are very clear. 70-80% of the book are data science foundation and basics for you to tackle harder topics later.

    8 people found this helpful
    Sending feedback...
    Thank you for your feedback.
    Sending feedback...
    Thanks, we'll investigate in the next few days.
  • 5 out of 5 stars
    Amazing book on Data Science
    Reviewed in the United States on May 6, 2025
    Brief content visible, double tap to read full content.
    Full content visible, double tap to read brief content.

    Great book when you want to get into the field of Data Science

    One person found this helpful
    Sending feedback...
    Thank you for your feedback.
    Sending feedback...
    Thanks, we'll investigate in the next few days.
  • 3 out of 5 stars
    Great book about the how of Data Science, but not the why.
    Reviewed in the United States on June 30, 2020
    Brief content visible, double tap to read full content.
    Full content visible, double tap to read brief content.

    In my personal opinion, this is a book to bridge the gap between an experienced mathematician/statician and practical machine learning. The book is more focused on describing how to implement mathematical formulas in Python than to actually explain the math behind it.

    I am a proficient Python Engineer and I can read the code and understand what is being done, but the author makes no effort to explain how it reached to that conclusion, or why it matters. The author does implement the mathematical formulas in Python skillfully, it misses the point of the book though.

    Another big problem with this book is that it assumes you can learn mathematics by just doing mathematics without understanding the why. It is frustrating to read and follow the author implement mathematical formulas without explaining why. I believe this is the case because the author DOES assume you have the required math to follow the book. I believe the author should add a section in the preface that list the prerequisites for this book:

    - Linear Algebra

    - Statistics

    - Probability

    - Vector Calculus

    - Continuous Optimization

    Above all, it is a good book if used as an index on where to start to understand Data Science, but it definitely doesn't fulfill the promise of being "from scratch". From scratch IMHO means you dive into the internals of Data Science algorithms. I had the expectation that this book was going to be more like "Designing Data-Intensive Applications" for Data Science where the "why" is as important as the "how". Data Science from scratch is a book about the how, with no effort to dive into the why. The book does provide the vocabulary for me to discuss Data Science with practitioners, but I didn't feel it got me any closer to becoming a practitioner myself.

    BTW, the fact that book is monochrome doesn't matter the font and figures are very clear and readable.

    73 people found this helpful
    Sending feedback...
    Thank you for your feedback.
    Sending feedback...
    Thanks, we'll investigate in the next few days.
  • 5 out of 5 stars
    Great book, really means from scratch
    Reviewed in the United States on July 28, 2021
    Brief content visible, double tap to read full content.
    Full content visible, double tap to read brief content.

    This is a great book. Doing everything from scratch and not just using numpy, sklearn, etc is a great way to learn what's really going on underneath. I'm surprised how far he gets along this path. By the end, you will have implemented a keras-like deep learning setup. It won't be fast enough for production use since it's all using Lists underneath, but you'll be able to see how it all fits together. Also, coming from a more typed language background, I loved the type annotations.

    7 people found this helpful
    Sending feedback...
    Thank you for your feedback.
    Sending feedback...
    Thanks, we'll investigate in the next few days.
  • 5 out of 5 stars
    Good Coverage of the "Bare Metal" of basic Data Science
    Reviewed in the United States on August 18, 2019
    Brief content visible, double tap to read full content.
    Full content visible, double tap to read brief content.

    If you need a good broad brush to learn from, the second revision (in monochrome) is the book for you!

    Yes, there is numpy, pandas, and a host of other packages and frameworks available to perform many of the examples of what is explained in the book. But you need to broaden your knowledge with this material that touches the "bare metal" of Data Science.

    Excellent use is made of clear, concise verbiage to make things "black and white". (save the color images and other crutches for the board room stakeholders!).

    32 people found this helpful
    Sending feedback...
    Thank you for your feedback.
    Sending feedback...
    Thanks, we'll investigate in the next few days.

Top reviews from other countries

  • 5 out of 5 stars
    Buen producto llego bien, solo no brilla mucho
    Reviewed in Mexico on December 13, 2025
    Brief content visible, double tap to read full content.
    Full content visible, double tap to read brief content.

    Buen producto llego bien

    Sending feedback...
    Thanks, we'll investigate in the next few days.
  • 5 out of 5 stars
    Highly recommended
    Reviewed in Germany on April 6, 2026
    Brief content visible, double tap to read full content.
    Full content visible, double tap to read brief content.

    A must-read in this era.

    Sending feedback...
    Thanks, we'll investigate in the next few days.
  • 5 out of 5 stars
    Start with this book right now!
    Reviewed in Canada on August 13, 2019
    Brief content visible, double tap to read full content.
    Full content visible, double tap to read brief content.

    Joel's method of explaining is both entertaining and very useful

    Sending feedback...
    Thanks, we'll investigate in the next few days.
  • 2 out of 5 stars
    Not bad, but not good either
    Reviewed in Italy on October 29, 2021
    Brief content visible, double tap to read full content.
    Full content visible, double tap to read brief content.

    The book is useful to grasp the basic concept behind data science. However it gets pretty messy as the topics become more complex, especially when the python code is shown without too much of explanations. If you need a book to learn python for data science, there are many other alternatives.

    Sending feedback...
    Thanks, we'll investigate in the next few days.
  • 5 out of 5 stars
    Very good ground up approach to the subject
    Reviewed in the United Kingdom on January 25, 2020
    Brief content visible, double tap to read full content.
    Full content visible, double tap to read brief content.

    It’s definitely from the ground up - I found it useful to revisit the maths as well as seeing the code - well with the price of the book

    Sending feedback...
    Thanks, we'll investigate in the next few days.