Buy New
$57.27$57.27
FREE delivery Monday, June 15
Ships from: Amazon Sold by: ROSE BOOK SELLER
Used - Very Good
$15.65$15.65
FREE delivery Monday, June 15
Ships from: BooksRun Sold by: BooksRun
Sorry, there was a problem.
There was an error retrieving your Wish Lists. Please try again.Sorry, there was a problem.
List unavailable.
Download the free Kindle app and start reading Kindle books instantly on your smartphone, tablet, or computer - no Kindle device required.
Read instantly on your browser with Kindle for Web.
Using your mobile phone camera - scan the code below and download the Kindle app.
Follow the authors
OK
Practical Statistics for Data Scientists: 50 Essential Concepts
Purchase options and add-ons
Statistical methods are a key part of data science, yet very few data scientists have any formal statistics training. Courses and books on basic statistics rarely cover the topic from a data science perspective. This practical guide explains how to apply various statistical methods to data science, tells you how to avoid their misuse, and gives you advice on what's important and what's not.
Many data science resources incorporate statistical methods but lack a deeper statistical perspective. If you’re familiar with the R programming language, and have some exposure to statistics, this quick reference bridges the gap in an accessible, readable format.
With this book, you’ll learn:
- Why exploratory data analysis is a key preliminary step in data science
- How random sampling can reduce bias and yield a higher quality dataset, even with big data
- How the principles of experimental design yield definitive answers to questions
- How to use regression to estimate outcomes and detect anomalies
- Key classification techniques for predicting which categories a record belongs to
- Statistical machine learning methods that "learn" from data
- Unsupervised learning methods for extracting meaning from unlabeled data
- ISBN-101491952962
- ISBN-13978-1491952962
- Edition1st
- PublisherO'Reilly Media
- Publication dateJune 27, 2017
- LanguageEnglish
- Dimensions6.75 x 0.5 x 9 inches
- Print length315 pages
There is a newer edition of this item:
$59.99
This title will be released on July 28, 2026.
Frequently bought together

Customers who viewed this item also viewed
Practical Statistics for Data Scientists: 50+ Essential Concepts Using R and PythonPaperbackFREE Shipping by AmazonGet it as soon as Sunday, Jun 14Only 15 left in stock (more on the way).
Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent SystemsPaperbackFREE Shipping by AmazonGet it as soon as Sunday, Jun 14
Essential Math for Data Science: Take Control of Your Data with Fundamental Linear Algebra, Probability, and StatisticsPaperbackFREE Shipping by AmazonGet it as soon as Sunday, Jun 14
Naked Statistics: Stripping the Dread from the DataPaperbackFREE Shipping on orders over $35 shipped by AmazonGet it as soon as Sunday, Jun 14
Data Science from Scratch: First Principles with PythonPaperbackFREE Shipping by AmazonGet it as soon as Sunday, Jun 14
Customers also bought or read
- Naked Statistics: Stripping the Dread from the Data#1 Best SellerProbability & Statistics
Paperback$13.98$13.98Delivery Sun, Jun 14 - Storytelling with Data: A Data Visualization Guide for Business Professionals
Paperback$23.18$23.18Delivery Sun, Jun 14 - Data Science from Scratch: First Principles with Python
Paperback$38.83$38.83FREE delivery Sun, Jun 14 - Ace the Data Science Interview: 201 Real Interview Questions Asked By FAANG, Tech Startups, & Wall Street
Paperback$42.57$42.57FREE delivery Sun, Jun 14 - Practical Statistics for Data Scientists: 50+ Essential Concepts Using R and Python
Paperback$45.25$45.25FREE delivery Sun, Jun 14 - Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems
Paperback$49.50$49.50FREE delivery Sun, Jun 14 - Python for Data Analysis: Data Wrangling with pandas, NumPy, and Jupyter
Paperback$43.99$43.99FREE delivery Sun, Jun 14 - Python Data Science Handbook: Essential Tools for Working with Data
Paperback$28.95$28.95$5.88 delivery Jun 17 - 23 - An Introduction to Statistical Learning: with Applications in Python (Springer Texts in Statistics)
Hardcover$66.96$66.96FREE delivery Jun 19 - 22 - Grokking Algorithms, Second Edition: An illustrated guide for programmers and other curious people
Paperback$43.99$43.99FREE delivery Sun, Jun 14 - Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython
Paperback$35.22$35.22$3.99 delivery Jun 16 - Jul 1 - Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking
Paperback$21.53$21.53Delivery Jun 17 - 18 - Essential Math for Data Science: Take Control of Your Data with Fundamental Linear Algebra, Probability, and Statistics
Paperback$37.10$37.10FREE delivery Sun, Jun 14 - Hands-On Large Language Models: Language Understanding and Generation
Paperback$47.69$47.69FREE delivery Sun, Jun 14 - Python Crash Course, 3rd Edition: A Hands-On, Project-Based Introduction to Programming#1 Best SellerIntroductory & Beginning Programming
Paperback$27.53$27.53Delivery Sun, Jun 14 - Deep Learning (Adaptive Computation and Machine Learning series)
Hardcover$61.00$61.00FREE delivery Sun, Jun 14 - Algorithms to Live By: The Computer Science of Human Decisions
Paperback$8.50$8.50Delivery Sun, Jun 14 - Designing Machine Learning Systems: An Iterative Process for Production-Ready Applications
Paperback$40.00$40.00FREE delivery Sun, Jun 14 - Fundamentals of Data Engineering: Plan and Build Robust Data Systems
Paperback$43.99$43.99FREE delivery Sun, Jun 14 - Clean Architecture: A Craftsman's Guide to Software Structure and Design (Robert C. Martin Series)
Paperback$30.29$30.29Delivery Sun, Jun 14 - Machine Learning: A Probabilistic Perspective (Adaptive Computation and Machine Learning series)
Hardcover$108.44$108.44$4.25 delivery Jul 3 - 8 - Introduction to Machine Learning with Python: A Guide for Data Scientists
Paperback$30.64$30.64Delivery Sun, Jun 14 - The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition
Hardcover$74.51$74.51FREE delivery Sun, Jun 14
From the Publisher
About this Book
Data science is a fusion of multiple disciplines, including statistics, computer science, information technology and domain specific fields. As a result, a several different terms could be used to reference a given concept. Key terms and their synonyms will be highlighted throughout the book in a sidebar within the text.
This book is aimed at the data scientist with some familiarity with the R programming language, and with some prior (perhaps spotty or ephemeral) exposure to statistics. Both of us came to the world of data science from the world of statistics, and have some appreciation of the contribution that statistics can make to the art of data science. At the same time, we are well aware of the limitations of traditional statistics instruction: statistics as a disciple is a century and a half old, and most statistics textbooks and courses are laden with the momentum and inertia worthy of an ocean liner.
Two goals underlie this book:
- To lay out, in digestible, navigable and easily referenced form, key concepts from statistics that are relevant to data science.
- To explain which concepts are important and useful from a data science perspective, which are less so, and why.
Editorial Reviews
Book Description
About the Author
Andrew Bruce has over 30 years of experience in statistics and data science in academia, government and business. He has a Ph.D. in statistics from the University of Washington and published numerous papers in refereed journals. He has developed statistical-based solutions to a wide range of problems faced by a variety of industries, from established financial firms to internet startups, and offers a deep understanding the practice of data science.
Product details
- Publisher : O'Reilly Media
- Publication date : June 27, 2017
- Edition : 1st
- Language : English
- Print length : 315 pages
- ISBN-10 : 1491952962
- ISBN-13 : 978-1491952962
- Item Weight : 1.12 pounds
- Dimensions : 6.75 x 0.5 x 9 inches
- Best Sellers Rank: #271,202 in Books (See Top 100 in Books)
- #13 in Database Storage & Design
- #91 in Data Processing
- #1,041 in Computer Programming (Books)
- Customer Reviews:
About the authors

Discover more of the author’s books, see similar authors, read book recommendations and more.

Peter Bruce is the Founder of the Institute for Statistics Education, a privately-owned online educational institution. Since its creation in 2002, the Institute has specialized in introductory and graduate level online education in statistics, machine learning, data science, optimization, and other subjects in quantitative analytics.
Prior to founding the Institute, in partnership with the noted economist Julian Simon, Peter continued and commercialized the development of Simon's Resampling Stats, a tool for bootstrapping and resampling. In his work at Cytel Software Corp., he developed Box Sampler along similar lines, and helped bring XLMiner, a machine learning add-in for Excel, to market. He has authored a number of journal articles in the area of resampling, and is a co-author of "Practical Statistics for Data Science" (O'Reilly, multiple editions and translations) and of "Machine Learning for Business Analytics" (Wiley, multiple editions and translations). He is also the author of "Introductory Statistics and Analytics" (Wiley, 2014). Early in his career, he co-authored (with D. Traynham) a noted review of airline deregulation in the National Review (May, 1980).
Prior to his retirement in 2024, Peter's role at the Institute centered on course development and faculty recruitment - there are over 60 faculty members from around the world who are published experts in their fields; most teach from their own texts. He also teaches a course on resampling methods.
Peter has degrees in Russian from Princeton and Harvard, and an MBA from the University of Maryland; he is an autodidact in the area of statistics. Prior to his work in statistics, Peter worked in the US diplomatic corps as a Foreign Service Officer.
Customer reviews
Customer Reviews, including Product Star Ratings help customers to learn more about the product and decide whether it is the right product for them.
To calculate the overall star rating and percentage breakdown by star, we don’t use a simple average. Instead, our system considers things like how recent a review is and if the reviewer bought the item on Amazon. It also analyzed reviews to verify trustworthiness.
Learn more how customers reviews work on AmazonCustomers say
Generated from the text of customer reviewsSelect to learn more
Reviews with images
You will learn more from doing a Google search.
Top reviews from the United States
- 5 out of 5 stars
A good start for those who are iffy with stats but don't want to dive too deep yet.
Reviewed in the United States on June 20, 2017There's always that one person who is unsatisfied, but it sure as hell isn't me, because I knew what this book was going to be like the moment I saw how many pages it was going to have & how the early release version looked. I still preordered a hard copy (for sharing) & a digital copy (for carrying), because I knew this was kind of what type of book I was looking for & then some.
The concepts are not astronomically explained, but with just enough depth that I can also individually explain to people what they are. What really stands out for me so far is after each or so concept, there is a section labeled as further reading (well, in the digital copy) that is usually at the end of the book altogether & I found myself realizing I have a lot of those books so the authors really know where to look & guide those who wanted more depth.
Yeah yeah yeah, the codes are missing (as of mid-June 2017) but if you really understood / know which packages to use, you wouldn't need the code. The first half of the book are two three liners of code concepts anyways; it's the explanations that matter the most. The second half of the book is the good part, which separates a white hat statistician from a grey hat data scientist, which is exactly what I wanted in a <300 page book.
Thanks for keeping me waiting since November though, thought it would never come! The O`Reilly books always keep me in awe at how they always know what topic I want to have a brief book (probably data collecting on me :P) & simultaneously leave me in suspense because I never notice I am preordering the books! Sigh. My only request is to be able to preorder the Kindle editions rather than the physical editions; my data science book cubby is starting to overwhelm my statistics cubby (NOT FOR LONG MASTERS PROGRAM ~).
27 people found this helpfulSending feedback...Sending feedback...HelpfulThank you for your feedback.Sorry, we failed to record your vote. Please try againThanks, we'll investigate in the next few days.Sorry, We failed to report this review. Please try again - 5 out of 5 stars
Excellent, straightforward review of key concepts
Reviewed in the United States on September 7, 2018This book gets right to the point, and explains relevant statistical concepts in straightforward, easy to follow language. It is exactly what I was looking for. I was glad that the authors started at the literal beginning, but the pace and efficiency of writing allow the reader to come up to speed quickly. Well done.
Sending feedback...Sending feedback...HelpfulThank you for your feedback.Sorry, we failed to record your vote. Please try againThanks, we'll investigate in the next few days.Sorry, We failed to report this review. Please try again - 4 out of 5 stars
A modern and very readable book that nicely explains high-level concepts.
Reviewed in the United States on November 13, 2018First of all, this book is not for you if you want a deep and thorough explanation of statistical concepts. It serves a completely different purpose: to familiarize a reader with high-level concepts; to enable them to continue their statistics education elsewhere.
I found this book a very engaging read: it sets itself apart from other books on statistics in clearly telling which concepts are not-so-relevant for the modern computerized explorative analysis toolset. Many concepts that are presented in classic books on the subjects are rooted in 20s and 30s where computing power wasn't available and researches resorted to various pre-calculated distributions and formulas to do their work. A modern data-scientist's approach would eschew some of the old ways and instead rely on randomization, resampling and computing power.
This book not only tells what something is, but also why it is that way and if a concept is still relevant today.
I can recommend this book if your statistics knowledge is spotty or ephemeral, it serves its purpose well and doesn't bog down the reader with (sometimes) unnecessary mathematical concepts to demonstrate an idea.
Why the four stars:
1. Lack of examples in programming languages.
2. Complete lack of exercises (at least 1-2 exercises are necessary).
3. All scarce examples that are available are in R. No Python. :(
26 people found this helpfulSending feedback...Sending feedback...HelpfulThank you for your feedback.Sorry, we failed to record your vote. Please try againThanks, we'll investigate in the next few days.Sorry, We failed to report this review. Please try again - 5 out of 5 stars
Interesting info, no practical applications without datasets
Reviewed in the United States on June 11, 2017Information seems plainly written and relevant. No link to datasets makes the "practical" code portion of the book unusable. Will happily update my review when the datasets are released.
EDIT:
Ok the datasets are up. There is a short R script to run to download the data, it will require some small modifications to get it working correctly.
You need to create a folder named "data".
and I changed the second line in the script from:
PSDS_PATH <- file.path('~', 'statistics-for-data-scientists')
to this:
PSDS_PATH <- file.path('.')
This will download the data into a folder named "data" in whatever directory you run the script. The script runs with no real feedback and some of the data sets are large, so just be patient. Once these were downloaded the examples in the book run great.
15 people found this helpfulSending feedback...Sending feedback...HelpfulThank you for your feedback.Sorry, we failed to record your vote. Please try againThanks, we'll investigate in the next few days.Sorry, We failed to report this review. Please try again - 5 out of 5 stars
I love this book as a reference
Reviewed in the United States on January 5, 2018I love this book as a reference. Clear, efficient but detailed explanations. It is not designed as a textbook but as a reference. When I wonder "what is that test used for again?" or "what was that formula?" this is the first thing I reach for. Sure, Google has become universal for that too, but I like having a single hard copy reference that I can get to know and that becomes a trustworthy old friend. This book is taking on that role for me.
19 people found this helpfulSending feedback...Sending feedback...HelpfulThank you for your feedback.Sorry, we failed to record your vote. Please try againThanks, we'll investigate in the next few days.Sorry, We failed to report this review. Please try again - 5 out of 5 stars
Excellent introductory textbook for data scientists (and students)
Reviewed in the United States on July 14, 2017Excellent introductory text for a comprehensive overview of statistics! The github repository augments the content very well and provides added value for the statistical topics covered in the book. Both of the Bruce brothers are statistical gurus and this fact is evident in the writing, which is both informative and witty. Peter is the president of Statistics.com and is well-versed in providing statistical instruction to students of all ages and levels. He is also a proponent of resampling and one of the developers of the excellent Resampling Stats software package for Excel.
It is true that the textbook does not provide in-depth coverage for all topics, but I don't think that was the intent of the authors. However, the text DOES provide an excellent introduction to topics relevant to students and data scientists. After reading the text and working through the examples, you will be equipped to further your knowledge in whichever topic you require for you data analysis task.
Highly recommended!
21 people found this helpfulSending feedback...Sending feedback...HelpfulThank you for your feedback.Sorry, we failed to record your vote. Please try againThanks, we'll investigate in the next few days.Sorry, We failed to report this review. Please try again - 3 out of 5 stars
Good Topics, Incomplete Explanations
Reviewed in the United States on July 1, 2019I think this book covers a great and useful set of topics for a data scientist to know. The problem is that the author does a poor job explaining many of the topics clearly. So, I often feel the need to read about every explained concept online or watch a youtube clip to understand it better. The definitions, explanations, and examples are sometimes good but often rushed. You need this book to know what you need to learn, but then you end up learning those things elsewhere, and not from this book.
Sending feedback...Sending feedback...HelpfulThank you for your feedback.Sorry, we failed to record your vote. Please try againThanks, we'll investigate in the next few days.Sorry, We failed to report this review. Please try again - 5 out of 5 stars
Excellent Pocket reference for Aspiring Data Scientists
Reviewed in the United States on March 11, 2019I bought this book for $13 an it has been a great read. Numerous major concepts required for a data scientist interview have been covered in this book. If you ask me, it's worth every cent spent on it. I gifted a second one to my friend who is in a Data Science program.
Sending feedback...Sending feedback...HelpfulThank you for your feedback.Sorry, we failed to record your vote. Please try againThanks, we'll investigate in the next few days.Sorry, We failed to report this review. Please try again
Top reviews from other countries
L.2 out of 5 starsNicht das was das was man bei dem Titel erwartet
Reviewed in Germany on August 18, 2019Eine oberflächliche kurze Darstellung diverser statistischer Methoden ohne auf die Details/Formeln groß einzugehen sofern diese denn gegeben sind. Jedes Thema enthält zwar Referenzen auf weiterführende Bücher/Quellen, allerdings ist dieses Buch somit alles andere als Praktisch. Der R-Code ist auch nur obligatorisch und nicht mal sauber formatiert.
Kurzum: Das Buch ist nicht mehr als ein Glossar, der die Methoden anreist und fast gar nicht gegeneinander Vergleicht. Mit Google wird man wesentlich besser informiert.
Vor allem richtet sich dieses Buch an Data Scientists und Leute die schon mal mit R. gearbeitet haben. Wer das bereits hat, der braucht dieses Buch nicht!
Schreibstil: Trocken, repetitiv und viele Vorwärts- und -Rückwärtsverweise.
Von daher keine Empfehlung!
Sending feedback...Thanks, we'll investigate in the next few days.Sorry, We failed to report this review. Please try again
Deep Shah4 out of 5 starsA good book to start the journey of data science
Reviewed in the United Kingdom on March 10, 2020The book very well covers the basics with special focus on data science. It also demonstrate the concepts using the R codes.
Sending feedback...Thanks, we'll investigate in the next few days.Sorry, We failed to report this review. Please try again
MOISES SILVA1 out of 5 starsMala impresión
Reviewed in Mexico on March 3, 2020Parece fotocopia de otro libro. No parece un libro de importación.
Sending feedback...Thanks, we'll investigate in the next few days.Sorry, We failed to report this review. Please try again
G. Horne5 out of 5 starsStatistical Handbook for Students and Practitioners
Reviewed in Canada on March 23, 2018Practical Statistics for Data Scientists presents all of the statistical analysis techniques that students and pracitioners of data analytics projects data science would benefit from reading. From school to workplace this book will earn it's place on your bookshelf.
Sending feedback...Thanks, we'll investigate in the next few days.Sorry, We failed to report this review. Please try again
Sunder Athreya5 out of 5 starsGives readers a very good perspective of traditional statistics and how data science differs ...
Reviewed in India on September 26, 2017Well organised and really lucid text! Gives readers a very good perspective of traditional statistics and how data science differs from that. Very learnable. It would have been very useful if they added some problem sets at the end of the chapters.
Sending feedback...Thanks, we'll investigate in the next few days.Sorry, We failed to report this review. Please try again













![Calculus Study Cards: Calculus Review Prep and Practice Test Questions for High School Students [Full Color Cards]](https://m.media-amazon.com/images/I/411W1SNrb9L._AC_SR100,100_QL65_.jpg)
