Shop Viral Tech & Accessories
Buy used:
Used: Good | Details
Sold by Seattlegoodwill
Condition: Used: Good
Access codes and supplements are not guaranteed with used items.
Added to

Sorry, there was a problem.

There was an error retrieving your Wish Lists. Please try again.

Sorry, there was a problem.

List unavailable.
Kindle app logo image

Download the free Kindle app and start reading Kindle books instantly on your smartphone, tablet, or computer - no Kindle device required.

Read instantly on your browser with Kindle for Web.

Using your mobile phone camera - scan the code below and download the Kindle app.

QR code to download the Kindle App

  • Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython

Follow the author

Get new release updates & improved recommendations
Something went wrong. Please try your request again later.

Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython 2nd Edition

4.6 out of 5 stars (1,820)

Get complete instructions for manipulating, processing, cleaning, and crunching datasets in Python. Updated for Python 3.6, the second edition of this hands-on guide is packed with practical case studies that show you how to solve a broad set of data analysis problems effectively. You’ll learn the latest versions of pandas, NumPy, IPython, and Jupyter in the process.

Written by Wes McKinney, the creator of the Python pandas project, this book is a practical, modern introduction to data science tools in Python. It’s ideal for analysts new to Python and for Python programmers new to data science and scientific computing. Data files and related material are available on GitHub.

  • Use the IPython shell and Jupyter notebook for exploratory computing
  • Learn basic and advanced features in NumPy (Numerical Python)
  • Get started with data analysis tools in the pandas library
  • Use flexible tools to load, clean, transform, merge, and reshape data
  • Create informative visualizations with matplotlib
  • Apply the pandas groupby facility to slice, dice, and summarize datasets
  • Analyze and manipulate regular and irregular time series data
  • Learn how to solve real-world data analysis problems with thorough, detailed examples

Customers also bought or read

Loading...

From the brand


From the Publisher

Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython

What Is This Book About?

This book is concerned with the nuts and bolts of manipulating, processing, cleaning, and crunching data in Python. My goal is to offer a guide to the parts of the Python programming language and its data-oriented library ecosystem and tools that will equip you to become an effective data analyst. While 'data analysis' is in the title of the book, the focus is specifically on Python programming, libraries, and tools as opposed to data analysis methodology. This is the Python programming you need for data analysis.

New for the Second Edition

The first edition of this book was published in 2012, during a time when open source data analysis libraries for Python (such as pandas) were very new and developing rapidly. In this updated and expanded second edition, I have overhauled the chapters to account both for incompatible changes and deprecations as well as new features that have occurred in the last five years.

I’ve also added fresh content to introduce tools that either did not exist in 2012 or had not matured enough to make the first cut. Finally, I have tried to avoid writing about new or cutting-edge open source projects that may not have had a chance to mature. I would like readers of this edition to find that the content is still almost as relevant in 2020 or 2021 as it is in 2017.

The major updates in this second edition include:

  • All code, including the Python tutorial, updated for Python 3.6 (the first edition used Python 2.7)
  • Updated Python install instructions for the Anaconda Python Distribution & other Python packages
  • Updates for the latest versions of the pandas library in 2017
  • A new chapter on some more advanced pandas tools, and some other usage tips
  • A brief introduction to using statsmodels and scikit-learn
  • Reorganized since from the first edition to make the book more accessible to newcomers.

Editorial Reviews

About the Author

Wes McKinney is a New York?based software developer and entrepreneur. After finishing his undergraduate degree in mathematics at MIT in 2007, he went on to do quantitative finance work at AQR Capital Management in Greenwich, CT. Frustrated by cumbersome data analysis tools, he learned Python and started building what would later become the pandas project. He's now an active member of the Python data community and is an advocate for the use of Python in data analysis, finance, and statistical computing applications.



Wes was later the co-founder and CEO of DataPad, whose technology assets and team were acquired by Cloudera in 2014. He has since become involved in big data technology, joining the Project Management Committees for the Apache Arrow and Apache Parquet projects in the Apache Software Foundation. In 2016, he joined Two Sigma Investments in New York City, where he continues working to make data analysis faster and easier through open source software.

Product details

  • Publisher ‏ : ‎ O'Reilly Media
  • Publication date ‏ : ‎ November 14, 2017
  • Edition ‏ : ‎ 2nd
  • Language ‏ : ‎ English
  • Print length ‏ : ‎ 550 pages
  • ISBN-10 ‏ : ‎ 1491957662
  • ISBN-13 ‏ : ‎ 978-1491957660
  • Item Weight ‏ : ‎ 1.85 pounds
  • Dimensions ‏ : ‎ 7.25 x 1 x 9.5 inches
  • Best Sellers Rank: #769,232 in Books (See Top 100 in Books)
  • Customer Reviews:
    4.6 out of 5 stars (1,820)

About the author

Follow authors to get new release updates, plus improved recommendations.
Wes McKinney
Brief content visible, double tap to read full content.
Full content visible, double tap to read brief content.

Since 2007, I have been creating fast, easy-to-use data wrangling and statistical computing tools, mostly in the Python programming language. I am best known for creating the pandas project and writing the book Python for Data Analysis. I am also a contributor to the Apache Arrow, Kudu, and Parquet projects within the Apache Software Foundation. I am currently the CTO and Co-founder of Voltron Data, which builds accelerated computing technologies powered by Apache Arrow. I previously worked for Ursa Labs (within RStudio / Posit), Two Sigma, Cloudera, DataPad, and AQR Capital Management.

Customer reviews

4.6 out of 5 stars
1,820 global ratings

Customers say

Customers find the book excellent for data analysis, particularly as an introduction to Python, and appreciate its comprehensive coverage of complex topics. The writing style receives positive feedback, and customers consider it useful. However, the detailed instructions and readability receive mixed reviews, with several customers finding the explanations poor and the content difficult to follow.
AI Generated from the text of customer reviews

Select to learn more

73 customers mention content, 62 positive, 11 negative
Customers find the book comprehensive and instructive, particularly as an introduction to Python and data science, with clear explanations of basics.
Great book for someone familiar with other programming language and getting into Python....Read more
Excellent book, helps to understand better the logic of Pandas.Read more
Great content. Five star content. But, pages started coming off the binding one day after I got this in the mail....Read more
...This is every bit as detailed as I hoped it would be with a great introduction, great examples, and great coverage of fundamental, basic, and...Read more
18 customers mention difficulty level, 13 positive, 5 negative
Customers find the book suitable for beginners, with one mentioning it serves as a great introduction to pandas.
This is a great book for starting out. If you have a basic knowledge of python, I would highly recommend this....Read more
...I find it very easy to learn and it is much easier to set up R and RStudio than it is to set up Python, even though I love Python and Pandas.Read more
...This is not a beginner book, but it's exactly what I needed to learn the details for translating equations to code.Read more
This book gave me my first job. And I am still learning it. It is simple, talks some general idea why functions design like this, and introduces...Read more
9 customers mention writing style, 7 positive, 2 negative
Customers appreciate the writing style of the book, with one mentioning it helps develop Python scripting skills.
Well written by the creator of Pandas. The author's copious use of code snippets to illustrate his points makes the material very usable....Read more
Great writing.Read more
...We’ll written and generally doesn’t get into minutiae. Very useful.Read more
Wes is the creator of Pandas but he is not an effective writer. This has left a bad taste of pandas in my mind....Read more
7 customers mention comprehensiveness, 5 positive, 2 negative
Customers appreciate the book's comprehensive approach, covering complex topics and focusing on the pandas Python library.
Good, relevant, practical content. Bad book quality. Pages started coming loose after a few days....Read more
...The topics are very well explained and it covers most topics which is essential for analyzing the data using Python.Read more
...The topics are just very random.Read more
...This book primarily focuses on the pandas Python library, which is awesome at processing and organizing data...Read more
7 customers mention usefulness, 5 positive, 2 negative
Customers find the book useful.
Useful, especially in terms of getting data into and out of python from other data sources (databases, spreadsheets, etc.)Read more
...We’ll written and generally doesn’t get into minutiae. Very useful.Read more
...the capabilities of pandas as well as its strengths, but it wasn't terribly useful in even basic data science workflow and concepts....Read more
...some general idea why functions design like this, and introduces some practical functions....Read more
10 customers mention detailed instructions, 5 positive, 5 negative
Customers have mixed opinions about the book's instructions, with some finding them easy to follow while others report poor explanations and note that the book reads more like documentation than instructional material.
...the use of random data throughout the book, I found the examples easy to follow and useful....Read more
...random numbers and this is a poor way of teaching someone as it's too abstract....Read more
Excellent step-by-step instructions. Interesting examples.Read more
...Operations/syntax/methodology are presented without a logical hierachical structure, e.g. when introducing a certain object type, only the trivial...Read more
8 customers mention readability, 3 positive, 5 negative
Customers have mixed opinions about the book's readability, with several finding it difficult to follow.
Reads like a manual. Difficult to learn unless you’re looking for something specific.Read more
...Everything is easy to understand. Able to develop your python scripting skills from beginner to proRead more
Hard to follow, problematic explanations, and one needs simply a manual for this book. What a waste of time!Read more
...The small pictures and poor resolution make it difficult to read. The topics are just very random.Read more
Poor quality binding but great content.
3 out of 5 stars
Poor quality binding but great content.
Great content. Five star content. But, pages started coming off the binding one day after I got this in the mail. Loved the first edition and bought this just for the updates, so I can definitely recommend the content. However I can’t recommend the book itself.
Thank you for your feedback
Sorry, there was an error
Sorry we couldn't load the review

Top reviews from the United States

  • 5 out of 5 stars
    Awesome Book to Gain Practical Data Skills with Python
    Reviewed in the United States on April 5, 2019
    Brief content visible, double tap to read full content.
    Full content visible, double tap to read brief content.

    This book has been my foundation of using python as a data analyst.

    This book primarily focuses on the pandas Python library, which is awesome at processing and organizing data (Python pandas is like MS Excel times 100. This is not an exaggeration). It also introduces the reader into numpy (lower level number crunching and arrays), matplotlib (data visualizations), scikitlearn (machine learning), and other useful data science libraries. The book contains other book recommendations for continuing education.

    Although this would be a challenging book for a brand new Python user, I would still recommend it, especially if you are currently doing a lot of work in MS Excel and/ or exporting data from databases. I had a few false starts learning Python, and my biggest stumbling block was lack of application in what I was learning. This book puts practical tools in the reader's hands very quickly. I personally don't have time to make goofy games etc. that other books have used as practice examples. Despite other reviews criticizing the use of random data throughout the book, I found the examples easy to follow and useful. I would also argue that learning how to generate random data is useful in itself (thus the purpose of the numpy random library), and that there are practical examples throughout the book. Chapter 14 devoted to real-world data analysis examples.

    I am almost finished with my second time through the book, this time working through every example. This book has been well worth the hours spent in it. For context, I previously relied on Excel, SQL, and some AutoHotKey. This book has significantly improved how I work.

    Thanks, Wes and team.

    36 people found this helpful
    Sending feedback...
    Thank you for your feedback.
    Sending feedback...
    Thanks, we'll investigate in the next few days.
  • 5 out of 5 stars
    A slog, but well worth it
    Reviewed in the United States on November 24, 2021
    Brief content visible, double tap to read full content.
    Full content visible, double tap to read brief content.

    I got this book when I was transitioning to doing data science with Python and was struggling to become familiar with standard tools. It's written by the creator of Pandas, and follows the style of the Pandas documentation: dense, telegraphic, peppered with examples.

    It's hard work because Wes McKinney often does not articulate why you would need to do something (assuming you are already knowledgeable on the underlying process), and writes like an impatient person who would rather be doing something else. Additionally examples often suffer from being both too long and too short - too long in that almost every example is on a toy dataset created from scratch, too short in that most of those datasets have only 5 or 10 elements and do not always showcase complex operations. Other examples (particularly involving time series) have an overabundance of data that make the critical results hard to spot. Frankly, my first month with Pandas was a miserable one.

    But I give the book 5 stars both because I came to love Pandas as I got more familiar with it, and because while McKinney is not fun to read, he does pack the book with useful information and it is (mostly) well organized. If anything it would benefit from being longer and with a more patient treatment of larger and more concrete datasets (eg the Titanic passenger dataset used in the Pandas documentation). The initial chapter on the basics of using Python could go - if you need this book, then you don't want to be trying to learn the rudiments of Python from it. If you can accept that you'll need a lot of bookmarks or margin notes to get through a rather steep learning curve, it will reward your persistence.

    11 people found this helpful
    Sending feedback...
    Thank you for your feedback.
    Sending feedback...
    Thanks, we'll investigate in the next few days.
  • 4 out of 5 stars
    Examplescould be improved.
    Reviewed in the United States on January 26, 2019
    Brief content visible, double tap to read full content.
    Full content visible, double tap to read brief content.

    This book covers all of the basics that you would want to know to get started in programming in Python for data analysis, as the title implies, but it doesn't really offer compelling real-world examples. The data seem to be made up and the analyses don't go into enough detail to help you really learn how pandas and numpy work. Overall this is a decent starter book but you will have to bookmark the python and pandas documentation online if you want to have a reference to all of the functionality those tools have, and there are many places online where you can get better examples to learn from. If you haven't made your mind up about which tool to use for data analysis, I highly recommend checking out dplyr in R, which has an excellent free book online (R for data science, hadley wickham). I find it very easy to learn and it is much easier to set up R and RStudio than it is to set up Python, even though I love Python and Pandas.

    13 people found this helpful
    Sending feedback...
    Thank you for your feedback.
    Sending feedback...
    Thanks, we'll investigate in the next few days.
  • 5 out of 5 stars
    Very simple ,well designed practical book. Recommend to beginners and Intermediates
    Reviewed in the United States on December 6, 2017
    Brief content visible, double tap to read full content.
    Full content visible, double tap to read brief content.

    This book gave me my first job. And I am still learning it. It is simple, talks some general idea why functions design like this, and introduces some practical functions. Because in real life real job you always need to look up documentation or to google certain functions, I think the idea why Wes makes functions/variables like this, and what he wants to develop in the future is very important. anyway, I think this book is for data analysis beginner and some intermediate users. I learned Python first so I recommend beginners who want to use Python for Data Analyst/Scientist to learn Python Programming first/simultaneously. At least understand lambda and python expressions, otherwise, you can't feel the full magic.

    30 people found this helpful
    Sending feedback...
    Thank you for your feedback.
    Sending feedback...
    Thanks, we'll investigate in the next few days.
  • 5 out of 5 stars
    Practical CS Classics for Data Science Age
    Reviewed in the United States on July 21, 2019
    Brief content visible, double tap to read full content.
    Full content visible, double tap to read brief content.

    So far, this book has been an inspiring reading. It contains a huge number of data cleansing, transformation, analysis & etc. code snippets. The code is very clean and - for the most part - self-explaining (at least, for a seasoned software developer). The book step by step displays the motivations behind the design and functionality of center-piece Python modules - and you would not expect anything less from the original designer of Pandas. I feel this wonderful book being a natural extension of ageless Practical CS classics by Niklaus Wirth, Kernighan-Ritchie, and B. Stroustrup for Data Science Age.

    Sending feedback...
    Thank you for your feedback.
    Sending feedback...
    Thanks, we'll investigate in the next few days.
  • 5 out of 5 stars
    Great book to master Pandas
    Reviewed in the United States on August 15, 2018
    Brief content visible, double tap to read full content.
    Full content visible, double tap to read brief content.

    I was looking for a book that could help me to learn python. I gave this book a try and I realized that the data analysis that I learnt from this book is pretty good from a pandas viewpoint (mostly).

    It does explain about numpy, matplotlib and seaborn libraries, but most of the time is oriented from the pandas perspective.

    Nevertheless, if you want to learn machine learning, numpy and other libraries, consider buying another book.

    All in all, I liked the book because it teaches you and really well how to wrangle data. I only had wish it had more numpy and other libraries.

    One person found this helpful
    Sending feedback...
    Thank you for your feedback.
    Sending feedback...
    Thanks, we'll investigate in the next few days.
  • 5 out of 5 stars
    Written by the creator of Pandas
    Reviewed in the United States on February 14, 2022
    Brief content visible, double tap to read full content.
    Full content visible, double tap to read brief content.

    Well written by the creator of Pandas. The author's copious use of code snippets to illustrate his points makes the material very usable. The snippets are short enough to type by hand so you get the frequent opportunity to play with the code and really understand the tools being presented. And Pandas is awesome!

    One person found this helpful
    Sending feedback...
    Thank you for your feedback.
    Sending feedback...
    Thanks, we'll investigate in the next few days.
  • 5 out of 5 stars
    Great introduction
    Reviewed in the United States on May 16, 2019
    Brief content visible, double tap to read full content.
    Full content visible, double tap to read brief content.

    I am not a programmer, but have been trying to learn python for data analysis for a while. This book does a great job of explaining some basics that other books/programs tend to skip over. Also seems like python is even easier to work with now than it was just a few years ago. If you have tried to pick up these skills without success before, this book might be a good way to re-enter.

    2 people found this helpful
    Sending feedback...
    Thank you for your feedback.
    Sending feedback...
    Thanks, we'll investigate in the next few days.

Top reviews from other countries

    Translated by Amazon
    See original
  • 5 out of 5 stars
    Qualtity of the product is Ausom (as discribed)
    Reviewed in India on August 13, 2021
    Brief content visible, double tap to read full content.
    Full content visible, double tap to read brief content.

    The quality of the book is awesome (as described) quality of the packaging is awesome and book.

    Nice book, covers all the topics gradually and thoroughly. Just started and liking it already. Will post another review after having read couple of chapters.

    Sending feedback...
    Thanks, we'll investigate in the next few days.
  • 5 out of 5 stars
    Rapide et sûr
    Reviewed in France on May 20, 2022
    Brief content visible, double tap to read full content.
    Full content visible, double tap to read brief content.

    Acquisition pour un perfectionnement en tant que Data Analyst

    Sending feedback...
    Thanks, we'll investigate in the next few days.
    Translated from French by Amazon
    See original
  • 5 out of 5 stars
    Best pandas reference book
    Reviewed in the United Arab Emirates on December 18, 2019
    Brief content visible, double tap to read full content.
    Full content visible, double tap to read brief content.

    This is the best reference I use for dealing with python, numpy and mainly pandas. Must have for anyone learning or using pandas. The author (who actually wrote pandas)style is into the point, clear and with simple examples that demonstrate the usage in real world.

    Also this book has all the info to help you prepare data for sci-kit learn and tf .

    Sending feedback...
    Thanks, we'll investigate in the next few days.
  • 5 out of 5 stars
    Must have
    Reviewed in Spain on January 8, 2021
    Brief content visible, double tap to read full content.
    Full content visible, double tap to read brief content.

    You must have this book if you want to learn Pandas and Data Science.

    Sending feedback...
    Thanks, we'll investigate in the next few days.
  • 5 out of 5 stars
    Hervorragendes Buch
    Reviewed in Germany on October 12, 2019
    Brief content visible, double tap to read full content.
    Full content visible, double tap to read brief content.

    1. Logisch aufgebaut. Es ist trotzdem möglich quer zu lesen.

    2. Neben den behandelten Bibliotheken wird auch Python so vermittelt, dass man als Einsteiger das Wichtigste mitnimmt und als Fortgeschrittener dazu lernt.

    3. Ich finde es nicht trocken und es geht nicht zu sehr in die Tiefe. Es wird so kompakt wie möglich das Nötige dargeboten.

    Sending feedback...
    Thanks, we'll investigate in the next few days.
    Translated from German by Amazon
    See original