This document provides an overview of data analysis and visualization techniques using Python. It begins with an introduction to NumPy, the fundamental package for numerical computing in Python. NumPy stores data efficiently in arrays and allows for fast operations on entire arrays. The document then covers Pandas, which builds on NumPy and provides data structures like Series and DataFrames for working with structured and labeled data. It demonstrates how to load data, select subsets of data, and perform operations like filtering and aggregations. Finally, it discusses various data visualization techniques using Matplotlib and Seaborn like histograms, scatter plots, box plots, and heatmaps that can be used for exploratory data analysis to gain insights from data.
Vectorization refers to performing operations on entire NumPy arrays or sequences of data without using explicit loops. This allows computations to be performed more efficiently by leveraging optimized low-level code. Traditional Python code may use loops to perform operations element-wise, whereas NumPy allows the same operations to be performed vectorized on entire arrays. Broadcasting rules allow operations between arrays of different shapes by automatically expanding dimensions. Vectorization is a key technique for speeding up numerical Python code using NumPy.
This document provides an overview of working with DataFrames in Python using the Pandas library. It discusses:
1. What a DataFrame is - a two-dimensional, size-mutable, tabular data structure in Pandas for data manipulation.
2. How to create DataFrames from dictionaries, lists, CSV files and more.
3. Common tasks like viewing data, selecting rows/columns, modifying data, analysis and saving DataFrames.
It also covers indexing and filtering DataFrames using labels or boolean conditions, arithmetic alignment in Pandas and NumPy, and vectorized computation in NumPy.
1. NumPy is a fundamental Python library for numerical computing that provides support for arrays and vectorized computations.
2. Pandas is a popular Python library for data manipulation and analysis that provides DataFrame and Series data structures to work with tabular data.
3. When performing arithmetic operations between DataFrames or Series in Pandas, the data is automatically aligned based on index and column labels to maintain data integrity. NumPy also automatically broadcasts arrays during arithmetic to align dimensions element-wise.
NumPy is a numerical Python package that provides a multidimensional array object and tools to work with these arrays. It allows fast operations on arrays of numeric data and is used for scientific computing and mathematics. NumPy arrays can be initialized from nested Python lists and accessed using square brackets. Common operations include indexing, slicing, reshaping arrays, and performing mathematical operations element-wise or on whole arrays.
NumPy provides two fundamental objects for multi-dimensional arrays: the N-dimensional array object (ndarray) and the universal function object (ufunc). An ndarray is a homogeneous collection of items indexed using N integers. The shape and data type define an ndarray. NumPy arrays have a dtype attribute that returns the data type layout. Arrays can be created using the array() function and have various dimensions like 0D, 1D, 2D and 3D.
NumPy is a Python package that is used for scientific computing and working with multidimensional arrays. It allows fast operations on arrays through the use of n-dimensional arrays and has functions for creating, manipulating, and transforming NumPy arrays. NumPy arrays can be indexed, sliced, and various arithmetic operations can be performed on them element-wise for fast processing of large datasets.
NumPy is a Python package that provides multidimensional array and matrix objects as well as tools to work with these objects. It was created to handle large, multi-dimensional arrays and matrices efficiently. NumPy arrays enable fast operations on large datasets and facilitate scientific computing using Python. NumPy also contains functions for Fourier transforms, random number generation and linear algebra operations.
NumPy is a Python library used for working with multidimensional arrays and matrices for scientific computing. It allows fast operations on arrays through optimized C code and is the foundation of the Python scientific computing stack. NumPy arrays can be created in many ways and support operations like indexing, slicing, broadcasting, and universal functions. NumPy provides many useful features for linear algebra, Fourier transforms, random number generation and more.
NumPy is a Python library used for working with multi-dimensional arrays and matrices for scientific computing. It allows fast operations on large data sets and arrays. NumPy arrays can be created from lists or ranges of values and support element-wise operations via universal functions. NumPy is the foundation of the Python scientific computing stack and provides key features like broadcasting for efficient computations.
NumPy is a Python library that provides multi-dimensional array and matrix objects to handle large amounts of numerical data efficiently. It contains a powerful N-dimensional array object called ndarray that facilitates fast operations on large data sets. NumPy arrays can have any number of dimensions and elements of the array can be of any Python data type. NumPy also provides many useful methods for fast mathematical and statistical operations on arrays like summing, averaging, standard deviation, slicing, and matrix multiplication.
NumPy is a Python library that provides multidimensional array and matrix objects, along with tools to work with these objects. It is used for working with arrays and matrices, and has functions for linear algebra, Fourier transforms, and matrices. NumPy was created in 2005 and provides fast operations on arrays and matrices.
This document provides an overview of arrays and operations on arrays using NumPy. It discusses creating arrays, mathematical operations on arrays like basic operations, squaring arrays, indexing and slicing arrays, and shape manipulation. Mathematical operations covered include conditional operations and matrix multiplication. Indexing and slicing cover selecting single elements, counting backwards with negative indexes, and combining positive and negative indexes. Shape manipulation discusses changing an array's shape, size, combining arrays, splitting arrays, and repeating arrays.
This document provides a summary of key aspects of NumPy, the fundamental package for scientific computing in Python. It introduces NumPy ndarrays as a more efficient way to store and manipulate numerical data compared to built-in Python data types. It then covers how to create ndarrays, their basic properties like shape and dtype, and common operations like slicing, sorting, random number generation, and aggregations.
NumPy and Scipy provide MATLAB-like functionality for numerical computing in Python. NumPy features include typed multidimensional arrays for fast numerical computations like matrix math. NumPy is much faster than Python for tasks like matrix multiplication. NumPy arrays can represent vectors, matrices, images, tensors, and more. NumPy provides functions for creating, manipulating, and performing mathematical operations on arrays. Broadcasting rules allow arrays of different dimensions to perform element-wise operations.
This document provides an introduction to NumPy arrays. It discusses arrays versus lists, how to create NumPy arrays using various functions like arange() and zeros(), and how to perform operations on NumPy arrays like arithmetic, mathematical functions, and manipulations. It also covers installing NumPy, importing it, and checking the version. NumPy arrays allow fast and efficient storage and manipulation of numerical data in Python.
This document provides an introduction to NumPy arrays. It discusses arrays versus lists, how to create NumPy arrays using various functions like arange() and zeros(), and how to perform operations on NumPy arrays such as arithmetic, mathematical functions, and manipulations. It also covers installing NumPy, importing it, and checking the version. NumPy arrays allow fast and efficient storage and manipulation of numerical data in Python.
1. NumPy is a fundamental Python library for numerical computing that provides support for arrays and vectorized computations.
2. Pandas is a popular Python library for data manipulation and analysis that provides DataFrame and Series data structures to work with tabular data.
3. When performing arithmetic operations between DataFrames or Series in Pandas, the data is automatically aligned based on index and column labels to maintain data integrity. NumPy also automatically broadcasts arrays during arithmetic to align dimensions element-wise.
NumPy is a numerical Python package that provides a multidimensional array object and tools to work with these arrays. It allows fast operations on arrays of numeric data and is used for scientific computing and mathematics. NumPy arrays can be initialized from nested Python lists and accessed using square brackets. Common operations include indexing, slicing, reshaping arrays, and performing mathematical operations element-wise or on whole arrays.
NumPy provides two fundamental objects for multi-dimensional arrays: the N-dimensional array object (ndarray) and the universal function object (ufunc). An ndarray is a homogeneous collection of items indexed using N integers. The shape and data type define an ndarray. NumPy arrays have a dtype attribute that returns the data type layout. Arrays can be created using the array() function and have various dimensions like 0D, 1D, 2D and 3D.
NumPy is a Python package that is used for scientific computing and working with multidimensional arrays. It allows fast operations on arrays through the use of n-dimensional arrays and has functions for creating, manipulating, and transforming NumPy arrays. NumPy arrays can be indexed, sliced, and various arithmetic operations can be performed on them element-wise for fast processing of large datasets.
NumPy is a Python package that provides multidimensional array and matrix objects as well as tools to work with these objects. It was created to handle large, multi-dimensional arrays and matrices efficiently. NumPy arrays enable fast operations on large datasets and facilitate scientific computing using Python. NumPy also contains functions for Fourier transforms, random number generation and linear algebra operations.
NumPy is a Python library used for working with multidimensional arrays and matrices for scientific computing. It allows fast operations on arrays through optimized C code and is the foundation of the Python scientific computing stack. NumPy arrays can be created in many ways and support operations like indexing, slicing, broadcasting, and universal functions. NumPy provides many useful features for linear algebra, Fourier transforms, random number generation and more.
NumPy is a Python library used for working with multi-dimensional arrays and matrices for scientific computing. It allows fast operations on large data sets and arrays. NumPy arrays can be created from lists or ranges of values and support element-wise operations via universal functions. NumPy is the foundation of the Python scientific computing stack and provides key features like broadcasting for efficient computations.
NumPy is a Python library that provides multi-dimensional array and matrix objects to handle large amounts of numerical data efficiently. It contains a powerful N-dimensional array object called ndarray that facilitates fast operations on large data sets. NumPy arrays can have any number of dimensions and elements of the array can be of any Python data type. NumPy also provides many useful methods for fast mathematical and statistical operations on arrays like summing, averaging, standard deviation, slicing, and matrix multiplication.
NumPy is a Python library that provides multidimensional array and matrix objects, along with tools to work with these objects. It is used for working with arrays and matrices, and has functions for linear algebra, Fourier transforms, and matrices. NumPy was created in 2005 and provides fast operations on arrays and matrices.
This document provides an overview of arrays and operations on arrays using NumPy. It discusses creating arrays, mathematical operations on arrays like basic operations, squaring arrays, indexing and slicing arrays, and shape manipulation. Mathematical operations covered include conditional operations and matrix multiplication. Indexing and slicing cover selecting single elements, counting backwards with negative indexes, and combining positive and negative indexes. Shape manipulation discusses changing an array's shape, size, combining arrays, splitting arrays, and repeating arrays.
This document provides a summary of key aspects of NumPy, the fundamental package for scientific computing in Python. It introduces NumPy ndarrays as a more efficient way to store and manipulate numerical data compared to built-in Python data types. It then covers how to create ndarrays, their basic properties like shape and dtype, and common operations like slicing, sorting, random number generation, and aggregations.
NumPy and Scipy provide MATLAB-like functionality for numerical computing in Python. NumPy features include typed multidimensional arrays for fast numerical computations like matrix math. NumPy is much faster than Python for tasks like matrix multiplication. NumPy arrays can represent vectors, matrices, images, tensors, and more. NumPy provides functions for creating, manipulating, and performing mathematical operations on arrays. Broadcasting rules allow arrays of different dimensions to perform element-wise operations.
This document provides an introduction to NumPy arrays. It discusses arrays versus lists, how to create NumPy arrays using various functions like arange() and zeros(), and how to perform operations on NumPy arrays like arithmetic, mathematical functions, and manipulations. It also covers installing NumPy, importing it, and checking the version. NumPy arrays allow fast and efficient storage and manipulation of numerical data in Python.
This document provides an introduction to NumPy arrays. It discusses arrays versus lists, how to create NumPy arrays using various functions like arange() and zeros(), and how to perform operations on NumPy arrays such as arithmetic, mathematical functions, and manipulations. It also covers installing NumPy, importing it, and checking the version. NumPy arrays allow fast and efficient storage and manipulation of numerical data in Python.
This document provides an introduction and overview of natural language processing (NLP). It discusses how NLP aims to allow computers to communicate with humans using everyday language. It also discusses related areas like artificial intelligence, linguistics, and cognitive science. The document outlines some key aspects of communication like intention, generation, perception, analysis, and incorporation. It discusses the roles of syntax, semantics, and pragmatics. It also covers challenges in NLP like ambiguity and how ambiguity is pervasive and can lead to many possible interpretations. The document contrasts natural languages with computer languages and provides examples of common NLP tasks.
This document discusses database security. It introduces common threats to databases like loss of confidentiality, integrity and availability. The key database security requirements are then outlined as confidentiality, integrity, availability and non-repudiation. Two main types of access control are described - discretionary access control (DAC) using privileges and mandatory access control (MAC) using security classifications. The role of the database administrator to implement access controls is also discussed.
This presentation has been made keeping in mind the students of undergraduate and postgraduate level. To keep the facts in a natural form and to display the material in more detail, the help of various books, websites and online medium has been taken. Whatever medium the material or facts have been taken from, an attempt has been made by the presenter to give their reference at the end.
In the seventh century, the rule of Sindh state was in the hands of Rai dynasty. We know the names of five kings of this dynasty- Rai Divji, Rai Singhras, Rai Sahasi, Rai Sihras II and Rai Sahasi II. During the time of Rai Sihras II, Nimruz of Persia attacked Sindh and killed him. After the return of the Persians, Rai Sahasi II became the king. After killing him, one of his Brahmin ministers named Chach took over the throne. He married the widow of Rai Sahasi and became the ruler of entire Sindh by suppressing the rebellions of the governors.
A short update and next week. I am writing both Session 9 and Orientation S1.
As a Guest Student,
You are now upgraded to Grad Level.
See Uploads for “Student Checkin” & “S8”. Thx.
Thank you for attending our workshops.
If you are new, do welcome.
Grad Students: I am planning a Reiki-Yoga Master Course (As a package). I’m Fusing both together.
This will include the foundation of each practice. Our Free Workshops can be used with any Reiki Yoga training package. Traditional Reiki does host rules and ethics. Its silent and within the JP Culture/Area/Training/Word of Mouth. It allows remote healing but there’s limits As practitioners and masters. We are not allowed to share certain secrets/tools. Some content is designed only for “Masters”. Some yoga are similar like the Kriya Yoga-Church (Vowed Lessons). We will review both Reiki and Yoga (Master tools) in the Course upcoming.
Session Practice, For Reference:
Before starting a session, Make sure to check your environment. Nothing stressful. Later, You can decorate a space as well.
Check the comfort level, any needed resources (Yoga/Reiki/Spa Props), or Meditation Asst?
Props can be oils, sage, incense, candles, crystals, pillows, blankets, yoga mat, any theme applies.
Select your comfort Pose. This can be standing, sitting, laying down, or a combination.
Monitor your breath. You can add exercises.
Add any mantras or affirmations. This does aid mind and spirit. It helps you to focus.
Also you can set intentions using a candle.
The Yoga-key is balancing mind, body, and spirit.
Finally, The Duration can be long or short.
Its a good session base for any style.
Next Week’s Focus:
A continuation of Intuition Development. We will review the Chakra System - Our temple. A misguided, misused situation lol. This will also serve Attunement later.
For Sponsor,
General updates,
& Donations:
Please visit:
https://ldmchapels.weebly.com
How to Manage Maintenance Request in Odoo 18Celine George
Efficient maintenance management is crucial for keeping equipment and work centers running smoothly in any business. Odoo 18 provides a Maintenance module that helps track, schedule, and manage maintenance requests efficiently.
THERAPEUTIC COMMUNICATION included definition, characteristics, nurse patient...parmarjuli1412
The document provides an overview of therapeutic communication, emphasizing its importance in nursing to address patient needs and establish effective relationships. THERAPEUTIC COMMUNICATION included some topics like introduction of COMMUNICATION, definition, types, process of communication, definition therapeutic communication, goal, techniques of therapeutic communication, non-therapeutic communication, few ways to improved therapeutic communication, characteristics of therapeutic communication, barrier of THERAPEUTIC RELATIONSHIP, introduction of interpersonal relationship, types of IPR, elements/ dynamics of IPR, introduction of therapeutic nurse patient relationship, definition, purpose, elements/characteristics , and phases of therapeutic communication, definition of Johari window, uses, what actually model represent and its areas, THERAPEUTIC IMPASSES and its management in 5th semester Bsc. nursing and 2nd GNM students
How to Configure Vendor Management in Lunch App of Odoo 18Celine George
The Vendor management in the Lunch app of Odoo 18 is the central hub for managing all aspects of the restaurants or caterers that provide food for your employees.
*Order Hemiptera:*
Hemiptera, commonly known as true bugs, is a large and diverse order of insects that includes cicadas, aphids, leafhoppers, and shield bugs. Characterized by their piercing-sucking mouthparts, Hemiptera feed on plant sap, other insects, or small animals. Many species are significant pests, while others are beneficial predators.
*Order Neuroptera:*
Neuroptera, also known as net-winged insects, is an order of insects that includes lacewings, antlions, and owlflies. Characterized by their delicate, net-like wing venation and large, often prominent eyes, Neuroptera are predators that feed on other insects, playing an important role in biological control. Many species have aquatic larvae, adding to their ecological diversity.
Artificial intelligence Presented by JM.jmansha170
AI (Artificial Intelligence) :
"AI is the ability of machines to mimic human intelligence, such as learning, decision-making, and problem-solving."
Important Points about AI:
1. Learning – AI can learn from data (Machine Learning).
2. Automation – It helps automate repetitive tasks.
3. Decision Making – AI can analyze and make decisions faster than humans.
4. Natural Language Processing (NLP) – AI can understand and generate human language.
5. Vision & Recognition – AI can recognize images, faces, and patterns.
6. Used In – Healthcare, finance, robotics, education, and more.
Owner By:
Name : Junaid Mansha
Work : Web Developer and Graphics Designer
Contact us : +92 322 2291672
Email : [email protected]
HOW YOU DOIN'?
Cool, cool, cool...
Because that's what she said after THE QUIZ CLUB OF PSGCAS' TV SHOW quiz.
Grab your popcorn and be seated.
QM: THARUN S A
BCom Accounting and Finance (2023-26)
THE QUIZ CLUB OF PSGCAS.
How to Manage & Create a New Department in Odoo 18 EmployeeCeline George
In Odoo 18's Employee module, organizing your workforce into departments enhances management and reporting efficiency. Departments are a crucial organizational unit within the Employee module.
Parenting Teens: Supporting Trust, resilience and independencePooky Knightsmith
For more information about my speaking and training work, visit: https://www.pookyknightsmith.com/speaking/
SESSION OVERVIEW:
Parenting Teens: Supporting Trust, Resilience & Independence
The teenage years bring new challenges—for teens and for you. In this practical session, we’ll explore how to support your teen through emotional ups and downs, growing independence, and the pressures of school and social life.
You’ll gain insights into the teenage brain and why boundary-pushing is part of healthy development, along with tools to keep communication open, build trust, and support emotional resilience. Expect honest ideas, relatable examples, and space to connect with other parents.
By the end of this session, you will:
• Understand how teenage brain development affects behaviour and emotions
• Learn ways to keep communication open and supportive
• Explore tools to help your teen manage stress and bounce back from setbacks
• Reflect on how to encourage independence while staying connected
• Discover simple strategies to support emotional wellbeing
• Share experiences and ideas with other parents
Strengthened Senior High School - Landas Tool Kit.pptxSteffMusniQuiballo
Landas Tool Kit is a very helpful guide in guiding the Senior High School students on their SHS academic journey. It will pave the way on what curriculum exits will they choose and fit in.
Ray Dalio How Countries go Broke the Big CycleDadang Solihin
A complete and practical understanding of the Big Debt Cycle. A much more practical understanding of how supply and demand really work compared to the conventional economic thinking. A complete and practical understanding of the Overall Big Cycle, which is driven by the Big Debt Cycle and the other major cycles, including the big political cycle within countries that changes political orders and the big geopolitical cycle that changes world orders.
2. Usage of Numpy for numerical Data
NumPy (Numerical Python) is a fundamental library for
numerical computing in Python. It provides support for arrays,
matrices, and a wide range of mathematical functions.
Key Features
• Efficient storage and manipulation of numerical data.
• Functions for array operations, linear algebra, and random
number generation.
3. NumPy
• Stands for Numerical Python
• Is the fundamental package required for high performance
computing and data analysis
• NumPy is so important for numerical computations in Python
is because it is designed for efficiency on large arrays of data.
• It provides
• ndarray for creating multiple dimensional arrays
• Internally stores data in a contiguous block of memory, independent of other
built-in Python objects, use much less memory than built-in Python sequences.
• Standard math functions for fast operations on entire arrays of data without
having to write loops
• NumPy Arrays are important because they enable you to express batch
operations on data without writing any for loops. We call this vectorization.
4. NumPy ndarray vs list
• One of the key features of NumPy is its N-dimensional array
object, or ndarray, which is a fast, flexible container for large
datasets in Python.
• Whenever you see “array,” “NumPy array,” or “ndarray” in
the text, with few exceptions they all refer to the same thing:
the ndarray object.
• NumPy-based algorithms are generally 10 to 100 times faster
(or more) than their pure Python counterparts and use
significantly less memory.
import numpy as np
my_arr = np.arange(1000000)
my_list = list(range(1000000))
5. ndarray
• ndarray is used for storage of homogeneous data
• i.e., all elements the same type
• Every array must have a shape and a dtype
• Supports convenient slicing, indexing and efficient vectorized
computation
import numpy as np
data1 = [6, 7.5, 8, 0, 1]
arr1 = np.array(data1)
print(arr1)
print(arr1.dtype)
print(arr1.shape)
print(arr1.ndim)
6. Creating ndarrays
Using list of lists
import numpy as np
data2 = [[1, 2, 3, 4], [5, 6, 7, 8]] #list of lists
arr2 = np.array(data2)
print(arr2.ndim) #2
print(arr2.shape) # (2,4)
7. Create a 2d array from a list of list
• You can pass a list of lists to create a matrix-like a 2d array.
In:
Out:
8. The dtype argument
• You can specify the data-type by setting the dtype() argument.
• Some of the most commonly used NumPy dtypes are: float, int, bool,
str, and object.
In:
Out:
9. The astype argument
• You can also convert it to a different data-type using the astype method.
In: Out:
• Remember that, unlike lists, all items in an array have to be of the same
type.
10. dtype=‘object’
• However, if you are uncertain about what data type your
array will hold, or if you want to hold characters and
numbers in the same array, you can set the dtype as 'object'.
In: Out:
11. The tolist() function
• You can always convert an array into a list using the tolist() command.
In: Out:
12. Inspecting a NumPy array
• There are a range of functions built into NumPy that allow you to
inspect different aspects of an array:
In:
Out:
15. Arithmatic with NumPy Arrays
• Arithmetic operations with scalars propagate the scalar argument
to each element in the array:
• Comparisons between arrays of the same size yield boolean
arrays:
arr = np.array([[1., 2., 3.], [4., 5., 6.]])
print(arr)
[[1. 2. 3.]
[4. 5. 6.]]
print(arr **2)
[[ 1. 4. 9.]
[16. 25. 36.]]
arr2 = np.array([[0., 4., 1.], [7., 2., 12.]])
print(arr2)
[[ 0. 4. 1.]
[ 7. 2. 12.]]
print(arr2 > arr)
[[False True False]
[ True False True]]
16. Extracting specific items from an array
• You can extract portions of the array using indices, much like when
you’re working with lists.
• Unlike lists, however, arrays can optionally accept as many
parameters in the square brackets as there are number of dimensions
In: Out:
17. Indexing and Slicing
• One-dimensional arrays are simple; on the surface they act
similarly to Python lists:
arr = np.arange(10)
print(arr) # [0 1 2 3 4 5 6 7 8 9]
print(arr[5]) #5
print(arr[5:8]) #[5 6 7]
arr[5:8] = 12
print(arr) #[ 0 1 2 3 4 12 12 12 8 9]
18. Indexing and Slicing
• As you can see, if you assign a scalar value to a slice, as in
arr[5:8] = 12, the value is propagated (or broadcasted) to the
entire selection.
• An important first distinction from Python’s built-in lists is that
array slices are views on the original array.
• This means that the data is not copied, and any modifications to the view will be
reflected in the source array.
arr = np.arange(10)
print(arr) # [0 1 2 3 4 5 6 7 8 9]
arr_slice = arr[5:8]
print(arr_slice) # [5 6 7]
arr_slice[1] = 12345
print(arr) # [ 0 1 2 3 4 5 12345 7 8 9]
arr_slice[:] = 64
print(arr) # [ 0 1 2 3 4 64 64 64 8 9]
19. Indexing
• In a two-dimensional array, the elements at each index are no
longer scalars but rather one-dimensional arrays:
• Thus, individual elements can be accessed recursively. But that is
a bit too much work, so you can pass a comma-separated list of
indices to select individual elements.
• So these are equivalent:
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(arr2d[2]) # [7 8 9]
print(arr2d[0][2]) # 3
print(arr2d[0, 2]) #3
20. Activity 3
• Consider the two-dimensional array, arr2d.
• Write a code to slice this array to display the last column,
[[3] [6] [9]]
• Write a code to slice this array to display the last 2 elements of
middle array,
[5 6]
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
21. Boolean indexing
• A boolean index array is of the same shape as the array-to-
be-filtered, but it only contains TRUE and FALSE values.
In: Out:
22. Pandas
• Pandas, like NumPy, is one of the most popular Python
libraries for data analysis.
• It is a high-level abstraction over low-level NumPy, which is
written in pure C.
• Pandas provides high-performance, easy-to-use data
structures and data analysis tools.
• There are two main structures used by pandas; data frames
and series.
23. Indices in a pandas series
• A pandas series is similar to a list, but differs in the fact that a series associates a label with
each element. This makes it look like a dictionary.
• If an index is not explicitly provided by the user, pandas creates a RangeIndex ranging from 0
to N-1.
• Each series object also has a data type.
In: Ou
t:
24. • As you may suspect by this point, a series has ways to extract all of
the values in the series, as well as individual elements by index.
In: Ou
t:
• You can also provide an index manually.
In:
Out:
25. • It is easy to retrieve several elements of a series by their indices or make
group assignments.
In:
Out:
26. Filtering and maths operations
• Filtering and maths operations are easy with Pandas as well.
In: Ou
t:
27. Pandas data frame
• Simplistically, a data frame is a table, with rows and columns.
• Each column in a data frame is a series object.
• Rows consist of elements inside series.
Case ID Variable one Variable two Variable 3
1 123 ABC 10
2 456 DEF 20
3 789 XYZ 30
28. Creating a Pandas data frame
• Pandas data frames can be constructed using Python dictionaries.
In:
Out:
29. • You can also create a data frame from a list.
In: Out:
30. • You can ascertain the type of a column with the type() function.
In:
Out:
31. • A Pandas data frame object as two indices; a column index and row
index.
• Again, if you do not provide one, Pandas will create a RangeIndex from
0 to N-1.
In:
Out:
32. • There are numerous ways to provide row indices explicitly.
• For example, you could provide an index when creating a data frame:
In: Out:
• or do it during runtime.
• Here, I also named the index ‘country
code’.
In:
Out:
33. • Row access using index can be performed in several ways.
• First, you could use .loc() and provide an index label.
• Second, you could use .iloc() and provide an index number
In: Out:
In: Out:
34. • A selection of particular rows and columns can be selected this way.
In: Out:
• You can feed .loc() two arguments, index list and column list, slicing operation
is supported as well:
In: Out:
37. Reading from and writing to a file
• Pandas supports many popular file formats including CSV, XML,
HTML, Excel, SQL, JSON, etc.
• Out of all of these, CSV is the file format that you will work with the
most.
• You can read in the data from a CSV file using the read_csv() function.
• Similarly, you can write a data frame to a csv file with the to_csv()
function.
38. • Pandas has the capacity to do much more than what we have
covered here, such as grouping data and even data
visualisation.
• However, as with NumPy, we don’t have enough time to cover
every aspect of pandas here.
39. Exploratory data analysis (EDA)
Exploring your data is a crucial step in data analysis. It involves:
• Organising the data set
• Plotting aspects of the data set
• Maybe producing some numerical summaries; central tendency
and spread, etc.
“Exploratory data analysis can never be the whole story, but
nothing else can serve as the foundation stone.”
- John Tukey.
40. Reading in the data
• First we import the Python packages we are going to use.
• Then we use Pandas to load in the dataset as a data frame.
NOTE: The argument index_col argument states that we'll treat the first
column of the dataset as the ID column.
NOTE: The encoding argument allows us to by pass an input error created
by special characters in the data set.
#18: As NumPy has been designed to be able to work with very large arrays, you could imagine performance
and memory problems if NumPy insisted on always copying data.
If you want a copy of a slice of an ndarray instead of a view, you
will need to explicitly copy the array—for example,
arr[5:8].copy().