一、Numpy基础简介
1.基础
>>>import numpy as np
>>>arr=np.array([1,2,3,4])
>>>arr
array([1, 2, 3, 4])
>>>type(arr)
numpy.ndarray
>>>np.array([[1,2,3],[5,6,7]])
array([[1, 2, 3],
[5, 6, 7]])
2. 数组的加减乘除、切片与列表相同
>>>b=np.array([[1,2,3,],[5,6,7]])
>>>b[1][0]
5
>>>b.dtype
dtype('int32')
二、Pandas 基础
1.series
1)基础
>>>import pandas as pd
>>>s=pd.Series([1,2,3,4],index=['|','||','|||','||||'])
>>>s
| 1
|| 2
||| 3
|||| 4
dtype: int64
>>>s['V']='apple'
>>>s
| 1
|| 2
||| 3
|||| 4
V apple
dtype: object
>>>s0=pd.Series([1,2,3,4],index=['|','||','|||','||||'])
>>>s0.astype('str')
| 1
|| 2
||| 3
|||| 4
dtype: object
>>>a={'a':'jack','b':'may','c':'lucy'}
>>>s1=pd.Series(a)
>>>s1
a jack
b may
c lucy
dtype: object
>>>s2=pd.Series(a,index =['a','b','c','4'])
>>>s2
a jack
b may
c lucy
4 NaN
dtype: object
增删方法与字典相同
>>>del s2['a']
>>>s2
b may
c lucy
4 NaN
dtype: object
>>>s[s>2]
||| 3
|||| 4
dtype: int64
2)切片
>>>s['|']
1
>>>s[['|','||']]
| 1
|| 2
dtype: int64
>>>s[0:2]
| 1
|| 2
dtype: int64
>>>s['|':'|||']
| 1
|| 2
||| 3
dtype: int64
2.dataframe
1).构造
>>>import pandas as pd
>>>d={
'name':['jack','nacy','betty'],
'sex':['male','male','femal'],
'age':[15,25,36]
}
>>>df=pd.DataFrame(d)
>>>df
| name | sex | age |
---|
0 | jack | male | 15 |
---|
1 | nacy | male | 25 |
---|
2 | betty | femal | 36 |
---|
>>>df2=pd.DataFrame([[1,2,3,4],[5,6,7,8]],index=list('ab'),columns=list("甲乙丙丁"))
>>>df2
>>>df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
name 3 non-null object
sex 3 non-null object
age 3 non-null int64
dtypes: int64(1), object(2)
memory usage: 152.0+ bytes
增删方式同字典
del df['name']
df
| sex | age |
---|
0 | male | 15 |
---|
1 | male | 25 |
---|
2 | femal | 36 |
---|
2).切片
import pandas as pd
d={
'name':[
'jack','nacy','betty'],
'sex':['male','male','femal'],
'age':[15,25,36]
}
df=pd.DataFrame(d,index=['a','b','c'])
df
| name | sex | age |
---|
a | jack | male | 15 |
---|
b | nacy | male | 25 |
---|
c | betty | femal | 36 |
---|
取行
取单行
df.iloc[2]
df.loc['c']
name betty
sex femal
age 36
Name: c, dtype: object
注意:df[2]、df[‘c’]会报错,直接方式不能取单行
取不连续多行
df.iloc[[0,2]]
df.loc[['a','c']]
注意:df[[0,2]]、df[[‘a’,‘c’]]会报错,直接方式不能取不连续多行
| name | sex | age |
---|
a | jack | male | 15 |
---|
c | betty | femal | 36 |
---|
取连续的多行
注意:隐式索引包前不包后,显式索引包前包后
df[0:2]
| name | sex | age |
---|
a | jack | male | 15 |
---|
b | nacy | male | 25 |
---|
df['a':'c']
| name | sex | age |
---|
a | jack | male | 15 |
---|
b | nacy | male | 25 |
---|
c | betty | femal | 36 |
---|
df.iloc[0:2]
| name | sex | age |
---|
a | jack | male | 15 |
---|
b | nacy | male | 25 |
---|
df.loc['a':'c']
| name | sex | age |
---|
a | jack | male | 15 |
---|
b | nacy | male | 25 |
---|
c | betty | femal | 36 |
---|
取列
取单列
df.name
a jack
b nacy
c betty
Name: name, dtype: object
df['name']
a jack
b nacy
c betty
Name: name, dtype: object
df.iloc[:,0]
a jack
b nacy
c betty
Name: name, dtype: object
df.loc[:,'name']
a jack
b nacy
c betty
Name: name, dtype: object
取不连续多列
df[['name','age']]
df.iloc[:,[0,2]]
df.loc[:,['name','age']]
| name | age |
---|
a | jack | 15 |
---|
b | nacy | 25 |
---|
c | betty | 36 |
---|
取连续多列
df.iloc[:,0:2]
| name | sex |
---|
a | jack | male |
---|
b | nacy | male |
---|
c | betty | femal |
---|
df.loc[:,'name':'age']
| name | sex | age |
---|
a | jack | male | 15 |
---|
b | nacy | male | 25 |
---|
c | betty | femal | 36 |
---|
注意:df[‘name’:‘age’]连续索引无法取到想要的数据,行数据不会显示
行列综合
连续索引用于取行,单个索引或者索引集合用于取列
df[0:1].name
df[0:1]['name']
df['name'][0]
df.name[0:1]
a jack
Name: name, dtype: object
df['a':'b'].name
df['a':'b']['name']
df['name']['a':'b']
df.name['a':'b']
a jack
b nacy
Name: name, dtype: object
df[0:2][['name','age']]
df[['name','age']][0:2]
df[1:3][0:1]
df.iloc[行,列]
df.iloc[1,1]
'male'
df.iloc[1:2,0:2]
df.iloc[1:2,[0,2]]
df.loc['a','name']
'jack'
df.loc[行,列]
df.loc['b':'c','name':'age']
| name | sex | age |
---|
b | nacy | male | 25 |
---|
c | betty | femal | 36 |
---|
df.loc['b':'c',['name','age']]
附:
使用索引器ix(已过期)
>>>df.ix[0]
name jack
sex male
age 15
Name: 0, dtype: object

>>>df.ix[0:1]
| name | sex | age |
---|
0 | jack | male | 15 |
---|
1 | nacy | male | 25 |
---|
df.ix[2:3,['age','name']]
3).更改值
>>>df['age'][1]=255
>>>df
| name | sex | age |
---|
0 | jack | male | 15 |
---|
1 | nacy | male | 255 |
---|
2 | betty | femal | 36 |
---|
>>>df.age=[14,26,36]
>>>df
| name | sex | age |
---|
0 | jack | male | 14 |
---|
1 | nacy | male | 26 |
---|
2 | betty | femal | 36 |
---|
>>>df.age=df.age+1
>>>df
| name | sex | age |
---|
0 | jack | male | 15 |
---|
1 | nacy | male | 27 |
---|
2 | betty | femal | 37 |
---|
>>>df.index=list('abc')
>>>df
| name | sex | age |
---|
a | jack | male | 14 |
---|
b | nacy | male | 26 |
---|
c | betty | femal | 36 |
---|
df[df.age==27].name='seven'
df
>>>df.loc[df.age==37,'age']=38
>>>df
| age | name | sex |
---|
a | 15 | jack | male |
---|
b | 27 | nacy | male |
---|
c | 38 | betty | femal |
---|
4). 查找数据
df[df.age == 26]
df[df.age > 26]
df[~(df.age > 26)]
| age | name | sex |
---|
0 | 14 | jack | male |
---|
1 | 26 | nacy | male |
---|
df[(df.age > 20)&(df.name=='nacy')]
df[(df.age > 30)|(df.name=='jack')]
| age | name | sex |
---|
0 | 14 | jack | male |
---|
2 | 36 | betty | femal |
---|
df.query("(age > 20)&(name=='nacy')")