06 Seaborn
06 Seaborn
March 7, 2024
[27]: df = pd.read_csv("dm_office_sales.csv")
df.head()
salary sales
0 91684 372302
1 119679 495660
2 82045 320453
3 92949 377148
4 71280 312802
[28]: df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 6 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 division 1000 non-null object
1 level of education 1000 non-null object
2 training level 1000 non-null int64
3 work experience 1000 non-null int64
4 salary 1000 non-null int64
5 sales 1000 non-null int64
dtypes: int64(4), object(2)
memory usage: 47.0+ KB
[29]: df.shape
1
[29]: (1000, 6)
plt.legend(loc = (1.1,0.5))
2
[32]: #displacement plot
sns.displot(data=df,x='salary')
C:\Users\chand\anaconda3\Lib\site-packages\seaborn\axisgrid.py:118: UserWarning:
The figure layout has changed to tight
self._figure.tight_layout(*args, **kwargs)
3
[33]: #histogram
sns.set(style='darkgrid')
sns.histplot(data=df,x='sales', bins = 10) #bins helps to zoom in or zoom out␣
↪the plot
4
0.1 KDE
[34]: # KDE is kernel density estimation maps an estimate of probability density␣
↪function of a random variable.
5
[36]: import numpy as np
sample = np.random.randint(0,80,200)
sample_age_df = pd.DataFrame(sample,columns=['age'])
sample_age_df.head()
[36]: age
0 75
1 5
2 31
3 26
4 66
6
[38]: sns.kdeplot(data=sample_age_df,x='age')
7
[39]: sample_age_df.to_csv('eg.csv') #to save a data
[40]: #countplot
plt.figure(figsize = (12,8))
sns.countplot(x='division',data=df,hue= 'level of education')
8
[41]: #barplot()
sns.barplot(x='division', y='sales',data=df, estimator = np.min)
9
[42]: #help(sns.barplot)
C:\Users\chand\anaconda3\Lib\site-packages\seaborn\axisgrid.py:118: UserWarning:
The figure layout has changed to tight
self._figure.tight_layout(*args, **kwargs)
10
[44]: sns.pairplot(df, hue = 'division')
C:\Users\chand\anaconda3\Lib\site-packages\seaborn\axisgrid.py:118: UserWarning:
The figure layout has changed to tight
self._figure.tight_layout(*args, **kwargs)
11
[45]: sns.pairplot(df, hue = 'division', diag_kind='hist', corner = True)
#corner= True helps in avoiding repetition of plots
C:\Users\chand\anaconda3\Lib\site-packages\seaborn\axisgrid.py:118: UserWarning:
The figure layout has changed to tight
self._figure.tight_layout(*args, **kwargs)
12
13