0% found this document useful (0 votes)
2 views11 pages

ids-assignment

The document contains Python code that processes employee and department data using pandas. It includes various queries to filter and display employee information based on different criteria, such as department number and salary. The code also demonstrates merging data from employee and department datasets for comprehensive analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views11 pages

ids-assignment

The document contains Python code that processes employee and department data using pandas. It includes various queries to filter and display employee information based on different criteria, such as department number and salary. The code also demonstrates merging data from employee and department datasets for comprehensive analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

ids-assignment

May 20, 2025

[7]: import pandas as pd

[128]: emp=pd.read_excel(r'C:\Users\avach\Downloads\emp.xls')
emp.head()

[128]: EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO


0 7782 CLARK MANAGER 7839.0 1981-06-09 2450 NaN 10
1 7839 KING PRESIDENT NaN 1981-11-17 5000 NaN 10
2 7934 MILLER CLERK 7782.0 1982-01-23 1300 NaN 10
3 7369 SMITH CLERK 7902.0 1980-12-17 800 NaN 20
4 7566 JONEES MANAGER 7839.0 1981-04-02 2975 NaN 20

[11]: dep=pd.read_excel(r'C:\Users\avach\Downloads\dep.xls')
dep.head()

[11]: DEPTNO DNAME LOC


0 NaN NaN NaN
1 10.0 ACCOUNTING NEW YORK
2 20.0 RESEARCH DALLAS
3 30.0 SALES CHICAGO
4 40.0 OPERATIONS BOSTON

0.0.1 1) List all the information about all the employee.

[14]: print(emp)

EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO


0 7782 CLARK MANAGER 7839.0 1981-06-09 2450 NaN 10
1 7839 KING PRESIDENT NaN 1981-11-17 5000 NaN 10
2 7934 MILLER CLERK 7782.0 1982-01-23 1300 NaN 10
3 7369 SMITH CLERK 7902.0 1980-12-17 800 NaN 20
4 7566 JONEES MANAGER 7839.0 1981-04-02 2975 NaN 20
5 7788 SCOTT ANALYST 7566.0 1982-12-09 3000 NaN 20
6 7876 ADAMS CLERK 7788.0 1983-01-12 1100 NaN 20
7 7902 FORD ANALYST 7566.0 1981-12-04 3000 NaN 20
8 7499 ALLEN SALESMAN 7698.0 1981-02-20 1600 300.0 30
9 7521 WARD SALESMAN 7698.0 1981-02-22 1250 500.0 30

1
10 7654 MARTIN SALESMAN 7698.0 1981-09-28 1250 1400.0 30
11 7698 BLAKE MANAGER 7839.0 1981-05-01 2850 NaN 30
12 7844 TURNER SALESMAN 7698.0 1981-09-08 1500 0.0 30
13 7900 JAMES CLERK 7698.0 1981-12-03 950 NaN 30
14 7370 Vishnu House Wife 7698.0 1981-06-29 400 NaN 90

0.0.2 2)list employee belonging to the dept no 20.

[17]: emp[(emp.DEPTNO==20)]

[17]: EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO


3 7369 SMITH CLERK 7902.0 1980-12-17 800 NaN 20
4 7566 JONEES MANAGER 7839.0 1981-04-02 2975 NaN 20
5 7788 SCOTT ANALYST 7566.0 1982-12-09 3000 NaN 20
6 7876 ADAMS CLERK 7788.0 1983-01-12 1100 NaN 20
7 7902 FORD ANALYST 7566.0 1981-12-04 3000 NaN 20

0.0.3 3)list the name and salary of the employees whose salary is more than 1000.

[20]: df1=emp[(emp.SAL>=1000)][['ENAME','SAL']]
df1

[20]: ENAME SAL


0 CLARK 2450
1 KING 5000
2 MILLER 1300
4 JONEES 2975
5 SCOTT 3000
6 ADAMS 1100
7 FORD 3000
8 ALLEN 1600
9 WARD 1250
10 MARTIN 1250
11 BLAKE 2850
12 TURNER 1500

[22]: emp.size

[22]: 120

[24]: emp.shape

[24]: (15, 8)

2
0.1 4)list the employee working in department number 30, salary should be
greater than 2000.
[27]: df2=emp[(emp.SAL>2000)&(emp.DEPTNO== 30)]['ENAME']
df2

[27]: 11 BLAKE
Name: ENAME, dtype: object

0.2 5)list the name of clerk working in department 20.


[30]: df3=emp[(emp.JOB=='CLERK')&(emp.DEPTNO== 20)]['ENAME']
df3

[30]: 3 SMITH
6 ADAMS
Name: ENAME, dtype: object

0.3 6)list the details of clerks in descending order of department.


[33]: df4=emp[emp.JOB== 'CLERK'].sort_values('DEPTNO',ascending= False)
df4

[33]: EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO


13 7900 JAMES CLERK 7698.0 1981-12-03 950 NaN 30
3 7369 SMITH CLERK 7902.0 1980-12-17 800 NaN 20
6 7876 ADAMS CLERK 7788.0 1983-01-12 1100 NaN 20
2 7934 MILLER CLERK 7782.0 1982-01-23 1300 NaN 10

0.4 7)list the empno ,deptno,working in department 20 and salary should be


greater than 2000 record data should be sorted according deptno.
[36]: df5=emp[(emp.DEPTNO== 20)&(emp.SAL>=2000)].
↪sort_values('DEPTNO')[['ENAME','DEPTNO']]

df5

[36]: ENAME DEPTNO


4 JONEES 20
5 SCOTT 20
7 FORD 20

3
0.5 8)list the no of employee ,enameof clerks in ascending order of salary.
[39]: df6=emp.sort_values('SAL',ascending= True)[['EMPNO','ENAME','SAL']]
df6

[39]: EMPNO ENAME SAL


14 7370 Vishnu 400
3 7369 SMITH 800
13 7900 JAMES 950
6 7876 ADAMS 1100
9 7521 WARD 1250
10 7654 MARTIN 1250
2 7934 MILLER 1300
12 7844 TURNER 1500
8 7499 ALLEN 1600
0 7782 CLARK 2450
11 7698 BLAKE 2850
4 7566 JONEES 2975
5 7788 SCOTT 3000
7 7902 FORD 3000
1 7839 KING 5000

0.6 9)list the empno deptno of clerks working in dept 20 and having salary
more than 2000 output should be in descending order by deptno.
[42]: df7=emp[(emp.JOB == 'CLERK') & (emp.DEPTNO == 20)].sort_values('DEPTNO',␣
↪ascending=False)[['EMPNO', 'DEPTNO','ENAME','SAL']]

df7

[42]: EMPNO DEPTNO ENAME SAL


3 7369 20 SMITH 800
6 7876 20 ADAMS 1100

0.7 10)list the common jobs in department 20 & 30.


[45]: df8=pd.merge(emp[emp.DEPTNO==20],emp[emp.
↪DEPTNO==30],on='JOB',how='inner')[['JOB']]

df8

[45]: JOB
0 CLERK
1 MANAGER
2 CLERK

4
0.8 12) List the EMPNO ,name of employee ,DEPTNO,DEPTNAME whose
salary >3000?
[71]: df9 = pd.merge(emp, dep, on='DEPTNO', how='inner').query('SAL >=␣
↪3000')[['EMPNO', 'ENAME', 'DEPTNO', 'DNAME']]

df9

[71]: EMPNO ENAME DEPTNO DNAME


1 7839 KING 10 ACCOUNTING
5 7788 SCOTT 20 RESEARCH
7 7902 FORD 20 RESEARCH

0.9 13)Display the list of employee working in each DEPT ?display the DEPT
information even if no employee belong to that DEPT?
[75]: df10=pd.merge(emp,dep, left_on='DEPTNO',right_on='DEPTNO',how='right')
df10

[75]: EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO \


0 NaN NaN NaN NaN NaT NaN NaN NaN
1 7782.0 CLARK MANAGER 7839.0 1981-06-09 2450.0 NaN 10.0
2 7839.0 KING PRESIDENT NaN 1981-11-17 5000.0 NaN 10.0
3 7934.0 MILLER CLERK 7782.0 1982-01-23 1300.0 NaN 10.0
4 7369.0 SMITH CLERK 7902.0 1980-12-17 800.0 NaN 20.0
5 7566.0 JONEES MANAGER 7839.0 1981-04-02 2975.0 NaN 20.0
6 7788.0 SCOTT ANALYST 7566.0 1982-12-09 3000.0 NaN 20.0
7 7876.0 ADAMS CLERK 7788.0 1983-01-12 1100.0 NaN 20.0
8 7902.0 FORD ANALYST 7566.0 1981-12-04 3000.0 NaN 20.0
9 7499.0 ALLEN SALESMAN 7698.0 1981-02-20 1600.0 300.0 30.0
10 7521.0 WARD SALESMAN 7698.0 1981-02-22 1250.0 500.0 30.0
11 7654.0 MARTIN SALESMAN 7698.0 1981-09-28 1250.0 1400.0 30.0
12 7698.0 BLAKE MANAGER 7839.0 1981-05-01 2850.0 NaN 30.0
13 7844.0 TURNER SALESMAN 7698.0 1981-09-08 1500.0 0.0 30.0
14 7900.0 JAMES CLERK 7698.0 1981-12-03 950.0 NaN 30.0
15 NaN NaN NaN NaN NaT NaN NaN 40.0

DNAME LOC
0 NaN NaN
1 ACCOUNTING NEW YORK
2 ACCOUNTING NEW YORK
3 ACCOUNTING NEW YORK
4 RESEARCH DALLAS
5 RESEARCH DALLAS
6 RESEARCH DALLAS
7 RESEARCH DALLAS
8 RESEARCH DALLAS
9 SALES CHICAGO

5
10 SALES CHICAGO
11 SALES CHICAGO
12 SALES CHICAGO
13 SALES CHICAGO
14 SALES CHICAGO
15 OPERATIONS BOSTON

0.10 14)list the job unique to dept no 20?


[78]: df11 = pd.merge(emp, dep, left_on='DEPTNO', right_on='DEPTNO', how='right')

set(emp[emp.DEPTNO == 20]['JOB'])
set(emp[emp.DEPTNO == 30]['JOB'])
set(emp[emp.DEPTNO == 10]['JOB'])

[78]: {'CLERK', 'MANAGER', 'PRESIDENT'}

0.11 15)list the name of employee whose eployee no are


7369,7521,7839,7934,7788.
[85]: df12=(emp[emp.EMPNO.isin([7369,7521,7839,7934,7788])]['ENAME'])
df12

[85]: 1 KING
2 MILLER
3 SMITH
5 SCOTT
9 WARD
Name: ENAME, dtype: object

0.12 16)list the employee details not belonging to the department 10,20,40.
[93]: df13=emp[~emp.DEPTNO.isin([10,30,40])]
df13

[93]: EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO


3 7369 SMITH CLERK 7902.0 1980-12-17 800 NaN 20
4 7566 JONEES MANAGER 7839.0 1981-04-02 2975 NaN 20
5 7788 SCOTT ANALYST 7566.0 1982-12-09 3000 NaN 20
6 7876 ADAMS CLERK 7788.0 1983-01-12 1100 NaN 20
7 7902 FORD ANALYST 7566.0 1981-12-04 3000 NaN 20
14 7370 Vishnu House Wife 7698.0 1981-06-29 400 NaN 90

6
0.13 17)list the name of employee whose eployee no are
7369,7521,7839,7934,7788 and salary more than 1500.
[113]: df14 = emp[(emp.EMPNO.isin([7369, 7521, 7839, 7934, 7788])) & (emp.SAL >=␣
↪1500)][['ENAME']]

df14

[113]: ENAME
1 KING
5 SCOTT

0.14 18)list the employee name and salary,whose salary is between 1000 &2000?
[120]: df15=emp[emp.SAL.between(1000,2000)] [['ENAME','SAL']]
df15

[120]: ENAME SAL


2 MILLER 1300
6 ADAMS 1100
8 ALLEN 1600
9 WARD 1250
10 MARTIN 1250
12 TURNER 1500

0.15 19)List the employee name and salary, whose salary is not between 1000
to 2000?
[123]: df16=emp[~emp.SAL.between(1000,2000)] [['ENAME','SAL']]
df16

[123]: ENAME SAL


0 CLARK 2450
1 KING 5000
3 SMITH 800
4 JONEES 2975
5 SCOTT 3000
7 FORD 3000
11 BLAKE 2850
13 JAMES 950
14 Vishnu 400

7
0.16 20)list the employee name who has joined before 30 june 1981 and after
december 1981?
[142]: df17 = emp[emp.HIREDATE.between('1981-06-30', '1981-12-31')]
df17

[142]: EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO


1 7839 KING PRESIDENT NaN 1981-11-17 5000 NaN 10
7 7902 FORD ANALYST 7566.0 1981-12-04 3000 NaN 20
10 7654 MARTIN SALESMAN 7698.0 1981-09-28 1250 1400.0 30
12 7844 TURNER SALESMAN 7698.0 1981-09-08 1500 0.0 30
13 7900 JAMES CLERK 7698.0 1981-12-03 950 NaN 30

0.17 21)list the department and no of employee in each department?


[155]: df18=emp.groupby('DEPTNO').size().to_frame("No of employee")
df18

[155]: No of employee
DEPTNO
10 3
20 5
30 6
90 1

0.18 22)List the number of jobs Department wise?


[162]: df19=emp.groupby(['DEPTNO','JOB']).size().to_frame("NO of employee")
df19

[162]: NO of employee
DEPTNO JOB
10 CLERK 1
MANAGER 1
PRESIDENT 1
20 ANALYST 2
CLERK 2
MANAGER 1
30 CLERK 1
MANAGER 1
SALESMAN 4
90 House Wife 1

8
0.19 23)List the no of employee working with the company?
[170]: df20=emp['EMPNO'].count()
print("Number of employees working:")
df20

Number of employees working:

[170]: 15

0.20 24)List the min Sal,Max SAL , Avg SAL?


[181]: df21=pd.DataFrame({"minimum salary":[emp.SAL.min()],"maximum salary":[emp.SAL.
↪max()],"Averag salary":[emp.SAL.mean()]})

df21

[181]: minimum salary maximum salary Averag salary


0 400 5000 1961.666667

0.21 25)List Avg SAL , Min SAl , Max SAl working in DEPTNO=20?
[190]: df22 = emp[emp['DEPTNO'] == 20]['SAL'].agg(␣
↪Minimum_Sal='min',Maximum_Sal='max',Average_Sal='mean')

df22

[190]: Minimum_Sal 800.0


Maximum_Sal 3000.0
Average_Sal 2175.0
Name: SAL, dtype: float64

0.22 26)list the total salary minimum salary and maximum salary job wise?
[197]: df24 = emp.groupby('JOB')['SAL'].
↪agg(Total_Salary='sum',Minimum_Salary='min',Maximum_Salary='max')

df24

[197]: Total_Salary Minimum_Salary Maximum_Salary


JOB
ANALYST 6000 3000 3000
CLERK 4150 800 1300
House Wife 400 400 400
MANAGER 8275 2450 2975
PRESIDENT 5000 5000 5000
SALESMAN 5600 1250 1600

9
0.23 27)List the SAl ,Max SAL ,Min SAL ,and Avg SAl of an employee job wise
for dept 20 only?
[206]: df25=emp[emp.DEPTNO==20].groupby('JOB')['SAL'].
↪agg(Total_Salary='sum',Minimum_Salary='min',Maximum_Salary='max')

df25

[206]: Total_Salary Minimum_Salary Maximum_Salary


JOB
ANALYST 6000 3000 3000
CLERK 1900 800 1100
MANAGER 2975 2975 2975

0.24 28)List the Avg SAL of all DEPT employee more then 5 people?
[223]: df26=emp.groupby('DEPTNO').filter(lambda x: len(x)>5).
↪groupby('DEPTNO',as_index=False)['SAL'].mean()

df26

[223]: DEPTNO SAL


0 30 1566.666667

[227]: emp.groupby('DEPTNO').size()

[227]: DEPTNO
10 3
20 5
30 6
90 1
dtype: int64

[229]: df27=emp.groupby('DEPTNO').filter(lambda x: len(x)>=5).


↪groupby('DEPTNO',as_index=False)['SAL'].mean()

df27

[229]: DEPTNO SAL


0 20 2175.000000
1 30 1566.666667

0.25 29)List the Avg SAl for each job excluding manager?
[240]: df28=(emp[emp.JOB=='MANAGER']).groupby('JOB')['SAL'].mean()
df28

[240]: JOB
MANAGER 2758.333333

10
Name: SAL, dtype: float64

0.26 30)List the total SAL ,MAx SAL,Min sal of employee job wise from
DEPTNO=20 and display those row having Avg SAL greater than 1000
the output should Arrenge in the ascending order of sum of sal?
[265]: df29=emp[emp.DEPTNO==20].groupby('JOB',as_index=False).
↪agg(Total_salary=('SAL','sum'),Maximum_sal=('SAL','max'),Minimum_sal=('SAL','min'),Avg_sal=(

↪query('Avg_sal>1000').sort_values('Total_salary')

df29

[265]: JOB Total_salary Maximum_sal Minimum_sal Avg_sal


2 MANAGER 2975 2975 2975 2975.0
0 ANALYST 6000 3000 3000 3000.0

[ ]:

11

You might also like