"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

December 16, 2019

Day#306 - Express the SQL in pandas, TSQL in Pandas

I wanted to mimic joins, aggregation, sum whatever we do in Database with pandas. A simple storyline of Data Analysis between Employee, Department and Salary using pandas dataframes.


import pandas as pd
#Define Data Frames
Employee = {'name': ['Raj', 'Siva', 'Mike', 'Gopi','New_Joinee'],
'age': [22,38,26,35,22]}
dfEmployee = pd.DataFrame(Employee)
print(dfEmployee)
#Salary
Salary = {'name': ['Raj', 'Siva', 'Mike', 'Gopi','Raj', 'Siva'],
'salary': [2200,3800,2600,3500,7000,5000],
'Month': ['Jan','Feb','Jan','Jan','Feb','Mar']
}
dfSalary = pd.DataFrame(Salary)
print(dfSalary)
#Department
Department = {'name': ['Raj', 'Siva', 'Mike', 'Gopi','NOCODE'],
'dept': ['IT','AI','HR','DB','NOCODE']}
dfDepartment = pd.DataFrame(Department)
print(dfDepartment)
#Inner Join, Employee and Dept
print('Outer Join')
print(pd.merge(dfEmployee, dfDepartment, on='name', how='outer'))
#Left Join
print('Left Join')
print(pd.merge(dfEmployee, dfDepartment, on='name', how='left'))
#Right Join
print('Right Join')
print(pd.merge(dfEmployee, dfDepartment, on='name', how='right'))
#Inner Join
print('Inner Join')
print(pd.merge(dfEmployee, dfDepartment, on='name', how='inner'))
#Group by
#Total Salary Group by Employee
#Do a join
salarydata = pd.merge(dfEmployee, dfSalary, on='name', how='inner')
print(salarydata)
print('Total Paid by Employee')
#Perform Group By
print(salarydata.groupby(['name']).sum())
#Sum
#Total Salary Paid
salarydata = pd.merge(dfEmployee, dfSalary, on='name', how='inner')
Total = salarydata['salary'].sum()
print('Total Salary Paid')
print(Total)
#Min
#Min Salary paid by employee
MinimumSalary = salarydata['salary'].min()
print('MinimumSalary')
print(MinimumSalary)
#Max
#Max Salary paid by employee
MaxSalary = salarydata['salary'].max()
print('MaxSalary')
print(MaxSalary)
view raw Pandas_Joins.py hosted with ❤ by GitHub
Everything can be done in SQL. This is a different approach to it using pandas.

Happy Learning!!!

No comments: