HR Data Analysis
1. Introduction
This project analyzes an HR dataset to uncover insights about employee demographics,
performance, engagement, and other key HR metrics. Using Python libraries like pandas and
matplotlib, the analysis offers actionable insights for better workforce management and decision-
making.
2. Objectives
Analyze gender, race, and marital status distributions.
Investigate salary trends across different attributes such as gender, department, and
employment status.
Explore employee performance, satisfaction, tenure, and recruitment effectiveness.
Examine termination trends and their reasons.
3. Dataset Overview
The dataset contains HR-related attributes, including:
Demographics: Gender, race, marital status.
Job Details: Department, salary, hire date, termination date.
Performance Metrics: Engagement survey, satisfaction score, performance score.
Other Metrics: Absences, special project counts, recruitment source.
4. Analysis and Insights
4.1. Data Exploration
python
print([Link]()) # Display first few rows
print([Link]().sum()) # Check for missing values
print([Link]()) # Summary statistics
Insights:
o Missing values in certain columns like termination date (for active employees).
o Basic summary statistics highlight salary ranges and other key metrics.
4.2. Demographic Analysis
Gender Distribution:
python
gender_distribution = df['Sex'].value_counts()
Identifies the male-to-female ratio in the organization.
Race Distribution:
python
race_distribution = df['RaceDesc'].value_counts()
Examines diversity in the workforce.
Marital Status Distribution:
python
marital_status_distribution = df['MaritalDesc'].value_counts()
Highlights proportions of married, single, and other marital statuses.
4.3. Salary Analysis
Salary by Gender:
python
salary_by_gender = [Link]('Sex')['Salary'].mean()
Average salary comparison between genders.
Salary by Department:
python
salary_by_department = [Link]('Department')['Salary'].mean()
Shows salary trends across various departments.
Salary by Race:
python
salary_by_race = [Link]('RaceDesc')['Salary'].mean()
Reveals potential disparities or trends in salaries by race.
4.4. Performance and Satisfaction
Performance-Satisfaction Correlation:
python
performance_satisfaction_correlation = df[['EngagementSurvey', 'EmpSatisfaction']].corr()
Explores whether higher engagement correlates with better satisfaction.
Absences vs. Performance:
python
absences_by_performance = [Link]('PerformanceScore')['Absences'].mean()
Identifies whether more absences affect performance scores.
4.5. Termination Analysis
Termination Rates:
python
termination_rate = df['EmploymentStatus'].value_counts()
Shows the proportion of active versus terminated employees.
Termination Reasons:
python
terminated_employees = df[df['EmploymentStatus'] != 'Active']
termination_reasons = terminated_employees['TermReason'].value_counts()
Analyzes reasons for employee termination.
4.6. Recruitment Effectiveness
Recruitment Source Distribution:
python
recruitment_source_dist = df['RecruitmentSource'].value_counts()
Highlights the most common recruitment sources.
Recruitment Source vs Performance:
python
recruitment_performance = [Link]('RecruitmentSource')['PerformanceScore'].value_counts()
Assesses whether certain recruitment sources yield better-performing employees.
4.7. Special Projects
python
projects_by_department = [Link]('Department')['SpecialProjectsCount'].mean()
Average number of special projects handled by employees in different departments.
5. Visualizations
The following charts were created to enhance the analysis:
1. Bar Chart: Gender distribution and salary by gender.
2. Pie Chart: Salary by department, race, and employment status.
3. Line Chart: Termination rates over time.
4. Histogram: Salary and satisfaction distributions.
5. Scatter Plot: Performance vs engagement satisfaction.
6. Bar Chart: Recruitment source performance.
6. Insights and Recommendations
Demographics:
Male and female employees have similar average salaries, but deeper analysis may be
needed to uncover inequities.
Race and marital status distributions suggest potential diversity and inclusion
opportunities.
Salaries:
Salary differences between departments highlight priority or high-revenue roles.
A clear salary disparity exists across employment statuses and races, which should be
investigated further.
Performance and Satisfaction:
A positive correlation between engagement and satisfaction emphasizes the importance
of employee engagement initiatives.
Employees with higher performance scores tend to have fewer absences.
Terminations:
Majority of terminated employees cite specific reasons, such as performance issues or
restructuring.
Insights into tenure by department suggest opportunities to increase retention rates.
Recruitment:
Recruitment source effectiveness is evident, with some channels yielding better-
performing employees. Focusing on high-performing sources could improve hiring
quality.
Projects:
Departments handling fewer special projects may indicate under-utilization or varying
job scopes.
7. Conclusion
The analysis provides valuable insights into HR data, addressing key areas like salaries,
performance, terminations, and recruitment. By implementing the recommendations,
organizations can enhance employee satisfaction, reduce turnover, and optimize hiring strategies.