Determinants of Human Development
Index (HDI): A Comprehensive Analysis
Executive Summary
This report presents a comprehensive analysis of the determinants of the Human
Development Index (HDI) using data from the INDODAPOER dataset. The analysis
employs various econometric methods, including Ordinary Least Squares (OLS)
regression and Instrumental Variables (IV) regression, to identify the key factors that
drive HDI levels. The findings reveal that poverty rates, literacy rates, and access to
electricity are significant determinants of HDI, with poverty having a particularly strong
negative impact. The IV regression results suggest that the effect of poverty on HDI is
even stronger than indicated by OLS estimates, highlighting the importance of
addressing endogeneity concerns in this analysis.
1. Introduction
The Human Development Index (HDI) is a composite measure that assesses
development not only through economic growth but also through improvements in
human well-being. Understanding the determinants of HDI is crucial for policymakers
seeking to improve human development outcomes. This analysis aims to identify the
key variables that drive HDI levels and quantify their impacts using advanced
econometric methods.
2. Data and Methodology
2.1 Data Source and Description
The analysis uses data from the INDODAPOER Excel dataset, which contains various
socioeconomic indicators for regions in Indonesia. The dataset includes two HDI
indicators: "Human Development Index" and "Human Development Index, revised
method." After data cleaning and preparation, the analysis focuses on the most recent
available HDI data for each region.
2.2 Variable Selection
Based on theoretical considerations and data availability, the following variables were
selected for analysis:
1. Human Development Index (HDI): The dependent variable measuring overall
human development
2. Poverty Rate: Percentage of population living below the poverty line
3. Literacy Rate: Literacy rate for population aged 15 and over
4. Household Access to Electricity: Percentage of households with access to
electricity
2.3 Methodology
The analysis employs several econometric approaches:
1. Correlation Analysis: To identify the strength and direction of relationships
between HDI and potential determinants
2. OLS Regression: To estimate the basic relationships between HDI and its
determinants
3. Instrumental Variables (IV) Regression: To address potential endogeneity issues,
using literacy rate as an instrument for poverty rate
4. Robustness Checks: Including subsample analysis and alternative specifications
3. Results and Analysis
3.1 Descriptive Statistics
The analysis dataset includes 544 regions with complete data for all variables. The
summary statistics reveal:
• HDI values range from 49.29 to 80.51, with a mean of 72.09
• Poverty rates range from 2.02% to 41.76%, with a mean of 11.87%
• Literacy rates range from 30.03% to 113.87%, with a mean of 97.03%
• Household access to electricity ranges from 12.99% to 100%, with a mean of
96.76%
3.2 Correlation Analysis
The correlation analysis shows strong relationships between HDI and its potential
determinants:
• Poverty Rate: -0.725 (strong negative correlation)
• Literacy Rate: 0.703 (strong positive correlation)
• Household Access to Electricity: 0.654 (strong positive correlation)
These correlations suggest that regions with lower poverty rates, higher literacy rates,
and better access to electricity tend to have higher HDI values.
3.3 OLS Regression Results
The OLS regression model explains approximately 67.2% of the variation in HDI (R-
squared = 0.672). All variables are statistically significant at the 1% level:
• Poverty Rate: -0.284 (p < 0.001)
• Literacy Rate: 0.240 (p < 0.001)
• Household Access to Electricity: 0.104 (p < 0.001)
The results indicate that a one percentage point increase in poverty rate is associated
with a 0.284 point decrease in HDI, holding other factors constant. Similarly, one
percentage point increases in literacy rate and household access to electricity are
associated with 0.240 and 0.104 point increases in HDI, respectively.
3.4 Diagnostic Tests for OLS
Several diagnostic tests were conducted to assess the validity of the OLS model:
1. Multicollinearity: The Variance Inflation Factors (VIFs) for the explanatory
variables range from 1.71 to 1.92, indicating moderate but acceptable levels of
multicollinearity.
2. Heteroscedasticity: The Breusch-Pagan test indicates the presence of
heteroscedasticity (p < 0.001), suggesting that the error variance is not constant.
3. Autocorrelation: The Durbin-Watson statistic of 1.51 suggests some positive
autocorrelation in the residuals.
4. Normality of Residuals: The Jarque-Bera test indicates that the residuals are not
normally distributed (p < 0.001).
These diagnostic results suggest that while the OLS model provides valuable insights,
there are some violations of classical assumptions that warrant further investigation and
potentially more advanced methods.
3.5 Instrumental Variables (IV) Regression
To address potential endogeneity concerns, particularly regarding the poverty rate
variable, an IV regression was conducted using literacy rate as an instrument for poverty
rate. The first-stage regression shows that literacy rate is a strong predictor of poverty
rate (F-statistic = 214.54, p < 0.001), satisfying the relevance condition for a valid
instrument.
The IV regression results show:
• Poverty Rate: -1.158 (p < 0.001)
• Household Access to Electricity: -0.219 (p = 0.039)
• Constant: 107.05 (p < 0.001)
Notably, the coefficient on poverty rate is substantially larger in magnitude in the IV
regression (-1.158) compared to the OLS regression (-0.284), suggesting that the OLS
estimates may understate the true negative impact of poverty on HDI.
3.6 Robustness Checks
Several robustness checks were conducted to validate the findings:
1. Subsample Analysis: The data was split into high-HDI and low-HDI subsamples
based on the median HDI value. The IV regression results for both subsamples
confirm the negative relationship between poverty and HDI, though the magnitude
is larger for the low-HDI subsample (-1.302) compared to the high-HDI subsample
(-0.570).
2. Alternative Specification: An alternative specification including squared terms for
household access to electricity was estimated. The results show a significant non-
linear relationship, with the coefficient on the squared term being negative
(-0.0057, p < 0.001), suggesting diminishing returns to improvements in electricity
access.
4. Discussion and Implications
4.1 Key Findings
1. Poverty as a Critical Determinant: Both OLS and IV results identify poverty rate as
the strongest determinant of HDI, with a substantial negative impact. The IV
estimates suggest that the true effect of poverty on HDI may be even stronger than
indicated by OLS.
2. Importance of Education: Literacy rate shows a strong positive correlation with
HDI and serves as an effective instrument for poverty reduction, highlighting the
dual role of education in human development.
3. Infrastructure Access: Access to electricity has a significant positive relationship
with HDI, though with potential diminishing returns at very high levels of access.
4. Non-Linear Relationships: The alternative specification reveals non-linear effects,
particularly for electricity access, suggesting that the relationship between
infrastructure and human development is complex.
4.2 Policy Implications
1. Poverty Reduction Strategies: Given the strong negative impact of poverty on
HDI, targeted poverty reduction programs should be a priority for improving
human development outcomes.
2. Educational Investments: The strong relationship between literacy and HDI, as
well as literacy's role in reducing poverty, suggests that investments in education
can have multiplier effects on human development.
3. Infrastructure Development: While improving access to electricity is important,
the diminishing returns suggest that in regions with already high access rates, other
factors may become more critical for further HDI improvements.
4. Tailored Regional Approaches: The subsample analysis indicates that the
determinants of HDI may have different magnitudes of impact in regions with
different development levels, suggesting the need for tailored policy approaches.
4.3 Limitations and Future Research
1. Data Limitations: The analysis is based on the most recent available data for each
region, which may not capture the most current situation or temporal dynamics.
2. Omitted Variables: While the model includes key determinants, other factors not
included due to data limitations may also influence HDI.
3. Instrument Validity: While literacy rate appears to be a strong instrument for
poverty, the exclusion restriction (that literacy affects HDI only through poverty)
may not hold perfectly.
4. Causal Interpretation: Despite the use of IV methods, causal interpretations
should be made with caution due to potential remaining endogeneity concerns.
Future research could address these limitations by: - Incorporating panel data to analyze
changes over time - Including additional determinants as data becomes available -
Exploring alternative instruments and identification strategies - Conducting more
detailed regional analyses to account for local contexts
5. Conclusion
This analysis provides robust evidence that poverty rates, literacy rates, and access to
electricity are significant determinants of HDI levels. The IV regression results highlight
the particularly strong negative impact of poverty on human development, suggesting
that poverty reduction should be a central focus of development policies. The findings
also emphasize the importance of education and basic infrastructure in improving
human development outcomes, while acknowledging the complex and potentially non-
linear nature of these relationships.
The methodological approach, combining OLS and IV regression with comprehensive
diagnostic testing and robustness checks, provides a rigorous framework for analyzing
the determinants of human development. While acknowledging certain limitations, this
analysis offers valuable insights for policymakers and researchers interested in
understanding and improving human development outcomes.
Appendix: Technical Details
A.1 First-Stage Regression Results
The first-stage regression of poverty rate on literacy rate and household access to
electricity shows:
OLS Regression
Results
==========================================================================
Dep. Variable: Poverty Rate R-
squared: 0.442
Model: OLS Adj. R-
squared: 0.440
Method: Least Squares F-
statistic: 214.5
No. Observations: 544 Prob (F-
statistic): 2.50e-69
==========================================================================
coef std err t P>|t|
[0.025 0.975]
--------------------------------------------------------------------------
const 74.3742 3.107 23.934 0.000
68.270 80.478
Household -0.3703 0.033 -11.146 0.000
-0.436 -0.305
Literacy -0.2749 0.038 -7.310 0.000
-0.349 -0.201
==========================================================================
A.2 IV Regression Full Results
IV-2SLS Estimation
Summary
==========================================================================
Dep. Variable: HDI R-
squared: -0.2035
Estimator: IV-2SLS Adj. R-
squared: -0.2079
No. Observations: 544 F-
statistic: 90.224
==========================================================================
Parameter Std. Err. T-stat P-value
Lower CI Upper CI
--------------------------------------------------------------------------
const 107.05 12.381 8.6467
0.0000 82.788 131.32
Household_Access -0.2192 0.1064 -2.0595 0.0394
-0.4279 -0.0106
Poverty_Rate -1.1578 0.1884 -6.1452 0.0000
-1.5271 -0.7886
==========================================================================
A.3 Comparison of OLS and IV Coefficients
OLS IV
const 42.0702 107.053937
Household_Access 0.1909 -0.219233
Poverty_Rate -0.2841 -1.157837
A.4 Alternative Specification Results
IV-2SLS Estimation
Summary
==========================================================================
Dep. Variable: HDI R-
squared: 0.0258
Estimator: IV-2SLS Adj. R-
squared: 0.0204
No. Observations: 544 F-
statistic: 295.50
==========================================================================
Parameter Std. Err. T-stat P-
value Lower CI Upper CI
--------------------------------------------------------------------------
const 73.171 4.0975 17.857
0.0000 65.140 81.202
Household_Access 0.6764 0.1858 3.6403
0.0003 0.3122 1.0406
Household_Access_Squared -0.0057 0.0016 -3.4929
0.0005 -0.0089 -0.0025
Poverty_Rate -1.0654 0.1212 -8.7880
0.0000 -1.3030 -0.8278
==========================================================================