Invented Dataset Descriptions (Set 2)
1. WATERRESOURCES.DOC
Source: Generated dataset 2023
- Coverage: 150 river basins worldwide observed across 1980–2015.
- Description: This dataset provides a comprehensive overview of freshwater resources and
their usage. It brings together hydrological data, water extraction rates, and population
pressures. The dataset allows researchers to study patterns of water scarcity, identify
vulnerable regions, and examine the consequences of overuse. It is valuable for
policymakers designing sustainable water management strategies and for assessing the
risks of conflict over shared resources.
- Variables:
1. BASIN – Basin identifier
2. YR – Year
3. FLOW – Average annual river flow (cubic meters per second)
4. WITHDRAW – Water withdrawn for human use (billion cubic meters)
5. POP_DEP – Population dependent on basin (millions)
6. IRRIG – Share of water used for irrigation (%)
7. STRESS – Water stress index (0–100 scale)
2. MIGRATIONPATTERNS.XLS
Source: Generated dataset 2012
- Coverage: 100 sending and receiving countries observed across 1990–2020.
- Description: This dataset documents global migration trends, recording flows of people
across countries and the socioeconomic consequences of migration. It includes data on
remittances, labor market integration, and demographic shifts. The dataset is highly
relevant for understanding the drivers of migration and for evaluating immigration policies.
It also sheds light on how migration affects both the origin and destination countries in
terms of skills, employment, and income distribution.
- Variables:
1. ORIG – Country of origin code
2. DEST – Country of destination code
3. YR – Year
4. MIGR_FLOW – Number of migrants (thousands)
5. REMIT – Remittances sent back (millions USD)
6. EMP_RATE – Employment rate of migrants (%)
7. DEM_IMP – Demographic impact index
3. ENERGYTRANSITION.DOC
Source: Generated dataset 2023
- Coverage: 80 countries observed across 2000–2022.
- Description: This dataset focuses on the global energy transition, recording the balance
between fossil fuels and renewable sources. It tracks how investments, policies, and
technological changes drive the shift towards cleaner energy. By including indicators on
carbon intensity and subsidies, it allows analysis of policy effectiveness. Researchers and
policymakers can use it to project future energy needs and to monitor progress toward
climate targets such as net-zero emissions.
- Variables:
1. CTRY – Country code
2. YR – Year
3. FOSSIL_SHARE – Share of energy from fossil fuels (%)
4. RENEW_SHARE – Share of energy from renewables (%)
5. SUBSIDY – Fossil fuel subsidies (millions USD)
6. INV_RENEW – Investments in renewable energy (millions USD)
7. CARBON_INT – Carbon intensity of energy use (kg CO₂ per unit of GDP)
4. CULTURALHERITAGE.XLS
Source: Generated dataset 2022
- Coverage: 200 UNESCO-listed sites observed across 1975–2015.
- Description: This dataset records the preservation status, visitor statistics, and economic
impact of cultural heritage sites. It provides insights into how tourism, conservation
policies, and urbanization affect heritage preservation. The dataset is suitable for studies in
cultural economics, sustainable tourism, and heritage management. It highlights the tension
between conservation and commercialization, offering evidence for balancing cultural value
with economic development.
- Variables:
1. SITE_ID – Site code
2. YR – Year
3. VISITORS – Annual visitors (thousands)
4. REV_TOUR – Tourism revenue linked to site (millions USD)
5. CONSERV – Conservation spending (millions USD)
6. STATUS – Preservation status index (0–100)
7. LOC_PRESS – Local development pressure index
5. INCOMEDISTRIB.DOC
Source: Generated dataset 2019
- Coverage: 70 countries observed across 1985–2018.
- Description: This dataset provides detailed measures of income distribution and
inequality. It covers multiple dimensions, including income shares by decile, Gini
coefficients, and poverty rates. It enables in-depth exploration of how economic growth
interacts with inequality. Policymakers can use it to assess redistribution policies and to
evaluate progress toward social equity goals. It also supports comparative studies across
regions and time.
- Variables:
1. CTRY – Country code
2. YR – Year
3. GINI – Gini coefficient
4. TOP10 – Income share of top 10% (%)
5. BOT40 – Income share of bottom 40% (%)
6. POV_RATE – Poverty headcount ratio (%)
7. WELF_IDX – Composite welfare index
6. TECHADOPTION.DOC
Source: Generated dataset 2023
- Coverage: 90 countries observed across 1995–2020.
- Description: This dataset documents the adoption of key technologies over time, including
computing, mobile phones, and AI. It reveals the speed of diffusion, differences across
income groups, and productivity effects. The dataset is well-suited for studying technology
gaps, leapfrogging in developing economies, and the role of innovation in growth. It can also
be used for forecasting the spread of emerging technologies and understanding their labor
market impact.
- Variables:
1. CTRY – Country identifier
2. YR – Year
3. PC_PER – Personal computers per 100 people
4. MOBILE_PER – Mobile phone subscriptions per 100 people
5. AI_USE – Share of firms using AI tools (%)
6. PROD_GROWTH – Labor productivity growth (%)
7. DIG_GAP – Digital divide index