Skip to content

v2.1 release#455

Merged
catherinebirney merged 154 commits intov2.1-releasefrom
develop
Sep 3, 2025
Merged

v2.1 release#455
catherinebirney merged 154 commits intov2.1-releasefrom
develop

Conversation

@catherinebirney
Copy link
Copy Markdown
Contributor

@catherinebirney catherinebirney commented Jul 2, 2025

Major changes:

  • Data quality scoring implemented for FBS
    • New adjust_dqi_reliability_collection_scores() to modify data reliability and data collection based on source and target sector levels
    • assign_temporal_correlation() assigns temporal DQ based on difference between year of data and target year of FBS
    • assign_geographical_correlation() assigns DQ for geoscale based on data geoscale vs target FBS geoscale
    • assign_technological_correlation() assigns DQ scores based on difference between source and target sectors
  • Modified how data are merged on location so we can correctly merge state with county data
  • Modified how activities are mapped to sectors
    • Changed how activities are mapped to properly account for data quality scores
      - Technological scores
      - Modify data reliability and data collection scores after mapping
    • First map to sector year identified in data crosswalk, then later convert to target sector year, previously we immediately converted the crosswalk to target sector year
    • Modified NAICS year conversion method
      - Pull all NAICS6 and determine mapping changes for child naics to parent naics in generate_naics_crosswalk_conversion_ratios()
      - For example, if we are converting NAICS4 across years, we identify all child NAICS6 and determine how those NAICS6 map between years. If there are 4 child NAICS6 and one child NAICS6 maps to a different parent NAICS4 in the target year, than ¼ of the original NAICS4 parent value is mapped to a different NAICS4 in the target year
      • Conversion is not based on numeric values within the FBS because we might only have NAICS4 values, not NAICS6 and therefore do not have the data to create proportional conversions
      - We previously mapped all activities to NAICS6+, then converted, then aggregated - problematic when assigning DQ scores
    • New subset_sector_key()
      - Subsets sector key to return industry that most closely maps activity/source sectors to target sectors – drops parent sectors within crosswalk and assigns tech corr scoring, modifies datareliability and datacollection scores based on mapping
  • Modified how naics are converted to target naics years
    • Had a data check that checked if a sector-like activity was found in any naics year outside of the target year and if so, mapped to target year. Did not always map correctly because sector could be found in multiple NAICS years, and the NAICS years map differently to target year
      - Revised this function to check for the closest NAICS year to the target year and use that year to map to target NAICS

Minor changes:

  • Correct error in attribute_flows_to_sectors()
    • Original group_total assignment was based on original df FlowAmount values, but we reset the index, so needed to base group_total on new index of the df
  • Adds FIPS scale (1,3,5) to FIPS_Crosswalk
  • Add NAICS 2002, 2007, 2022 crosswalks
  • Expand NAICS_Crosswalk_TimeSeries to include NAICS 2022
  • New NAICS_Year_Concordance which maps published 6-digit sectors across years
  • New Sector_Levels csv which lables sector level and sector length for all sectors
  • In source_catalog.ymal
    • Correct BLS_QCEW NAICS years for 2011, 2022, and 2023
  • BLS QCEW estimate_suppressed_qcew()
    • Update the function to only estimate suppressed data up to max sector level. No longer estimate suppressed 6-digit sectors, when our target is 3-digit
  • Data Quality scores
    • Update GHGI scores
  • Consistent fips scale assignments. National = 5, state = 2, county = 1
  • url updates to government FBA links

FBA changes

  • BLS_QCEW: expand to include 2000 – 2023, add county FBS, some changes to target_naics_year to match those of the FBA

FBS changes

  • Retain NAICS 2017 Schema for Employment 2022 and 2023 FBS to enable comparison to prior years

Includes PR:

#441
#456

  • Generates new FBAs for EPA GHGI for 2019-2023
  • Updates to GHG FBS national (m1 and m2) for 2019 - 2023; drops 2012 - 2018 FBS which no longer will work with the latest FBAs
  • Updates Use and Supply tables in SUT format (see Replaces derived use tables with benchmark year #453)
  • Adds Wages_national FBS for 2017
  • update activity to sector mapping to modify how parent sectors are dropped to work for non-naics-like activities

bl-young and others added 30 commits February 16, 2024 14:12
…uppressed data up to the target sector level in FBS method
# Conflicts:
#	flowsa/data_source_scripts/EPA_GHGI.py
#	flowsa/methods/flowbyactivitymethods/EPA_GHGI.yaml
…e sectors that exist in the flowbyactivity and those that most closely map to the target sectors
…th rather than string length for hosuehold and gov codes
@catherinebirney catherinebirney added this to the v2.1.0 milestone Jul 2, 2025
@WesIngwersen
Copy link
Copy Markdown

The ghg_2025 branch has been updated. Somehow that will need to be merged in.

@catherinebirney
Copy link
Copy Markdown
Contributor Author

The ghg_2025 branch has been updated. Somehow that will need to be merged in.

will touch base with @bl-young and pull in via PR #456

@catherinebirney catherinebirney marked this pull request as ready for review September 3, 2025 18:10
@catherinebirney catherinebirney merged commit bc44b48 into v2.1-release Sep 3, 2025
10 checks passed
@catherinebirney catherinebirney mentioned this pull request Sep 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants