Conversation
…sector level in target naics
…ate/county data in prop attribution
…uppressed data up to the target sector level in FBS method
…ing suppressed data
# Conflicts: # flowsa/data_source_scripts/EPA_GHGI.py # flowsa/methods/flowbyactivitymethods/EPA_GHGI.yaml
…e sectors that exist in the flowbyactivity and those that most closely map to the target sectors
…ivities are sector-like
…activities to sectors
…th rather than string length for hosuehold and gov codes
…ctor-like to use the Activity column data
…gned (stewi data) and convert to target sector year
…eo dq column already with all 0 values
|
I reviewed the FBS generation in the action at 59d24a9, for the CRHW national FBS, the facilities that come in as 5 digit NAICS instead of 6 are getting dropped. I think this is only when there is a single 6 digit child for that 5 digit. Also seeing that the 5 digit NAICS with multiple children are not being handled correctly: Old (correct): (21222 split evenly between 212221 and 212222) New (incorrect): All of 21222 is assigned to 212222 |
This was resolved by c09f6f3 |
|
In the revised |
…don't want to reset group_id here); no need to check_if_sectors_are_naics twice if 0 the first time
I believe that ebe8ae6 addresses this, though need to confirm it doesn't impact other methods negatively. I was reviewing this in the context of GHG_national, which was showing major diffs (and duplicate values). It now looks correct and shows no change from remote. |
|
We decided to drop the county employment FBS (or perhaps all but one example). As well as the interim national and state employment FBS files (like 2000-2012), right? |
…or waste modeling work
…esulting in the dropping of child naics in the target mapping
|
using |
fix mapping subset when applied to parent-incompleteChild




Major changes:
adjust_dqi_reliability_collection_scores()to modify data reliability and data collection based on source and target sector levelsassign_temporal_correlation()assigns temporal DQ based on difference between year of data and target year of FBSassign_geographical_correlation()assigns DQ for geoscale based on data geoscale vs target FBS geoscaleassign_technological_correlation()assigns DQ scores based on difference between source and target sectors- Technological scores
- Modify data reliability and data collection scores after mapping
- Pull all NAICS6 and determine mapping changes for child naics to parent naics in generate_naics_crosswalk_conversion_ratios()
- For example, if we are converting NAICS4 across years, we identify all child NAICS6 and determine how those NAICS6 map between years. If there are 4 child NAICS6 and one child NAICS6 maps to a different parent NAICS4 in the target year, than ¼ of the original NAICS4 parent value is mapped to a different NAICS4 in the target year
• Conversion is not based on numeric values within the FBS because we might only have NAICS4 values, not NAICS6 and therefore do not have the data to create proportional conversions
- We previously mapped all activities to NAICS6+, then converted, then aggregated. This is not a good method for a multitude of reasons, but especially problematic when assigning DQ scores
- Subsets sector key to return industry that most closely maps activity/source sectors to target sectors – drops parent sectors within crosswalk and assigns tech corr scoring, modifies datareliability and datacollection scores based on mapping
- Revised this function to check for the closest NAICS year to the target year and use that year to map to target NAICS
Minor changes:
FBA changes