Write binary files directly, bypass CSV intermediary#1876
Merged
Conversation
Eliminate the CSV intermediary step for GUL, IL/FM, and RI input files. Previously the pipeline wrote DataFrames to CSV, then converted CSV to binary at runtime. Now binary files are written directly from DataFrames during preparation, with CSV output only when intermediary_csv=True (for debugging). Preparation changes: - Refactor csvtobin converters to decouple CSV reading from binary writing: extract df_to_ndarray(), amplifications_write_bin(), and complex_items_write_bin() as reusable functions. - write_gul_input_files(): replace per-column pop/prepare dicts with a unified files_write_info dict that drives binary+CSV output. - get_il_input_items(): write FM binary files directly, rename pandas dtype dicts to avoid shadowing numpy dtypes, return .bin paths. - write_files_for_reinsurance(): write RI binary files directly using df_to_ndarray().tofile(). - Thread intermediary_csv parameter through GenerateFiles, RunExposure, and all preparation functions. Execution changes: - Replace csv_to_bin/\_csv_to_bin with move_bin() which relocates pre-built .bin files to the run output directory. - Update IL/RI detection in GenerateLossesDir and GenerateLossesDeterministic to check for .bin files (not just .csv). - Update _check_each_inputs_directory to accept either .csv or .bin. - Remove unused step_flag from deterministic loss commands. - Use np.memmap(mode='r') instead of np.fromfile in load_as_ndarray and load_as_array for demand-paged binary reads. Test updates: - Remove CsvToBin test class and associated test helpers. - Update test_generate_files expected paths from .csv to .bin.
Contributor
Author
|
the Piwind test are failing because it checks for the presence of csv files in the created run directory. |
SkylordA
reviewed
Feb 18, 2026
SkylordA
approved these changes
Feb 25, 2026
Merged
benhayes21
approved these changes
Feb 26, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
intermediary_csv=True(for debugging).df_to_ndarray,amplifications_write_bin,complex_items_write_bin).csv_to_binruntime conversion withmove_bin()which relocates pre-built.binfiles to the run output directory..binfiles (not just.csv), and update_check_each_inputs_directoryto accept either format.np.memmap(mode='r')instead ofnp.fromfilefor demand-paged binary reads.