Create function to trim output json file from not necessary data by Bachibouzouk · Pull Request #625 · rl-institut/multi-vector-simulator

Bachibouzouk · 2020-10-29T11:30:06Z

Address part of #606

Changes proposed in this pull request:

Move retrieve_date_time_info from C0 to B0

Tasks from #606 (comment):

2. After closely inspecting the JSON file (~29MB), we observed timestamps are the same for all timeseries. We can significantly reduce the size and redundancy of the file by removing all timestamps vectors from the file. The period can be deduced by simulation settings.
3. We also spotted attributes which do not exist in the EPA side (e.g. costs_investment_over_lifetime). These can be left-out at first and added by the end user.
4. “EnergyBusses" field should be provided directly by EPA as MVS input, it can therefore be dropped
5. Data which are NOT modified by MVS (e.g. economic_data, EnergyConsumption{development_costs, age_installed, optimizedCap, input timeSeries, etc.}) shall not be send back to EPA to prevent redundancy.
7. “path_to_plots” seem to be referencing plotly dash generated figures. We can omit it.
8. cost_matrix, scalar_matrix
not sure if needed
9. optimizedFlows is a duplicate of the flows of each asset, it can be omitted in the returned json

Use in-line comments to explain your code
Write docstrings to your code (example docstring)
For new functionalities: Explain in readthedocs
Write test(s) for your new patch of code (pytests, assertion debug messages)
Update the CHANGELOG.md
Apply black (black . --exclude docs/)
Check if benchmark tests pass locally (EXECUTE_TESTS_ON=master pytest)

_{For more information on how to contribute check the CONTRIBUTING.md.}

smartie2076 · 2020-11-04T13:09:48Z

* [x]  2. After closely inspecting the JSON file (~29MB), we observed `timestamps `are the same for all timeseries. We can significantly reduce the size and redundancy of the file by removing all timestamps vectors from the file. The period can be deduced by simulation settings.

Nice

* [ ]  3. We also spotted attributes which do not exist in the EPA side (e.g. costs_investment_over_lifetime). These can be left-out at first and added by the end user.

Do you want to address this as well in this PR? It will be huge that way. Also, if I understood correctly, ICOM will ignore/drop unknown parameters/entries, so we would not have to put work into that right now (especially as we get feedback from the end users next friday, whether they would like to keep the parameters you would need to drop).

* [x]  4.  “EnergyBusses" field should be provided directly by EPA as MVS input, it can therefore be dropped

I should create another issue for this, and we should remove it seperately, from my point of view. If we drop it from our outputs, eventhough it is still part of our processing in C, this may be misleading. If we drop it, we should replace it for our manual/local use with a new csv input file energy_busses.csv. This is also connected to #559.

* [ ]  5. Data which are NOT modified by MVS (e.g. economic_data, EnergyConsumption{`development_costs, age_installed, optimizedCap, input timeSeries`, etc.}) shall not be send back to EPA to prevent redundancy.

Makes sense, should not be too hard to do? However, do we want to have this for our general MVS output as well (local execution) or only for the json file sent to the EPA? I think it would be good to keep inputs and outputs together for local execution.

* [x]  9. `optimizedFlows ` is a duplicate of the flows of each asset, it can be omitted in the returned json

Yeah, if they use the assets individually to plot this is true. Not sure why they do that, though.

smartie2076

Notes:

You also created the function C0.compute_timeseries_properties(), which is not in the changelog. Have you written a test for that? f0cad40
In 61555b8 you wrote a lot of tests, but can you add debug assertion messages?
I need to remember c1b4908 for the energyBus issue
In 60dcc9d you have left a number of parameters as strings instead of constant variables.
Is there an overview on the helper functions that you create in the utils? data_parser.py does not have extensive docstrings that could explain what you can do with the function.
In 82a0868 you replace == etc. with is. Can you add this to the changelog?

smartie2076 · 2020-11-04T13:35:06Z

src/multi_vector_simulator/C0_data_processing.py

+            logging.debug(f"Not loading {group} asset {asset} from file")
+            compute_timeseries_properties(dict_values[group][asset])


I am actually not sure what that debug message tells me.

* You also created the function `C0.compute_timeseries_properties()`, which is not in the changelog. Have you written a test for that? [f0cad40](https://github.com/rl-institut/multi-vector-simulator/commit/f0cad40d86fa47d45e66895820b8d771a12021db)

I move this code into a function because I wanted to be able to execute this also when not loading timeseries from csv, I did not write tests for it (here the function applies standard sum and max to a timeseries. There was already a kind of local test of the normalized timeseries which I kept. testing that sum does the sum is already done by the builtin test suite on the sum function in that case.

smartie2076 · 2020-11-04T13:35:13Z

src/multi_vector_simulator/C0_data_processing.py

                # If Filename defines the generation timeseries, then we have an asset with a lack of dispatchability
                dict_values[group][asset].update({DISPATCHABILITY: False})
+        else:
+            logging.debug(f"Not loading {group} asset {asset} from file")


I am actually not sure what that debug message tells me.

smartie2076 · 2020-11-04T13:35:19Z

src/multi_vector_simulator/C0_data_processing.py

            )
-    return
+        else:
+            logging.debug(f"Not loading {group} asset {asset} from file")


I am actually not sure what that debug message tells me.

smartie2076 · 2020-11-04T13:37:19Z

src/multi_vector_simulator/F0_output.py

    return answer
+
+
+def select_essential_results(dict_values):


Docstrings can be improved

smartie2076 · 2020-11-04T13:38:05Z

src/multi_vector_simulator/F0_output.py

+    for asset_group in [
+        ECONOMIC_DATA,
+        SIMULATION_SETTINGS,
+        FIX_COST,
+        ENERGY_BUSSES,
+        PATHS_TO_PLOTS,
+        OPTIMIZED_FLOWS,


This may work for now, but in a different pr we should find a way to define energy busses manually

Not sure if we will run into issues this way, as energyBusses bay be overwritten with C

That is possible, I did not check how the output looked, I only checked that the simulation was running through with the provided energyBusses, thinking we will then iterate on EPA outputs which are then input of MVS etc...

smartie2076 · 2020-11-04T13:41:44Z

src/multi_vector_simulator/utils/data_parser.py

+    ENERGY_PROVIDERS: [
+        "asset_type",
+        LABEL,
+        OEMOF_ASSET_TYPE,
+        "energy_vector",
+        INFLOW_DIRECTION,
+        OUTFLOW_DIRECTION,
+        CONNECTED_CONSUMPTION_SOURCE,
+        CONNECTED_FEEDIN_SINK,
+        DEVELOPMENT_COSTS,
+        DISPATCH_PRICE,
+        ENERGY_PRICE,
+        FEEDIN_TARIFF,
+        "installed_capacity",
+        LIFETIME,
+        "optimize_capacity",


Arent energy_vector, installed_capacity, optimize_capacity also constant variables? Or are you using EPA names here?

These are EPA names, I wanted to specifically not have constant variables for EPA names, so that we see the ones that are different and we are aware of them. It is also helpful in the future to know which constant variable values need to be adapted as they differ than the EPA ones

smartie2076 · 2020-11-04T13:47:27Z

tests/test_F0_output.py

+    # TODO fix input from time parameters for simulation settings
+    # def test_load_json_parse_pandas_series(self):
+    #     """ """
+    #     k = TYPE_SERIES
+    #     assert self.value_dict[k].equals(JSON_TEST_DICTIONARY[k])
+    #
+    # def test_load_json_parse_pandas_series_tuple_name(self):
+    #     """ """
+    #     k = "pandas_series_tuple_name"
+    #     assert self.value_dict[k].equals(JSON_TEST_DICTIONARY[k])


Have you addressed this in a later commit?

It requires a definition of simulation time parameters prior to running these tests #626 (comment)

smartie2076 · 2020-11-04T13:51:51Z

src/multi_vector_simulator/server.py

+
+    epa_dict_values = data_parser.convert_mvs_params_to_epa(dict_values)
+
+    output_processing.select_essential_results(epa_dict_values)
+
+    json_values = output_processing.store_as_json(epa_dict_values)
+
+    return json.loads(json_values)


This means that there is only the option to store the data in EPA-format. Would it not make sense to have the option to keep the json output as it is, without dropping duplicates? This would keep all data intact and in one place for future consideration/comparision. Does ´mvs_report.py` also work with the lightweight json output?

There are three things:

dropping indexes of timeseries to reduce size is done anyway (store_as_json)

convert from EPA to MVS

drop some fields that might be redundant (project_data, energy_busses etc)

We can add an option for the server enable or not the last two. I just considered that server.py is primarily meant for MVS/EPA communication so I made this the default.

python mvs_tool.py -i ... (which btw you can replace now by mvs_tool -i ....) calls main function of cli.py, so it is only concerned by the changes which are dropping indexes of timeseries to reduce size (store_as_json).

Bachibouzouk force-pushed the feature/reduce-output branch 2 times, most recently from e42d019 to 6c65ce0 Compare October 30, 2020 18:23

Bachibouzouk mentioned this pull request Nov 2, 2020

Time parameter input should be rethought #626

Open

Bachibouzouk force-pushed the feature/reduce-output branch from 110a923 to fcc5f6c Compare November 2, 2020 22:33

Bachibouzouk requested a review from smartie2076 November 2, 2020 22:51

smartie2076 approved these changes Nov 4, 2020

View reviewed changes

smartie2076 mentioned this pull request Nov 5, 2020

[Decision wanted] Definition of energy busses via csv #635

Closed

2 tasks

Bachibouzouk added 17 commits November 5, 2020 22:23

Create function to compute timeseries properties

0fb21f9

Move retrieve_date_time_info from C0 to B0

122b5f9

Update the conversion from dict to json format

3bb2d8f

Update the conversion from json format to dict

fa9fa02

Compute the time_index before loading the data from json file

2530472

Remove unused argument from function

4bb5aef

Compute timeseries properties also when not loading from a file

3a1513c

Write tests for addition of timeindex upon json conversion

b2c67c9

Create function to select main results out of dict_values

425e4b2

Create mapping between EPA and MVS parameter names

0ca5eee

Create parameter parser from EPA to MVS

8823b93

Create parameter parser from MVS to EPA

e85cb78

Write new tests

a04769e

Fix tests after change of structure of series in json file

cb22fa9

Replace == by is for booleans and None

d420706

Remove useless return statements

5a6f853

Sort and parse the mvs results to send to epa

0b05313

Bachibouzouk force-pushed the feature/reduce-output branch from fcc5f6c to b3eabdc Compare November 5, 2020 21:36

Bachibouzouk added 2 commits November 5, 2020 23:37

Update changelog

cdb9042

Improve docstrings

66e2b7e

Bachibouzouk marked this pull request as ready for review November 5, 2020 22:39

Bachibouzouk force-pushed the feature/reduce-output branch from b3eabdc to 66e2b7e Compare November 5, 2020 22:47

Bachibouzouk merged commit 7649baf into dev Nov 5, 2020

Bachibouzouk deleted the feature/reduce-output branch November 5, 2020 23:03

smartie2076 mentioned this pull request Nov 10, 2020

Release/v0.5.1 #648

Merged

		logging.debug(f"Not loading {group} asset {asset} from file")
		compute_timeseries_properties(dict_values[group][asset])

Conversation

Bachibouzouk commented Oct 29, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

smartie2076 commented Nov 4, 2020

Uh oh!

smartie2076 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Bachibouzouk Nov 5, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Bachibouzouk commented Oct 29, 2020 •

edited

Loading

Bachibouzouk Nov 5, 2020 •

edited

Loading