Skip to content

feature/tc_forecast_cxml#36

Merged
emanuel-schmid merged 39 commits intodevelopfrom
feature/tc_forecast_cxml
Aug 15, 2022
Merged

feature/tc_forecast_cxml#36
emanuel-schmid merged 39 commits intodevelopfrom
feature/tc_forecast_cxml

Conversation

@mmyrte
Copy link
Copy Markdown
Collaborator

@mmyrte mmyrte commented Mar 11, 2022

This is a work in progress for a Cyclone XML (cxml) reader for TC forecast products. I'm currently testing it with data from ECMWF, see https://confluence.ecmwf.int/display/TIGGE/Tools#Tools-tc, but due to the clear XML specs, it should work with all providers. General structure:

  • Transforms an xml file into csv using xsl, which is read into a pandas DF
  • Groups the pandas DF into chunks that constitute tracks.
  • Transforms each chunk into the accustomed xr.Dataset structure.

I've tried making it robust against duplicate/false labels in the input data, such as I encountered here (wrong basin designations), but cannot be sure yet.

Missing: Tests, gracious handling of missing dependencies, documentation.

@mmyrte mmyrte mentioned this pull request Mar 11, 2022
@emanuel-schmid emanuel-schmid changed the title Feature/tc forecast cxml WIP: feature/tc_forecast_cxml Apr 7, 2022
@mmyrte mmyrte requested a review from aleeciu April 28, 2022 08:29
@mmyrte
Copy link
Copy Markdown
Collaborator Author

mmyrte commented Apr 28, 2022

Hi @aleeciu - I didn't get around to this recently; still working on a test. Would be happy if you could already have a first look, but totally cool if you don't run anything yet.

@mmyrte mmyrte marked this pull request as ready for review April 29, 2022 12:54
@mmyrte
Copy link
Copy Markdown
Collaborator Author

mmyrte commented Apr 29, 2022

@aleeciu I've now committed unit tests, but here comes the hard part - I'm so out of touch with you guyses CI pipeline that I don't know how to add the lxml dependency or to graciously catch it if it's not already present. I guess this is also for @emanuel-schmid : I don't want to burden climada_petals with additional dependencies, since this feature only going to be used by very few people. How would you handle the situation?

@aleeciu
Copy link
Copy Markdown
Collaborator

aleeciu commented Apr 29, 2022

since this feature only going to be used by very few people.

not sure about this, I think there could be some interest. Anyway it's on @emanuel-schmid to decide whether we are fine with adding lxml.

"""Missing double and integers in ecCodes """

CXML2CSV_XSL = Path(__file__).parent / "tc_tracks_foreast_cxml2csv.xsl"
"""Path at which an xsl is found for transforming CXML to CSV format."""
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want this file here or in the climada/data folder? @emanuel-schmid

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @emanuel-schmid, here is one issue.

Copy link
Copy Markdown
Contributor

@emanuel-schmid emanuel-schmid May 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We certainly don't want it here. climada_petals/hazard/data folder maybe. Or climada data api. Depends on what is that file. What's inside, who wrote it and how, will it ever change, can we have it as csv file instead?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that file is required to translate the downloaded file from cxml into csv. It was written by @mmyrte, I'd say we can consider it as the standard CLIMADA file when working with these .cxml files. The module was written such that a user can potentially define locally his/her own conversion file and, if so, this standard file won't be used. On whether it can also be written as .csv it's for @mmyrte but I doubt.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emanuel-schmid No, it's not a data file; it's XSL, so it contains transformation logic.

Copy link
Copy Markdown
Collaborator Author

@mmyrte mmyrte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy now. @aleeciu I don't know whether there are still things that need to be approved by @emanuel-schmid, but if it's really just the question of where in the file tree the xsl is positioned, that can easily be adjusted later on. I say we merge, unless there are new policies in place that I don't know of.

@aleeciu
Copy link
Copy Markdown
Collaborator

aleeciu commented May 23, 2022

@mmyrte thanks for this new feature. To me it looks quite fine. I will make a tutorial and test it with different providers. Then I think we can merge ? Do you anything else on your TODO list ?

@mmyrte
Copy link
Copy Markdown
Collaborator Author

mmyrte commented May 23, 2022

To summarise our call:

  • You discuss with @emanuel-schmid whether you want to pin the lxml version or not (thereby removing the need for the string conversion or not).
  • You rename both or either of the cxml_sample_transformation.xsl files to something more appropriate, like cxml_ecmwf_transformation.xsl and cxml_60-72h_filter.xsl.
  • Plus you take care of a tutorial and such.

Cheers

Jan

@aleeciu
Copy link
Copy Markdown
Collaborator

aleeciu commented May 23, 2022

To summarise our call: You discuss with @emanuel-schmid whether you want to pin the lxml version or not (thereby removing the need for the string conversion or not).
You rename both or either of the cxml_sample_transformation.xsl files to something more appropriate, like cxml_ecmwf_transformation.xsl and cxml_60-72h_filter.xsl.
Plus you take care of a tutorial and such.

Sounds like a plan

"""Missing double and integers in ecCodes """

CXML2CSV_XSL = Path(__file__).parent / "data/cxml_ecmwf_transformation.xsl"
"""Xsl file for transforming CXML to CSV format."""
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emanuel-schmid I have now created a data folder in hazard and moved this transformation file therein. I am not sure whether this is the best place or if it should go somewhere else. I have tried to move it in my climada/data folder but then I get an error from jenkins that it does not find the file (I think I should have also changed something in the CONFIG file but I am not sure what).

@aleeciu
Copy link
Copy Markdown
Collaborator

aleeciu commented Jun 24, 2022

Hi @emanuel-schmid,

to me this this now works (many thanks @mmyrte) and it's ready to be merged. Open issue is where to locate the data:

  • there is a file we need for the new feature to run. I have now moved it into a newly created folder hazard/data as you suggested above. But maybe we want them in climada/data ? I have tried doing so but then I get an error from jenkins saying it does not find the data. I think I should manipulate the CONFIG file if I want to fix that but I would not know exactly how.

  • I have started a tutorial and wanted my trial data to be in climada/data/demo. However I am not sure how to access them from the tutorial folder since when I call from climada.util.constants import DEMO_DIR it creates /climada_petals/doc/tutorial/data/demo instead of pointing to /climada_petals/data/demo. Again it's me not being able to properly handle the CONFIG file. We can however merge the branch without a tutorial and then add it later.

Thanks.

@mmyrte
Copy link
Copy Markdown
Collaborator Author

mmyrte commented Jun 29, 2022

One last comment - I still don't think that application logic such as that contained in the xsl file should be in a "data" directory, since I find the name misleading. It's up to you guys though.

@mmyrte mmyrte closed this Jun 29, 2022
@mmyrte mmyrte reopened this Jun 29, 2022
@emanuel-schmid emanuel-schmid merged commit f5a696e into develop Aug 15, 2022
@emanuel-schmid emanuel-schmid deleted the feature/tc_forecast_cxml branch August 15, 2022 08:52
@emanuel-schmid emanuel-schmid changed the title WIP: feature/tc_forecast_cxml feature/tc_forecast_cxml Aug 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants