Skip to content

[DF] Change the signature of RInterface::Describe (Fix #8893)#9551

Merged
eguiraud merged 1 commit intoroot-project:masterfrom
ikabadzhov:DF_Describe_Improvements
Jan 17, 2022
Merged

[DF] Change the signature of RInterface::Describe (Fix #8893)#9551
eguiraud merged 1 commit intoroot-project:masterfrom
ikabadzhov:DF_Describe_Improvements

Conversation

@ikabadzhov
Copy link
Copy Markdown
Contributor

@ikabadzhov ikabadzhov commented Jan 12, 2022

This Pull request:

[DF] Change the signature of RInterface::Describe

Changes or fixes:

A new structure DFDescription is introduced.
It has 2 member strings, corresponding to the brief and the full description.
It allows more interactive output of these strings.

RInterface::Describe now returns a DFDescription.
As brief description is the output from RInterface::DescribeDataset.
As full description is the remaining code from RInterface::Describe.
Moreover, RInterface::DescribeDataset is now a private method.

RDFDescription has the following methods:

  • AsString(bool) -> returning brief/full description as a string
  • Print(bool) -> printing the content of AsString(bool)
  • overloaded << -> returns ostream corresponding to AsString(shortFormat=false)
  • printValue -> returns string corresponding to AsString(shortFomat=false)
  • __repr__ pythonization -> assigning __repr__ to AsString(shortFormat=false)

Tests and Tutorials were adapted correspondingly.

Checklist:

  • tested changes locally
  • updated the docs (if necessary)

This PR fixes #8893

@ikabadzhov ikabadzhov requested a review from eguiraud January 12, 2022 21:04
@ikabadzhov ikabadzhov requested a review from couet as a code owner January 12, 2022 21:05
@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, ROOT-ubuntu2004/soversion, mac1015/python3, mac11/cxx17, windows10/cxx14
How to customize builds

@ikabadzhov ikabadzhov force-pushed the DF_Describe_Improvements branch from 712c06e to 2027f91 Compare January 12, 2022 21:11
@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, ROOT-ubuntu2004/soversion, mac1015/python3, mac11/cxx17, windows10/cxx14
How to customize builds

Copy link
Copy Markdown
Contributor

@eguiraud eguiraud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ikabadzhov ! The design is what we want, some notes on the implementation below.

@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, ROOT-ubuntu2004/soversion, mac1015/python3, mac11/cxx17, windows10/cxx14
How to customize builds

1 similar comment
@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, ROOT-ubuntu2004/soversion, mac1015/python3, mac11/cxx17, windows10/cxx14
How to customize builds

@ikabadzhov ikabadzhov force-pushed the DF_Describe_Improvements branch from bb9bbb7 to 2083f3b Compare January 13, 2022 16:09
@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, ROOT-ubuntu2004/soversion, mac1015/python3, mac11/cxx17, windows10/cxx14
How to customize builds

@vepadulano
Copy link
Copy Markdown
Member

Thanks @ikabadzhov for this very nice improvements! +1 for the usage of const whenever needed. Do we also want to include the overload of the operator<< and the pythonization for the __repr__ and __str__ dunder methods in this release or we leave them for a next time?

@ikabadzhov
Copy link
Copy Markdown
Contributor Author

Thank you @vepadulano. As a current solution, we decided that Describe() returns a RDFDescribe object which has 2 methods Print(bool) and AsString(bool), in stead of overloading operator<< and its std::string().
I see now that alternatively, we could overload these operators of the RDFDescribe class.

Reason to think that current solution with separate methods is better is the design that Print and AsString methods take one input argument (bool to determine if we want short or full description).

Given your comment, I see it as a valid argument to have: RDFDescribe Describe(bool), and then RDFDescribe has overloaded << and std::string(). What do you think about that?

@eguiraud
Copy link
Copy Markdown
Contributor

Good point, yes I think we want __repr__ for a nicer experience in notebooks and REPLs (#8893), as well as std::string printValue(RDFDescription *tdf) for the cling REPL -- for the RDFDescription class, indeed.

@ikabadzhov can you take care of it? For the printValue function you can see how it's done e.g. in RDataFrame.hxx, for the __repr__ pythonization I think Vincenzo can give you some pointers.

@ikabadzhov
Copy link
Copy Markdown
Contributor Author

@eguiraud @vepadulano if we have overloaded <<, string, __repr__, do we then need to keep Print(bool) and AsString(bool)? I think no, or?

@eguiraud
Copy link
Copy Markdown
Contributor

eguiraud commented Jan 14, 2022

Yes we do, e.g. there is no way to pick a short or long format using << (and I prefer the more explicit AsString rather than a cast operator).

@ikabadzhov ikabadzhov force-pushed the DF_Describe_Improvements branch from 2083f3b to 1ae539b Compare January 14, 2022 13:44
@ikabadzhov ikabadzhov requested a review from etejedor as a code owner January 14, 2022 13:44
@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, ROOT-ubuntu2004/soversion, mac1015/python3, mac11/cxx17, windows10/cxx14
How to customize builds

endif()

# RDFDescription pythonization
if (rdfdescription)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be under the if (dataframe) condition block above?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes indeed there is no rdfdescription cmake option so the test should go with the other rdf tests

from . import pythonization

@pythonization("RDFDescription", ns="ROOT::RDF")
def pythonize_rdataframe(klass):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Congrats for your first pythonization!

I would call this pythonize_rdfdescription to be consistent in the naming.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

corrected in the latest commit (and also the other typo)

@ikabadzhov ikabadzhov force-pushed the DF_Describe_Improvements branch from 1ae539b to 1837e17 Compare January 14, 2022 14:01
@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, ROOT-ubuntu2004/soversion, mac1015/python3, mac11/cxx17, windows10/cxx14
How to customize builds

Test supported __repr__
"""
df1 = ROOT.RDataFrame(1);
ref1 = ("Empty dataframe filling 1 row\n"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming that there are tests in RDataFrame for AsString that check it does the right thing, this test could be simply self.assertEqual(df1.Describe().AsString(), repr(df1.Describe()));, i.e. no need to test again what is tested in the AsString tests.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am applying this change and creating a new commit now. Everything else I keep unchanged (so far).

/// ~~~
///
std::string Describe()
RDFDescription Describe()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why RDFDescription and not RDescription as in RDisplay? The namespace ROOT::RDF already qualifies it as an RDF thing no?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the name of the class is RDFDescription, i.e. ROOT::RDF::RDFDescription. I can change it to ROOT::RDF::RDescription. Is that what you mean?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm asking if there's a reason to name it RDFDescription and not RDescription.

Copy link
Copy Markdown
Contributor

@eguiraud eguiraud Jan 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we might have other descriptions of things still under the ROOT::RDF namespace, and in Doxygen (or when searching the code in general) it's better to search for the RDFDescription class than for a generic RDescription class across all ROOT.

I would have done the same for RDisplay but it's too late.

Copy link
Copy Markdown
Contributor

@eguiraud eguiraud Jan 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmh on second thought I don't know, Enric is right that we have some seriously generic names in the ROOT::RDF namespace, like RDisplay and RInterface, so maybe that ship has sailed.

I guess the question is just whether we'll ever have another description of something else, like a RDF::RFilterDescription 🤔

Copy link
Copy Markdown
Contributor

@etejedor etejedor Jan 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the question is just whether we'll ever have another description of something else, like a RDF::RFilterDescription

In that case, for our peace of mind, we can argue that ROOT::RDF::RDescription means the description of the RDF (the RDF namespace qualifies it as such), if the description is of an RDF filter, it is fair that it's called ROOT::RDF::RFilterDescription

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Peace of mind would be an argument for leaving the more specific name 😄

Ivan, Enric, pick the one you like best, I have no preference.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also think if later on, we will have other RDF "instances" which are used to "present" some specific part of RDF itself. As an example potentially RDFMetaData.
I believe answer should converge to yes, and hence I would go for RDFDescription. Moreover, I just saw that there is a RDFTypeNameGetter class, so keeping RDF seem consistent.

Remark, no matter which version we choose, very likely the other version is never going to be used (mainly for consistency and avoiding confusion - might be obvious but I want to make sure it is considered).

@ikabadzhov ikabadzhov force-pushed the DF_Describe_Improvements branch from 1837e17 to b815ae5 Compare January 14, 2022 14:35
@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, ROOT-ubuntu2004/soversion, mac1015/python3, mac11/cxx17, windows10/cxx14
How to customize builds

@phsft-bot
Copy link
Copy Markdown

Build failed on ROOT-ubuntu16/nortcxxmod.
Running on sft-ubuntu-1604-3.cern.ch:/build/workspace/root-pullrequests-build
See console output.

Failing tests:

@phsft-bot
Copy link
Copy Markdown

Build failed on ROOT-performance-centos8-multicore/default.
Running on olbdw-01.cern.ch:/data/sftnight/workspace/root-pullrequests-build
See console output.

Warnings:

  • [2022-01-14T15:03:35.573Z] /data/sftnight/workspace/root-pullrequests-build/root/graf2d/mathtext/src/table/adobeglyphlist.h:128:44: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
  • [2022-01-14T15:03:35.573Z] /data/sftnight/workspace/root-pullrequests-build/root/graf2d/mathtext/src/table/adobeglyphlist.h:128:55: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
  • [2022-01-14T15:08:35.914Z] /data/sftnight/workspace/root-pullrequests-build/root/graf2d/postscript/src/AdobeGlyphList.h:110:44: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
  • [2022-01-14T15:08:35.914Z] /data/sftnight/workspace/root-pullrequests-build/root/graf2d/postscript/src/AdobeGlyphList.h:110:55: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]

Failing tests:

@phsft-bot
Copy link
Copy Markdown

Build failed on ROOT-ubuntu2004/soversion.
Running on root-ubuntu-2004-3.cern.ch:/home/sftnight/build/workspace/root-pullrequests-build
See console output.

Failing tests:

)

A new class DFDescription is introduced.
It has 2 member strings, corresponding to the brief and the full description.
It allows more interactive output of these strings.

RInterface::Describe now returns a DFDescription object.
As brief description is the output from RInterface::DescribeDataset.
As full description is the remaining code from RInterface::Describe.
Moreover, RInterface::DescribeDataset is now a private method.

RDFDescription has the following methods:
* AsString(bool) -> returning brief/full description as a string
* Print(bool) -> printing the content of AsString(bool)
* overloaded<< -> returns ostream corresponding to AsString(shortFormat=false)
* printValue -> returns string corresponding to AsString(shortFomat=false)
* __repr__ pythonization -> assigning __repr__ to AsString(shortFormat=false)

Tests and Tutorials were adapted correspondingly.
@ikabadzhov ikabadzhov force-pushed the DF_Describe_Improvements branch from b815ae5 to 6dc7a26 Compare January 14, 2022 15:42
@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, ROOT-ubuntu2004/soversion, mac1015/python3, mac11/cxx17, windows10/cxx14
How to customize builds

@phsft-bot
Copy link
Copy Markdown

Build failed on ROOT-performance-centos8-multicore/default.
Running on olbdw-01.cern.ch:/data/sftnight/workspace/root-pullrequests-build
See console output.

Warnings:

  • [2022-01-14T15:57:20.263Z] /data/sftnight/workspace/root-pullrequests-build/root/graf2d/mathtext/src/table/adobeglyphlist.h:128:44: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
  • [2022-01-14T15:57:20.263Z] /data/sftnight/workspace/root-pullrequests-build/root/graf2d/mathtext/src/table/adobeglyphlist.h:128:55: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
  • [2022-01-14T16:02:40.408Z] /data/sftnight/workspace/root-pullrequests-build/root/graf2d/postscript/src/AdobeGlyphList.h:110:44: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
  • [2022-01-14T16:02:40.408Z] /data/sftnight/workspace/root-pullrequests-build/root/graf2d/postscript/src/AdobeGlyphList.h:110:55: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]

Failing tests:

@phsft-bot
Copy link
Copy Markdown

Build failed on ROOT-debian10-i386/cxx14.
Running on pcepsft10.dyndns.cern.ch:/build/workspace/root-pullrequests-build
See console output.

Failing tests:

Copy link
Copy Markdown
Contributor

@eguiraud eguiraud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Member

@vepadulano vepadulano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!

@phsft-bot
Copy link
Copy Markdown

Build failed on mac1015/python3.
Running on macitois21.dyndns.cern.ch:/Users/sftnight/build/workspace/root-pullrequests-build
See console output.

Failing tests:

@eguiraud eguiraud merged commit e0644d8 into root-project:master Jan 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[DF] Describe output does not display correctly in Jupyter notebooks

5 participants