Skip to content

Conversation

@Erotemic
Copy link
Contributor

@Erotemic Erotemic commented Nov 13, 2020

This PR is a small self-contained reorganization to reduce the diff size in #4327. Hopefully it will be easy to review. It simply refactors the minors submodule into a subpackage, which will allow for easier integration of new algorithms related to graph minors.

Specifically the PR creates the subpackage algorithms/minors and moves the existing algorithms/minors.py inside this subpackage (renaming it to algorithms/minors/contraction.py). The tests were also moved accordingly.

This should not cause any change in the API because all components of the algorithms/minors/contraction.py package are exposed in algorithms/minors/__init__.py.

This PR does not have any dependencies

This PR is depended on by #4350, and #4327

@Erotemic Erotemic changed the title Reorganize minor submodule as subpackge Reorganize minor submodule as subpackage Nov 15, 2020
@Erotemic Erotemic force-pushed the dev/make_minor_subpackage branch from b6e6c46 to b0d5193 Compare December 18, 2020 15:54
@Erotemic
Copy link
Contributor Author

Erotemic commented Dec 18, 2020

This has been rebased on master. I think the main point of discussion is what I've done to populate the __init__ file. I left in a private variable containing some developer notes:

__devnotes__ = """
CommandLine
-----------
# Run all tests in this subpackage
pytest networkx/algorithms/minors --doctest-modules
# Autogenerate the `__init__.py` file for this subpackage with `mkinit`.
mkinit ~/code/networkx/networkx/algorithms/minors/__init__.py -w
"""

This demonstrates how I used the mkinit package to autogenerate the explicit import statements instead of relying on import *. I imagine any points of contention in this PR would be

  1. if that __devnotes__ comment should stay in the file or not,

  2. the special __submodules__ attribute used by mkinit should be included or not

  3. the fact that the comment is using a path on my system (it could be changed to point at a module-like path -- e.g. mkinit networkx.algorithms.minors instead -- which would be more universal), and

  4. if I should just use import * anyway to be consistent with the current state of the package, even though there is discussion about moving away from that style.

That's all assuming there is support to move minors.py into a subpackage at all. (which I think there is a strong argument for)

@rossbar @dschult @jarrodmillman

@jarrodmillman
Copy link
Member

@Erotemic Thanks for pushing this. (I haven't had time to think about making this a subpackage, but if @dschult agrees I am +1.)

I would like the __devnotes__ removed from this PR and I would like to settle on a consistent import pattern for NX (possibly using mkinit everywhere). We have a few different styles I suspect (though import * is probably the main style). It would be great if you could provide feedback on this PR:

You may also want to quickly scan:

Now that most of the major scientific Python core packages have dropped Python 3.6, we would like to implement lazy importing across the ecosystem (numpy, scipy, scikit-image, networkx, etc.). @stefanv created the PR on the scikit-image repo, since we were hoping to get more community feedback (but the idea started in #4401). Stefan is working on a PEP-like proposal to go with this PR, but it isn't ready yet.

I haven't looked at mkinit, but it looks like it may be helpful for moving in the lazy import direction. So it may make sense for us to use mkinit for the whole NX project and suggest using it in other projects if they are implementing the lazy import stuff. Maybe we could even add it to our CI system's style and formatting checker to make sure new contributions are being added to the _init__.py modules correctly.

@Erotemic
Copy link
Contributor Author

The implementation of lazy initialization in scikit-image is interesting. Its likely that I could add a formatter to mkinit such that it generates a similar pattern based on its static analysis. I'll start thinking about that, see if I can put together a proof-of-concept implementation, and at that point I'll likely have delved into the issue enough to provide useful feedback.

Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve making minors a subpackage instead of a module within algorithms.

I haven't quite understood yet the devnotes, submodules and all. It seems more straightforward to use __all__ within the modules and import * to take advantage of that feature. But I'm learning more about the import options and how lazy import would be helpful. So I'm open to exploration.

@Erotemic
Copy link
Contributor Author

My latest commit removes __devnotes__ (I did leave on small non-string comment at the bottom with one user-independent mkinit command, as it is very useful to have if you want to use mkinit to regenerate the file). This includes the new lazy import work discussed on scikit-image/scikit-image#5101. Of course we can switch back to eager with mkinit networkx.algorithms.minors --black -w or to lazy via mkinit networkx.algorithms.minors --black -w --lazy without much hassle. (We could also do a conditional check on some NETWORKX_EAGER_IMPORT environment variable so people have the option of front-loading all imports, which can be useful for long running processes).

I could also add a mode to mkinit to force the exiting import * method, so we can merge something consistent and then make the swap to lazy / explicit-eager in a separate branch? Or we could just merge this in lazy or explicit-eager mode and modify the rest of the library later (might be easier dev-time wise).

@rossbar
Copy link
Contributor

rossbar commented Jan 21, 2021

My preference would be to split the package reorganization and the import stuff into separate PRs. The proposed import system changes (lazy imports, mkinit, the ability to toggle, etc.) are interesting and (for me at least) it would be helpful to have them self-contained. It also would help un-stick the pkg/module reorg as I think the import changes need more attention & discussion whereas the module reorganization (coupled with following the existing import pattern with __all__) is a straightforward change that it seems everyone agrees with.

@Erotemic
Copy link
Contributor Author

That makes sense, I'll revert this to the minimal changes.

@Erotemic Erotemic force-pushed the dev/make_minor_subpackage branch from d4accac to 08e1673 Compare March 2, 2021 15:20
@Erotemic
Copy link
Contributor Author

Erotemic commented Mar 2, 2021

@rossbar It only took me a month to get around to, but I made the small changes. This PR should be good to go.

Base automatically changed from master to main March 4, 2021 18:20
Copy link
Contributor

@rossbar rossbar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - one final question about whether we need to expose the contraction module, but this is the type of question we've gone around on before and shouldn't be a blocker. I'm happy with it as-is, just curious what others think.

Thanks @Erotemic !

.. [1] https://en.wikipedia.org/wiki/Graph_minor
"""

from networkx.algorithms.minors import contraction
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my only remaining question is: do we want to add this module to the top-level namespace, or only the functions within it? I.e. do we ever want users to do from networkx import contraction or nx.contraction.contracted_edge instead of calling the functions from directly from nx.

It seems superfluous to expose the module from nx when all of the relevant functions are already available from there, but I don't feel strong about it either way - just curious what others think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for bringing that up. We really need to make a consistent system and shift to that.
I don't care much which way we go so long as most functions are available easily (usually that means at the top level of the namespace). I'm fine to follow protocols from other Scientific Python packages too.

What makes it easy for developers down the road. If someone later adds a function that uses a private function, will we remember to put the module into the namespace so users can get at the private function? I'm hoping lazy imports will make it all much cleaner -- but I haven't thought enough about how that would impact cases like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to like having the module available in the namespace. It at least discourages people from giving the module and one of its attributes the same name (e.g. datetime.datetime). I also sometimes like being able to write the explicit long name (without having to do the explicit import beforehand). I don't have a strong opinion though.

I'd vote to leave it as-is.

Lazy imports doesn't solve the problem, both conventions (expose submodule names / only expose attributes) work just fine with lazy imports. (Using mkinit can control the convention by specifying specifying --noattrs or --nomods, also --lazy argument should exist in the pypi version)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A counterargument to adding an extra namespace for functions that are already in the main namespace is (to paraphrase PEP20) "there should be one (and preferably only one) way to do things". From this perspective, it's suboptimal to have two different (public) interfaces to identical functionality (e.g. nx.contracted_edges and nx.contraction.contracted_edges).

In practice I don't think this would matter to users much as most are likely to use the name that requires the least amount of typing. However, I do think it is confusing in that there are namespaces in networkx (e.g. bipartite) that contain functions with the same names as those in the main namespace, but with different behavior - e.g. nx.betweenness_centrality vs. nx.bipartite.betweenness_centrality.

Anyways this is all part of a wider discussion that IMO probably needs an NXEP. For the particular instance in this PR, I would vote to not make nx.contraction publicly available until we have the wider policy on namespaces settled - it's much easier to add a namespace later on than to take one away!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion, so I'll make that change. As you said, its easy to add it later.

@rossbar rossbar merged commit 4a5bcf5 into networkx:main Mar 23, 2021
@rossbar
Copy link
Contributor

rossbar commented Mar 23, 2021

Thanks @Erotemic !

@jarrodmillman jarrodmillman added this to the networkx-2.6 milestone May 20, 2021
MridulS pushed a commit to MridulS/networkx that referenced this pull request Feb 4, 2023
Reorganize minor submodule as subpackge with contraction submodule
cvanelteren pushed a commit to cvanelteren/networkx that referenced this pull request Apr 22, 2024
Reorganize minor submodule as subpackge with contraction submodule
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

4 participants