Skip to content

refguide-check with pytest as a runner#26604

Merged
mattip merged 21 commits intonumpy:mainfrom
ev-br:smoke-docs
Jun 13, 2024
Merged

refguide-check with pytest as a runner#26604
mattip merged 21 commits intonumpy:mainfrom
ev-br:smoke-docs

Conversation

@ev-br
Copy link
Contributor

@ev-br ev-br commented Jun 3, 2024

SciPy has transitioned to running refguide-check modified doctesting via pytest, and here's a matching proposal for NumPy.

The target invocation would be

$ pip install scipy-doctest    # once
$ pytest --pyargs numpy --doctest-collect=api

Currently one needs to manually pytest-ignore several dusty corners so the precise incantation for me locally is

$ spin run bash
$ cd ..
$ pytest --pyargs numpy --doctest-modules -v --doctest-collect=api --ignore=/home/br/repos/numpy/build-install/usr/lib/python3.11/site-packages/numpy/distutils --ignore=/home/br/repos/numpy/build-install/usr/lib/python3.11/site-packages/numpy/_core/cversions.py --ignore=/home/br/repos/numpy/build-install/usr/lib/python3.11/site-packages/numpy/_pyinstaller --ignore=/home/br/repos/numpy/build-install/usr/lib/python3.11/site-packages/numpy/random/_examples

Of course, all these ignores could/should go into a spin command, which is on my TODO list.

For comparison:

msg = "|".join(msgs)

with warnings.catch_warnings():
warnings.filterwarnings('ignore', category=DeprecationWarning, message=msg)
Copy link
Member

@ngoldbaum ngoldbaum Jun 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The full tests have a check that doesn't like this, although the error mesage is bad, I think it's supposed to say something like we don't want to let code in that might filter a real warning. Obviously you need to ignore warnings to implement this context manager. Maybe there's a way to dynamically add pytest.mark.filterwarnings marks to a test? Another option is to manually add an exception for this use-case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me do the simplest thing and skip the top level conftest.py in that test.

ev-br added 5 commits June 6, 2024 11:19
The conftest is needed to get the smoke-docs configuration from
the main conftest.

To run doctesting, use (this is scipy's `smoke-tutorial`)

$ pytest --doctest-glob=*rst ~/repos/numpy/doc/source/user/ -v
@ev-br
Copy link
Contributor Author

ev-br commented Jun 6, 2024

Okay, the last commit adds a spin smoke-docs command --- the name follows the scipy's dev.py smoke-docs. The user API is exactly the same as spin test, only it only runs doctests:

$ spin smoke-docs      # == python runtests.py refguide-check
$ spin smoke-docs numpy/lib
$ spin smoke-docs -t numpy/linalg/_linalg.py -v
# ....

Curiously, somewhere in the spin machinery warnings get turned into errors by default. I wonder where this switch is.
Other than that, the direct pytest --doctest-modules invocation passes locally for me.

EDIT: this is fixed, all warnings are filtered out.

@ngoldbaum
Copy link
Member

Thanks so much for working on this.

I know at some point @seberg was looking at adding docstring checking with pytest-doctest-plus. Just a head's up in case that intersects with this change.

Can we get a mention of this in the developer guide too?

@ev-br ev-br changed the title WIP: refguide-check with pytest as a runner refguide-check with pytest as a runner Jun 8, 2024
@ev-br
Copy link
Contributor Author

ev-br commented Jun 8, 2024

Thanks Nathan!
It does intersect indeed :-).
The demo is here; from my side what's left are a few minor tweaks to

  • properly run it on CI;
  • port smoke-testing the *rst tutorials (scipy's dev.py smoke-tutorials);
  • rip out the doctesting part of tools/refguide-check.py;
  • update the devguide

I think at this stage I'm going to pause working on this, and ask for an exec decision.

@ev-br ev-br mentioned this pull request Jun 9, 2024
@seberg
Copy link
Member

seberg commented Jun 9, 2024

I know at some point @seberg was looking at adding docstring checking with pytest-doctest-plus. Just a head's up in case that intersects with this change.

Yeah, it would be nice to consolidate the ecosystem more and doctestplus seemed/seems like the more likely candiate for that, but I don't think that should stop improvements here either way.

ev-br added 5 commits June 9, 2024 17:09
Follow what pytest --doctest-modules allows:
- directories
- -k patterns

`pytest --doctest-modules path/to/py/file` I'd expect to work,
but apparently it fails to collect anything --- and this is
not related to scipy-doctests.
@ev-br
Copy link
Contributor Author

ev-br commented Jun 9, 2024

Okay, the PR is ready from my side.

  • CI is hooked up to the benchmark run to save a numpy build; can move it to a more appropriate place if wanted;
  • some skips can be fixed in the source; I only fixed the "obvious" ones, the rest are enhancements (anything that worked under refguide-check keeps working, so any failures are "new")
  • tutorials can be done in a follow-up
  • spin integration is
$ spin smoke-docs
$ spin smoke-docs numpy/linalg
$ spin smoke-docs -- -k"det and not slogdet"

@mattip
Copy link
Member

mattip commented Jun 10, 2024

Re: this or pytest-doctestplus: ...

Makes sense, perhaps we should make a decision which way we wish to go. #21070 seems to have a few parallel discussions:

  • how to run doctests
  • how to get feedback which doctests have run
  • whether to pivot from refguide-check to doctest-plus or some other scientific-python wide solution

Issue #21069 also discussed transparency of which doctests are running.

Issue #15845 suggested moving to doctest-plus, but also did not reach a conclusion.

@seberg
Copy link
Member

seberg commented Jun 10, 2024

Well, we never used doctests to check the code for correctness only to check the docs for correctness. That is right IMO since docs should change freely without worrying about code test coverage.

OTOH, that argument doesn't mean you can't just run the doctests always: docs getting out-dated isn't much better than code tests starting to fail.

However, I suspect doctests are more flaky (e.g. we never run them on 32bit platforms, I bet and I am not sure if they might be python version sensitive more than other tests).
So, I am fine either way, but someone might have to be a bit dedicated to ensure it's not flaky?

The name [smoke-docs] of the spin command follows earlier discussion of the scipy version

Thanks, but I suppose I don't see that spin smoke-docs is better than spin doctest.


I am OK with this largely as it is, it isn't like the spin invocation would change if we would put in doctestplus.

@ev-br
Copy link
Contributor Author

ev-br commented Jun 10, 2024

perhaps we should make a decision which way we wish to go. #21070 seems to have a few parallel discussions:

  • how to run doctests
  • how to get feedback which doctests have run
  • whether to pivot from refguide-check to doctest-plus or some other scientific-python wide solution

To summarize main high-level difference between scipy-doctest and doctestplus (+opinionated comments):

Architecture:

UI:

  • human-readable markers, # may vary, are superior to # doctest: +FLOAT_CMP, IMO
  • Flexibility in doctest finding (e.g. only doctest public objects) is handy and has long history in both numpy and scipy
  • Flexibility in skipping things is essential (consider plt.xlim([1, 2]) # doctest: +SKIP)
  • Having to tweak whitespace to be exactly right is annoying (doctests's NORMALIZE_WHITESPACE is not sufficient).

Overall, all features of scipy-doctest can be added to doctestplus. I'm on record suggesting an ecosystem wide solution, #21070 (comment). That discussion went nowhere though, which is why there is now a scipy-doctest plugin :-).

@ev-br
Copy link
Contributor Author

ev-br commented Jun 10, 2024

I still think that mixing doctests with unit tests is a bad idea.

Is there another discussion of this position somewhere? If not could you elaborate? In my mind, doctests are just another form of unit test, and I often peruse the unit tests as extended examples beyond the documentation

There is some discussion in scipy/scipy#20127.

Perusing unit tests as extended examples --- definitely yes, it's likely not just you and me doing this regularly :-).
Going the other way --- using doctests to test functionality instead of unit tests --- is what I think is a bad idea. The reason is basically that documentation is for a reader, so the main goal of docstring examples are to demonstrate things to a reader. Doctesting them is just a technical implement to keep examples from bitrotting, no more.
IOW, docstring examples are for the users; unit tests-as-documentation is for developers.

@mattip
Copy link
Member

mattip commented Jun 10, 2024

I think scipy-doctest should be added to requirements/test_requirements.txt, otherwise running this locally one gets a pytest error about a missing --doctest-collect command line argument

Edit: maybe build_requirements.txt (where spin is installed)? The other errors, about missing pytest, or missing matplotlib,scipy are clear, only the weird --doctest-collect error is more difficult to untangle.

@ev-br
Copy link
Contributor Author

ev-br commented Jun 10, 2024

Added to test_requirements.txt (before seeing the edit). test_requirements make sense to me more than build_requirements.txt (next to pytest-xdist, also a pytest plugin?), but am happy to move if wanted.

Copy link
Member

@mattip mattip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this at the triage meeting, and this got a thumbs-up. The only nit is if we could change the name to check-docs. I took the liberty of changing the name.

@mattip
Copy link
Member

mattip commented Jun 12, 2024

Just to be clear: this does not completely replace refguide-check since it does not check the rst documentation, correct? So the circle ci job running refguide-check should still be running as previously.

@ev-br
Copy link
Contributor Author

ev-br commented Jun 12, 2024

Indeed. This is on my to do list. Scipy's ui is dev.py smoke-tutorials, so you want spin check-tutorials here then?
EDIT: also, this PR or a follow-up?

@ngoldbaum
Copy link
Member

so you want spin check-tutorials here then?

Sounds fine to me, good to keep it consistent.

also, this PR or a follow-up?

A followup is fine if you're ok with that. I'd like to be able to have the new spin subcommand ASAP! Every once in a while I need to figure out how to run the doctests and it's not very fun.

@asmeurer
Copy link
Member

asmeurer commented Jun 12, 2024

I agree with Sebastian that doctests are not tests. They are examples that happen to be tested. A good documentation example is not a good test and a good test is not a good documentation example. They shouldn't be confused with each other. The point of doctests is just to ensure documentation examples actually work the way they say they do.

But that seems irrelevant to the question of whether something like spin test should automatically run the doctests. In my view, that's the one simple command that contributors will run on their machine to ensure all their changes are working before they commit. It shouldn't require ten different commands with obscure options to do this (c.f. #26588).

@ev-br
Copy link
Contributor Author

ev-br commented Jun 12, 2024

So how about landing it as is, and I'll work on a follow-up to

  • add check-tutorials
  • remove doctesting from the refguide-check.py script

Not sure how to make spin test run multiple commands. Surely it's doable.

@seberg
Copy link
Member

seberg commented Jun 12, 2024

But that seems irrelevant to the question of whether something like spin test should automatically run the doctests

I honestly don't care for that and if you want something that checks everything, it may have to also include linting? And that would make it a new invocation, which is probably fine if we can think of it, like comprehensive-check ;).
Also, I don't trust that doc tests are not brittle, bunching it up might just make someone run them on niche platforms where it is a lost cause.

it does not check the rst documentation, correct?

Ah, I somewhat expected it would. FWIW, I might be happy to include the rst docs into check-docs, but either way works and maybe it's slow.

@mattip
Copy link
Member

mattip commented Jun 13, 2024

So how about landing it as is, and I'll work on a follow-up to

  • add check-tutorials
  • remove doctesting from the refguide-check.py script

Sounds like a plan, thanks!

@mattip mattip merged commit fb980ec into numpy:main Jun 13, 2024
@mattip
Copy link
Member

mattip commented Jun 13, 2024

Thanks @ev-br

@mattip
Copy link
Member

mattip commented Jun 13, 2024

FWIW: I view doctest as a way to test the documentation for correctness. So it is a "test". Documentation can have bugs just like code can have bugs, and testing is just one tool to expose those bugs. Relegating documentation tests to a category that is "not testing" does a disservice to the hard work put into making the documentation.

That said, it is true the examples in the documentation are meant to be "happy path" demonstrations of correct use of the functionality, and should not be taken as comprehensive tests to prove correctness of the code, so the goal of examples is not the same as the goal of unit tests.

@asmeurer
Copy link
Member

I view doctest as a way to test the documentation for correctness.

Yes, I agree completely, and that's part of my argument that this should be easier to invoke. I just want to push back from the idea that a doctest is a test, because that leads to people writing bad documentation examples (not to accuse anyone here of saying that, but I've seen it happen before).


>>> np.array([1.2, object(), "hello world"],
dtype=StringDType(coerce=True))
... dtype=StringDType(coerce=True))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is taken over from the initial version, but should this not be coerce=False? – see also

>>> np.array([1, object(), 3.4], dtype=StringDType(coerce=False))
Traceback (most recent call last):
...
ValueError: StringDType only allows string data when string coercion is disabled.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but it seems already fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Sorry, must have looked at the wrong branch then – in main and 2.2.x it's still True.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, confusing that this is twice in the docs, and only once is wrong. Yeah, should just fix it...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a bit puzzled that it did not raise in doctests, but apparently the exception was quoted incompletely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the likely author of the typo (didn't check) thanks for noticing and fixing 😀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants