FEA Add examples recommender system#1125
Conversation
|
This is exciting. It works: https://output.circle-artifacts.com/output/job/24f2114f-3815-45f3-8e2e-aec20d01b25a/artifacts/0/rtd_html/auto_examples/plot_0_sin.html#sphx-glr-auto-examples-plot-0-sin-py . If at some point you can test a documentation with more examples such as matplotlib or scikit-learn's, it would be very useful to get a feeling of how well it works. |
I'll take the bait :) Tested for MNE-Python (192 examples) in mne-tools/mne-python#11619. Spot checking I see that:
So it seems to work well! FYI I do get some warnings like: |
|
Thanks a lot, @larsoner ! I don't know MNE-Python that well. At a quick glance it seems that the recommendations are imperfect but definitely useful! @ArturoAmorQ a couple of quick comments:
|
glemaitre
left a comment
There was a problem hiding this comment.
A couple of comments regarding some clean up
Co-authored-by: Guillaume Lemaitre <[email protected]>
|
@ArturoAmorQ let me know when this is ready to go in! |
@lucyleeow On my side it's good to go :D |
lucyleeow
left a comment
There was a problem hiding this comment.
Thank you for all your work! Sorry, last nitpicks I think.
sphinx_gallery/gen_rst.py
Outdated
|
|
||
| if gallery_conf["recommender"]["enable"]: | ||
| # extract the filename without the extension | ||
| recommend_fname = os.path.splitext(os.path.split(example_fname)[1])[0] |
There was a problem hiding this comment.
Would this be nicer if we use Path instead?
| return app | ||
|
|
||
|
|
||
| def test_recommend_n_examples(sphinx_app): |
There was a problem hiding this comment.
So sorry, I should have been clearer, this test can stay in test_full.py. Since it uses sphinx_app it may as well stay in test_full.py, I was meaning more the other 2 tests, which don't need spinx_app.
Also thank you for separating test_example_recommender_methods to it's own unit test.
|
I hope that all the comments have been addressed. Please let me know otherwise. |
lucyleeow
left a comment
There was a problem hiding this comment.
So sorry, just thought of 2 more things. Thanks for your patience. This is the last thing!
| the very rare/very common words. This may improve the recommendations quality, | ||
| but more importantly, it spares some computation resources that would be wasted | ||
| on non-informative tokens. | ||
|
|
There was a problem hiding this comment.
I know this is already implied above but could we explicitly say that only the examples within a single gallery (and it's sub galleries) are used for computing closest examples.
Also this is probably obvious but can we add that only recommendations for .py files will be generated.
| n_examples = sphinx_app.config.sphinx_gallery_conf["recommender"]["n_examples"] | ||
|
|
||
| assert '<p class="rubric">Related examples</p>' in html | ||
| assert count == n_examples |
There was a problem hiding this comment.
What about adding a check that the same 3 related examples are found?
|
Note CI failure due to: scipy/docs.scipy.org#80 (we use 'http://docs.scipy.org/doc/scipy/wrong_url' in our tinybuild) Will merge once green, just in case. @ArturoAmorQ I've made the last small changes. |
Thanks @lucyleeow :) |
|
Woohoo!! Thanks everybody involved, in particular Arturo. I'm really looking forward to this feature being used in the downstream projects!
|
|
Would be good to get this out there -- @lucyleeow do you have time to cut a release next week? If not, I can do it |
|
Tentative yes. I'll ping you if not. |
… to version 0.15.0 v0.15.0 ------- Support for Python 3.7 dropped in this release. Requirement is now Python >=3.8. Pillow added as a dependency. **Implemented enhancements:** - ENH: Improve logging visibility of errors and filenames `#1225 <https://github.com/sphinx-gallery/sphinx-gallery/pull/1225>`__ (`larsoner <https://github.com/larsoner>`__) - ENH: Improve API usage graph `#1203 <https://github.com/sphinx-gallery/sphinx-gallery/pull/1203>`__ (`larsoner <https://github.com/larsoner>`__) - ENH: Always write sg_execution_times and make DataTable `#1198 <https://github.com/sphinx-gallery/sphinx-gallery/pull/1198>`__ (`larsoner <https://github.com/larsoner>`__) - ENH: Write all computation times `#1197 <https://github.com/sphinx-gallery/sphinx-gallery/pull/1197>`__ (`larsoner <https://github.com/larsoner>`__) - ENH: Support source files in any language `#1192 <https://github.com/sphinx-gallery/sphinx-gallery/pull/1192>`__ (`speth <https://github.com/speth>`__) - FEA Add examples recommender system `#1125 <https://github.com/sphinx-gallery/sphinx-gallery/pull/1125>`__ (`ArturoAmorQ <https://github.com/ArturoAmorQ>`__) (NEWS truncated at 15 lines)
|
Sorry to post about it here, but I don't have folks emails. For GSOC this year, one of the proposed projects is an image based search. I think it could work as an extension of this feature/built on top if it, so I was wondering if any of the folks here had time/interest in being subject area mentors? Thanks! |
|
Hmm... I'm not totally convinced it would be useful to other projects that use SG like scikit-learn, MNE, etc. Would it? If it's mostly (or only) useful for |
The reason I was thinking custom solution is primarily b/c my end goal is a detexify type interface & also possibly a component based search. Granted like a detexify interface could hook into an image-based search - and also yeah that might be generally useful b/c I don't think my motivation of "folks may not know what a chart is called" is necessarily specific to matplotlib. |
|
@lucyleeow would you be interested in mentoring this potentially? You might be a better fit |
|
I was thinking the scikit-learn folks would be more knowledgable about machine learning algorithms but @story645 email me, we could be able to work something out. |
|
Actually do you have my email? I'll make it visible on my github account. |
Fixes #1081
For a proof of concept using the scikit-learn library navigate through:
https://output.circle-artifacts.com/output/job/22589aa6-cfe8-4daf-a9c0-b5abf94098b6/artifacts/0/doc/auto_examples/index.html
This PR is ready for review.