Skip to content

Conversation

@lucyleeow
Copy link
Contributor

@lucyleeow lucyleeow commented Jan 3, 2020

closes #583

If an example uses a function like this:

import h.i

h.i.j()

get_mapping will check if local_name, in this case h, is in the self.imported_names dict. The key for this import would be h.i so it doesn't match.

I've added a for loop which will run for 'n' times, where 'n' is the max number of splits (split('.')) of all keys in self.imported_names. In each loop another split of the locally used name is added onto local_name which is checked to see if it exists in self.imported_names. e.g., for the above example the loop would run twice, first time local_name would be h and second time local_name would be h.i.

Not very elegant - open to changes @larsoner

To remind myself, classes worked (as noted in #583) because when creating a class instance local_name existed in self.global_variables (this line)

(thanks to @glemaitre for your help)

@lucyleeow
Copy link
Contributor Author

lucyleeow commented Jan 3, 2020

Do the failures have something to do with pillow?

@larsoner
Copy link
Contributor

larsoner commented Jan 5, 2020

This is going to conflict with #581. Can you see if #581 fixes the problem? If it doesn't, would it be difficult to make a PR after #581 is merged?

@lucyleeow
Copy link
Contributor Author

@larsoner #581 doesn't fix the problem but yes I can do the PR post #581

Would the get_mapping function become too unwieldy? Do you suggest refactoring or to just add my for loop in?

@larsoner
Copy link
Contributor

larsoner commented Jan 6, 2020

Would the get_mapping function become too unwieldy? Do you suggest refactoring or to just add my for loop in?

Either way is fine with me. Whatever you think helps keep the code readable. I trust the regression tests to make sure nothing is broken, master already covers a few good cases and with #581 we cover even more.

And if you can fix it so that np.random.RandomState actually links properly, that would be amazing! (See here we still have some deficiencies)

@codecov-io
Copy link

codecov-io commented Jan 6, 2020

Codecov Report

Merging #584 into master will increase coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #584      +/-   ##
==========================================
+ Coverage   97.55%   97.56%   +0.01%     
==========================================
  Files          29       29              
  Lines        2988     3001      +13     
==========================================
+ Hits         2915     2928      +13     
  Misses         73       73
Impacted Files Coverage Δ
sphinx_gallery/tests/test_full.py 100% <100%> (ø) ⬆️
sphinx_gallery/tests/test_backreferences.py 100% <100%> (ø) ⬆️
sphinx_gallery/backreferences.py 97.2% <100%> (+0.08%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 775ca82...d411486. Read the comment docs.

'''
""" + code_str.split(b"'''")[-1]
expected['h.i'] = {u'module': u'h', u'module_short': u'h', u'name': u'i',
expected['k.l'] = {u'module': u'k', u'module_short': u'k', u'name': u'l',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also check the tinybuild and add tests that fail in master and work now? It ensures that the real-world / end use cases improve in addition to our lower-level functions.

There might already be examples that didn't work before but now work (i.e., just change test_full.py but not anything in the tinybuild/ directory itself), which would be great. But if you need to also change things in tinybuild/ to get something that fails in master but works here it's fine, too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't an appropriate import example in tinybuild/ - I can add a import a.b to /plot_numpy_matplotlib ('Link to other packages') example and add an assertion somewhere in test_full. Does that work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that works, but in addition to what you've done you should add something like the following after line ~198 in test_full.py as that's where we test that the code blocks have the appropriate outgoing links:

assert 'sphinx_gallery.backreferences.identify_names.html' in lines

It's fine to have all this in plot_numpy_matplotlib.py of tinybuild, no need for a separate file.

@lucyleeow
Copy link
Contributor Author

I have added a import case to plot_numpy_matplotlib.py and checked that it generates the appropriate backreferences mini-gallery rst (and that it doesn't work on master).

I know it doesn't really fit in with the example plot_numpy_matplotlib.py so I am happy to put this in it's own example .py file.

I also noticed that incorrect mini-gallery rst files are generated, which reference incorrect example .py files. Neither of these functions are used at all in the tinybuild examples:

  • sphinx_gallery.backreferences.NameFinder.examples - references plot_future_imports.py
  • sphinx_gallery.sorting.FileNameSortKey.examples - references plot_numpy_matplotlib.py

I am not sure if this error is related to the np.random.RandomState(0) problem and the inconsistent variable name linking mentioned here: mne-tools/mne-python#7174 (comment). I can investigate these in a separate issue/PR.

Copy link
Contributor

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also noticed that incorrect mini-gallery rst files are generated, which reference incorrect example .py files... I am not sure if this error is related to the np.random.RandomState(0) problem and the inconsistent variable name linking mentioned here: mne-tools/mne-python#7174 (comment). I can investigate these in a separate issue/PR.

That would be great. I would say either open an issue or WIP PR with a list of these three problems (RandomState, extra / incorrect backrefs, incomplete method parsing) and we can tackle them not in this PR

full_name = '.'.join(
module[:depth] + [class_name] + method)
yield name, full_name, class_attr, is_class
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This break will always occur in the first iteration of the for loop, so do we even need the loop? It seems likely that this loop is now equivalent to just doing:

if len(module) >= 1:  # probably don't even need this conditional
    full_name = ...
    yield ...

Is the loop no longer necessary because somewhere else we're already traversing the full_name.split('.') levels backward?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you are right. The loop is needed for the conditional if local_name in self.imported_names: but is not required for the condition elif local_name in self.global_variables:.

In the case where you are calling an indirectly imported function (or class), e.g.,

import h.i
h.i.j()

the local_name, h, will not be in self.imported_names or self.global_variables. It needs to traverse the loop for the first conditional if local_name in self.imported_names:.

Not sure how to fix...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It needs to traverse the loop for the first conditional if local_name in self.imported_names:

I'm confused :( Which loop are you talking about with this statement? The one up on line 69, or 80? Those are the two that satisfy if local_name in self.imported_names. Maybe the best thing to do is to add a commit with some code comments to help explain why some of the conditionals / loops are needed in the places that you know they are (and even # XXX don't know why this is needed in some places).

I'm just saying that the change you made in the code is equivalent to something simpler. So we should either simplify the loop starting on line 108 accordingly or revert the addition of the break.

Are you saying that (regardless of this equivalence) the code as it's written does not work properly?

Copy link
Contributor Author

@lucyleeow lucyleeow Jan 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment but not sure if it helps.

I was talking about the 69 for loop for split_level in range(max_import_splits):

And yes I realise now that the break was on the wrong level. It was meant to be similar to/on the same level as the break here:

            for split_level in range(max_import_splits):
                local_name_split = name.split('.')
                local_name = '.'.join(local_name_split[:split_level+1])
                remainder = name[len(local_name):]
                class_attr = False
                if local_name in self.imported_names:
                    # Join import path to relative path
                    full_name = self.imported_names[local_name] + remainder
                    if local_name in self.global_variables:
                        obj = self.global_variables[local_name]
                        if remainder:
                            for level in remainder[1:].split('.'):
                                obj = getattr(obj, level)
                        is_class = inspect.isclass(obj)
                    else:
                        is_class = False
                    yield name, full_name, class_attr, is_class
                    break   # <----this break!!

but you made me realise that the for split_level in range(max_import_splits): loop is only required for the first condition. i.e.

for split_level in range(max_import_splits):
    if local_name in self.imported_names:
    <the for loop is needed in here>
    elif local_name in self.global_variables:
    <the for loop isn't needed here>

Copy link
Contributor

@larsoner larsoner Jan 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suggests that the code should be refactored to:

if local_name in self.imported_names:
    <the for loop is needed in here>
    for split_level in range(max_import_splits):
        ...
elif local_name in self.global_variables:
    <the for loop isn't needed here>

Right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind, this wouldn't work as local_name is formed using the split_level. So maybe you were talking about a different for loop...

Copy link
Contributor Author

@lucyleeow lucyleeow Jan 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Different loop?

For case of import h.i; h.i.j():

for split_level in range(max_import_splits):
    if local_name in self.imported_names:
        # the for loop is needed in here because self.imported_names is `h.i`
        # so the loop would check for `h` in self.imported_names first 
        # time round,  `h.i` in imported_names
    elif local_name in self.global_variables:
        # you don't need the loop here as you are checking for 
        # global variable names, which
        # won't have a '.' (i think?)

But I don't have a great understanding of what goes on after the elif local_name in self.global_variables: so please correct me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've put the break at the correct level now - which may or may not make things clearer.

Copy link
Contributor Author

@lucyleeow lucyleeow Jan 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The break broke things because it seems that you need to traverse the loop again if during the first time round the loop you have this case:

for split_level in range(max_import_splits):
    if local_name in self.imported_names:
        # condition not met on first loop
    elif local_name in self.global_variables:
        # condition met on first loop

Because otherwise example_code_obj returned from identify_names is incorrect. I don't really understand why.
e.g., for tinybuild sphinx_gallery.backreferences.identify_names([('text', 'Text block', 1)]), example_code_obj is:

# with break
('sphinx_gallery.backreferences.identify_names', {'name': 'backreferences.identify_names', 'module': 'builtins.module', 'module_short': 'builtins.module', 'is_class': False})])
# without break
('sphinx_gallery.backreferences.identify_names', {'name': 'identify_names', 'module': 'sphinx_gallery.backreferences', 'module_short': 'sphinx_gallery.backreferences', 'is_class': False}),

I think that without the break both are yielded by get_mapping but somehow identify_names knows that 'name': 'backreferences.identify_names' is the good one so only returns this? I'm a bit confused by how identify_names works but does that make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but somehow identify_names knows that 'name': 'backreferences.identify_names' is the good one

It looks for valid intersphinx references, so yeah it should subselect the correct one (wherever it's documented in the docs)

'''
""" + code_str.split(b"'''")[-1]
expected['h.i'] = {u'module': u'h', u'module_short': u'h', u'name': u'i',
expected['k.l'] = {u'module': u'k', u'module_short': u'k', u'name': u'l',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that works, but in addition to what you've done you should add something like the following after line ~198 in test_full.py as that's where we test that the code blocks have the appropriate outgoing links:

assert 'sphinx_gallery.backreferences.identify_names.html' in lines

It's fine to have all this in plot_numpy_matplotlib.py of tinybuild, no need for a separate file.

@larsoner
Copy link
Contributor

larsoner commented Jan 7, 2020

That break appears to have broken things

@lucyleeow
Copy link
Contributor Author

This failure seems to be a problem with installation with Python Nightly

@lucyleeow
Copy link
Contributor Author

So this works and fixes this specific bug. Going forward and fixing the other 3 problems (RandomState, extra / incorrect backrefs, incomplete method parsing) we might want to think about refactoring the get_mapping function.

I will tackle this when I feel brave enough (also have some other work to do), unless someone beats me to it.

@larsoner larsoner merged commit da4b0cb into sphinx-gallery:master Jan 7, 2020
@larsoner
Copy link
Contributor

larsoner commented Jan 7, 2020

Thanks @lucyleeow !

@larsoner larsoner added the bug label Jan 7, 2020
@larsoner larsoner mentioned this pull request Jan 7, 2020
4 tasks
@lucyleeow lucyleeow deleted the IS/583 branch April 6, 2020 13:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG backreferences not working for functions if not imported directly

3 participants