MNT: remove redundant normalization from set by rcomer · Pull Request #30979 · matplotlib/matplotlib

rcomer · 2026-01-16T22:05:42Z

PR summary

Inspired by #30957

The normalization is not needed here because the alias forms of the set_* methods can be called. So we can save ourselves microseconds 🚀

Before:

In [5]: %timeit line.set(lw=3, color='r', ls='--')
12.3 μs ± 116 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

After:

In [8]: %timeit line.set(lw=3, color='r', ls='--')
10.9 μs ± 76.5 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

timhoffm · 2026-01-17T10:59:17Z

From a quick look, I haven't seen a test for this. Please add one. I think Artist itself does not have aliases, so either you would have to create a sublcass with aliases for the test or just use one of the subclasses like

line = Line2D([0, 1], [0, 1])
line.set(lw=5)
assert line.get_linewidth() == 5

timhoffm

For context, the _api.define_aliases decorator automatically creates the alias getters and setters. So we are safe that all the alias setters exist and normalization is indeed not necessary.

Btw. do we test set() at all? If not, you may just add the long form here as well

line.set(linewidth=6)
assert line.get_linewidth() == 6

rcomer · 2026-01-17T13:46:52Z

Btw. do we test set() at all?

Not that I can see, though it is of course tested indirectly many times.

anntzer · 2026-01-17T19:18:24Z

A difference is that if going through normalize_kwargs, passing {"linewidth":, 1, "lw": 2} will raise an error because it sees that two aliases of the same property are passed (and it explicitly checks for that), whereas directly calling the setters will just let the second value win. I don't have a strong opinion right now as to whether this actually matters, or whether it's even worth worrying about this.

rcomer · 2026-01-17T19:55:52Z

Thanks @anntzer I think it should error for that case. I will wait to see if anyone thinks differently and, if, not, reinstate normalize_kwargs here and add a check for the error to the new test.

timhoffm · 2026-01-18T00:12:20Z

Note that we are not consistent with duplicating through aliases

In [33]: plt.plot([0, 1], lw=2, linewidth=3)

TypeError: Got both 'lw' and 'linewidth', which are aliases of one another

but

In [29]: line = Line2D([], [], lw=2, linewidth=3)

In [30]: line.get_linewidth()
Out[30]: 2.0

In [31]: line = Line2D([], [], linewidth=2, lw=3)

In [32]: line.get_linewidth()
Out[32]: 3.0

So, plot() does a normalization and warns, but Line2D.__init__ does not warn. Surprisingly, lw always wins - That's an implementation detail as linewidth is an explicit parameter, whereas lw comes in via kwargs. __init__ first sets all explicit parameters and then does self._internal_update(kwargs), with which lw overwrites the previously set linewidth. I'm quite sure this is accidental behavior and not intentionally designed.

I would have said that it is acceptable for set() to not warn on the basis that set(a=A, b=B) is just set_a(A); set_b(B) and if these parameters interact (e.g. because they overwrite each other, but could also be other interactions), that is your responsibility. Warning may be nice, but I'm not sure we need to jump extra hoops for that.

But this all shows that we should have a consistent concept for aliases:

Should they exist as dedicated methods? i.e. do we want set_lw()?

I don't think we necessarily need them as a public interface. If somebody bothers to resort to set_* methods, they could write out the full name. The abbreviations are primarily useful for kwargs.

The benefit of the alias functions is that we can pass aliases all the way through to their function. This feels cheaper because no remapping is needed. But cost would really need a dedicated investigation. If we just do a remapping (no duplicacy check), that's basically a dict-lookup, which is extremely fast. It's likely even faster than calling the alias setter, because that in turn has to lookup and call the actual property setter. But there could be a caveat that if we have to do normalization at every step of a call hierarchy, that could still become expensive.

The downside of not normalizing and falling through to the respective setter is that unnormalized state can have surprising effects. Which parameter eventually wins can depend on implementation details like in the Line2D.__init__.

When do we want to do normalization
Should this eventually happen for every API call? If so, we must normalize on the most inner level, i.e. in the public API of the Artists. We could but don't have to normalize in outer API layers like plot(). Some of these still need/use normalization to be able to conflate or sanity check parameters, e.g. mashing up c, color, facecolor, fc state in scatter().

Do we want duplicacy checks?
i.e. warn when both lw and linewidth are supplied. That's an additional cost and more costly than normalization, because you have to do comparison with all parameters. If we want to check for duplicacy, we obviously need to do this before each normalization. Alternative to duplicacy checking would be putting the inputs in the responsibility of the user. One can argue, as above, that we just apply all parameters - either with defined priorities (e.g. original/alias name wins, first/last name wins) - or stating that priority is undefined.

Overall, I'm not clear what is best. I think we should defer this to 3.12. It's nice to clean this mess up, but it has been as is for decades, so no need to hurry.

rcomer · 2026-01-18T10:08:31Z

😱

I should have learned by now that nothing is as simple as it first appears!

Should they exist as dedicated methods? i.e. do we want set_lw()?

I don't think we necessarily need them as a public interface. If somebody bothers to resort to set_* methods, they could write out the full name. The abbreviations are primarily useful for kwargs.

I agree. I actually hadn’t realised that they exist until this PR. I don’t think I would have otherwise gone looking for them. These days most people use tab-completion so they won’t save many keystrokes.

anntzer · 2026-01-18T11:17:13Z

That's an implementation detail as linewidth is an explicit parameter, whereas lw comes in via kwargs. init first sets all explicit parameters and then does self._internal_update(kwargs)

I believe that the technically correct solution is to not include any property in the actual constructor signature (the documented and displayed signature can and should likely be different, but that can be done via the usual signature-rewriting tricks that we already use for .set()); we should just capture all kwargs in a kwargs dict and forward them to set().

rcomer · 2026-01-18T11:31:01Z

Do we want duplicacy checks?
i.e. warn when both lw and linewidth are supplied. That's an additional cost and more costly than normalization, because you have to do comparison with all parameters. If we want to check for duplicacy, we obviously need to do this before each normalization.

We can skip that check if the normalized dictionary is the same size as the original dictionary

--- a/lib/matplotlib/cbook.py
+++ b/lib/matplotlib/cbook.py
@@ -1862,18 +1862,19 @@ def normalize_kwargs(kw, alias_mapping=None):
     to_canonical = {alias: canonical
                     for canonical, alias_list in alias_mapping.items()
                     for alias in alias_list}
-    canonical_to_seen = {}
-    ret = {}  # output dictionary
 
+    ret = {to_canonical.get(k, k): v for k, v in kw.items()}  # output dictionary
+    if len(ret) == len(kw):
+        return ret
+
+    # look for duplicates
+    canonical_to_seen = {}
     for k, v in kw.items():
         canonical = to_canonical.get(k, k)
         if canonical in canonical_to_seen:
             raise TypeError(f"Got both {canonical_to_seen[canonical]!r} and "
                             f"{k!r}, which are aliases of one another")
         canonical_to_seen[canonical] = k
-        ret[canonical] = v
-
-    return ret

However, with kwargs_good = {'lw': 3, 'color': 'r', 'linestyle': 'dashed'}, that only saved me 0.1 of a microsecond

main

In [8]: %timeit normalize_kwargs(kwargs_good, Line2D)
1.51 μs ± 8.46 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

branch

In [4]: %timeit normalize_kwargs(kwargs_good, Line2D)
1.39 μs ± 9.61 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

anntzer · 2026-01-18T12:01:29Z

If we care about performance then the biggest bottleneck actually appears to be the regeneration of to_canonical again and again, ie. caching that gives me a >2x speedup on @rcomer's example:

diff --git i/lib/matplotlib/cbook.py w/lib/matplotlib/cbook.py
index 2e416486ba..34bc774aa7 100644
--- i/lib/matplotlib/cbook.py
+++ w/lib/matplotlib/cbook.py
@@ -1820,6 +1820,13 @@ def _resize_sequence(seq, N):
         return list(itertools.islice(itertools.cycle(seq), N))
 
 
+@functools.cache
+def _fix_alias_map(cls):
+    return {alias: canonical
+            for canonical, alias_list in getattr(cls, "_alias_map", {}).items()
+            for alias in alias_list}
+
+
 def normalize_kwargs(kw, alias_mapping=None):
     """
     Helper function to normalize kwarg inputs.
@@ -1852,28 +1859,32 @@ def normalize_kwargs(kw, alias_mapping=None):
     if kw is None:
         return {}
 
-    # deal with default value of alias_mapping
+    # deal with default value of alias_mapping -- could be deprecated away.
     if alias_mapping is None:
-        alias_mapping = {}
-    elif (isinstance(alias_mapping, type) and issubclass(alias_mapping, Artist)
-          or isinstance(alias_mapping, Artist)):
-        alias_mapping = getattr(alias_mapping, "_alias_map", {})
-
-    to_canonical = {alias: canonical
-                    for canonical, alias_list in alias_mapping.items()
-                    for alias in alias_list}
+        return kw
+
+    if isinstance(alias_mapping, Artist):
+        alias_mapping = type(alias_mapping)
+
+    if isinstance(alias_mapping, type) and issubclass(alias_mapping, Artist):
+        to_canonical = _fix_alias_map(alias_mapping)
+    else:
+        to_canonical = {alias: canonical
+                        for canonical, alias_list in alias_mapping.items()
+                        for alias in alias_list}
+
+    ret = {to_canonical.get(k, k): v for k, v in kw.items()}  # output dictionary
+    if len(ret) == len(kw):
+        return ret
+
+    # look for duplicates
     canonical_to_seen = {}
-    ret = {}  # output dictionary
-
     for k, v in kw.items():
         canonical = to_canonical.get(k, k)
         if canonical in canonical_to_seen:
             raise TypeError(f"Got both {canonical_to_seen[canonical]!r} and "
                             f"{k!r}, which are aliases of one another")
         canonical_to_seen[canonical] = k
-        ret[canonical] = v
-
-    return ret
 
 
 @contextlib.contextmanager

The patch above is just a quick try but I suspect the "correct" approach is to change the format of Artist._alias_map (as set up by define_aliases) to directly be in the "efficient" format ({alias: canonical, alias: canonical, ...}) (with whatever deprecation dance is necessary, although most of the relevant stuff is private).

timhoffm · 2026-01-18T12:07:01Z

I believe that the technically correct solution is to not include any property in the actual constructor signature (the documented and displayed signature can and should likely be different, but that can be done via the usual signature-rewriting tricks that we already use for .set()); we should just capture all kwargs in a kwargs dict and forward them to set().

I fundamentally agree. Two additional thoughts to consider:

How does completion work nowadays? At what exactly are they looking? It would be nice to support that. OTOH constructors are rarely called directy. But possibly the same argument will hold for our plotting functions?
How costly is signature rewriting? It's an import-time cost and a few won't matter, but if we do this large-scale on the library we should at least have a guess whether that's significant.

rcomer · 2026-01-18T13:05:00Z

How does completion work nowadays? At what exactly are they looking? It would be nice to support that. OTOH constructors are rarely called directy. But possibly the same argument will hold for our plotting functions?

A quick check with ipython and vscode show they give me no help for passing linewidth to plot, but for Line2D tab-completion on linewidth works nicely. This is probably the wrong way around!

I am not too experienced with decorators, but I wonder if it's possible to make one that injects the canonical properties of an artist into a call sequence, and passes them as **kwargs to the inner function.

timhoffm · 2026-01-18T14:51:07Z

A quick chat with AI reveals the following - to be checked, but plausible behavior:

Aspect	IPython	VS Code
Execution model	Runtime	Static
Parameter source	`inspect.signature` , `__text_signature__`	AST + .pyi
Works without running code	No	Yes

So for static analyzers, we likely have to tune our .pyi files while for runtime inspection, we need to adapt the signature via decorators. Both approaches should be compatible with each other and with still actually using **kwargs in the atcual code logic.

scottshambaugh · 2026-01-20T03:58:15Z

I'm seeing big performance gains in #31000 from skipping the heaviest use of this code path. So if the discussion here is borderline on keeping/changing the behavior then it's worth flagging that the performance hit from keeping it is going to be pretty minimal after that merges. It sounds mostly philosophical though. :)

timhoffm · 2026-01-20T15:47:05Z

My primary intention here is to come up with a concept: How do we get from short and long kwargs (in Artist.init, set, plotting functions) to having the right state set. Aspects to that are

performance: This is a quite common operation (though possibly often with few or even 0 parameters).
clarity: We should have a clear mental model how we handle this, ideally with not too much code
usability: In particular support of completion.

Performance is still a design criteria. If possible, the architecture should be efficient. So that we are not too dependent on special shortcuts. #31000 brings benefits on _set_cm, which is great and good to know, but doesn't alleviate the need for a fundamental concept.

Seems to need more discussion.

timhoffm · 2026-01-23T14:29:08Z

The patch above is just a quick try but I suspect the "correct" approach is to change the format of Artist._alias_map (as set up by define_aliases) to directly be in the "efficient" format ({alias: canonical, alias: canonical, ...}) (with whatever deprecation dance is necessary, although most of the relevant stuff is private).

If _alias_map is still needed somewhere, it's also not the end of the world to additionally store a _reverse_alias_map. That can be generated right away in define_aliases. It's a little bit additional import-time effort, but we can easily affort a dict comprehension compared to all the other stuff we do.

timhoffm · 2026-01-23T14:40:32Z

Overall the strategy here should be:

Short-term:
make normalize_kwargs faster through a pre-built reverse_alias_map #30979 (comment) #30979 (comment)
Done in #31023.

Mid-term: (larger project)
Revisit the architecture and handling of artist properties, including

fundamental design decisions MNT: remove redundant normalization from set #30979 (comment)
usage in signatures MNT: remove redundant normalization from set #30979 (comment)
need for normalization in set, (including whether it needs to detect duplicates) MNT: remove redundant normalization from set #30979 (comment)
completion support MNT: remove redundant normalization from set #30979 (comment)

timhoffm · 2026-02-02T09:04:11Z

@rcomer I propose to close this. #31023 has made the normalization faster and any further changes are likely better addessed as part of a systematic reevaluation of the property handling, which we'll track in #31055.

rcomer added the Performance label Jan 16, 2026

QuLogic previously approved these changes Jan 17, 2026

View reviewed changes

rcomer force-pushed the set-normalize branch from 9d71800 to e26eeb1 Compare January 17, 2026 12:06

timhoffm approved these changes Jan 17, 2026

View reviewed changes

MNT: remove redundant normalization from set

335b2a6

rcomer force-pushed the set-normalize branch from e26eeb1 to 335b2a6 Compare January 17, 2026 13:46

timhoffm mentioned this pull request Jan 17, 2026

MNT: Discourage Artist.update #30978

Merged

5 tasks

rcomer marked this pull request as draft January 17, 2026 19:20

timhoffm mentioned this pull request Jan 18, 2026

[MNT]: Clarify the difference between Artist.set and Artist.update #30957

Closed

rcomer added this to the v3.12.0 milestone Jan 18, 2026

timhoffm mentioned this pull request Jan 19, 2026

[ENH]: utilize the auto completion. ( let me show what Im working on) #30990

Open

QuLogic mentioned this pull request Jan 20, 2026

PERF: Skip kwargs normalization in Artist._cm_set #31000

Merged

1 task

anntzer mentioned this pull request Jan 23, 2026

Speedup normalize_kwargs by storing aliases in a more practical format. #31023

Merged

5 tasks

rcomer mentioned this pull request Jan 28, 2026

Add tests for invalid properties and duplicate aliases in Artist.set #31041

Merged

1 task

timhoffm mentioned this pull request Feb 2, 2026

[MNT]: Revisit architecture and handling of Artist properties #31055

Open

rcomer closed this Feb 2, 2026

rcomer deleted the set-normalize branch February 2, 2026 09:43

timhoffm mentioned this pull request Feb 2, 2026

DOC: Clarify Artist.set() behavior #31057

Merged

rcomer mentioned this pull request Feb 2, 2026

TST: add basic test for set #31058

Merged

5 tasks

QuLogic removed this from the v3.12.0 milestone Feb 2, 2026

Uh oh!

Comments

Conversation

rcomer commented Jan 16, 2026

PR summary

Uh oh!

timhoffm commented Jan 17, 2026

Uh oh!

timhoffm left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rcomer commented Jan 17, 2026

Uh oh!

anntzer commented Jan 17, 2026

Uh oh!

rcomer commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

timhoffm commented Jan 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rcomer commented Jan 18, 2026

Uh oh!

anntzer commented Jan 18, 2026

Uh oh!

rcomer commented Jan 18, 2026

Uh oh!

anntzer commented Jan 18, 2026

Uh oh!

timhoffm commented Jan 18, 2026

Uh oh!

rcomer commented Jan 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

timhoffm commented Jan 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

scottshambaugh commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

timhoffm commented Jan 20, 2026

Uh oh!

timhoffm commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

timhoffm commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

timhoffm commented Feb 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

timhoffm left a comment •

edited

Loading

rcomer commented Jan 17, 2026 •

edited

Loading

timhoffm commented Jan 18, 2026 •

edited

Loading

rcomer commented Jan 18, 2026 •

edited

Loading

timhoffm commented Jan 18, 2026 •

edited

Loading

scottshambaugh commented Jan 20, 2026 •

edited

Loading

timhoffm commented Jan 23, 2026 •

edited

Loading

timhoffm commented Jan 23, 2026 •

edited

Loading