Simplify soft dependencies and update the dummy-creation process #36827

LysandreJik · 2025-03-19T14:59:07Z

This PR updates the dummy-creation process that had been in place up to now. Instead of defining dummies in the lib, they're now dynamically created when the required dependencies for an object aren't found.

See below for an extract from the documentation explaining the current updated approach.

While we strive for minimal dependencies, some models have specific dependencies requirements that cannot be
worked around. We don't want for all users of transformers to have to install those dependencies to use other models,
we therefore mark those as soft dependencies rather than hard dependencies.

The transformers toolkit is not made to error-out on import of a model that has a specific dependency; instead, an
object for which you are lacking a dependency will error-out when calling any method on it. As an example, if
torchvision isn't installed, the fast image processors will not be available.

This object is still importable:

>>> from transformers import DetrImageProcessorFast
>>> print(DetrImageProcessorFast)
<class 'DetrImageProcessorFast'>

However, no method can be called on that object:

>>> DetrImageProcessorFast.from_pretrained()
ImportError: 
DetrImageProcessorFast requires the Torchvision library but it was not found in your environment. Checkout the instructions on the
installation page: https://pytorch.org/get-started/locally/ and follow the ones that match your environment.
Please note that you may need to restart your runtime after installation.

Let's see how to specify specific object dependencies.

Specifying Object Dependencies

Filename-based

All objects under a given filename have an automatic dependency to the tool linked to the filename

TensorFlow: All files starting with modeling_tf_ have an automatic TensorFlow dependency.

Flax: All files starting with modeling_flax_ have an automatic Flax dependency

PyTorch: All files starting with modeling_ and not valid with the above (TensorFlow and Flax) have an automatic
PyTorch dependency

Tokenizers: All files starting with tokenization_ and ending with _fast have an automatic tokenizers dependency

Vision: All files starting with image_processing_ have an automatic dependency to the vision dependency group;
at the time of writing, this only contains the pillow dependency.

Vision + Torch + Torchvision: All files starting with image_processing_ and ending with _fast have an automatic
dependency to vision, torch, and torchvision.

All of these automatic dependencies are added on top of the explicit dependencies that are detailed below.

Explicit Object Dependencies

We add a method called requires that is used to explicitly specify the dependencies of a given object. As an
example, the Trainer class has two hard dependencies: torch and accelerate. Here is how we specify these
required dependencies:

from .utils.import_utils import requires

@requires(backends=("torch", "accelerate"))
class Trainer:
    ...

Backends that can be added here are all the backends that are available in the import_utils.py module.

github-actions · 2025-03-19T14:59:20Z

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. When it is ready for review, please click the Ready for review button (at the bottom of the PR page).

ArthurZucker

Thanks to the Great Cleaner @LysandreJik

LysandreJik · 2025-04-04T12:59:33Z

@molbap will review this sometime early next week and then we can merge 🤗

molbap · 2025-04-07T16:15:58Z

Will finish this review tomorrow morning!

molbap

Of course, I read all modifications 👀
I tested:

Creating a new model (llava-like) that requires torch/torchvision. No issues there, inits are created correctly and imports work fine.
Imports of specific models do not seem to be slowed down. Specifically ;
Time to import transformers on main (stats on 20 runs)

real: mean=1.985  std=0.126
user: mean=2.179  std=0.082
sys:  mean=0.326  std=0.020

on the branch:

real: mean=2.231  std=0.128
user: mean=2.259  std=0.097
sys:  mean=0.390  std=0.033

which is a very minor (but measurable) slowdown. For a specific model that requires backends,
Time to import LlavaForConditionalGeneration:
main:

real: mean=3.731  std=0.200
user: mean=3.259  std=0.174
sys:  mean=0.642  std=0.035

on the branch:

real: mean=3.740  std=0.184
user: mean=3.367  std=0.154
sys:  mean=0.587  std=0.039

No particular difference here.

New-model-addition correctly picks up on the init format (makes sense since for now it's copying it, modular will have to follow through

So the minor slowdowns may be due to the all being longer than it was, more modules to simply list even if they are not individually imported?

I've left one very minor comment or two but seems to hold up really well 🙌

molbap · 2025-04-08T08:40:00Z

src/transformers/image_utils.py

note, if torchvision is present but not torch (which may happen but isn't a runnable config) , then

if is_torchvision_available(): from torchvision import io as torchvision_io

and likely a few more things will fail at the import. Just flagging as torch is not a required dep

molbap · 2025-04-08T09:06:53Z

utils/check_inits.py

 if __name__ == "__main__":
-    check_all_inits()
-    check_submodules()
+    # This entire files needs an overhaul


utils/check_repo.py

molbap · 2025-04-08T09:12:19Z

utils/tests_fetcher.py

                    continue
                if d not in direct_deps:
-                    raise ValueError(f"KeyError:{d}. From {m}")
+                    module_prefix = d.rsplit(".", 1)[0].replace(".", "/")


Not familiar - so was common to get transformers.modeling_utils when we wanted e.g. transformers/modeling_utils.py ? good fallback then

molbap · 2025-04-08T09:15:35Z

src/transformers/utils/import_utils.py

+                module_keys = set(
+                    chain(*[[k.rsplit(".", i)[0] for i in range(k.count(".") + 1)] for k in list(module.keys())])
+                )


just added a comment if I get it right

Suggested change

module_keys = set(

chain(*[[k.rsplit(".", i)[0] for i in range(k.count(".") + 1)] for k in list(module.keys())])

)

# collect all intermediate module paths from keys like "transformers.models.bart.modeling_bart", counting dots

module_keys = set(

chain(*[[k.rsplit(".", i)[0] for i in range(k.count(".") + 1)] for k in list(module.keys())])

)

haha it is indeed unclear, it's so that when you import a module such as transformers.models.bart.modeling_bart.py, you can also import

transformers.models.bart.modeling_bart transformers.models.bart transformers.models transformers

I'll add a comment

molbap

Approved after discussion. Importing a model before:

Importing a model now:

Same time - the listing of all models is now explicit (all the tiny bars on the right).

Import * with torch, before:

Import * with torch, now:

Same measurement as before - roughly 200ms more for the import. Without torch, it takes 400ms.
LGTM!

src/transformers/commands/chat.py

Co-authored-by: Pablo Montalvo <[email protected]>

Co-authored-by: Joao Gante <[email protected]>

HuggingFaceDocBuilderDev · 2025-04-11T09:06:12Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

…gingface#36827) * Reverse dependency map shouldn't be created when test_all is set * [test_all] Remove dummies * Modular fixes * Update utils/check_repo.py Co-authored-by: Pablo Montalvo <[email protected]> * [test_all] Better docs * [test_all] Update src/transformers/commands/chat.py Co-authored-by: Joao Gante <[email protected]> * [test_all] Remove deprecated AdaptiveEmbeddings from the tests * [test_all] Doc builder * [test_all] is_dummy * [test_all] Import utils * [test_all] Doc building should not require all deps --------- Co-authored-by: Pablo Montalvo <[email protected]> Co-authored-by: Joao Gante <[email protected]>

github-actions bot marked this pull request as draft March 19, 2025 14:59

molbap mentioned this pull request Mar 20, 2025

Add model visual debugger #36798

Merged

LysandreJik force-pushed the dummy-removal branch 2 times, most recently from 907994d to 82203a3 Compare March 26, 2025 16:56

LysandreJik marked this pull request as ready for review March 26, 2025 16:57

github-actions bot requested review from ArthurZucker and Rocketknight1 March 26, 2025 16:57

ArthurZucker approved these changes Mar 26, 2025

View reviewed changes

LysandreJik force-pushed the dummy-removal branch 3 times, most recently from eafece1 to 86d19fb Compare March 28, 2025 10:15

LysandreJik changed the base branch from main to tests-fetcher-test-all March 28, 2025 10:24

LysandreJik force-pushed the dummy-removal branch 5 times, most recently from c2d9ac6 to a7d9a91 Compare March 28, 2025 14:56

molbap self-requested a review April 7, 2025 16:16

molbap reviewed Apr 8, 2025

View reviewed changes

molbap approved these changes Apr 8, 2025

View reviewed changes

LysandreJik force-pushed the dummy-removal branch from a7d9a91 to 220897a Compare April 8, 2025 11:23

LysandreJik changed the base branch from tests-fetcher-test-all to main April 8, 2025 11:26

This was referenced Apr 8, 2025

Remove runtime conditions for type checking #37340

Merged

Fix Flash Attention guard #37243

Open

LysandreJik force-pushed the dummy-removal branch 2 times, most recently from 4d04a39 to f6f1133 Compare April 9, 2025 14:55

gante reviewed Apr 9, 2025

View reviewed changes

src/transformers/commands/chat.py Outdated Show resolved Hide resolved

LysandreJik force-pushed the dummy-removal branch 3 times, most recently from 9ff165a to 45b8eb4 Compare April 11, 2025 07:11

LysandreJik and others added 8 commits April 11, 2025 09:12

Reverse dependency map shouldn't be created when test_all is set

94819eb

[test_all] Remove dummies

aab1f92

Modular fixes

04c4bde

Update utils/check_repo.py

964331b

Co-authored-by: Pablo Montalvo <[email protected]>

[test_all] Better docs

0d48820

[test_all] Update src/transformers/commands/chat.py

1456a5d

Co-authored-by: Joao Gante <[email protected]>

[test_all] Remove deprecated AdaptiveEmbeddings from the tests

a95b280

[test_all] Doc builder

2cf80e6

LysandreJik force-pushed the dummy-removal branch from 45b8eb4 to 2cf80e6 Compare April 11, 2025 07:15

LysandreJik added 3 commits April 11, 2025 09:21

[test_all] is_dummy

bb7bf7a

[test_all] Import utils

da09f0c

[test_all] Doc building should not require all deps

e081259

LysandreJik merged commit 54a123f into main Apr 11, 2025
19 of 21 checks passed

LysandreJik deleted the dummy-removal branch April 11, 2025 09:08

LysandreJik changed the title ~~Dummies~~ Simplify soft dependencies and update the dummy-creation process Apr 11, 2025

This was referenced Apr 18, 2025

Add DeepSeek V2 Model into Transformers #36400

Merged

Add Ovis2 model and processor implementation #37088

Merged

Simplify soft dependencies and update the dummy-creation process #36827

Simplify soft dependencies and update the dummy-creation process #36827

Uh oh!

Conversation

LysandreJik commented Mar 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Specifying Object Dependencies

Filename-based

Explicit Object Dependencies

Uh oh!

github-actions bot commented Mar 19, 2025

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

LysandreJik commented Apr 4, 2025

Uh oh!

molbap commented Apr 7, 2025

Uh oh!

molbap left a comment

Choose a reason for hiding this comment

Uh oh!

molbap Apr 8, 2025

Choose a reason for hiding this comment

Uh oh!

molbap Apr 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

molbap Apr 8, 2025

Choose a reason for hiding this comment

Uh oh!

molbap Apr 8, 2025

Choose a reason for hiding this comment

Uh oh!

LysandreJik Apr 9, 2025

Choose a reason for hiding this comment

Uh oh!

molbap left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Apr 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

LysandreJik commented Mar 19, 2025 •

edited

Loading