-
Notifications
You must be signed in to change notification settings - Fork 31.4k
Simplify soft dependencies and update the dummy-creation process #36827
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. When it is ready for review, please click the |
907994d to
82203a3
Compare
ArthurZucker
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks to the Great Cleaner @LysandreJik
eafece1 to
86d19fb
Compare
c2d9ac6 to
a7d9a91
Compare
|
@molbap will review this sometime early next week and then we can merge 🤗 |
|
Will finish this review tomorrow morning! |
molbap
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, I read all modifications 👀
I tested:
- Creating a new model (llava-like) that requires torch/torchvision. No issues there, inits are created correctly and imports work fine.
- Imports of specific models do not seem to be slowed down. Specifically ;
Time toimport transformersonmain(stats on 20 runs)
real: mean=1.985 std=0.126
user: mean=2.179 std=0.082
sys: mean=0.326 std=0.020on the branch:
real: mean=2.231 std=0.128
user: mean=2.259 std=0.097
sys: mean=0.390 std=0.033which is a very minor (but measurable) slowdown. For a specific model that requires backends,
Time to import LlavaForConditionalGeneration:
main:
real: mean=3.731 std=0.200
user: mean=3.259 std=0.174
sys: mean=0.642 std=0.035on the branch:
real: mean=3.740 std=0.184
user: mean=3.367 std=0.154
sys: mean=0.587 std=0.039No particular difference here.
New-model-addition correctly picks up on the init format (makes sense since for now it's copying it, modular will have to follow through
So the minor slowdowns may be due to the all being longer than it was, more modules to simply list even if they are not individually imported?
I've left one very minor comment or two but seems to hold up really well 🙌
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note, if torchvision is present but not torch (which may happen but isn't a runnable config) , then
if is_torchvision_available():
from torchvision import io as torchvision_ioand likely a few more things will fail at the import. Just flagging as torch is not a required dep
| if __name__ == "__main__": | ||
| check_all_inits() | ||
| check_submodules() | ||
| # This entire files needs an overhaul |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😁
utils/tests_fetcher.py
Outdated
| continue | ||
| if d not in direct_deps: | ||
| raise ValueError(f"KeyError:{d}. From {m}") | ||
| module_prefix = d.rsplit(".", 1)[0].replace(".", "/") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not familiar - so was common to get transformers.modeling_utils when we wanted e.g. transformers/modeling_utils.py ? good fallback then
| module_keys = set( | ||
| chain(*[[k.rsplit(".", i)[0] for i in range(k.count(".") + 1)] for k in list(module.keys())]) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just added a comment if I get it right
| module_keys = set( | |
| chain(*[[k.rsplit(".", i)[0] for i in range(k.count(".") + 1)] for k in list(module.keys())]) | |
| ) | |
| # collect all intermediate module paths from keys like "transformers.models.bart.modeling_bart", counting dots | |
| module_keys = set( | |
| chain(*[[k.rsplit(".", i)[0] for i in range(k.count(".") + 1)] for k in list(module.keys())]) | |
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haha it is indeed unclear, it's so that when you import a module such as transformers.models.bart.modeling_bart.py, you can also import
transformers.models.bart.modeling_bart
transformers.models.bart
transformers.models
transformers
I'll add a comment
molbap
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a7d9a91 to
220897a
Compare
4d04a39 to
f6f1133
Compare
9ff165a to
45b8eb4
Compare
Co-authored-by: Pablo Montalvo <[email protected]>
Co-authored-by: Joao Gante <[email protected]>
45b8eb4 to
2cf80e6
Compare
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
…gingface#36827) * Reverse dependency map shouldn't be created when test_all is set * [test_all] Remove dummies * Modular fixes * Update utils/check_repo.py Co-authored-by: Pablo Montalvo <[email protected]> * [test_all] Better docs * [test_all] Update src/transformers/commands/chat.py Co-authored-by: Joao Gante <[email protected]> * [test_all] Remove deprecated AdaptiveEmbeddings from the tests * [test_all] Doc builder * [test_all] is_dummy * [test_all] Import utils * [test_all] Doc building should not require all deps --------- Co-authored-by: Pablo Montalvo <[email protected]> Co-authored-by: Joao Gante <[email protected]>
…gingface#36827) * Reverse dependency map shouldn't be created when test_all is set * [test_all] Remove dummies * Modular fixes * Update utils/check_repo.py Co-authored-by: Pablo Montalvo <[email protected]> * [test_all] Better docs * [test_all] Update src/transformers/commands/chat.py Co-authored-by: Joao Gante <[email protected]> * [test_all] Remove deprecated AdaptiveEmbeddings from the tests * [test_all] Doc builder * [test_all] is_dummy * [test_all] Import utils * [test_all] Doc building should not require all deps --------- Co-authored-by: Pablo Montalvo <[email protected]> Co-authored-by: Joao Gante <[email protected]>




This PR updates the dummy-creation process that had been in place up to now. Instead of defining dummies in the lib, they're now dynamically created when the required dependencies for an object aren't found.
See below for an extract from the documentation explaining the current updated approach.
While we strive for minimal dependencies, some models have specific dependencies requirements that cannot be
worked around. We don't want for all users of
transformersto have to install those dependencies to use other models,we therefore mark those as soft dependencies rather than hard dependencies.
The transformers toolkit is not made to error-out on import of a model that has a specific dependency; instead, an
object for which you are lacking a dependency will error-out when calling any method on it. As an example, if
torchvisionisn't installed, the fast image processors will not be available.This object is still importable:
However, no method can be called on that object:
Let's see how to specify specific object dependencies.
Specifying Object Dependencies
Filename-based
All objects under a given filename have an automatic dependency to the tool linked to the filename
TensorFlow: All files starting with
modeling_tf_have an automatic TensorFlow dependency.Flax: All files starting with
modeling_flax_have an automatic Flax dependencyPyTorch: All files starting with
modeling_and not valid with the above (TensorFlow and Flax) have an automaticPyTorch dependency
Tokenizers: All files starting with
tokenization_and ending with_fasthave an automatictokenizersdependencyVision: All files starting with
image_processing_have an automatic dependency to thevisiondependency group;at the time of writing, this only contains the
pillowdependency.Vision + Torch + Torchvision: All files starting with
image_processing_and ending with_fasthave an automaticdependency to
vision,torch, andtorchvision.All of these automatic dependencies are added on top of the explicit dependencies that are detailed below.
Explicit Object Dependencies
We add a method called
requiresthat is used to explicitly specify the dependencies of a given object. As anexample, the
Trainerclass has two hard dependencies:torchandaccelerate. Here is how we specify theserequired dependencies:
Backends that can be added here are all the backends that are available in the
import_utils.pymodule.