-
-
Notifications
You must be signed in to change notification settings - Fork 690
Are generated subtargets a misfeature? #10423
Description
What are generated subtargets?
When possible, Pants will generate multiple smaller targets from one original target. Each generated target will have exactly one sources file.
For example, if a target owns 5 files, Pants would generate 5 new targets. All of the metadata gets copied from the original, except for the address and the sources field.
If a file has multiple owning targets, then we will never attempt to generate a subtarget because it is ambiguous where to copy the metadata from.
When are they used?
- File arguments: when safe, we will generate a subtarget for each file specified. Otherwise, we use the original.
--changed-since: this is essentially sugar for file arguments. Same semantics as file arguments.- Explicit file dependencies in BUILD files.
- We error if there are multiple owners.
- Dependency inference: when safe, we generate a subtarget for the specific file that's imported.
Open questions
Dependencies on files in the same owner
Imagine you have the target :utils. strutil.py depends on dirutil.py. In the BUILD file, :utils does not include a dependency on dirutil.py because that is already in the sources field; Dependency inference is also disabled. Then, ./pants dependencies strutil.py will end up not including dirutil.py nor :utils, and we won’t include dirutil.py anywhere in the closure.
No mechanism to disable generated subtargets
The only way to get the feature to not work is that you have multiple owning targets.
We could possibly add a global flag.
Pros
- If you were not already using one target per file, then you will get finer-grained caching without any new BUILD file boilerplate.
- Finer-grained caching means that we invalidate less frequently, i.e. that we get more cache hits.
- Finer-grained caching also results in slightly smaller chroots, as we don't copy over as many files.
- Is this actually noticeable?
- File arguments (almost always) operate at the file level.
- Previously, file arguments were really sugar to operate on targets. We only operated at the file level with
lint,fmt, andtestdue to special logic. - Arguably, this makes file arguments more consistent and more useful, as something like
./pants dependencies foo.pywill only show the deps of that file (if dep inference is enabled).
- Previously, file arguments were really sugar to operate on targets. We only operated at the file level with
- Artificial cycles at the target level are less common.
- This requires that you use dependency inference, however.
- We agreed that we must tolerate cycles, at least upon request, so this is no longer a major win.
- Starts decoupling describing metadata from the dependency graph.
- There is a tension between wanting to describe metadata coarsely but also wanting fine-grained dependencies. This feature allows you to get both.
Cons
- File addresses are deceptive.
- We output a file name for generated subtargets, but it's not really a file. It's a target that owns exactly one file and was generated from some other target.
- This nuance risks confusion; it hides key information about how Pants works, i.e. that Pants cares about the metatadata for that target, not only the source file.
- We now must teach two concepts, rather than one.
- If you have any
files(),resources(), orpython_requirement_library()targets, you will always have conventional targets. - Because it's so common to use those target types, we will still need to teach conventional targets.
- If you have any
- A user's experience varies substantially depending on if they use file arguments or address arguments, and dep inference or not.
- This has a risk for confusion why something works one way, but not another.
- The tool feels less unified and less consistent.
- See Stop eagerly validating
!ignores in thedependenciesfield #10420 (comment) for a concrete issue we had in Toolchain because of this. If we use file args, Pants works; otherwise, it errors.
- No mechanism to determine the owner of a file.
- Before, you would use
./pants list foo.py. Now this gives you back the file namefoo.py. - It's not clear what metadata is being applied for a file, and where to go to edit that metadata.
- Before, you would use
- Substantially increases our code complexity and maintenance burden.
- This is particularly tricky code to reason about, particularly because there are so many edge cases.
- Example: this has interfered with adding
--query, where it's unclear how we should handle generated subtargets. - Example: confusing how to integrate this with
dependeesTeachdependeeshow to work with generated subtargets #10354