remove categories metadata from (OneHot)Label datapoint#7171
remove categories metadata from (OneHot)Label datapoint#7171pmeier wants to merge 1 commit intopytorch:mainfrom
Conversation
| Example: | ||
| >>> class BatchMultiCrop(transforms.Transform): | ||
| ... def forward(self, sample: Tuple[Tuple[Union[datapoints.Image, datapoints.Video], ...], datapoints.Label]): | ||
| ... def forward(self, sample: Tuple[Tuple[Union[datapoints.Image, datapoints.Video], ...], datapoints.LabelWithCategories]): |
There was a problem hiding this comment.
Oops, auto refactoring was a little to eager. Need to revert this.
| num_categories = len(inpt.categories) | ||
| output = one_hot(inpt.as_subclass(torch.Tensor), num_classes=num_categories) | ||
| return datapoints.OneHotLabel(output, categories=inpt.categories) | ||
| return datapoints.OneHotLabel(output) |
There was a problem hiding this comment.
This does change the functionality, but not by much. Previously, users could use this transform and it would do the right thing by default if the label had some categories attached to it. Now they have to explicitly pass it to the constructor.
Note that the -1 default is somewhat weird. It is a sentinel to let the torch kernel know to "figure it out". This means it checks all values available in the input and infers the number of categories from them. Obviously, if the input doesn't contain the smallest and largest value, this will give false results.
|
We (@NicolasHug, @vfdev-5, and I) had a longer offline discussion about this and decided to not include the Closing this PR as we don't need to remove the |
We had an offline discussion about this and the consensus is that right now there is no critical need for the labels to have a
categoriesfield. It is nice to have for visualization and if there is user demand for it later we can re-add it. Still, as is it is a liability since we haven't done any perf tests with it and there might be other implications from sharing the same list of categories between all label instances coming from a dataset.This PR removes the attribute and all functionality that comes with it. Diff is so large because I ported the "old" behavior over to
torchvision.prototype.datasetssince they use the attribute quite a bit.cc @bjuncek