System Info
This is environment agnositic. (also transformers env throws an error)
Who can help?
No response
Information
Tasks
Reproduction
- Try to run charsui with a large audio file (I tried with a 25 minute one (I know I'm supposed to use smaller inputs)).
- Notice it takes several minutes in SequenceFeatureExtractor.pad converting a numpy array into a list of numpy arrays.
You could also just run SequenceFeatureExtractor.pad with a very large numpy array as input, I haven't wasted time making a MRE.
Expected behavior
Transformers shouldn't waste time with this, it's pointless. This could be prevented by just not doing any work if the input is already a numpy array, by changing
|
for key, value in processed_features.items(): |
|
if isinstance(value[0], (int, float)): |
|
processed_features[key] = to_numpy(value) |
|
else: |
|
processed_features[key] = [to_numpy(v) for v in value] |
to
for key, value in processed_features.items():
if isinstance(value[0], (int, float)):
processed_features[key] = to_numpy(value)
elif not isinstance(value, np.ndarray):
processed_features[key] = [to_numpy(v) for v in value]
System Info
This is environment agnositic. (also
transformers envthrows an error)Who can help?
No response
Information
Tasks
examplesfolder (such as GLUE/SQuAD, ...)Reproduction
You could also just run SequenceFeatureExtractor.pad with a very large numpy array as input, I haven't wasted time making a MRE.
Expected behavior
Transformers shouldn't waste time with this, it's pointless. This could be prevented by just not doing any work if the input is already a numpy array, by changing
transformers/src/transformers/feature_extraction_sequence_utils.py
Lines 169 to 173 in 0d48241
to