-
Notifications
You must be signed in to change notification settings - Fork 32.7k
what audio input files are supported by ASR model "openai/whisper-large-v2" on huggingface? #26075
Copy link
Copy link
Closed
Description
Hi,
I am trying to use hugging face pipeline to invoke the model "openai/whisper-large-v2' to perform ASR task. Theerafter, I am wrapping the model within gradio as below but when I submit, I get an error.
Model card of this model says "To transcribe audio samples, the model has to be used alongside a [WhisperProcessor]"
How do I know what audio input is supported by this model and also how can I convert my audio sample (.m4a file) default generated by macbook voice memos app to a format supported by this model?
code as here:
from transformers import pipeline
import gradio as gr
#model = pipeline(model = "facebook/wav2vec2-base-960h")
model = pipeline( model= "openai/whisper-large-v2")
def transcribe_audio(mic=None, file = None):
if mic is not None:
audio = mic
elif file is not None:
audio = file
else:
return "You must either provide a mic recording or a file"
transcription = model(audio)["text"]
return transcription
gr.Interface(
fn=transcribe_audio,
inputs=[
gr.Audio(source="microphone", type="filepath", optional=True),
gr.Audio(source="upload", type="filepath", optional=True),
],
outputs="text",
).launch(share = True)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels