Site icon Gradient Flow

Speech Data Processing Takes Flight

Subscribe • Previous Issues

Unlocking speech and audio data with new open source tools

Interest in neural networks and deep learning can be traced back to groundbreaking results in computer vision (2012) and speech recognition (2011). The number of companies working on computer vision applications is increasing, but the number of companies working on audio data is much lower, despite the fact that there are many speech models and services available.

A major reason is that audio data has historically been difficult to work with. There are many different formats for storing and compressing audio data. Data is either lossless or lossy and may require different codecs to read, plus audio data can have multiple channels.

Unbeknownst to many Data and AI teams, things are simpler today. In a new post with researchers from Meaning, we describe an ecosystem of open source projects that vastly simplify audio data processing and pipelining. These projects allow data scientists, developers, and machine learning engineers who are comfortable with Python to start incorporating audio data into their models.

Read The Post

Data Exchange podcast


2022 NLP Summit

The NLP Summit is the world’s largest applied NLP community. As co-chair, I’m excited to announce that we have another outstanding slate of presentations that include many real world use cases and applications, updates on major open source projects, and cutting-edge research being conducted at Google Brain, Hugging Face, and OpenAI. If you work with NLP and text, you need to attend this FREE online conference.


Register Now

What does it mean to build trust into AI?

In my recent Twitter Spaces conversation with Andrew Burt (Managing Partner at BNH.ai) and Bob Friday (Chief AI Officer at Juniper), we dig into that and more.

 


If you enjoyed this newsletter please support our work by encouraging your friends and colleagues to subscribe to our newsletter:

Exit mobile version