Speech2Text with Sentence Similarity
Understanding speech has always been a tough task, so as to convert it to text. But the scenario has changed a lot. With the advent of Deep Learning, the process has become not only easier but also the accuracy it achieves in understanding and transcribing it to text is really remarkable. In this article I will share how I built a web app that replies to a few questions you asked about me.
One fine evening, I decided to make something that replies when somebody asks about me, on behalf of me. I wanted to make use of some state-of-the-art models for better and powerful performance. So I landed to HuggingFace model Hub. I decided to use Wav2Vec2 for speech recognition, Sentence-transfomers for embedding the sentence, and deploy it in Gradio App to make the inference quick and easy.
Initially, I collected a few Questions-Answers pairs that could be asked. I embedded them using the sentence-transformers and stored the embedding. Now, when an audio file is passed to the wav2vec2, it transcribes it. The transcribed text is then embedded using the sentence-transformers. The answer corresponding to the nearest question embedding is then returned (in the case above answer to ‘Where do you study?’).
Lets now dive into the Code Section. The code is pretty easy. You first need to install few libraries.
$ pip install transformers, sentence-transformers, gradio
After installing the libraries, lets write a script to load the models we want.
Now, we create a utility script that contains function to parse the audio file. It also contains Question-Answer pairing.
Let’s create a gradio app now to make inferences of our work. Gradio is an absolutely easy yet powerful library to make inferences of your ML models, so if you are feeling hard to make a quick demo of your models, I highly recommend you to check Gradio. With afew lines of code shown below, our project is live.
If you want to see demo video of this project, kindly visit to the link below where i have code and embeddings for few questions. And please Star the project if you like the idea.