IPA output? #1019

gtoal · 2023-06-16T00:56:42Z

gtoal
Jun 16, 2023

I'm assuming Whisper goes from speech directly to text? Is there any option to generate a textual representation of the phonemes or allophones first, i.e. in IPA representation (similar to the Allosaurus project ( https://github.com/xinjli/allosaurus ). I've been working on spelling correction from phonetic representations of mis-spelled words, and I'ld like to try adapting it to convert transcribed speech to text (though only for personal interest at this point, as your speech to text is incredibly good compared to anything I've seen in the past so I'm not sure there's much left to do!)

Graham
PS Later note... I was incredibly lucky with the first speech sample I tested with, when I wrote that note above - it was close to 100% accurate (it just missed a proper name - Dustin Sekula Library was recorded as "Dustin's secular library" :-) ) Subsequent transcriptions of real data (i.e. my own speech, or listening to the television) have been significantly worse. Perhaps the reason that first conversion was near perfect was because it was an answering machine recording of an automated alert system read out by a synthesized voice... Anyway I mention this just to say that there clearly is some point to my doing some experiments with my IPA to text code! ... if an IPA transcription of the speech is any good that is, The one in Allosaurus was not, even allowing for a phonetic similarity distance metric in the word reconstruction.

MaciejKucia · 2023-12-06T08:25:21Z

MaciejKucia
Dec 6, 2023

Neural IPA would be cool but AFAIK nobody has done it.
Relevant conversation openai/whisper#318 (comment)

1 reply

ggerganov Dec 6, 2023
Maintainer

Maybe the new grammar support can be utilized: #1229

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IPA output? #1019

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

IPA output? #1019

gtoal Jun 16, 2023

Replies: 1 comment · 1 reply

MaciejKucia Dec 6, 2023

ggerganov Dec 6, 2023 Maintainer

gtoal
Jun 16, 2023

Replies: 1 comment 1 reply

MaciejKucia
Dec 6, 2023

ggerganov Dec 6, 2023
Maintainer