Publications
2024 |
Generating diverse synthetic data for ASR training data augmentation.
Ogun, S. (2024). PhD thesis.
|
Performant ASR Models for Medical Entities in Accented Speech
Afonja, T., Olatunji, T., Ogun, S., Etori, N. A., Owodunni, A., & Yekini, M. (2024). Performant ASR Models for Medical Entities in Accented Speech. Interspeech 2024.
|| || paper || || model link || |
1000 African Voices: Advancing inclusive multi-speaker multi-accent speech synthesis
Ogun, S., Owodunni, A. T., Olatunji, T., Alese, E., Oladimeji, B., Afonja, T., ... & Adewumi, T. (2024). 1000 African Voices: Advancing inclusive multi-speaker multi-accent speech synthesis. Interspeech 2024.
|| || paper || || model link || |
STATE OF THE ART. THINK BEFORE LOADING.
Guillaume Coiffier, Sewade Ogun, Leo Valque, Priyansh Trivedi. STATE OF THE ART. THINK BEFORE LOADING, 2024, 978-2-9591975-0-5. hal-04509255
|| || paper || |
2023 |
Stochastic Pitch Prediction Improves the Diversity and Naturalness of Speech in Glow-TTS.
Ogun, S., Colotte, V., Vincent, E. (2023) Stochastic Pitch Prediction Improves the Diversity and Naturalness of Speech in Glow-TTS. Proc. INTERSPEECH 2023, 4878-4882, doi: 10.21437/Interspeech.2023-1673
|| || paper || || source code || |
Can we use Common Voice to train a Multi-Speaker TTS system?
Ogun, S., Colotte, V., & Vincent, E. (2023, January). Can we use Common Voice to train a Multi-Speaker TTS system?. In 2022 IEEE Spoken Language Technology Workshop (SLT) (pp. 900-905). IEEE.
|| || paper || || Dataset link || |
2021 |
Towards a weakly-supervised learning paradigm for speech recognition
Ndoye A., Ogun, S., Adi, Y., & Cisse, M. (2021, March). Towards a weakly-supervised learning paradigm for speech recognition. Masters Thesis.
|| || paper || || source code || |