Publications - Sewade Ogun's Website

2024

Generating diverse synthetic data for ASR training data augmentation.

Ogun, S. (2024). PhD thesis.

Performant ASR Models for Medical Entities in Accented Speech

Afonja, T., Olatunji, T., Ogun, S., Etori, N. A., Owodunni, A., & Yekini, M. (2024). Performant ASR Models for Medical Entities in Accented Speech. Interspeech 2024.

|| || paper || || model link ||

1000 African Voices: Advancing inclusive multi-speaker multi-accent speech synthesis

Ogun, S., Owodunni, A. T., Olatunji, T., Alese, E., Oladimeji, B., Afonja, T., ... & Adewumi, T. (2024). 1000 African Voices: Advancing inclusive multi-speaker multi-accent speech synthesis. Interspeech 2024.

|| || paper || || model link ||

STATE OF THE ART. THINK BEFORE LOADING.

Guillaume Coiﬀier, Sewade Ogun, Leo Valque, Priyansh Trivedi. STATE OF THE ART. THINK BEFORE LOADING, 2024, 978-2-9591975-0-5. hal-04509255

|| || paper ||

Pre-print

An Exhaustive Evaluation of TTS- and VC-based Data Augmentation for ASR

Ogun, S., Colotte, V., Vincent, E. Preprint

2023

Stochastic Pitch Prediction Improves the Diversity and Naturalness of Speech in Glow-TTS.

Ogun, S., Colotte, V., Vincent, E. (2023) Stochastic Pitch Prediction Improves the Diversity and Naturalness of Speech in Glow-TTS. Proc. INTERSPEECH 2023, 4878-4882, doi: 10.21437/Interspeech.2023-1673

|| || paper || || source code ||

Can we use Common Voice to train a Multi-Speaker TTS system?

Ogun, S., Colotte, V., & Vincent, E. (2023, January). Can we use Common Voice to train a Multi-Speaker TTS system?. In 2022 IEEE Spoken Language Technology Workshop (SLT) (pp. 900-905). IEEE.

|| || paper || || Dataset link ||

2021

Towards a weakly-supervised learning paradigm for speech recognition

Ndoye A., Ogun, S., Adi, Y., & Cisse, M. (2021, March). Towards a weakly-supervised learning paradigm for speech recognition. Masters Thesis.

|| || paper || || source code ||