|
|
Exploring Transfer of Self-supervised English Pretrained Model to Yoruba and French Languages in Very Low-Resource Settings
- We use a self-supervised pretrained model trained on large amount of English speech
- Model is finetuned on 1 hour of yoruba and french audio with labels
- We compare this to training on 1 hour of audio in the target language from scratch.
- Audio samples are from only 1 speaker
- Surprisingly, we got better results finetuning on a single speaker for Yoruba language. Many factors could have contributed to this including gender (Male for Yoruba and Female for French, quality of audio recording etc)
- Details of the project can be found on my github repo
|