A data-centric approach for Portuguese speech recognition : language model and its implications.
Nenhuma Miniatura disponível
Data
2023
Título da Revista
ISSN da Revista
Título de Volume
Editor
Resumo
Recent advances in Automatic Speech Recognition
have made it possible to achieve a quality never seen before
in the literature, both for languages with abundant data, such
as English, which has a large number of studies, and for
the Portuguese language, which has a more limited amount
of resources and studies. The most recent advances address
speech recognition problems with Transformers based models,
which have the capability to perform the speech recognition
task directly from the raw signal, without the need for manual
feature extraction. Some studies have already shown that it is
possible to further improve the quality of the transcription of
these models using language models within the decoding stage,
however, the real impact of such language models is still not
clear, especially for the Brazilian Portuguese scenario. Also, it is
known that the quality of the data used for training the models
is of paramount importance, however, there are few works in the
literature addressing this issue. This work explores the impact of
language models applied to Portuguese speech recognition both
in terms of data quality and computational performance, with
a data-centric approach. We propose an approach to measure
similarity between datasets and, thus, assist in decision-making
during training. The approach indicates paths for the advancement
of the state-of-the-art aiming at Portuguese speech recognition,
showing that it is possible to reduce the size of the language
model by 80% and still achieve error rates around 7.17%
for the Common Voice dataset. The source code is available at
https://github.com/joaoalvarenga/language-model-evaluation.
Descrição
Palavras-chave
Brazilian Portuguese
Citação
ALVARENGA, J. P. R.; MERSCHMANN, L. H. de C.; LUZ, E. J.da S. A data-centric approach for Portuguese speech recognition: language model and its implications. IEEE Latin America Transactions, v. 21, n. 4, abr. 2023. Disponível em: <https://latamt.ieeer9.org/index.php/transactions/article/view/7464>. Acesso em: 06 jul. 2023.