Estudio comparativo de métodos de transcripción para corpus orales: el caso del español

Marimar Rufino Morales

doi:10.26378/rnlael1429406

Authors

Marimar Rufino Morales Université de Montréal

DOI:

https://doi.org/10.26378/rnlael1429406

Keywords:

respeaking, transcription, automatic speech recognition, speech-to-text software, spoken corpus

Abstract

Technological advances have propelled the research methodology in transcription. Language corpus tools based on statistical models and deep learning have improved the alignment and annotation phases. However, when it comes to transcribing the material, the conversation’s interpretive load and nature themselves hinder automation of the process. That is why interviews used for studying spoken language are still transcribed with a player and keyboard, which can constitute one of the most time-consuming aspects of data processing. In other professional contexts, automatic speech recognition is used to transcribe effectively through human-computer collaboration. The techniques and strategies may differ, but they all stabilize fluctuations in computing tools and are faster than other methods. In this study, the off-line respeaking method was used to transcribe the interviews of the Spoken Corpus of the Spanish Language in Montreal. Transcription times and accuracy were measured and compared with automatic speech recognition and typing. Off-line respeaking, using automatic speech-to-text software in its current state, proved to be the fastest and most error-free method for transcribing interviews.

Downloads

Download data is not yet available.

Global Statistics

1161 Views	1875 Downloads
3036 Total

Downloads by format:

PDF (Español (España)) 1232 MHT (Español (España)) 643

Author Biography

Marimar Rufino Morales, Université de Montréal

She is a PhD student in Hispanic Studies at the Université de Montréal and teaches Spanish-French translation. After graduating from the EUTI in Granada, she specialized in audiovisual translation in Canada, where she has been subtitling for deferred television for over 25 years, and 15 live. Within his line of research on Spanish variation, he is interested in optimizing the transcription of the spoken language with the help of intelligent technology.

Español

Authors

DOI:

Keywords:

Abstract

Downloads

Global Statistics

Author Biography

Marimar Rufino Morales, Université de Montréal

Downloads

Published

How to Cite

Issue

Section

License

Language

Information

info

aviso-legal

Current Issue

Español

Authors

DOI:

Keywords:

Abstract

Downloads

Global Statistics ℹ️

Author Biography

Marimar Rufino Morales, Université de Montréal

Downloads

Published

How to Cite

Issue

Section

License

Language

Information

info

aviso-legal

Current Issue

Global Statistics