Catch up on the latest AI articles

End-to-end Speech Translation

End-to-end Speech Translation "NeurST".

Voice Recognition

3 main points 
✔️ An open-source toolkit for neural speech translation.
✔️ Easy-to-use and flexible end-to-end speech translation system.
✔️ Setup for benchmarking, feature extraction, data preprocessing, distributed training and much more. 

NeurST: Neural Speech Translation Toolkit
written by Chengqi Zhao, Mingxuan Wang, Lei Li
(Submitted on 18 Dec 2020 (v1))
Comments: arXiv:2012.10018 [cs.CL]
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)


Neural Speech Translation (NST) is a crucial contribution to deep learning. The common approach to NST is the cascading approach which uses separate Automatic Speech Recognition (ASR) and Neural Machine Translation(NMT) models. This approach is prone to error propagation i.e a faulty result from the ASR will certainly result in faulty NMT results. A more recent end-to-end approach aims to directly transform speech into translated text and therefore mitigates error propagation. It also reduces the model size making it suitable for deployment. Despite the impressive performance of end-to-end models, there seems to be inconsistency while benchmarking the models during different research works. This is due to the complexity of preprocessing audio data which involves tricky data augmentation and pre-training. The NeurST toolkit is here to solve those problems.   

NeurST provides implementations of state of the art transformer-based models and includes feature extraction, data preprocessing, training, and inference modules, enabling researchers to reproduce the benchmark results. It is implemented in TensorFlow2. 

To read more,

Please register with AI-SCHOLAR.

Sign up for free in 1 minute

Thapa Samrat avatar
I am a second year international student from Nepal who is currently studying at the Department of Electronic and Information Engineering at Osaka University. I am interested in machine learning and deep learning. So I write articles about them in my spare time.

If you have any suggestions for improvement of the content of the article,
please contact the AI-SCHOLAR editorial team through the contact form.

Contact Us