[Google × Meta] XLS-R Large-scale Model For Speech Recognition And Speech Translation [Google × Meta] XLS-R Large-scale Model For Speech Recognition And Speech Translation 21/09/2024 Speech Recognition For The Dysarthric
Fusion Of Speech And Image! Does The Multimodal Method "AV-HuBERT" Shine In Speech Recognition For The Dysarthric? Fusion Of Speech And Image! Does The Multimodal Method "AV-HuBERT" Shine In Speech Recognition For T ... 31/08/2024 Speech Recognition For The Dysarthric
Artificial Intelligence Developed By Meta! How Well Does The "HuBERT" Model, Which Is Different From Conventional Self-supervised ... Artificial Intelligence Developed By Meta! How Well Does The "HuBERT" Model, Which Is Different From ... 29/08/2024 AI For Science
[BitNet B1.58] Achieved Accuracy Better Than Llama By Expressing Model Parameters In Three Values! [BitNet B1.58] Achieved Accuracy Better Than Llama By Expressing Model Parameters In Three Values! 27/08/2024 Large Language Models
AVI-Talking" Generates Natural 3D Talking Faces From Audio AVI-Talking" Generates Natural 3D Talking Faces From Audio 17/08/2024 Face Recognition
Zero-Shot Transition Learning] Innovative Technology For Speech Recognition Of Unlearned Languages From Multilingual Corpus Data! Zero-Shot Transition Learning] Innovative Technology For Speech Recognition Of Unlearned Languages F ... 07/08/2024 Speech Recognition For The Dysarthric
Generating Dysarthric Speech! What Is The Magic Data Extension Technology To Solve The Shortage Of Training Data? Generating Dysarthric Speech! What Is The Magic Data Extension Technology To Solve The Shortage Of T ... 26/07/2024 Sound
[Unit-DSR] Normalization Of Disabled Speech To Normal Speech By HuBERT [Unit-DSR] Normalization Of Disabled Speech To Normal Speech By HuBERT 26/07/2024 Self-supervised Learning
[HiFi-GAN] GAN-based Vocoder Capable Of Generating 22 KHz Audio On A Single GPU [HiFi-GAN] GAN-based Vocoder Capable Of Generating 22 KHz Audio On A Single GPU 10/07/2024 Speech Synthesis
[Mustango] Music Generation Model Utilizing Domain Knowledge Of Music [Mustango] Music Generation Model Utilizing Domain Knowledge Of Music 01/07/2024 Audio And Speech Processing
[VoiceCraft] A Language Model That Synthesizes Natural Speech At The Highest Level In The Industry [VoiceCraft] A Language Model That Synthesizes Natural Speech At The Highest Level In The Industry 01/07/2024 Speech Synthesis
The Secrets Of Speech Recognition Technology The Secrets Of Speech Recognition Technology 24/04/2024 Voice Recognition
AI's Cambrian Explosion: The Key To The Era Of Finding And Utilizing Useful AI Creators AI's Cambrian Explosion: The Key To The Era Of Finding And Utilizing Useful AI Creators 18/03/2024 Video Generation
[MusicLDM] Text-to-Music Model With Low Risk Of Plagiarism [MusicLDM] Text-to-Music Model With Low Risk Of Plagiarism 22/01/2024 Diffusion Model
[AudioLDM] Text-to-Audio Generation Model Using Latent Diffusion [AudioLDM] Text-to-Audio Generation Model Using Latent Diffusion 16/01/2024 Diffusion Model
[CoDi] Any-to-any Diffusion Model That Can Handle Almost Any Modality [CoDi] Any-to-any Diffusion Model That Can Handle Almost Any Modality 12/01/2024 Diffusion Model
CLAP] Contrastive Learning Model Of Speech And Text CLAP] Contrastive Learning Model Of Speech And Text 21/12/2023 Contrastive Learning
Brain2Music] Automatic Music Generation Based On Brain Information Brain2Music] Automatic Music Generation Based On Brain Information 06/12/2023 Large Language Models