The Secrets Of Speech Recognition Technology The Secrets Of Speech Recognition Technology 24/04/2024 Voice Recognition
AI's Cambrian Explosion: The Key To The Era Of Finding And Utilizing Useful AI Creators AI's Cambrian Explosion: The Key To The Era Of Finding And Utilizing Useful AI Creators 18/03/2024 Video Generation
[MusicLDM] Text-to-Music Model With Low Risk Of Plagiarism [MusicLDM] Text-to-Music Model With Low Risk Of Plagiarism 22/01/2024 Diffusion Model
[AudioLDM] Text-to-Audio Generation Model Using Latent Diffusion [AudioLDM] Text-to-Audio Generation Model Using Latent Diffusion 16/01/2024 Diffusion Model
[CoDi] Any-to-any Diffusion Model That Can Handle Almost Any Modality [CoDi] Any-to-any Diffusion Model That Can Handle Almost Any Modality 12/01/2024 Diffusion Model
CLAP] Contrastive Learning Model Of Speech And Text CLAP] Contrastive Learning Model Of Speech And Text 21/12/2023 Contrastive Learning
Brain2Music] Automatic Music Generation Based On Brain Information Brain2Music] Automatic Music Generation Based On Brain Information 06/12/2023 Large Language Models
LP-MusicCaps] Automatic Generation Of Music Captions Using LLM LP-MusicCaps] Automatic Generation Of Music Captions Using LLM 20/11/2023 Contrastive Learning
MuLan] Multimodal Music-Text Using Contrastive Learning MuLan] Multimodal Music-Text Using Contrastive Learning 24/10/2023 Contrastive Learning
[MusicLM] Text-to-Music Generation Model Developed By Google. [MusicLM] Text-to-Music Generation Model Developed By Google. 18/10/2023 Transformer
Make-An-Audio] Prompt-enhanced Diffusion Model For Speech Generation. Make-An-Audio] Prompt-enhanced Diffusion Model For Speech Generation. 16/10/2023 Diffusion Model
Multimodal Emotion Recognition From Text, Voice And Vision: Sony's Proposed M2FNet! Multimodal Emotion Recognition From Text, Voice And Vision: Sony's Proposed M2FNet! 31/01/2023 Emotion Recognition
How Should We Link Different Resolution Features? : D3Net Proposed By Sony How Should We Link Different Resolution Features? : D3Net Proposed By Sony 30/01/2023 CVPR
Text To Speech Methods That Run On Fewer Computational Resources Text To Speech Methods That Run On Fewer Computational Resources 05/10/2022 NAS
A 3D Mesh Of A Face Resembling The Speaker Can Be Generated From Speech Alone A 3D Mesh Of A Face Resembling The Speaker Can Be Generated From Speech Alone 19/08/2022 3D
Now There's A Technique For Editing The Facial Movements Of Characters In A Video To Match Any Emotion! Now There's A Technique For Editing The Facial Movements Of Characters In A Video To Match Any Emoti ... 05/08/2022 CVPR
More Realistic Facial 3D Animations Can Be Generated From Audio! More Realistic Facial 3D Animations Can Be Generated From Audio! 01/08/2022 3D
FreeMo, A Model That Automatically Generates Upper Body Gestures In Response To Speech, Is Here! FreeMo, A Model That Automatically Generates Upper Body Gestures In Response To Speech, Is Here! 19/07/2022 Speech Synthesis