AI's Cambrian Explosion: The Key To The Era Of Finding And Utilizing Useful AI Creators AI's Cambrian Explosion: The Key To The Era Of Finding And Utilizing Useful AI Creators 18/03/2024 Video Generation
[MusicLDM] Text-to-Music Model With Low Risk Of Plagiarism [MusicLDM] Text-to-Music Model With Low Risk Of Plagiarism 22/01/2024 Diffusion Model
[AudioLDM] Text-to-Audio Generation Model Using Latent Diffusion [AudioLDM] Text-to-Audio Generation Model Using Latent Diffusion 16/01/2024 Diffusion Model
[CoDi] Any-to-any Diffusion Model That Can Handle Almost Any Modality [CoDi] Any-to-any Diffusion Model That Can Handle Almost Any Modality 12/01/2024 Diffusion Model
CLAP] Contrastive Learning Model Of Speech And Text CLAP] Contrastive Learning Model Of Speech And Text 21/12/2023 Contrastive Learning
Brain2Music] Automatic Music Generation Based On Brain Information Brain2Music] Automatic Music Generation Based On Brain Information 06/12/2023 Large Language Models
LP-MusicCaps] Automatic Generation Of Music Captions Using LLM LP-MusicCaps] Automatic Generation Of Music Captions Using LLM 20/11/2023 Contrastive Learning
MuLan] Multimodal Music-Text Using Contrastive Learning MuLan] Multimodal Music-Text Using Contrastive Learning 24/10/2023 Contrastive Learning
[MusicLM] Text-to-Music Generation Model Developed By Google. [MusicLM] Text-to-Music Generation Model Developed By Google. 18/10/2023 Transformer
Make-An-Audio] Prompt-enhanced Diffusion Model For Speech Generation. Make-An-Audio] Prompt-enhanced Diffusion Model For Speech Generation. 16/10/2023 Diffusion Model
Multimodal Emotion Recognition From Text, Voice And Vision: Sony's Proposed M2FNet! Multimodal Emotion Recognition From Text, Voice And Vision: Sony's Proposed M2FNet! 31/01/2023 Emotion Recognition
How Should We Link Different Resolution Features? : D3Net Proposed By Sony How Should We Link Different Resolution Features? : D3Net Proposed By Sony 30/01/2023 CVPR
Text To Speech Methods That Run On Fewer Computational Resources Text To Speech Methods That Run On Fewer Computational Resources 05/10/2022 NAS
A 3D Mesh Of A Face Resembling The Speaker Can Be Generated From Speech Alone A 3D Mesh Of A Face Resembling The Speaker Can Be Generated From Speech Alone 19/08/2022 3D
Now There's A Technique For Editing The Facial Movements Of Characters In A Video To Match Any Emotion! Now There's A Technique For Editing The Facial Movements Of Characters In A Video To Match Any Emoti ... 05/08/2022 CVPR
More Realistic Facial 3D Animations Can Be Generated From Audio! More Realistic Facial 3D Animations Can Be Generated From Audio! 01/08/2022 3D
FreeMo, A Model That Automatically Generates Upper Body Gestures In Response To Speech, Is Here! FreeMo, A Model That Automatically Generates Upper Body Gestures In Response To Speech, Is Here! 19/07/2022 Speech Synthesis
Can You Do Deep Learning, Graph Search, And Conditional Optimization With Explosive Speed And Low Power Consumption? Quantitative ... Can You Do Deep Learning, Graph Search, And Conditional Optimization With Explosive Speed And Low Po ... 08/07/2022 Survey