Article
Open Vocabulary Object Detection Enabled By OWL-ViT
Open Vocabulary Object Detection Enabled By OWL-ViT
Neural Network
Libra] A New Multimodal Design Of Large Language Models Using Separate Vision Systems
Libra] A New Multimodal Design Of Large Language Models Using Separate Vision Systems
Large Language Models
DrHouse] Diagnostic System Using Sensor Information And Expertise
DrHouse] Diagnostic System Using Sensor Information And Expertise
Medical
A Comprehensive Survey Of The Current Status And Challenges Of AI-Based Predictive Maintenance In The Steel Industry
A Comprehensive Survey Of The Current Status And Challenges Of AI-Based Predictive Maintenance In Th ...
Prediction Model
Proposal Of An Optimization Method For Activation Functions And CRReLU Using Information Entropy
Proposal Of An Optimization Method For Activation Functions And CRReLU Using Information Entropy
Loss Function
[For Everyone To Enjoy The Convenience... Speaker Adaptation Of Dysarthric Speech Using Whisper
[For Everyone To Enjoy The Convenience... Speaker Adaptation Of Dysarthric Speech Using Whisper
Speech Recognition For The Dysarthric
[You're Using Wav2vec2 For This? It Makes Feature Extraction Of Dysarthric Speech More Efficient!
[You're Using Wav2vec2 For This? It Makes Feature Extraction Of Dysarthric Speech More Efficient!
Speech Recognition For The Dysarthric
A Paper That Overturns Conventional Wisdom! The Classification Of Dysarthria Was Based On Noise, Not Characteristics!
A Paper That Overturns Conventional Wisdom! The Classification Of Dysarthria Was Based On Noise, Not ...
Speech Recognition For The Dysarthric
Equal Access To Convenience! EasyCall Corpus", A Speech Corpus For The Dysarthric
Equal Access To Convenience! EasyCall Corpus", A Speech Corpus For The Dysarthric
Speech Recognition For The Dysarthric
Question The "norm"! Noise Suppression Using Ultra-low Complexity DNN
Question The "norm"! Noise Suppression Using Ultra-low Complexity DNN
NOISE SUPPRESSION
The Time Has Come For Everyone To Speak English! Zero-shot Text-to-speech Technology For Multiple Languages Makes It Easy For Anyo ...
The Time Has Come For Everyone To Speak English! Zero-shot Text-to-speech Technology For Multiple La ...
Speech Recognition For The Dysarthric
MVANet: The Most Powerful Model For Background Removal
MVANet: The Most Powerful Model For Background Removal
Neural Network
Zero-shot Learning] AI Voice Cloning And Lip-syncing Verification And Explanation
Zero-shot Learning] AI Voice Cloning And Lip-syncing Verification And Explanation
Neural Network
PosterLlama: Ability To Design Language Models And Generate Content-aware Layouts
PosterLlama: Ability To Design Language Models And Generate Content-aware Layouts
Layout-gen
MaskDiT: Low Learning Cost Diffusion Model For Image Generation
MaskDiT: Low Learning Cost Diffusion Model For Image Generation
Image Generation
Roadmap For Learning From Demonstrations Of Robot Operations For The Manufacturing Industry
Roadmap For Learning From Demonstrations Of Robot Operations For The Manufacturing Industry
Robot
E-commerce Background Image Generation Based On Product Category And Brand Style
E-commerce Background Image Generation Based On Product Category And Brand Style
Image Generation