Image Recognition
Toward AI That Doesn't Forget Images, CoMemo Pioneers Next-generation Vision And Language Models
Toward AI That Doesn't Forget Images, CoMemo Pioneers Next-generation Vision And Language Models
PictSure: A New Method To Challenge Few-Shot Classification With The Power Of Visual Embedding
PictSure: A New Method To Challenge Few-Shot Classification With The Power Of Visual Embedding
UnifiedCrawl: A New Approach To Low-Resource Language Data Collection And Efficient LLM Adaptation
UnifiedCrawl: A New Approach To Low-Resource Language Data Collection And Efficient LLM Adaptation
Other
Insight-V: A New Strategy For Multimodal Reasoning Connecting Vision And Thought
Insight-V: A New Strategy For Multimodal Reasoning Connecting Vision And Thought
Stable Flow: Visualization Of The "really Important Layers" Behind Image Generation
Stable Flow: Visualization Of The "really Important Layers" Behind Image Generation
Open Vocabulary Object Detection Enabled By OWL-ViT
Open Vocabulary Object Detection Enabled By OWL-ViT
Neural Network
Libra] A New Multimodal Design Of Large Language Models Using Separate Vision Systems
Libra] A New Multimodal Design Of Large Language Models Using Separate Vision Systems
Large Language Models
MVANet: The Most Powerful Model For Background Removal
MVANet: The Most Powerful Model For Background Removal
Neural Network
Zero-shot Learning] AI Voice Cloning And Lip-syncing Verification And Explanation
Zero-shot Learning] AI Voice Cloning And Lip-syncing Verification And Explanation
Neural Network
MaskDiT: Low Learning Cost Diffusion Model For Image Generation
MaskDiT: Low Learning Cost Diffusion Model For Image Generation
Image Generation
E-commerce Background Image Generation Based On Product Category And Brand Style
E-commerce Background Image Generation Based On Product Category And Brand Style
Image Generation
MimicBrush, A New Image Editing Method "Imitative Editing" Is Proposed
MimicBrush, A New Image Editing Method "Imitative Editing" Is Proposed
Image Editing
Object Background Generation Using Text-2-Image Diffusion Model
Object Background Generation Using Text-2-Image Diffusion Model
Image Generation
Giving LLMs A Whiteboard To Write Down Their Reasoning Process Greatly Improves Their Visual Reasoning Ability!
Giving LLMs A Whiteboard To Write Down Their Reasoning Process Greatly Improves Their Visual Reasoni ...
Prompting Method
MicroDiffusion: A Thousand-dollar Generative Image Quality Model That Outperforms Multi-million-dollar Models
MicroDiffusion: A Thousand-dollar Generative Image Quality Model That Outperforms Multi-million-doll ...
Image Generation
Human-robot Cooperative Assembly Realized By Large-scale Language Models
Human-robot Cooperative Assembly Realized By Large-scale Language Models
Robot
GenAI-Arena] New Platform To Evaluate Generated Models By User Votes
GenAI-Arena] New Platform To Evaluate Generated Models By User Votes
Large Language Models