Image Caption
Vript-Hard, A New Benchmark For Testing Comprehension Of Long-form Video
Vript-Hard, A New Benchmark For Testing Comprehension Of Long-form Video
Large Language Models
LAVE, An Agent-assisted Video Editing Tool That Utilizes LLM
LAVE, An Agent-assisted Video Editing Tool That Utilizes LLM
Large Language Models
YesBut: The Emergence Of A Dataset That Makes The VLM Understand Irony And Caricature!
YesBut: The Emergence Of A Dataset That Makes The VLM Understand Irony And Caricature!
Dataset
From Face Recognition To Age Estimation, Potential Biometric Technologies Using ChatGPT-4
From Face Recognition To Age Estimation, Potential Biometric Technologies Using ChatGPT-4
Large Language Models
[Set-of-Mark Visual Prompting] Prompting Technology To Enhance GPT-4V's Image Recognition Capability
[Set-of-Mark Visual Prompting] Prompting Technology To Enhance GPT-4V's Image Recognition Capability
Prompting Method
[CoDi] Any-to-any Diffusion Model That Can Handle Almost Any Modality
[CoDi] Any-to-any Diffusion Model That Can Handle Almost Any Modality
Diffusion Model
Generating 3D Objects From Text - DreamFusion
Generating 3D Objects From Text - DreamFusion
3D
Summary Of Image Caption Generation Techniques From Attention To GAN-based Methods
Summary Of Image Caption Generation Techniques From Attention To GAN-based Methods
Image Caption