Catch up on the latest AI articles

What is AI-SCHOLAR?

LongVie: A New Era Of 1-minute Ultra-High Quality Video Generation Realized By Multimodal Control

LongVie: A New Era Of 1-minute Ultra-High Quality Video Generation Realized By Multimodal Control

Skywork UniPic: Next-generation Multimodal Model That Integrates Image Understanding, Generation, And Editing With High Efficiency

Skywork UniPic: Next-generation Multimodal Model That Integrates Image Understanding, Generation, An ...

Democratizing GPT-4o Level Image Generation: The Janus-4o And ShareGPT-4o-Image Challenge

Democratizing GPT-4o Level Image Generation: The Janus-4o And ShareGPT-4o-Image Challenge

Stable Flow: Visualization Of The "really Important Layers" Behind Image Generation

Stable Flow: Visualization Of The "really Important Layers" Behind Image Generation

Zero-shot Learning] AI Voice Cloning And Lip-syncing Verification And Explanation

Zero-shot Learning] AI Voice Cloning And Lip-syncing Verification And Explanation

29/01/2025 Neural Network

MaskDiT: Low Learning Cost Diffusion Model For Image Generation

MaskDiT: Low Learning Cost Diffusion Model For Image Generation

27/01/2025 Image Generation

E-commerce Background Image Generation Based On Product Category And Brand Style

E-commerce Background Image Generation Based On Product Category And Brand Style

17/01/2025 Image Generation

MimicBrush, A New Image Editing Method "Imitative Editing" Is Proposed

MimicBrush, A New Image Editing Method "Imitative Editing" Is Proposed

16/01/2025 Image Editing

Object Background Generation Using Text-2-Image Diffusion Model

Object Background Generation Using Text-2-Image Diffusion Model

10/01/2025 Image Generation

MicroDiffusion: A Thousand-dollar Generative Image Quality Model That Outperforms Multi-million-dollar Models

MicroDiffusion: A Thousand-dollar Generative Image Quality Model That Outperforms Multi-million-doll ...

25/12/2024 Image Generation

SKETCHPAD] Enhanced Inference Of Multimodal Language Models With Intermediate Sketches

SKETCHPAD] Enhanced Inference Of Multimodal Language Models With Intermediate Sketches

18/12/2024 Large Language Models

Plot2Code] Benchmark For Testing Multimodal LLM Code Generation

Plot2Code] Benchmark For Testing Multimodal LLM Code Generation

17/12/2024 Large Language Models

[LDDGAN] Diffusion Model With The Highest Speed Inference

[LDDGAN] Diffusion Model With The Highest Speed Inference

29/09/2024 Diffusion Model

GenTron: Diffusion Transformers For Image And Video Generation

GenTron: Diffusion Transformers For Image And Video Generation

26/08/2024 Image Generation

How Frame Interpolation AI Technologies RIFE & IFNet Work And How To Use Them

How Frame Interpolation AI Technologies RIFE & IFNet Work And How To Use Them

20/08/2024 Image Generation

AVI-Talking" Generates Natural 3D Talking Faces From Audio

AVI-Talking" Generates Natural 3D Talking Faces From Audio

17/08/2024 Face Recognition

Disentangled Diffusion: T2I Model To Extract Multiple Concepts From A Single Image

Disentangled Diffusion: T2I Model To Extract Multiple Concepts From A Single Image

26/05/2024 Image Generation

U-ViT: ViT Backbone For Diffusion Models

U-ViT: ViT Backbone For Diffusion Models

23/05/2024 Image Generation