Transformer
Hymba, A New Architecture That Pushes The Limits Of Small LLMs
Hymba, A New Architecture That Pushes The Limits Of Small LLMs
Insight-V: A New Strategy For Multimodal Reasoning Connecting Vision And Thought
Insight-V: A New Strategy For Multimodal Reasoning Connecting Vision And Thought
Stable Flow: Visualization Of The "really Important Layers" Behind Image Generation
Stable Flow: Visualization Of The "really Important Layers" Behind Image Generation
Open Vocabulary Object Detection Enabled By OWL-ViT
Open Vocabulary Object Detection Enabled By OWL-ViT
Neural Network
Classification Tasks - Extremely Difficult! Use The WHFEMD Algorithm To Accurately And Efficiently Capture And Classify Features O ...
Classification Tasks - Extremely Difficult! Use The WHFEMD Algorithm To Accurately And Efficiently C ...
Speech Recognition For The Dysarthric
Giving LLMs A Whiteboard To Write Down Their Reasoning Process Greatly Improves Their Visual Reasoning Ability!
Giving LLMs A Whiteboard To Write Down Their Reasoning Process Greatly Improves Their Visual Reasoni ...
Prompting Method
Cross-Layer Attention Significantly Reduces Transformer Memory
Cross-Layer Attention Significantly Reduces Transformer Memory
Transformer
YesBut: The Emergence Of A Dataset That Makes The VLM Understand Irony And Caricature!
YesBut: The Emergence Of A Dataset That Makes The VLM Understand Irony And Caricature!
Dataset
[SCoRe] Reinforcement Learning To Enhance LLM's Ability To Self-correct! Identify And Correct Errors In A Multi-step Process
[SCoRe] Reinforcement Learning To Enhance LLM's Ability To Self-correct! Identify And Correct Errors ...
Large Language Models
AI To Transform Mathematics Education; Possibilities And Challenges Of Solving Mathematical Problems Using Large-Scale Language Mo ...
AI To Transform Mathematics Education; Possibilities And Challenges Of Solving Mathematical Problems ...
Large Language Models
A Better Attention Mechanism Will Improve The Performance Of LLM's Long-text Processing!
A Better Attention Mechanism Will Improve The Performance Of LLM's Long-text Processing!
Large Language Models
[OmniGen] All Image-related Tasks Can Be Performed With Only One Generation Model!
[OmniGen] All Image-related Tasks Can Be Performed With Only One Generation Model!
Image Generation
SkySense: Multimodal Remote Sensing Foundation Model
SkySense: Multimodal Remote Sensing Foundation Model
CVPR
Google's High-performance LLM That Compresses Very Long Prompt Sentences To Save Memory
Google's High-performance LLM That Compresses Very Long Prompt Sentences To Save Memory
Large Language Models
GenTron: Diffusion Transformers For Image And Video Generation
GenTron: Diffusion Transformers For Image And Video Generation
Image Generation
[BitNet] Large-scale Language Model With 1-bit Inference
[BitNet] Large-scale Language Model With 1-bit Inference
BitNet
Self-supervised ViT With Deep Fake Detection
Self-supervised ViT With Deep Fake Detection
Self-supervised Learning