Model Compression
Ultra-Sparse Memory Network: A New Method To Change Transformer Memory Efficiency
Ultra-Sparse Memory Network: A New Method To Change Transformer Memory Efficiency
Hymba, A New Architecture That Pushes The Limits Of Small LLMs
Hymba, A New Architecture That Pushes The Limits Of Small LLMs
Stable Flow: Visualization Of The "really Important Layers" Behind Image Generation
Stable Flow: Visualization Of The "really Important Layers" Behind Image Generation
Cross-Layer Attention Significantly Reduces Transformer Memory
Cross-Layer Attention Significantly Reduces Transformer Memory
Transformer
SA-FedLoRA] Communication Cost Reduction Methods For Federated Learning
SA-FedLoRA] Communication Cost Reduction Methods For Federated Learning
Medical
The Compressed Sensing Revolution: Automatic Validation Algorithms Prove Accuracy Of Neural Networks
The Compressed Sensing Revolution: Automatic Validation Algorithms Prove Accuracy Of Neural Networks
Neural Network
[BitNet B1.58] Achieved Accuracy Better Than Llama By Expressing Model Parameters In Three Values!
[BitNet B1.58] Achieved Accuracy Better Than Llama By Expressing Model Parameters In Three Values!
Large Language Models
Apple's Efficient Inference Of Large Language Models On Devices With Limited Memory Capacity
Apple's Efficient Inference Of Large Language Models On Devices With Limited Memory Capacity
Large Language Models
I-ViT: Compute ViT In Integer Type! ?Shiftmax And ShiftGELU, Which Evolved From I-BERT Technology, Are Also Available!
I-ViT: Compute ViT In Integer Type! ?Shiftmax And ShiftGELU, Which Evolved From I-BERT Technology, A ...
Transformer
How Does Pruning Of The ImageNet Pre-training Model Work In Downstream Tasks?
How Does Pruning Of The ImageNet Pre-training Model Work In Downstream Tasks?
Pruning
Architectural Exploration Method For Neural Nets Running On IoT Devices
Architectural Exploration Method For Neural Nets Running On IoT Devices
NAS
Model Compression For Unconditional-GAN
Model Compression For Unconditional-GAN
GAN (Hostile Generation Network)
Dropout Layers, Not Weights Or Nodes! "LayerDrop" Proposal
Dropout Layers, Not Weights Or Nodes! "LayerDrop" Proposal
Dropout
Move The GAN With Your Phone! Combination Of Compression Techniques To Reduce Weight, 'GAN Slimming'
Move The GAN With Your Phone! Combination Of Compression Techniques To Reduce Weight, 'GAN Slimming'
GAN (Hostile Generation Network)
BERT For The Poor: A Technique To Reduce The Weight Of Complex Models Using Simple Techniques To Maximize Performance With Limited ...
BERT For The Poor: A Technique To Reduce The Weight Of Complex Models Using Simple Techniques To Max ...
Pruning