Atlas — Universal Lossless Model Compressor

Feature	Atlas	llama.cpp	AutoAWQ	AutoGPTQ
Auto bit allocation	Per-layer adaptive	Manual	Uniform	Uniform
Quality recovery	LoRA distillation	None	None	None
Quality guarantee	<1% verified	No	No	No
Hardware-aware	Auto-detect + fit	No	No	No
PPL drop (70B, 3.5bit)	~0.7%	~2-3%	~1-2%	~1-2.5%
End-to-end	One command	Multi-step	Script	Script

Feature

Atlas

llama.cpp

AutoAWQ

AutoGPTQ

Auto bit allocation

Per-layer adaptive

Manual

Uniform

Quality recovery

LoRA distillation

None

Quality guarantee

<1% verified

Hardware-aware

Auto-detect + fit

PPL drop (70B, 3.5bit)

~0.7%

~2-3%

~1-2%

~1-2.5%

End-to-end

One command

Multi-step

Script

Compress Model

Select model, set quality target, and let Atlas handle everything.

Model

Size: 140 GB (FP16) Layers: 80 Params: 70.6B

Hardware Target

Quality Target 99%

Aggressive (smaller) Lossless (bigger)

Output Format

Compression Progress

Idle

Overall 0%

Profile

Plan

Recover

Verify

Layer quantization plan

L0-7

5-bit

L8-31

4-bit

L32-59

3-bit

L60-79

2-bit

Estimated Size

9.2 GB

-67.7% from FP16

Avg Bitrate

3.4 bit

Mixed AWQ+HQQ+AQLM

Est. Throughput

8.4 tok/s

on MacBook Air M1

ETA

--:--

Not started

Compression Results

Llama-3.3-70B-Instruct — compressed 2024-06-28

Quality

99.3%

PPL drop: 0.7%

File Size

9.2GB

from 28.5 GB

Avg Bitrate

3.4bit

mixed precision

Speed

8.4t/s

MBA M1 16GB

Benchmark vs FP16 Baseline

WikiText-2 Perplexity

FP16: 5.42 Atlas: 5.46

-0.7% drop

MMLU (5-shot)

FP16: 79.2% Atlas: 78.9%

-0.3% drop

ARC-Easy

FP16: 85.1% Atlas: 84.7%

-0.5% drop

HellaSwag

FP16: 82.6% Atlas: 82.1%

-0.6% drop

Layer-by-Layer Bit Allocation

80 layers, color = quantization method

AWQ (5-6 bit)

HQQ (3-4 bit)

AQLM (2-3 bit)

Recovered

Recovery Iterations

Iteration	Weak Layers	Method	PPL Before	PPL After	Delta	Time
#1	L2, L5, L8, L11, L14	LoRA r=8	5.82	5.61	-3.6%	8m 42s
#2	L19, L23, L31, L44	LoRA r=8	5.61	5.51	-1.8%	6m 18s
#3	L62, L71, L78	Bit promote	5.51	5.46	-0.9%	3m 05s

Run 70B models
on your MacBook.

How Atlas Works

Profile

Plan

Recover

Verify

Why Atlas

Fits Your Machine

Smart Bit Allocation

Zero Quality Loss

Guaranteed Quality

End-to-End

MLX + GGUF

Atlas vs Everything Else

Stop guessing bit widths.

Compress Model

Compression Progress

Compression Results

Benchmark vs FP16 Baseline

Layer-by-Layer Bit Allocation

Recovery Iterations

Compression History

Run 70B models on your MacBook.

How Atlas Works

Profile

Plan

Recover

Verify

Why Atlas

Fits Your Machine

Smart Bit Allocation

Zero Quality Loss

Guaranteed Quality

End-to-End

MLX + GGUF

Atlas vs Everything Else

Stop guessing bit widths.

Compress Model

Compression Progress

Compression Results

Benchmark vs FP16 Baseline

Layer-by-Layer Bit Allocation

Recovery Iterations

Compression History

Run 70B models
on your MacBook.