2-Bit VPTQ: 6.5x Smaller LLMs While Preserving 95% Accuracy
Very accurate 2-bit quantization for running 70B LLMs on a 24 GB GPUContinue reading on Towards Data Science »
![2-Bit VPTQ: 6.5x Smaller LLMs While Preserving 95% Accuracy](https://miro.medium.com/v2/resize:fit:1143/0*MkC-gnCl8vmVVs9x.png)
Very accurate 2-bit quantization for running 70B LLMs on a 24 GB GPU
Feb 5, 2025 0
Feb 5, 2025 0
Feb 5, 2025 0
Feb 5, 2025 0
Feb 5, 2025 0
Feb 5, 2025 0
Feb 5, 2025 0
Feb 5, 2025 0
Or register with email
Very accurate 2-bit quantization for running 70B LLMs on a 24 GB GPU
Jan 27, 2025 0
Jan 28, 2025 0
Jan 28, 2025 0
Jan 28, 2025 0
Jan 29, 2025 0
Jan 30, 2025 1
Jan 29, 2025 0
Jan 28, 2025 0
This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies.