Sglang

Qwen 3.5 FP8 Weights Drop - How to Actually Deploy a 397B Model on 8 GPUs

Alibaba releases official FP8-quantized weights for the Qwen 3.5 flagship and 27B dense model, cutting memory requirements roughly in half and enabling deployment on 8x H100 GPUs with native vLLM and SGLang support.

Sglang

Qwen 3.5 FP8 Weights Drop - How to Actually Deploy a 397B Model on 8 GPUs

Google Analytics