Figure 1

Visual comparisons between FLUX and 1.58-bit FLUX. 1.58-bit FLUX demonstrates comparable generation quality to FLUX while employing 1.58-bit quantization, where 99.5% of the 11.9B parameters in the vision transformer are constrained to the values +1, -1, or 0. For consistency, all images in each comparison are generated using the same latent noise input. 1.58-bit FLUX utilizes a custom 1.58-bit kernel.

Abstract

We present 1.58-bit FLUX, the first successful approach to quantizing the state-of-the-art text-to-image generation model, FLUX.1-dev, using 1.58-bit weights (i.e., values in {-1, 0, +1}) while maintaining comparable performance for generating 1024 x 1024 images. Notably, our quantization method operates without access to image data, relying solely on self-supervision from the FLUX.1-dev model. Additionally, we develop a custom kernel optimized for 1.58-bit operations, achieving a 7.7x reduction in model storage, a 5.1x reduction in inference memory, and improved inference latency. Extensive evaluations on the GenEval and T2I Compbench benchmarks demonstrate the effectiveness of 1.58-bit FLUX in maintaining generation quality while significantly enhancing computational efficiency.

Efficiency measurements on the transformer part.

Figure 2

Latency measurements on the transformer part.

Table 3

Visual comparisons on GenEval dataset.

Figure 3

Visual comparisons on the val split of T2I CompBench.

Figure 4

Performance measurements.

Table 1 and 2
-->