Abstract

We present 1.58-bit FLUX, the first successful approach to quantizing the state-of-the-art text-to-image generation model, FLUX.1-dev, using 1.58-bit weights (i.e., values in {-1, 0, +1}) while maintaining comparable performance for generating 1024 x 1024 images. Notably, our quantization method operates without access to image data, relying solely on self-supervision from the FLUX.1-dev model. Additionally, we develop a custom kernel optimized for 1.58-bit operations, achieving a 7.7x reduction in model storage, a 5.1x reduction in inference memory, and improved inference latency. Extensive evaluations on the GenEval and T2I Compbench benchmarks demonstrate the effectiveness of 1.58-bit FLUX in maintaining generation quality while significantly enhancing computational efficiency.

Paper: https://arxiv.org/abs/2412.18653

Code: https://github.com/Chenglin-Yang/1.58bit.flux (coming soon)

    • db0M
      link
      fedilink
      English
      1221 days ago

      None of them are truly open source, but many are free to use without many restrictions . Flux dev unfortunately isn’t one of them, not even allowing open source services like the horde to use it

      • Rikudou_Sage
        link
        fedilink
        English
        221 days ago

        Hmm, just reading the license, which part forbids the use by free services like the Horde? It seems like it should be allowed.

        • db0M
          link
          fedilink
          English
          321 days ago

          It’s written in a way which forbids people serving it as a service.

          • Rikudou_Sage
            link
            fedilink
            English
            021 days ago

            Well, that’s what I’m saying, it didn’t feel like it to me.

            • db0M
              link
              fedilink
              English
              121 days ago

              We’ve gone through it with a fine tooth comb. There’s statements in there that don’t give any exceptions for free services

    • @[email protected]OP
      link
      fedilink
      English
      120 days ago

      It’s just an example on a paper to show it can still follow prompts. The images aren’t going to be flawless.

  • @kwilson
    link
    English
    320 days ago

    I’m not an expert on AI, but I’m surprised the comparison photos are so similiar. I was expecting the models to come with completely different images each time. The sky in the dragon pics specially looks like copied and pasted.

    • @[email protected]OP
      link
      fedilink
      English
      220 days ago

      The hope is that they’re similar. The pictures on the right are from a smaller version of the model that the pictures on the left are from. This shows that even though they shrank the model, it still understands the same prompt.

      • @kwilson
        link
        English
        220 days ago

        yeah I get that, I’m just surprised that both times the image is so similar. both times the dragon looks right, both time the sky looks mostly the same, stuff that isn’t part of the prompt you know.

        The same prompt can sometimes give you completely different pictures, that still comply with the prompt, on the same model

        • @[email protected]
          link
          fedilink
          English
          320 days ago

          The text is only part of the initial info the model uses to create the image - the settings and the random number seed are other parts that are relevant here because they’d be the same for both images. The seed in particular is why these images look so similar - normally when you give the same prompt twice, the seed is randomized so the model starts from two different points. Here, each model starts from the same point and works the same way, just with different amounts of data, so a lot of the details are shared.

      • @kwilson
        link
        English
        120 days ago

        yeah I get that, I’m just surprised that both times the image is so similar. both times the dragon looks right, both time the sky looks mostly the same, stuff that isn’t part of the prompt you know.

        The same prompt can sometimes give you completely different pictures, that still comply with the prompt, on the same model

  • @SirHery
    link
    English
    120 days ago

    I love the bird with bunny ears