HN Reader

251

Updating the GenAI comparison website is starting to feel a bit Sisyphean with all the new models coming out lately, but the results are in for the Flux 2 Pro Editing model!

https://genai-showdown.specr.net/image-editing

It scored slightly higher than BFL's Kontext model, coming in around the middle of the pack at 6 / 12 points.

I’ll also be introducing an additional numerical metric soon, so we can add more nuance to how we evaluate model quality as they continue to improve.

If you're solely interested in seeing how Flux 2 Pro stacks up against the Nano Banana Pro, and another Black Forest model (Kontext), see here:

https://genai-showdown.specr.net/image-editing?models=km,nbp...

Note: It should be called out that BFL seems to support a more formalized JSON structure for more granular edits so I'm wondering if accuracy would improve using it.

9 hours agoby vunderba

Great, especially that they still have an open-weight variant of this new model too. But what happened to their work on their unreleased SOTA video model? did it stop being SOTA, others got ahead, and they folded the project, or what? YT video about it: https://youtu.be/svIHNnM1Pa0?t=208 They even removed the page of that: https://bfl.ai/up-next/

11 hours agoby spyder

FLUX.1 Pro Kontext was one of the best artistic model, still great at instruction following comparing to MidJourney V7.

See my third comparison in Nano Banana blog post: https://quesma.com/blog/nano-banana-pro-intelligence-with-to...

10 hours agoby jakozaur

I just finished my Flux 2 testing (focusing on the Pro variant here: https://replicate.com/black-forest-labs/flux-2-pro). Overall, it's a tough sell to use Flux 2 over Nano Banana for the same use cases, but even if Nano Banana didn't exist it's only an iterative improvement over Flux 1.1 Pro.

Some notes:

- Running my nuanced Nano Banana prompts though Flux 2, Flux 2 definitely has better prompt adherence than Flux 1.1, but in all cases the image quality was worse/more obviously AI generated.

- The prompting guide for Flux 2 (https://docs.bfl.ai/guides/prompting_guide_flux2) encourages JSON prompting by default, which is new for an image generation model that has the text encoder to support it. It also encourages hex color prompting, which I've verified works.

- Prompt upsampling is an option, but it's one that's pushed in the documentation (https://github.com/black-forest-labs/flux2/blob/main/docs/fl...). This does allow the model to deductively reason, e.g. if asked to generate an image of a Fibonacci implementation in Python it will fail hilariously if prompt sampling is disabled, but get somewhere if it's enabled: https://x.com/minimaxir/status/1993361220595044793

- The Flux 2 API will flag anything tangently related to IP as sensentive even at its lowest sensitivity level, which is different from Flux 1.1 API. If you enable prompt upsampling, it won't get flagged, but the results are...unexpected. https://x.com/minimaxir/status/1993365968605864010

- Costwise and generation-speed-wise, Flux 2 Pro is on par with Nano Banana, and adding an image as an input pushes the cost of Flux 2 Pro higher than Nano Banana. The cost discrepancy increases if you try to utilize the advertised multi-image reference feature.

- Testing Flux 1.1 vs. Flux 2 generations does not result in objective winners, particularly around more abstract generations.

9 hours agoby minimaxir

> Run FLUX.2 [dev] on GeForce RTX GPUs for local experimentation with an optimized fp8 reference implementation of FLUX.2 [dev], created in collaboration with NVIDIA and ComfyUI.

Glad to see that they're sticking with open weights.

That said, Flux 1.x was 12B params, right? So this is about 3x as large plus a 24B text encoder (unless I'm misunderstanding), so it might be a significant challenge for local use. I'll be looking forward to the distill version.

11 hours agoby 542458

Good to see there's some competition to Nano Banana Pro. Other players are important for keeping the price of the leaders in check.

11 hours agoby xnx

Text encoder is Mistral-Small-3.2-24B-Instruct-2506 (which is multimodal) as opposed to the weird choice to use CLIP and T5 in the original FLUX, so that's a good start albeit kinda big for a model intended to be open weight. BFL likely should have held off the release until their Apache 2.0 distilled model was released in order to better differentiate from Nano Banana/Nano Banana Pro.

The pricing structure on the Pro variant is...weird:

> Input: We charge $0.015 for each megapixel on the input (i.e. reference images for editing)

> Output: The first megapixel is charged $0.03 and then each subsequent MP will be charged $0.015

11 hours agoby minimaxir

The model looks good for an open source model. I want to see how these models are trained. may be they have a base model from academic datasets and quickly fine-tune with models like nano banana pro or something? That could be the game for such models. But great to see an open source model competing with the big players.

10 hours agoby visioninmyblood

> The FLUX.2 - VAE is available on HF under an Apache 2.0 license.

anyone found this? To me the link doesn't lead to the model

10 hours agoby notrealyme123

I ran "family guy themed cyberpunk 2077 ingame screenshot, peter griffin as main character, third person view, view of character from the back" on both nano banana pro and bfl flux 2 pro. The results were staggering. The google model aligned better with the cyberpunk ingame scene, flux was too "realistic"

11 hours agoby AmazingTurtle

Their published benchmarks leave a lot to be desired. I would be interested in seeing their multi-image performance vs. Nano Banana. I just finished up benchmarking Image Editing models and while Nano Banana is the clear winner for one-shot editing its not great at few-shot.

10 hours agoby geooff_

18gb 4 bit quant via diffusers. "low vram setup" :)

10 hours agoby Yokohiii

Genuine question, does anyone use any of these text to image models regularly for non trivial tasks? I am curious to know how they get used. It literally seems like there is a new model reaching the top 3 every week

7 hours agoby bossyTeacher

We probably won't be able to run it on regular PCs, even with a 5090. So I am curious how good the results will be using a quntized version.

10 hours agoby DeathArrow

> Launch Partners

Wow, the Krea relationship soured? These are both a16z companies and they've worked on private model development before. Krea.1 was supposed to be something to compete with Midjourney aesthetics and get away from the plastic-y Flux models with artificial skin tones, weird chins, etc.

This list of partners includes all of Krea's competitors: HiggsField (current aggregator leader), Freepik, "Open"Art, ElevenLabs (which now has an aggregator product), Leonardo.ai, Lightricks, etc. but Krea is absent. Really strange omission.

I wonder what happened.

11 hours agoby echelon

If this is still a diffusion model, I wonder how well does it compare with NanoBanana.

10 hours agoby DeathArrow

Yes yes very impressive.

But can it still turn my screen orange?

11 hours agoby eric-p7

Oh, looks like someone had to release something very quickly after Google came for their lunch. Their little 15 mins is over already for BFL as it seems.

11 hours agoby beernet