AIBooru

Tag or flagging images with bad VAE

Posted under General

A small fraction of approved posts happen to be ones that use the SD 1.0 VAE. This is pretty well known now for producing desaturated images and/or artifacts when converting the latent space to a bitmap. As this is the one included in the leaked NovelAI model, more recently model authors/mergers have gotten better about replacing this VAE in the distributed checkpoint, but it still seems to crop up quite a bit. I could link many of these but here's a small selection of recent ones.

post #35284
post #35191
post #34478
post #32981

Should anything be done about these? I was thinking adding a metatag like bad_vae but I also question if these should have been approved in the first place.

It's situational, I guess? In terms of approval, I don't consider it much of an issue if the image looks fine aside from that specific issue.

As for the tag idea, I'm indifferent to it. You would have to make sure any images that possibly apply to it would actually be because of a bad VAE and not an intentional thing.

The 'Anything' titled vae(s) are typically a mix of half muted (no vae saturation) and a full bold color vae such as the newer ones. Could be something to keep in mind for any posts that still may have used some type of vae yeah. I don't see anything wrong with having a tag for certain vae/no vae, but might not necessarily be 'bad' per say.

Edit: the post above me brings up intentional use, limited_palette and similar tokens can have similar effects as no/limited vae on an image to a degree that some may confuse with no vae at times - but this is generally pretty distinctive imo. Just thought to mention things such as this.

The tag could be useful I guess, but deciding what qualifies might be problematic. Just from the given examples post #34478 and post #35191 are pretty obvious but with post #35284 and especially post #32981 I wouldn't be able to tell 100% percent.

As for whether they should be approved, if the result looks good it is good. Out of the four I personally only have a slight problem with 34478 but not enough to say that it's bad.

For post #32981 I can tell you they were unintentionally using the incorrect VAE as someone pointed it out to them on 4chan since they wondered why it was coming out desaturated, and they later fixed it, as can be seen from the rest of their work afterwards. I'm willing to bet it's almost always unintentional, but it would be correct that it's not always the case though and it is situational.

The 'Anything' titled vae(s)

These are almost always the NAI VAE renamed (not the same as the SD 1.0 VAE). You can verify the hash of these yourself. This is technically the "correct" VAE to use for anime artwork, since it was presumably trained on it, though we don't really have a way to verify this as it's from a leaked source, but users often use the newer SD VAEs or Waifu Diffusion VAEs because those produce more saturated colors.

vae/no vae

Also want to point out there is no such thing as "no vae", as the image would be stuck in latent space otherwise. This has became the colloquially used term for it because it's simpler and easier to understand in comparison to "SD 1.0 VAE" though. Most users don't even know the purpose or possibly the existence of a VAE though, which is another reason I believe this is almost always unintentional.

I will see if I can do some analysis to find more images based on their color range. My personal assumptions are: if the image is desaturated, and metadata is fully intact, it's almost always a case where the user is unaware of the problem. If the image comes from a source where metadata is kept (such as Pixiv) but there is no metadata, and it is desaturated, then it may be intentional.

If the image looks good, it looks good. Desaturated images can still obviously look good on e.g. sad-themed artworks. Also, some images might look great enough on their own that the desaturation is not big enough of an issue for an approver to not be able to enjoy the image.

It is unfortunate that a lot of people don't know about different VAE options and therefore don't have more saturated colours on those images where those are a necessity.

As for the tag, muted color might be close enough to what you are looking for.

Yes 'no vae/vae: none' is just the term used in the A111 ui etc. and having none selected.
As far as the assumption of users being unaware of nothing selected for a vae, that may be the case most of the time but you can never be sure. Always open to the possibility of things being intentional or not personally. One of the artists you linked in one of the images there I know uses vae, but also seems to not sometimes so idk.

Where on that post are you referring to? The 'fixed' vae StabilityAI officially baked the 0.9 vae in as a fix as far as I know for fp16 stuff and nan issues. That shouldn't have removed any watermarking based on their purpose for having it. However, I have seen a model that notes of removing the watermark by changing the vae even further. So I'm not sure to what extent that will translate into other models or not.

As far as that post goes, I'm not really sure what you're talking about or how that would be viewed as an image with 'bad vae'.

I see what you mean now, very hard to tell not zoomed in and to know what to look for. I suppose that may be considered a 'bad vae' since it was fixed by StabilityAI reverting VAEs. However, I'm not too sure how useful doing so here would be and being able to find any that use that version.

1