AIBooru

Civitai importer is buggy, importing compressed images

Posted under Bugs & Features

TL;DR: civitai importer should discard anything after image uuid, and append /original=true there

Please look at this.

post #73824

Show
$ echo -ne 'URL:\t\t'; tg; echo -ne 'Content-Type:\t'; wget --spider "`tg`" 2>&1 | ag len; echo -ne 'Format:\t\t'; wget -qO- `tg` | gm identify -; echo -ne 'MD5:\t\t'; wget -qO- `tg` | md5sum -; echo

URL:		https://cdn.aibooru.download/original/fc/d7/__original_generated_by_sulph_using_0002__fcd7e623328716a56ee367a834e7bc64.jpg
Content-Type:	Length: 228818 (223K) [image/jpeg]
Format:		/tmp/gmBMO2cB JPEG 1200x1754+0+0 DirectClass 8-bit 223.5Ki 0.000u 0:01
MD5:		fcd7e623328716a56ee367a834e7bc64  -

URL:		https://image.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/56026606-035d-41dd-89a5-64c128850c8b/width=832/00701-886454019.jpeg
Content-Type:	Length: 228818 (223K) [image/jpeg]
Format:		/tmp/gm388rii JPEG 1200x1754+0+0 DirectClass 8-bit 223.5Ki 0.000u 0:01
MD5:		fcd7e623328716a56ee367a834e7bc64  -

URL:		https://image.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/56026606-035d-41dd-89a5-64c128850c8b/00701-886454019.jpeg
Content-Type:	Length: 140370 (137K) [image/jpeg]
Format:		/dev/shm/gmc1b43y JPEG 832x1216+0+0 DirectClass 8-bit 137.1Ki 0.000u 0:01
MD5:		c3c1bbbb822ae6b2167beea547915f52  -

URL:		https://image.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/56026606-035d-41dd-89a5-64c128850c8b/original=true
Content-Type:	Length: 1532035 (1,5M) [image/png]
Format:		/tmp/gmvgBem1 PNG 832x1216+0+0 DirectClass 8-bit 1.5Mi 0.000u 0:01
MD5:		48e469f21640a2e32d2ebec2cd194026  -

post #71842

Show
URL:		https://cdn.aibooru.download/original/d8/a4/__original_generated_by_kozue77991268_using_pony_diffusion_xl__d8a424545b0f6648e76824feb777384e.jpg
Content-Type:	Length: 96822 (95K) [image/jpeg]
Format:		/tmp/gmBOb64p JPEG 896x1200+0+0 DirectClass 8-bit 94.6Ki 0.000u 0:01
MD5:		d8a424545b0f6648e76824feb777384e  -

URL:		https://image.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/181846df-d452-4542-b1fd-86a95f6564da/height=1200/52684-1883514548-score_9_up,score_8_up,score_7_up, _lora_comiclo-xl-pony_0.7_,1girl, long hair, blonde hair, animal on head, wavy mouth, blue eye.jpeg
Content-Type:	Length: 96822 (95K) [image/jpeg]
Format:		/tmp/gm2dzEcs JPEG 896x1200+0+0 DirectClass 8-bit 94.6Ki 0.000u 0:01
MD5:		d8a424545b0f6648e76824feb777384e  -

URL:		https://image.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/181846df-d452-4542-b1fd-86a95f6564da/original=true
Content-Type:	Length: 1392476 (1,3M) [image/png]
Format:		/dev/shm/gmpueS3m PNG 896x1200+0+0 DirectClass 8-bit 1.3Mi 0.000u 0:01
MD5:		aefb12c2e0310d801182377826284c90  -

More recent post: post #74058

Show
URL:		https://cdn.aibooru.download/original/82/3a/__original_generated_by_temmienoa_using_t_ponynaiv3__823a849150cc80aa8a65bae4d5135b88.jpg
Content-Type:	Length: 176063 (172K) [image/jpeg]
Format:		/dev/shm/gmx3k12n JPEG 1200x2100+0+0 DirectClass 8-bit 171.9Ki 0.000u 0:01
MD5:		823a849150cc80aa8a65bae4d5135b88  -

URL:		https://image.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/b79fb8d1-8f01-4060-b2ed-bc9b8edaf39c/width=1200/ComfyUI_00139_.jpeg
Content-Type:	Length: 176063 (172K) [image/jpeg]
Format:		/dev/shm/gmtirLoA JPEG 1200x2100+0+0 DirectClass 8-bit 171.9Ki 0.000u 0:01
MD5:		823a849150cc80aa8a65bae4d5135b88  -

URL:		https://image.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/b79fb8d1-8f01-4060-b2ed-bc9b8edaf39c/original=true
Content-Type:	Length: 1658041 (1,6M) [image/png]
Format:		/dev/shm/gmVCaGOI PNG 960x1680+0+0 DirectClass 8-bit 1.6Mi 0.000u 0:01
MD5:		6e5ee631e4d2b1e442583c9d96369131  -

See https://github.com/qsniyg/maxurl/issues/1265.

Civitai compress and scale images (as most cloud CDNs do nowadays...), but obtaining original image is easy.

The importer code need to be fixed. And that way, with all existing images from civitai replaced...

Thanks to @sulph for clearly pointing this out in comment #4006. I did upload from civitai myself and didn't notice...

@maimaimai said:

@fredgido

Nice!

Are you sure the /xG1nkqKTMzGDvpLrqFT7WA/ part would be always the same?
Are the old compressed ones going to be replaced?

I didn't want to automatically replace them because some of them are MD5 mismatches. I'm not sure if Civitai offers a feature to replace an image in-place (i.e. without deleting the old one, keeping the same ID), or if people just enter wrong sources after uploading from disk.

If you find any that need to be replaced you can post in topic #172.

I have no idea what happened in that case, the upload source is https://civitai.com/images/24358171 but both links you provided give the original image.

[2] pry(main)> Source::Extractor.find("https://civitai.com/images/24358171").image_urls
=> ["https://image.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/1d365646-7417-43c0-9b97-31ebb76c7f5b/original=true"]
[3] pry(main)> Source::Extractor.find("https://civitai.com/posts/5440141").image_urls
=> ["https://image.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/1d365646-7417-43c0-9b97-31ebb76c7f5b/original=true"]

I replaced the image without changing the URL and it gave the original image.

So why was it 187 KB .jpg at first? This is why it may be "a subtle bug".

Penance said:

the upload source is https://civitai.com/images/24358171

As far as I understand from my own experience, users can change the upload source to anything they want. I uploaded asset #310460 from disk, then changed the source, so that information (the REAL upload source) is now lost.
Maybe it gets overwritten automatically?

Updated

maimaimai said:

So why was it 187 KB .jpg at first? This is why it may be "a subtle bug".

As far as I understand from my own experience, users can change the upload source to anything they want. I uploaded asset #310460 from disk, then changed the source, so that information (the REAL upload source) is now lost.
Maybe it gets overwritten automatically?

No, admins can see the original URL that was pasted into the upload box (the original uploader should be able to see it as well). I can see that asset was uploaded from file://113370157_p0.jpg. By the way, you're supposed to use the direct image URL for pixiv links, not the post URL. Danbooru will convert it automatically.

As to why it gave a jpg, I have no idea.

1