Zerochan.net is one of famous anime/game/CG imageboards with strong community and modest crossposting with other imageboards.
It has specific tagging system - close to e-shuushuu-net - but not to mainstream danbooru / safebooru / yande-re / konachan / sankaku.
That's why Zerochan is a good distinct source for investigation of non-photographic images and their metadata.
This release devoted to dates between
01.01.2015 (ID=1820240) and 31.12.2016 (ID=2064142)right before release zerochan-2017 on russian tracker (
https://rutracker.org/forum/viewtopic.php?t=5478026) you can get magnet thereand giant zerochan-2018-2020 here (
https://nyaa.si/view/1304539) most of description and processing STUFF can be found thereRelease contains:•
61348 images in 270 zipped folders (1820xxx-2064xxx and several addons) partitioned mostly by 1.000-th ID
• filtered by size
~ least(image
height,imagewidth)>=1080 -- fullHD wallpapers as minimum
~ image
height*imagewidth>=1200000 -- 1100x1100 included
~ image
width/imageheight between 0.25 and 4 -- not too disproportional
• renamed
"zerochan - id - upto3sources ~ upto5characters (upto2artists).ext" ~ tags concatenated via "+", spaces replaced with underscores
~ maximum file name length 220 symbols, characters tags may be truncated if too long
• image format - JPG
• some gentle deduplication made (only visually identical images dropped)
• some metadata for every image "ZERO[/i]POSTS
2016.TSV" in root folder 61348 rows
• tag info for Copyright / Characters / Artists "ZEROTAGS
2016.TSV" - 1498790 rows
and also additional cross-release metadata:
• for every image "ZEROPOSTS
TORR2017.TSV" 172360 rows for zerochan-2017
• tag info "ZERO
TAGS2017.TSV" - 1028999 rows
• rename script from zerochan-2017 naming to used here
zerorename2017.bat • you cannot run it at once because of Windows limitation 16k commands in batch
•
ZERORAW2015-2020.JSON initial data for all 3 releases - 979103 rows
• NOTE post references in JSON URL may differ from page ID where it found, unfortunately I didn't save page ID
This torrent is not so huge compared to 2018-2020 because less count of images and no PNG at Zerochan that times.
Earlier interval (2014-) of Zerochan has big (80%+ ?) intersection with Sankaku (
https://nyaa.si/view/750972), Safebooru (
https://nyaa.si/view/719463) and e-shuushuu (
https://nyaa.si/view/513582, https://nyaa.si/view/771715)so
I see no sense to go deeper. Not yet. Hoops, I did it again !Information:No information.