tldr: a huge torrent of books straight from the academic publisher De Gruyter

cross-posted from: https://teddit.net/r/DataHoarder/comments/1463ah3/degruyter_collection/

Trying again because reddit filters, let’s see if I get it right this time.

After months of scraping I finally finished downloading almost every single degruyter book to which I have access, which are many. And so I created a torrent.

magnet:?xt=urn:btih:76f573241a0126fb1ab0aa5540cc7493c045ae74&dn=Degruyter%20Imprints%20v2%20%5b09-06-23%5d&tr=http%3a%2f%2fatrack.pow7.com%2fannounce&tr=udp%3a%2f%2fopen.stealth.si%3a80%2fannounce&tr=udp%3a%2f%2ftracker.cyberia.is%3a6969%2fannounce&tr=udp%3a%2f%2fretracker.lanta-net.ru%3a2710%2fannounce&tr=udp%3a%2f%2ftracker.moeking.me%3a6969%2fannounce&tr=udp%3a%2f%2ftracker.tiny-vps.com%3a6969%2fannounce&tr=udp%3a%2f%2ftracker.torrent.eu.org%3a451%2fannounce&tr=udp%3a%2f%2ftracker.openbittorrent.com%3a80%2fannounce&tr=http%3a%2f%2ftracker.openbittorrent.com%3a80%2fannounce&tr=udp%3a%2f%2fopentra

The content of the torrent is pretty much these https://www.degruyter.com/search?query=*&startItem=0&pageSize=10&sortBy=mostrecent&documentVisibility=all&documentTypeFacet=book&publisherFacet=De+Gruyter~De+Gruyter+Oldenbourg~De+Gruyter+Saur~De+Gruyter+Mouton

Neither the files nor the series are renamed. They all bear the original filenaming which is much better since it can easily be keyed to the book’s unique page.

Note: this torrent includes only the abovementioned degruyter imprints. In the upcoming weeks I will create a second torrent, with the degruyter partner publishers: about 100k books from these publishers

https://i.imgur.com/mSKrLto.png And in few months the last torrent which will include the degruyter journals.

This endeavour would have not been possible were not for all the people that granted me academic access, wrote scripts for me, helped me ensure the integrity of the files, and so on. Sadly many files, especially epubs, are corrupted or downright missing at the source. There are also some dupes that, being part of multiple series and subseries, were downloaded twice. Total torrent size is about 2tb.

https://i.imgur.com/BjqsUqJ.png of course everything is actively being shared with nexus/annas/libgen as well as private groups and friends and the classicist discord channel. I did not waste months so I could jerkoff on the big numbah, therefore I want the files to be shared and reshared as much as possible in order to grant indirect academic access to all those students (me being one) whose universities cannot afford degruyter subscriptions.

Last note: all the files are retail untouched (according to BIB standards, if anyone here is a member). So if some epub has shitty formatting blame the publisher.

: fixed magnet url

1 point
*

legend. but no idea what ill be downloading when looking through it. No idea how to use this

permalink
report
reply
1 point

Yeah, it’s not clear. The following explanation goes into a bit more detail on finding what you want to download, for anyone else who has trouble identifying books in the torrent.

Try browsing through this link on the publisher’s site for any subjects or books you’d be interested in. If you find something, copy the ISBN (listed in the work’s direct URL, or you can open the work’s page, which lists out the eBook and hardcover ISBNs—you need the eBook ISBN as far as I understand). If you open the torrent and let the file contents load, depending on your client, you can look through the structure, search through all files using the ISBN, and locate the book that way. Files all seem to be named by their DOI, with the / replaced with _. I guess the ISBN is part of the DOI naming convention.

It doesn’t look like any work tagged as Ahead of Publication will be in this torrent, but works from 2023 and prior appear to be.

permalink
report
parent
reply
1 point

thanks for the response, I will try another torrent client

permalink
report
parent
reply
1 point

Anna’s archive would love to hear about this!

permalink
report
reply

datahoarder

!datahoarder@lemmy.ml

Create post

Who are we?

We are digital librarians. Among us are represented the various reasons to keep data – legal requirements, competitive requirements, uncertainty of permanence of cloud services, distaste for transmitting your data externally (e.g. government or corporate espionage), cultural and familial archivists, internet collapse preppers, and people who do it themselves so they’re sure it’s done right. Everyone has their reasons for curating the data they have decided to keep (either forever or For A Damn Long Time). Along the way we have sought out like-minded individuals to exchange strategies, war stories, and cautionary tales of failures.

We are one. We are legion. And we’re trying really hard not to forget.

– 5-4-3-2-1-bang from this thread

Community stats

  • 391

    Monthly active users

  • 193

    Posts

  • 1.4K

    Comments

Community moderators