Quite frequently I come across scanned books that are viewable for free online. For example, the publisher put them there (such as preview chapters), a library (old books from their collection that are in public domain), etc. Since I like hoarding data, and the online viewers that are used to present the book to me might not be very practical, I frequently try to download the books one way or another. This requires toying with the “inspect element” tool and various other methods of getting the images/PDF. Now, all that I access is what is, well, accessible; I don’t hack into the servers or something. But - the stuff is meant to be hidden from the normal user. Does that act of hiding the material, no matter how primitive and easily circumvented, mean that I’m not allowed to access it at all?

I suppose ripping a public domain book is no big deal, but would books under copyright fare differently?

Mainly I’m asking out of curiosity, I don’t expect the police to come visit me for ripping a 16th century dictionary.

Note: I live in EU, but I’d be curious to hear how this is treated elsewhere too.

Edit: I also remembered a funny trick I noticed on one site - it allows viewing PDFs on their website, but not downloading, unless you pay for the PDF. But when you load the page, even without paying, the PDF is already downloaded onto your computer and can be found in the browser cache. Is it legal to simply save the file that is already on your computer?

56 points

AFAIK web scraping (the act of grabbing and downloading any data you see available on the internet) isn’t illegal, and I would assume downloading PDFs provided to you online would fall under that. Since it is copyrighted it would probably be illegal to share it, though.

permalink
report
reply
25 points
*

This. In a case around LinkedIn courts ruled that in the US it’s legal to scrape publicly available data. The company doing the scraping was selling that data to corporate customers, but ultimately use might depend on the information you’re accessing and under what permissions. (Not a lawyer)

permalink
report
parent
reply
53 points

According to the big tech its ok if you’re training large language model with it.

permalink
report
reply
14 points

You’re confusing the law that applies for the ruling class with the one that applies to common people

permalink
report
parent
reply
9 points

My brain is essentially an enormous language model.

permalink
report
parent
reply
4 points

Unironically yes, you would not know who Spiderman was without viewing a copyrighted work demonstrating what he looks like, and now you understand while generative AI fundamentally has to ingest copyrighted works.

permalink
report
parent
reply
43 points

If you can see it, you’ve already downloaded it. You’re just chosing to retain it.

permalink
report
reply
39 points

As with everything with the law, it depends.

In Australia, distribution is the illegal part, seeding/sharing is where they get you. Not the actual download itself.

permalink
report
reply
6 points

It’s usually not a question of legality, but efficiency.

It’s easy and efficient to bust someone for seeding, but busting hundreds for the odd file you can prove they downloaded is expensive and takes forever.

permalink
report
parent
reply
5 points
*

busting hundreds for the odd file you can prove they downloaded is expensive and takes forever.

And might well not be legally possible if all you have is an IP address, because lest we forget:

An IP is not an ID

permalink
report
parent
reply
26 points
*

viewable for free online

If you are viewing it on your computer, you have already downloaded it.
Don’t let anyone tell you otherwise.

already downloaded onto your computer and can be found in the browser cache

Exactly.

permalink
report
reply

Ask Lemmy

!asklemmy@lemmy.world

Create post

A Fediverse community for open-ended, thought provoking questions

Please don’t post about US Politics. If you need to do this, try !politicaldiscussion@lemmy.world


Rules: (interactive)


1) Be nice and; have fun

Doxxing, trolling, sealioning, racism, and toxicity are not welcomed in AskLemmy. Remember what your mother said: if you can’t say something nice, don’t say anything at all. In addition, the site-wide Lemmy.world terms of service also apply here. Please familiarize yourself with them


2) All posts must end with a '?'

This is sort of like Jeopardy. Please phrase all post titles in the form of a proper question ending with ?


3) No spam

Please do not flood the community with nonsense. Actual suspected spammers will be banned on site. No astroturfing.


4) NSFW is okay, within reason

Just remember to tag posts with either a content warning or a [NSFW] tag. Overtly sexual posts are not allowed, please direct them to either !asklemmyafterdark@lemmy.world or !asklemmynsfw@lemmynsfw.com. NSFW comments should be restricted to posts tagged [NSFW].


5) This is not a support community.

It is not a place for ‘how do I?’, type questions. If you have any questions regarding the site itself or would like to report a community, please direct them to Lemmy.world Support or email info@lemmy.world. For other questions check our partnered communities list, or use the search function.


Reminder: The terms of service apply here too.

Partnered Communities:

Tech Support

No Stupid Questions

You Should Know

Reddit

Jokes

Ask Ouija


Logo design credit goes to: tubbadu


Community stats

  • 11K

    Monthly active users

  • 4.3K

    Posts

  • 232K

    Comments