You are viewing a single thread.
View all comments View context
10 points
*

They’re the same issue tho. Piracy and using books for corporate AI training both should be fine. The same people going after data freedom are pushing this AI drama too. There’s too much money in copyright holding and it’s not being held by your favorite deviantart artists.

permalink
report
parent
reply
48 points

It’s not the same issue at all.

Piracy distributes power. It allows disenfranchised or marginalized people to access information and participate in culture, no matter where they live or how much money they have. It subverts a top-down read-only culture by enabling read-write access for anyone.

Large-scale computing services like these so-called AIs consolidate power. They displace access to the original information and the headwaters of culture. They are for-profit services, tuned to the interests of specific American companies. They suppress read-write channels between author and audience.

One gives power to the people. One gives power to 5 massive corporations.

permalink
report
parent
reply
22 points
*

Extremely well-said.

Also, it’s important to point out that the one that empowers people is the one that is consistently punished far more egregiously.

We have governments blocking the likes of Sci-Hub, Libgen, and Annas-Archive, but nobody is blocking Meta’s LLMs for the same.

If they were treated similarly, I would be far less upset about Meta’s arguments. However it’s clear that governments prioritize the success of business over the success of humanity.

permalink
report
parent
reply
9 points
*

It’s the opposite. Closing down public resources would be regulatory capture and that would be consolidation of power.

Who do you think can afford to pay billions in copyright to produce models? Only mega corporations and pirates. No more small AI companies. No more open source models.

permalink
report
parent
reply
7 points

I wish we could be talking about the power imbalances of corporate bodies exercised through the use of capital ownership, instead of squabbling about how that differential is manifested through a specific act of piracy.

The reason we view acts of piracy different when they are committed by corporate bodies is because of the power of their capital, not because the act itself is any different. The issue with Meta and OpenAI using pirated data in the production of LMM’s is that they maintain ownership of the final product to be profited from, not that the LMM comes to exist in the first place (even if it is through questionable means). Had they come to create these models from data that they already owned (I need not remind you that they have already claimed their right to a truly sickening amount of it, without having paid a cent), their profiting from it wouldn’t be any less problematic - LLM’s will still undermine the security of the working class and consolidate wealth into fewer and fewer hands. If we were to apply copyright here as it’s being advocated, nothing fundamental will change in that dynamic; in fact, it will only reinforce the basis of that power imbalance (ownership over capital being the primary vehicle) and delay the inevitable (continued consolidation).

If you’re really concerned with these corporations growing larger and their influence spreading further, then you should be directing your efforts at disrupting that vehicle of influence, not legitimizing it. I understand there’s an enraging double-standard at play here, but the solution isn’t to double down on private ownership, it should be to undermine and seize it for common ownership so that everyone benefits from the advancement.

permalink
report
parent
reply
1 point

I wonder if piracy could even benefit these corporations in the long term? Do people who pirate games and movies in their teens and twenties frequently go on to purchase such things when they’re older? I honestly don’t know, but I would love to see a study. I certainly have seen people make that claim.

permalink
report
parent
reply
2 points

Microsoft famously never went after pirates in Asian countries because despite piracy, it made them the default operating system.

They wanted people to be so used to Windows that they would be willing to pirate it just to use a computer.

It worked and their OS dominance for consumer OSes continues.

permalink
report
parent
reply
14 points
*

So why are Meta, and say, Sci-Hub are treated so differently? I don’t necessarily disagree, but it’s interesting that we legally attack people who are sharing data altruistically (Sci-Hub gives research away for free so more research can be done, scientific research should be free to the world, because it benefits all of mankind), but when it comes to companies who break the same laws to just make more money, that’s fine somehow.

It’s like trying to improve the world is punished, and being a selfish greedy fucking pig is celebrated and rewarded.

Sci-Hub is so villified, it can be blocked at an ISP level (depending on where you live) and politicians are pushing for DNS-level blocking. Similar can be said for Libgen or Annas-Archive. Is anything like that happening to Meta? No? Huh, interesting. I wonder why Meta gets different treatment for similar behavior.

I am willing to defend Meta’s use of this kind of data after the world has changed how they treat entities like Sci-Hub. Until that changes, all you are advocating for is for corporations to be able to break the law and for altruistic people to be punished. I agree they’re the same, but until the law treats them the same, you’re just giving freebies to giant corporations while fucking yourself in the ass.

permalink
report
parent
reply
13 points

To me it always seems to come back to nobility. Big corpo is the new nobility and they have certain privileges not available to the common folk. In theory it shouldn’t exist but in practice it most certainly does.

permalink
report
parent
reply
13 points
*

The aristocracy never died, it just got a new name.

I mean the US is literally built on the fact that the aristocracy in the US didn’t actually want to lose station, so they built a democracy that included many anti-democratic measures from the Senate to the Electoral College to only allowing land-owning white men to vote. The US was purpose built to serve the rich while paying lip-service to the poor.

“Conservatives” were literally always those who wanted to conserve the monarchy and aristocracy. Those were the things they originally wanted to conserve, and plainly still fucking do.

How people do not see this is a complete farce.

permalink
report
parent
reply
2 points

So why are Meta, and say, Sci-Hub treated so differently?

They are not. Meta is being sued, just like Sci-Hub was sued. So, one difference is that the suit involving Meta is still ongoing.

In any case, Meta did not create the dataset. IDK if they even shared it. The researcher who did is also being sued. The dataset has been taken down in response to a copyright complaint. IDK if it is available anywhere anymore. So the dataset was treated just like Sci-Hub. The sharing of the copyrighted material was stopped.

Meta downloading these books for AI training seems fairly straight-forward fair use to me. I don’t see how what Meta did is anything like what Sci-Hub did.

permalink
report
parent
reply
8 points
*

So ISPs are blocking Meta for their breaking of copyright?

Because ISPs block Sci-Hub.

No, one of them is having governments trying to kick off the internet, and the other is allowed to continue doing what they’re doing and the worst they’ll face is a fine. Not even close to the same, completely disproportionate. If they were blocking all Meta LLMs until they had removed all copyrighted material, maybe we could say the same.

permalink
report
parent
reply
3 points

Meta downloading these books for AI training seems fairly straight-forward fair use to me.

They pirated the books. Is that not legally relevant?

permalink
report
parent
reply

Technology

!technology@lemmy.world

Create post

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


Community stats

  • 18K

    Monthly active users

  • 12K

    Posts

  • 553K

    Comments