Reddit says it's made $203M so far licensing its data

well, not mine. i used a script to replace all of my comments with gibberish before i deleted them and then my account. if they went back and restored my comments, then all they’ll get is comments full of gibberish, especially since i overwrote them 3 times before deleting them, just in case they tried to roll back to the previous version.

have fun with that!

I like your style, but honestly I wouldn’t be surprised if they keep every single version.

Here’s the thing: Nothing in Reddit’s history indicates that they are that competent.

i bet they do now, but i’ve checked back now and then, and all of my comments and posts are most assuredly gone.

edit: i’ve gone back to check some old haunts, place i know i’ve commented, and i did some seaching with google using my old usernames, as google uses its cache to match to the posts\comments, even though they’re not there any more.

i see old posts that are graveyards of deleted comments, some with simply deleted accounts, and many others where both the account and comment are deleted. i don’t see any gibberish comments. the ones i know are mine (because replies quote the comment above, which i recognize as mine), are all just deleted in their entirety, so it seems they didn’t do comment versioning, at least not past the first edit. i see no posts under any former username of mine.

the efforts to scrub my content from reddit last May appears to have worked. sadly, since the API lockdown, those tools no longer work.

Reddit used to be open source and the source is still on github as a read only archive.

AFAIK back then edit history was only kept briefly. Enough to roll back an accidental edit (if you have admin privileges anyway) but not far enough back to view old versions of posts.

Of course, they would have backups, and maybe the code has changed, but I wouldn’t be surprised if it hasn’t changed and those backups are impractical (slow/expensive) to access.

Keeping old revisions is a common practice but it’s also expensive and in reddit’s case totally unnecessary.

you can request your reddit data, and they provide every comment along with edits as far as I remember, it was uncomfortable but i’d never posted anything regrettable at least

imagine getting your hands on u/spez’s reddit data

Yeah…all that comment data isn’t really that large. They’ll have backups captured for likely several years back. All you can view is the info on the current live servers. You might have kept them from getting like 3 months worth of your comments at best.

I did the same, but we’re both fools if we think reddit didn’t keep every character we typed (yet alone submitted) in a private, proprietary database.

We weren’t paid for our data. We were given access to a website free of charge. The consent we gave was supposed to be for the operation of the website, not for training AI.

They should fucking pay us.

LOL. I did the same. And I confirmed many months later that the comments were not restored.

Now I hear that Google wants to train their AI on reddit content. Haha. Good luck with that, Lorem Ipsum! 😁

If you actually replaced with “Lorem ipsum” texts, it would probably be easy to filter the garbage from the dataset.

Also, they probably have copies of the comments before the edits that are just not presented in the frontend.

Great script. Made sure to run it the day before the new API rules went into effect

Me too. Feelin’ mighty fine about that decision now. Long Live Lemmy

Correct, our data.

Spez

In other news, spez’s compensation from reddit last year was $193 million, and it’s COO got a cool $93 million.

C’mon, spez, tell us again how horrible it’s been that reddit’s never made a profit.

Just saw this on yours truly. Fucking hilarious considering they had the balls to IPO with that sack of rocks weighing down the entire company.

So Reddit charges users to create content (paid premium or by showing ads). And then it sells that content.

Making money both going and coming.

And it also asks reddit users to invest in reddit

loool

I wonder if there is any legal standing for users to sue Reddit for a fair share of those profits. That’d be nice if it could happen. But i suspect, probably not.

Their TOS says they own your content in any current or future formats or derivative works.

I’d say Reddit would win.

Their TOS says they own your content in any current or future formats or derivative works.

Their ToS could say they own you and your children and grandchildren, but that doesn’t make it enforceable.

If I post a frame from the movie Akira on Reddit would any reasonable person suggest that they own not only that frame, but also the entire movie that it came from as a derivative work? There is a glut of second-hand data just like that all over Reddit, Twitter, and every other social media network, and I’m willing to bet that’s also part of what’s being sold.

But hey… I’m not saying you’re wrong, just that the idea that they automatically “own” the things that people post on their website is ridiculous. It’s a bit like UPS or FedEx saying they own the contents of your package while delivering it.

It is true that Reddit does not hold a valid license to content that is

Sufficiently long-form, unique etc. to be copyrightable, and
posted by someone other than the copyright holder or someone with a sufficient license.

However, as far as I understand it, the extent to which Reddit—a content provider and social network—is legally required to remedy this is to comply with DMCA requests and review reported content. Perhaps there is a higher standard that I am not aware of?

The TOS shouldn’t hold up in court. A contract must be an exchange of two things, eg money for a product or service. You can’t say “Our service is free of charge!!!” And then in the fine print “(((But also you agree to give us everything we can take free of charge)))”.

The issue is how everyone does it. Facebook and Google started when data had no value, now they’re amongst the wealthiest businesses in the world. Now, Microsoft have joined in, *even though you already pay for their products and services anyway!"

However, the other aspect is that everyone is a victim. Lawmakers are the victim. They still haven’t quite yet realised how much is being taken from them (at least $50 per year, probably more like $1,000 per year if not more for prominent figures) but they are still being abused.

It’s like that form of bank fraud, where the criminal takes pennies from accounts, hoping the user won’t notice and the bank will write it off. Do it to enough people and enough times and you can make millions. They do this to everyone and they make billions.

Either the data is public domain and they don’t have to pay for it, but also cannot charge others for it, or the data is private and they must pay the author a fair share.

The exchange is you getting to be on reddit.

Yeah, probably not. When you sign up and agreed to their ToS, they don’t “own” your content, but you grant them a worldwide, royalty-free, perpetual, irrevocable, non-exclusive, transferable, and sublicensable license to use it without compensation.

From their ToS:

Any ideas, suggestions, and feedback about Reddit or our Services that you provide to us are entirely voluntary, and you agree that Reddit may use such ideas, suggestions, and feedback without compensation or obligation to you

Source: A pretty good post on r/HFY, though it is on Reddit, so don’t click it if you don’t want to :P

But how many TOS have been shot down because they over reach? I don’t know. You’re probably right. It it’s still fun to imagine.

There is legal standing, IMO. You can’t take something without consideration, and access to the website was granted free of charge while the data collection was squirrelled away in the fine print. That isn’t a lawful contract, the fine print is for technicalities about the main transaction of X in exchange for Y. You can’t say "we’ll give you X for free!!!” then sneak into the fine print “(((you also give us Y for free)))”. The structure is clearly deceptive in a manner that is designed to prevent a fair assessment of the value being exchanged.

Insurers have to provide a “key facts page” where they summarise in plain English what you’re paying for. The fine print gives the detail, but the front page is still “we give you X in exchange for Y”.

You can’t build a car without paying for the nuts and bolts. Tech companies have placed themselves amongst the wealthiest businesses in the world without paying for the nuts and bolts we provide.

Hell, even Microsoft is in on it now, even though you pay for Windows and Office 365!

Next question then: how do we mobilize into a class action against Reddit and google and Microsoft and whomever else?

… and thats why Wikipedia is non-profit.

Seeing human (even shitpost) achievements get monetized (in the most sucky manner) one by one is sad af.

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Reddit says it's made $203M so far licensing its data(techcrunch.com)

Technology

!technology@lemmy.world

Our Rules

Approved Bots

Community stats

Community moderators