I’m wondering how are all those different Lemmy instances financed? I know some rely on donations, but is that all and is that sustainable?
Here’s what interests me: obviously on a single instance you need to scale the infrastructure as more users join. But how much do you need to scale to account for the usage of federated instances?
I ask that because the answer in this thread is broadly “it’s sustainable because each instance is low cost, and you can always add more instances” but that doesn’t work if, after the number of instances grows 100x, all existing instances face an increase in costs even if they didn’t gain many more users, because they’re receiving more messages from federated instances, and need to download and store stuff from those other instances for their local users.
That’s a very interesting question and I’m not sure of the answer.
Obviously on some level, the cost of the infrastructure scales with the number of people using it. But so does the ability to crowdfund, if there are 100x more instances then theoretically there would be 100x more potential donors to meet the cost.
One clear way to influence the scaling in our favor would be to utilize instances with clear themes and purposes. If everybody on a particular instance is interested in the same content, that reduces the wasted computational resources compared to an instance where all of the users are interested in different topics, and thus subscribed to a much wider variety of communities.
My intuition is that as long as the platform only hosts text and images, the costs should be manageable, especially with inevitable improvements to computational efficiency that are likely to come as Lemmy matures. For instance, I believe there is some kind of patch that reduces storage utilization that should be shipping with the next version (0.19).
My line of thinking is really wondering about what optimisations are necessary to allow Lemmy to scale in this way. Large social media sites have very interesting designs to deal with the huge amount of data flowing through them, caching as much as possible close to where it will be needed. I don’t know about Lemmy’s design though and I don’t really have a good idea of how that would impact optimisations.
To take an example I remember from reddit (actually I had to re-read about it because I didn’t really remember it…) reddit caches ordered lists of things, for example, the list of posts on the homepage. The problem is that the ordering has to be reevaluated all the time because it can change whenever someone votes. (Let’s assume we’re looking at a listing which incorporates voting). To make that work efficiently the reddit programmers made vote processing actually update not just the backing store and invalidate the cache, but modify the “cache” directly, which is now more like a denormalised view of the backing data. The way this was done meant that later, when the rate of votes increased, there were again problems because all this processing was contending on these denormalised views. I’m thinking this is probably going to be complicated by the federated aspect, because that’s a separate source of updates from those local sources.
I would also say that you can’t expect linear scaling for donations: early adopters are going to be enthusiasts, and correspondingly more enthusiastic with their money!
You mean say I have an instance with 100 users.
If they all hang out on communities on the server (more or less), no problemo at all.
But if they all roam around and sign up on thousands of active communities on other servers, my server will be under water.
I love thinking about stuff like this (P=NP, complexity, etc) and I do not see very much about that concerning the lemmyverse which is IMO a shame.
I’m planning setting up a Lemmy build so I can tinker around with it, but you know, time and stuff. I also spent a lot of time just setting up the docker version so maybe it’s quite the job :-)