https://github.com/LemmyNet/lemmy/issues/3245

I posted far more details on the issue then I am putting here-

But, just to bring some math in- with the current full-mesh federation model, assuming 10,000 instances-

That will require nearly 50 million connections.

Each comment. Each vote. Each post, will have to be sent 50 million seperate times.

In the purposed hub-spoke model, We can reduce that by over 99%, so that each post/vote/comment/etc, only has to be sent 10,000 times (plus n*(n-1)/2 times, where n = number of hub servers).

The current full mesh architecture will not scale. I predict, exponential growth will continue to occur.

Let’s work on a solution to this problem together.

You are viewing a single thread.
View all comments View context
21 points

No - you said:

Each comment. Each vote. Each post, will have to be sent 50 million seperate times.

That won’t ever happen. Unless there’s 50 million instances. That’s not worst case, it’s just not a case.

There is no case in the current implementation where any one action is replicated more times than there are total instances.

And it doesn’t matter what “model” you assume, each action will have to federate to each instance eventually. That count is minimally, the total number of instances.

Lets say, those 10,000 instances all decide to look at a community on another server. Now you have 20,000 connections.

Looking does nothing, each instance hosts essentially a copy of the “host instance” for each community. Only interactions (comments, likes, posts, etc) are federated.

permalink
report
parent
reply
4 points
*

for fucks sake, dude, be collaborative, and not defensive. This isn’t reddit, I am not out to attack your karma.

If every instance, hosts a community, and Every other instance, subscribes to every one of those communities, that would lead to a full-mesh between all instances, resulting in worst-case scenario, ie, following the formula I provided for a full-mesh topology.

That is indeed, the worst case scenario, I have provided, explained, and documented in my examples.

If my example is too hard to understand, lets use an easier example

Count the number of instances on https://lemmy.ml/instances

Assume every one of those instances subscribes to !asklemmy.

Now, count the number of instances on https://lemmy.world/instances

Assume, every one of those instances subscribes to !lemmyworld.

Now, count the number of instances on https://beehaw.org/instances

Assume, every one of those instances subscribes to !technology.

It does. not. scale.

permalink
report
parent
reply
26 points

In no way is the person you’re responding to speaking defensively. They’ve discussed the reason why your extrapolation to a full-mesh connective worst-case scenario isn’t based in the reality of how ActivityPub functions. But you don’t seem to be willing to entertain the notion that the federation of any given action never exceeds the number of instances subscribed to the community that generated it.

Even should every instance subscribe to every community on every other instance, the recipient of a federated action doesn’t turn around and rebroadcast that action back on to the network because it is not the authoritative host of that community. Therefore what this discussion is lacking is proof of where this exponential broadcast storm of federated actions comes from in your assertion.

permalink
report
parent
reply
14 points
*

Apologies if I came off as hostile.

I mean I get what you’re saying - I just don’t see the practical use. The centralized hub replication servers would have to basically foot a huge bill for the fediverse, and do so silently and invisibly to the end user. As it is, most instances run on goodwill or donations. A silent, invisible server is hard to gather donations for. Who would run them?

Furthermore the topology you propose is essentially what we already have. A few large instances hold most of the largest communities. I don’t see that changing. This brings a fairly good balance - smaller instances pretty much only have to listen for updates from a few other instances, only the big instances are doing the hard work of notifying hundreds of others. They are already our “hubs”. Small instances really hardly do practically any hard work, the one I run for example just listens to maybe a dozen instances send updates, and occasionally sends out an update when one of my users interacts.

I suppose I just don’t understand how this could be implemented in practice- or rather how it could be useful to do so. It would strictly enforce a sort of centralization that right now is only a natural consequence of user behavior, while seemingly only bringing theoretical benefits unlikely to be realized.

permalink
report
parent
reply
2 points
*

The centralized hub replication servers would have to basically foot a huge bill for the fediverse, and do so silently and invisibly to the end user.

One consideration, since they are only having to basically sub/pub - the load actually might be drastically lower than expected.

Furthermore the topology you propose is essentially what we already have. A few large instances hold most of the largest communities. I don’t see that changing.

Suppose- that is a valid point. The issue though- those large instances are unable to keep up with demand and load, causing lots of federation issues.

Perhaps, my idea actually wouldn’t help that at all, but, using lemmy.ml as an example-

Instead of it having to send all of its updates out to every server subscribed- it can delegate that to a hub server to do it. The hub server can run a very minimal set of instructions, with enough intelligence to handle sub/pub.

Perhaps- one idea is, instead of thinking of it as a hub-server, think of it as a proxy server. Being able to delegate your instances actions to the proxy server to reduce that load from the main server.

And, instead of the hubs/proxies being more centralized, perhaps, its just an optional thing which you CAN do.

My line of thinking, is methods to reduce load from the main servers. This might be an idea that only benefits the handful of big servers.

To also further clarify- I DONT have a solution to the problem. I am only intending to establish a forum to discuss if this is even a viable option, or perhaps, think of other ways to spread around the load.

permalink
report
parent
reply
14 points

Yes, it is a “full mesh” diagram. But for each specific “federated” action, it is a simple hub and spoke distribution. The hosting server will send the federated action to each subscribed node. The nodes don’t need to check in with each other for that specific action.

I too believe that Federation is going to have scaling issues. But not due to full mesh

permalink
report
parent
reply
3 points

I am onboard with you there-

But, would not not agree- delegating and offloading those federation actions to a dedicated pool of servers, would not assist scalability?

That way- each instance doesn’t need to maintain all of the connections?

permalink
report
parent
reply

Technology

!technology@beehaw.org

Create post

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community’s icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

Community stats

  • 2.8K

    Monthly active users

  • 3.4K

    Posts

  • 82K

    Comments