Are there any Discord servers or somewhere in the Matrix to chat about hosting a Lemmy instance? I’ve got Lemmy running, but I think there are several of us in the same boat struggling with federation performance issues and it might be good to have some place to chat real time.

3 points

My server is struggling with federation. Pretty much everything I see in the logs with debug turned on is this:

2023-06-20T01:55:28.018419Z WARN Error encountered while processing the incoming HTTP request: lemmy_server::root_span_builder: Header is expired

permalink
report
reply
2 points

This is exactly what I am seeing. I just tried upping federation_worker_count in the postgres database. I saw someone in another thread mention trying that so we’ll see.

permalink
report
parent
reply
1 point

There is an nginx setting you can tune as well. I believe it was worker threads? Can’t remember the exact one and too tired to ssh into my instance to check.

permalink
report
parent
reply
1 point
*

This post says that the worker threads only effect outbound federation. I’m struggling with my instance not receiving anything inbound.

permalink
report
parent
reply
1 point

Upping worker count significantly reduced those in my case. If Lemmy is maxing out your CPU too much though by chance, you may need to upgrade.

permalink
report
parent
reply
1 point

I would be up for something like this. I host my own 8nstance as well. I’m having issues updating communities though. Every time I try I get the button spinner of death. I think in the end, the software is buggy and needs some time to get the bugs worked out, but it is frustrating.

permalink
report
reply
1 point

The matrix space have multiple rooms, one explicitly related to instance admins

https://matrix.to/#/#lemmy-space:matrix.org

permalink
report
reply
1 point

This is perfect. Thank you!

permalink
report
parent
reply
1 point

Honestly- a lot of the performance issues aren’t due to OUR servers- but, the upstream servers.

beehaw.org, lemmy.world, for example- I think their servers are completely overloaded, and are having issues keeping up.

I don’t have sync issues for the smaller/other servers at all. Just the big ones.

I have 128G of ram, 32 cores dedicated. I have the federation worker count set at 256. There is NO shortage of resources, and my server sits more or less, idle.

Due to this only really impacting those larger instances- I believe the blame may lie there.

permalink
report
reply
2 points

I think it is less about pointing fingers as to who’s at blame, and trying to see if there are things we can do to resolve/alleviate that.

I recall reading somewhere that @Ruud@lemmy.world mentioned before that the server is scaled all the way up to a fairly beefy dedicated server already, perhaps it is soon time to scale this service horizontally across multiple servers. If nothing else, I think a lot of value could be gained by moving the database to a separate server than the UI/backend server as a first step, which shouldn’t take too much effort (other than recurring $ and a bit of admin time) even with the current Lemmy code base/deployment workflow…

permalink
report
parent
reply
1 point

Well- I do know- most of the components do scale.

The UI/Frontend, for example, you can run multiple instances easily.

The API/MiddleTier, I don’t know if it supports horizontal scaling though. But, a beefy server can push a TON of traffic.

The database/backend, being postgres, does support some horizontal scaling.

Regarding the app itself, it scales much better if EVERYONE didn’t just flock to lemmy.ml, lemmy.world, and beehaw.org. I think that is one of the huge issues… everyone wanted to join the “big” instance.

permalink
report
parent
reply
3 points

If you look here: https://lemmy.world/comment/65982

At least specs and capacity wise, it doesn’t suggest it is hitting a wall.

The more I dug into things, the more I think the limitation comes from an age old issue in that if your service is expected to connect to a lot of flakey destinations, you’re not going to be in for a good time. I think the big instance backend is trying to send federation event messages, and a bunch of smaller federated destinations have shuttered (because they’re not getting all the messages, so they just go and sign up on the big instances to see everything), which results in the big instances’ out going connection have to wait for timeout and/or discover the recipient is no longer available, which results in a backed up queue of messages to send out.

When I posted a reply to myself on lemmy.world, it took 17 seconds to reach my instance (hosted in a data centre w/ sub 200ms ping to lemmy.world itself, so not a network latency issue here), which exceeds the 10 seconds limit per defined by Lemmy. Increasing it on the application protocol level won’t help, because as more small instances come up, they too would also like to subscribe to big hubs, which will just further exacerbate the lag.

I think the current implementation is very naive and can scale a bit, but will likely be insufficient as the fediverse grows, not as the individual instance’s user grows. That is, the bottle neck will not so much be “this can support instance up to 100K users” but rather “now that there’s 100K users, we’d also have 50K servers trying to federate with us”. And to work around that, you’re going to need a lot more than Postgres horizontal scaling… you’d need message buses and workers that can ensure jobs (i.e.: outward federation) can be sent effectively.

permalink
report
parent
reply

Selfhosted

!selfhosted@lemmy.world

Create post

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don’t control.

Rules:

  1. Be civil: we’re here to support and learn from one another. Insults won’t be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it’s not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don’t duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

Community stats

  • 5K

    Monthly active users

  • 3.6K

    Posts

  • 81K

    Comments