Does Lemmy really benefit from Rust? Is code execution speed the bottleneck?

[ - ]

dudeami0@lemmy.dudeami.win

82 points

1 year ago

The numbers are a little higher than you mention (currently ~3.2k active users). The server isn’t very powerful either, it’s now running on a dedicated server with 6 cores/12 threads and 32 gb ram. Other public instances are using larger servers, such as lemmy.world running on a AMD EPYC 7502P 32 Cores “Rome” CPU and 128GB RAM or sh.itjust.works running on 24 cores and 64GB of RAM. Without running one of these larger instances, I cannot tell what the bottleneck is.

The issues I’ve heard with federation are currently how ActivityPub is implemented, and possibly the fact all upvotes are federated individually. This means every upvote causes a federation queue to be built, and with a ton of users this would pile up fast. Multiply this by all the instances an instance is connected to and you have an exponential increase in requests. ActivityPub is the same protocol used by other federated servers, including Mastodon which had growing pains but appears to be running large instances smoothly now.

Other than that, websockets seem to be a big issue, but is being resolved in 0.18. It also appears every connected user gets all the information being federated, which is the cause for the spam of posts being prepended to the top of the feed. I wouldn’t be surprised if people are already botting content scrapers/posters as well, which might cause a flood of new content which has to get federated which causes queues to back up; this is mostly speculation though.

As it goes with development, generally you focus on feature sets first. Optimization comes once you reach a point a code-freeze makes sense, then you can work on speeding things up without new features breaking stuff. This might be an issue for new users temporarily, but this project wasn’t expecting a sudden increase in demand. This is a great way to show where inefficiencies are and improve performance is though. I have no doubt these will be resolved in a timely manner.

My personal node seems to use minimal resources, not having even registered compared to my other services. Looking at the process manager the postgres/lemmy backend/frontend use ~250MB of RAM.

For now, staying off lemmy.ml and moving communities to other instances is probably best. The use case of large instances anywhere near the scale of reddit wasn’t the goal of the project until reddit users sought alternatives. We can’t expect to show up here and demand it work how we want without a little patience and contributing.

permalink

report

reply

[ - ]

KindaABigDyl@programming.dev

64 points

1 year ago

It could be the devs just like programming in Rust. It’s a nice language lol

permalink

report

reply

[ - ]

dcormier@beehaw.org

6 points

1 year ago

I know I do. ¯\_(ツ)_/¯

permalink

report

parent

reply

[ - ]

snowe@programming.devM

63 points

1 year ago

Hi, programming.dev owner here. From what I’ve been seeing it’s a lot of memory issues. We were hitting swap which was causing massive disk io. You can see what happened with the disk io immediately after the upgrade to more memory. I know at least one reason is being resolved in this PR

We were also having issues with the nginx config. There were some really weird settings that I don’t think were necessary. Finally, the federation is quite busy. So if someone subscribes to events from 10 different servers, we pull in every single event, even upvotes. There’s currently a lot of work being done around this stuff.

I don’t think Rust is the problem. I think it’s just a growth thing. Every platform has growth challenges, things grow in ways that you never expect. You might have thought that it was going to be IO constrained due to the federation, but in reality it’s memory constrained because memory is actually the most expensive thing to have on a server. etc.

permalink

report

reply

[ - ]

argv_minus_one@beehaw.org

12 points

1 year ago

So if someone subscribes to events from 10 different servers, we pull in every single event, even upvotes. There’s currently a lot of work being done around this stuff.

You mean like coalescing multiple events into a single message, or…? (I don’t know anything about ActivityPub, so apologies if this is a stupid question!)

permalink

report

parent

reply

[ - ]

snowe@programming.devM

11 points

1 year ago

correct. I’ve been looking for the thread to try and find it for you, but haven’t been having any luck. People have been discussing exactly that though, but it seems like it could cause some problems with vote faking. Anyway, it is being worked on!

permalink

report

parent

reply

[ - ]

binwiederhier@discuss.ntfy.sh

2 points

1 year ago

Thank you for the insight. Fascinating. Also insane that ever upvote causes a flood of messages being distributed…

permalink

report

parent

reply

[ - ]

Badabinski@kbin.social

1 point

1 year ago

Any reason to use nginx versus something like Envoy? Like, I really like nginx, but Envoy’s xDS API is really great for on-the-fly changes. I also think it might scale better and have more relevant default values. I’m just not sure if Lemmy ties into nginx in some way, or if you’re purely using it as a reverse proxy.

I’ll note that most of my Envoy experience is from using it with k8s and a custom ingress controller, where my org handles millions of requests per second (across many Envoy pods). Deploying it standalone might make it less fun.

permalink

report

parent

reply

[ - ]

snowe@programming.devM

1 point

1 year ago

Nginx is part of the lemmy-ansible install. I’ve never heard of Envoy though. if you’re interested in helping out you can always join the discord. We also set up a matrix room, but it doesn’t have as much discussion in it yet. https://app.element.io/#/room/!hmRRJzTsXkNAGIDXNu:matrix.org

permalink

report

parent

reply

[ - ]

24Vindustrialdildo@sh.itjust.works

41 points

1 year ago

I think the devs openly stated they aren’t backend bods and asked for help optimising the database as a priority. There’s a bit of work going on on github to sort that out I think. Anyone reading this who can optimise postgresql or contribute to a database agnostic retool should probably speak to the devs as I imagine you’d be welcome.

I wish I could help so much but I doubt they’re going to retool into .net haha.

permalink

report

reply

[ - ]

Buttons@programming.devOP

14 points

1 year ago

*

Which is fine. If they wanted to learn Rust and wrote inefficient code, good for them. I appreciate their efforts. Rust can certainly be beaten into shape and perform well enough in the end.

permalink

report

parent

reply

[ - ]

21trillionsats@infosec.pub

6 points

1 year ago

Rust itself or the way the Rust logic is implemented is not the bottleneck. Like most decent web applications the bottleneck is the database and how the decentralized protocols themselves are reconciled there.

Scaling massive amounts of records like Lemmy has been forced to is almost always IO bound at the database level even when a web service is centralized; this is much more difficult in federated architectures. This is why “NoSQL” databases have increased in popularity, but they are also not a magic bullet as there are major ACID trade offs one needs to consider.

permalink

report

parent

reply

[ - ]

jeltz@programming.dev

6 points

1 year ago

NoSQL databases are no silver bullet and the costs of ACID are usually exaggerated (plus most NoSQL databases actually implement ACID anyway). NoSQL databases and SQL databases often have similar performance characteristics since most of the technology is typically the same under the hood.

Plus from my experience as a database consultant: databases are rarely IO bound, NoSQL or SQL unless you have a strange workload. Most time for query execution is usually spent waiting on loads or executing CPU instructions, not waiting on disk IO.

permalink

report

parent

reply

Does Lemmy really benefit from Rust? Is code execution speed the bottleneck?

Programming

!programming@programming.dev

Rules

Wormhole

Community stats

Community moderators