Hello internet users. I have tried gpt4all and like it, but it is very slow on my laptop. I was wondering if anyone here knows of any solutions I could run on my server (debian 12, amd cpu, intel a380 gpu) through a web interface. Has anyone found any good way to do this?
kobold.cpp is easy to use, fast and I like it.
If you’re interested in more relevant Lemmy communities:
(another option: text-generation-webui has several backends bundled. Maybe one of those works for you.)
text-generation-webui is kind of the standard from what I’ve seen to run it with a webui, but the vram stuff here is accurate. Text LLMs require an insane amount of vram to keep a conversation going.
Ollama and localai can both be run on a server with no gpu. You’d need to point a different web ui to them if you want though
Ollama is a nice server base, they lots of projects that plug on top of that.