Have you tried LocalGPT PrivateGPT or other similar alternatives to ChatGPT?

posted 10 months ago

Moshpirit@lemmy.world

selfhosted@lemmy.world

33 commentshide report

I’m interested in hosting something like this, and I’d like to know experiences regarding this topic.

The main reason to host this for privacy reasons and also to integrate my own PKM data (markdown files, mainly).

Feel free to recommend me videos, articles, other Lemmy communities, etc.

Sort:

Hot Top Controversial New Old

[ - ]

woodgen@lemm.ee

28 points

10 months ago

I tired a bunch, but current state of the art is text-generation-webui, which can load multiple models and has a workflow similar to stable-diffusion-webui.

https://github.com/oobabooga/text-generation-webui

permalink

report

[ - ]

CumBroth@discuss.tchncs.de

2 points

10 months ago

I’ve tried both this and https://github.com/jmorganca/ollama. I liked the latter a lot more; just can’t remember why.

GUI for ollama is a separate project: https://github.com/ollama-webui/ollama-webui

permalink

report

parent

[ - ]

The Cooking Senpai@lemme.discus.sh

16 points

10 months ago

Absolutely yes. You can try GPT4ALL which works on any decent CPU computer (the minimum I managed to run it with is a 2018 6 core 2.0ghz ARM64 processor) and has a lot of built in models. You can also import uncensored models (like the TheBloke ones on Huggingface ).

I also tried AutoGPT some times ago which is quite complex and cool.

permalink

report

[ - ]

CubitOom@infosec.pub

12 points

10 months ago

Checkout ollama.

There’s a lot of models you can pull from the official library.

Using ollama, you can also run external gguf models found on places like huggingface if you use a modelfile with something as simple as

echo "FROM ~/Documents/ollama/models/$model_filepath" >| ~/Documents/ollama/modelfiles/$model_name.modelfile

permalink

report

[ - ]

The Assman@sh.itjust.works

11 points

10 months ago

Deleted by creator

permalink

report

[ - ]

SoleInvictus@lemmy.world

6 points

10 months ago

It’s good for me because I’m piss poor at programming. In my defense, I’m not a programmer or even programmer adjacent. I do see how it wouldn’t be useful to a pro. It also has occasionally given me garbage advice that an expert would spot right away while I had to figure out in my own that it was ‘hallucinating’ again. There’s nothing better for learning than troubleshooting, though!

permalink

report

parent

[ - ]

bogo@sh.itjust.works

3 points

10 months ago

I can absolutely see it getting useful for a pro. It’s already a better version of IDE templates. If you have to write boilerplate code this can already do that. It’s a huge time saver for the things you’d have to go look up to remember how to do and piece together yourself.

Example: today I wanted a quick way to serve my current working directory over HTTP so I could do some quick web work. I asked ChatGPT to write me a bash function I could stick in my profile to do this, and I told it to pick a random unused port. That would have taken me much longer had I went to lookup how to do that all. The only hint I gave it was to use the Python builtin module for serving http.

permalink

report

parent

[ - ]

The Assman@sh.itjust.works

1 point

10 months ago

Deleted by creator

permalink

report

parent

[ - ]

scarilog@lemmy.world

2 points

10 months ago

There’s a project called Tabby that your can host as a server on a machine that has a GPU, and has a VSCode extension that connects to the server.

The default model is called starcoder, and it’s the small version, 1B parameters. The downside is that it’s not super smart (but still an improvement over built in tools), but since it’s such a small model, I’m getting sub-second processing times.

permalink

report

parent

[ - ]

The Assman@sh.itjust.works

1 point

10 months ago

Deleted by creator

permalink

report

parent

[ - ]

exu@feditown.com

1 point

10 months ago

I’ve found it’s pretty good for translating between steps so to speak.

Converted some bash to python relatively quickly by giving it snippets and fixing errors as it made them.

I also had success generating an ansible playbook based on my own previously written install instructions for SillyTavern and llama.cpp.

I could do both of those tasks myself, but thar would be more difficult than having a mostly correct translation and fixing some errors.

permalink

report

parent

[ - ]

amzd@kbin.social

0 points

10 months ago

You should make sure you are running a model that fits in your vram, for me it runs faster than any online LLM I’ve tried.

permalink

report

parent

[ - ]

sj_zero@lotide.fbxl.net

11 points

10 months ago

I’ve been using a number of different tools which I interface to my nextcloud.

My main nextcloud has a llm plugin which was really easy to install, you just install the plug-in, make sure that you are configured properly with python in your path, and then run an OCC command to download one of a few models.

https://localai.io/

I also hosted localAI, which was a little bit more involved, but the website did a decent enough job of explaining exactly all the things that you needed to do in order to get all the different types of AI model working. Besides LLMs, it also supports text to speech, speech to text, and image generation.

Two things that are important: first, if you are server doesn’t have a pretty advanced video card then you’re going to be using the CPU exclusively for AI, and that’ll be pretty slow. Second, I found it very quickly that the amount of RAM you have is critical. My main server is a core i5 4th gen, and so I put AI software on another one of my servers which is a core i5 7th gen. You would think that the latter would work a lot better, but it had half the ram, and it basically wasn’t even able to get started.

Besides hosting ai, if you have a desktop computer or gaming laptop you can run local AI models. There’s a fantastic piece of software called Faraday that works pretty well on my laptop. You can get more and more sophisticated models depending on how much memory you have.

https://youtu.be/aLy_vVLUHZk

Krita has AI dal-e support for image generation available as a plug-in. I haven’t used it yet because I just got it started downloading last night before I went to bed, but the installation process has defined in the video seems accurate and was extremely easy and mostly automated.

https://youtu.be/AU8NDSBIS1U

permalink

report

[ - ]

PipedLinkBot@feddit.rocksB

1 point

10 months ago

Here is an alternative Piped link(s):

https://piped.video/aLy_vVLUHZk

https://piped.video/AU8NDSBIS1U

Piped is a privacy-respecting open-source alternative frontend to YouTube.

I’m open-source; check me out at GitHub.

permalink

report

parent

[ - ]

ReallyActuallyFrankenstein@lemmynsfw.com

0 points

10 months ago

Second, I found it very quickly that the amount of RAM you have is critical. My main server is a core i5 4th gen, and so I put AI software on another one of my servers which is a core i5 7th gen. You would think that the latter would work a lot better, but it had half the ram, and it basically wasn’t even able to get started.

Is there an amount of RAM that’s currently considered the bare minimum for CPU-only self-hosting?

permalink

report

parent

[ - ]

exu@feditown.com

3 points

10 months ago

If you’re using llama.cpp, have a look at the GGUF models by TheBloke on huggingface. He puts approximate RAM required in the readme based on the quantisation level.

From personal experience I’d estimate 12G for 7B models based on how full RAM was with 16 gigs. For mixtral at least 32G.

permalink

report

parent

[ - ]

ReallyActuallyFrankenstein@lemmynsfw.com

1 point

10 months ago

Thanks, appreciate it (I’m new to local text CPU models, I know it was a stupid question).

permalink

report

parent

Selfhosted

!selfhosted@lemmy.world

Create post

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don’t control.

Rules:

Be civil: we’re here to support and learn from one another. Insults won’t be tolerated. Flame wars are frowned upon.
No spam posting.
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it’s not obvious why your post topic revolves around selfhosting, please include details to make it clear.
Don’t duplicate the full text of your blog or github here. Just post the link for folks to click.
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
No trolling.

Resources:

selfh.st Newsletter and index of selfhosted software and apps
awesome-selfhosted software
awesome-sysadmin resources
Self-Hosted Podcast from Jupiter Broadcasting

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

Community stats

5K
Monthly active users
3.6K
Posts
81K
Comments

Community stats

Community moderators