Best GPUs for self-hosted AI?

posted 11 months ago

Hello friends,

I’m pretty deep into self-hosting - especially on the home automation side. I’ve got a couple of options for self-hosted AI, but I don’t think they’ll meet my long term goals:

Coral TPUs: I have 2x processing my Frigate data. These seem fine for that purpose, but not useful for generative AIs?
Jetson Nano: Near as I can tell nothing supports these things except DeepStack, which appears to be abandoned. Bummed these haven’t gotten broader support in the community.

I’ve got plenty of rack space and my day job is managing thousands of machines, so not afraid of a more technical setup.

The used NVIDIA rack mounted Tesla GPU servers look interesting. What are y’all using?

Requirements:

Rack mounted
Supports local LLM and GenAI
Linux-based
Works with Docker

Sort:

Hot Top Controversial New Old

[ - ]

maggio@discuss.tchncs.de

2 points

11 months ago

My friend did this with a RTX 3060 12GB, and documented the process in this Octopusx blog post

If you have any questions we’d be happy to help

permalink

report

[ - ]

Aw3som3Guy@alien.topB

2 points

11 months ago

Don’t have direct experience with either, but:

It’s my understanding that a corral tpu is exclusively an inference accelerator, no training or more generative applications. Also, corral TPUs are a little bit unobtainium, with the only options I’ve seen behind scalped about as much as a pi, to basically the same result.

I think you’re overthinking the nano a bit. I’m not sure that you’d need explicit support for the nano, because it’s just a cuda gpu and so it should^TM just run anything cuda, as long as the arm cpu doesn’t trip the software up . For example, I’ve seen people running blender renders across a cluster of jetsons, just because, and I doubt that blender has any explicit support for jetsons.

If you’re coming at it from the sense that you have rack space to spare, a used Tesla / Quadro gpu would probably be better value than a jetson nano OG, because those were I think 2GB/4GB and 256 Kepler era cuda cores. You’d almost have to go out of your way to find a worse PCIe card, plus a normal PCIe card in a normal x86 server wouldn’t have arm software restrictions. Although as the other commenter mentioned, cooling/power draw is a more serious consideration for a PCIe card, plus the risks of buying used.

permalink

report

[ - ]

Trustworthy_Fartzzz@alien.topOPB

1 point

11 months ago

I totally agree on the Coral TPUs. Great for Frigate, but not much else. I’ve got 2x of the USB ones cranking on half a dozen 4K stream - works wonderfully.

And I agree in theory these Nanos should be great for all sorts of stuff, but nothing supports them. Everything I’ve seen is custom one offs outside of DeepStake (though CodeProject.AI purports there’s someone now working on a Nano port).

Sounding like a decent gaming GPU and a 2-3U box is the ticket here.

permalink

report

parent

[ - ]

seanpmassey@alien.topB

1 point

11 months ago

Point of pedantry- the Nano uses a Tegra X1 as its SoC. It has a Maxwell generation GPU, not Kepler.

The new Jetson Orin Nano uses an Ampere GPU.

permalink

report

parent

[ - ]

tehnomad@alien.topB

2 points

11 months ago

The best consumer NVIDIA card is the 3090ti because of its 24GB memory, so you can run bigger LLM models. I have a 3060ti 12GB which works pretty well with 7B and 13B LLM models.

permalink

report

[ - ]

s3r3ng@alien.topB

1 point

11 months ago

A 4090 is good enough for running many models. You probably want an A6000 for larger ones. But many models that don’t fit in your VRAM can be scaled down without much loss of effectiveness.

permalink

report

[ - ]

flossraptor@alien.topB

1 point

11 months ago

Nvidia is the only game in town right now. I decided on a 3090 for the time being, with the option of adding another one later. I think in two years we will have 100x better options specifically tailored for AI.

permalink

report

Self-Hosted Main

!main@selfhosted.forum

Create post

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don’t control.

For Example

Service: Dropbox - Alternative: Nextcloud
Service: Google Reader - Alternative: Tiny Tiny RSS
Service: Blogger - Alternative: WordPress

We welcome posts that include suggestions for good self-hosted alternatives to popular online services, how they are better, or how they give back control of your data. Also include hints and tips for less technical readers.

Useful Lists

Awesome-Selfhosted List of Software
Awesome-Sysadmin List of Software

Community stats

23
Monthly active users
1.8K
Posts
11K
Comments

Community moderators

communick@selfhosted.forum