I do image generation for RPGs, so AZovya’s RPG v3 model is easily my favorite. It does a wide range of styles very well and understands a lot of RPG-specific tokens. I’m really hoping they update it for SDXL, because all of the models I’ve seen so far are disappointing compared to what’s available with SD 1.5.
I don’t have an answer for LLMs, but I’m curious what others will reply with. Aren’t there only like… 3 or 4 models in common use for LLMs? I’m used to having hundreds to pick from with Stable Diffusion; I don’t think I understand how LLM models are different.
There are only a few popular LLM models. A few more if you count variations such as “uncensored” etc. Most of the others tend to not perform well or don’t have much difference from the more popular ones.
I would think that the difference is likely for two reasons:
-
LLMs require more effort in curating the dataset for training. Whereas a Stable Diffusion model can be trained by grabbing a bunch of pictures of a particular subject or style and throwing them in a directory, an LLM requires careful gathering and reformatting of text. If you want an LLM to write dialog for a particular character, for example, you would need to try to find or write a lot of existing dialog for that character, which is generally harder than just searching for images on the internet.
-
LLMs are already more versatile. For example, most of the popular LLMs will already write dialog for a particular character (or at least attempt to) just by being given a description of the character and possibly a short snippet of sample dialog. Fine-tuning doesn’t give any significant performance improvement in that regard. If you want the LLM to write in a specific style, such as Old English, it is usually sufficient to just instruct it to do so and perhaps prime the conversation with a sentence or two written in that style.