Avatar

ralakus

ralakus@lemmy.world
Joined
2 posts • 118 comments
Direct message

If you’re using an LLM, you should limit the output via a grammar to something like json, jsonl, or csv so you can load it into scripts and validate that the generated data matches the source data. Though at that point you might as well just parse the raw data and do it yourself. If I were you, I’d honestly use something like pandas/polars or even excel to get it done reliably without people bashing you for using the forbidden technology even if you can 100% confirm that the data is real and not hallucinated.

I also wouldn’t use any cloud LLM solution like OpenAI, Gemini, Grok, etc. Since those can change and are really hard to validate and give you little to no control of the model. I’d recommend using a local solution like running an open weight model like Mistral Nemo 2407 Instruct locally using llama.cpp or vLLM since the entire setup will not change unless you manually go in and change something. We use a custom finetuned version of Mixtral 8x7B Instruct at work in a research setting and it works very well for our purposes (translation and summarization) despite what critics think.

Tl;dr Use pandas/polars if you want 100% reliable (Human error not accounted). LLMs require lots of work to get reliable output from

Edit: There’s lots of misunderstanding for LLMs. You’re not supposed to use the bare LLM for any tasks except extremely basic ones that could be done by hand better. You need to build a system around them for your specific purpose. Using a raw LLM without a Retrieval Augmented Generation (RAG) system and complaining about hallucinations is like using the bare ass Linux kernel and complaining that you can’t use it as a desktop OS. Of course an LLM will hallucinate like crazy if you give it no data. If someone told you that you have to write a 3 page paper on the eating habits of 14th century monarchs in Europe and locked you in a room with absolutely nothing except 3 pages of paper and a pencil, you’d probably write something not completely accurate. However, if you got access to the internet and a few databases, you could write something really good and accurate. LLMs are exceptionally good at summarization and translation. You have to give them data to work with first.

permalink
report
reply

Women are very nice to see you soon as you scramble around in the dark hoping to pull the way out of work and get a little more of your time to think and there’s a lot more non nazis in your heart

I don’t know what this says about me lol

permalink
report
reply

In small datasets, the speed difference is minimal; but, once you get to large datasets with hundreds of thousands to millions of entries they do make quite a difference. For example, you’re a large bank with millions of clients, and you want to get a list of the people with the most money in an account. Depending on the sorting algorithm used, the processing time could range from seconds to days. That’s also only one operation, there’s so much other useful information that could be derived from a database like that using sorting.

permalink
report
parent
reply

I like how you sent a screenshot of a failed attempt

permalink
report
parent
reply

So what you’re saying is we destroy the unclean power sources

permalink
report
parent
reply

I’m pretty sure it’s because the use of absurd amounts of high fructose corn syrup. There’s 39g (can’t confirm, I got it from Google) of sugar in a 12oz (340ml) can. US soda is pretty much just carbonated high fructose corn syrup water with a bit of flavoring. There’s probably other significant differences too since the US has barely the minimum food safety laws.

permalink
report
parent
reply

I couldn’t paste the whole article since it’s too long for a single post so I pulled all the main points word for word from the article without the paragraphs explaining them. You can view the full article with the noscript and ublock extensions on Firefox to bypass the paywall.

  • He will indict Biden and his other political enemies
  • He will round up, intern, and deport undocumented immigrants
  • He will send the military to the border
  • He will invade Mexico
  • He will round up the homeless and send the National Guard into cities to fight crime
  • He will ban abortion nationwide
  • He will bring back the death penalty in a big way
  • He will make stuff more expensive by taxing all imported goods
  • He will reevaluate America’s participation in NATO
  • He will roll back all of Biden’s climate progress and reinvest in fossil fuels
  • He will construct “freedom cities” filled with flying cars
  • He will do what he can to flood the nation with guns
  • He will torch the First Amendment by going after non-MAGA media
  • He will legally delegitimize trans Americans
  • He will pardon the Jan. 6 rioters
  • He will gut the federal government and take unprecedented control of what’s left
permalink
report
reply