7 points

Kind of looks like the writing system of Georgian language but I’m not sure

permalink
report
reply
6 points

Well, then I was wrong

permalink
report
parent
reply
2 points

I don’t think so:

(ქართული)	გამარჯობა
permalink
report
parent
reply
21 points

No, this is Glagolitic script, an alternative to Cyrillic. Mostly used in old Slavic scriptures, was later replaced by Cyrillic and Latin.

Most Slavs themselves don’t know how to read this

permalink
report
parent
reply
4 points

It’s a dead script that was not that common in the first place, in Kievan Rus’ it was even used as a form of encryption in XI—XVI centuries for how little spread it was. It is also very different from modern Cyrillic. So, saying “most Slavs don’t know how to read it” is a bit of an understatement. Noone knows how to read it, apart from some linguists and overzealous Witcher fans.

permalink
report
parent
reply
3 points

It was widespread in Croatia until the late middle ages, about XIV-XV century.

Noone knows how to read it, apart from some linguists and overzealous Witcher fans.

I could fluently read and write it in high school. Was bored.

permalink
report
parent
reply
18 points

Nah, Georgian is arcs and circles everywhere, like this: ეს ქართული დამწერლობაა.

permalink
report
parent
reply
69 points

This might be happening because of the ‘elegant’ (incredibly hacky) way openai encodes multiple languages into their models. Instead of using all character sets, they use a modulo operator on each character, to make all Unicode characters represented by a small range of values. On the back end, it somehow detects which language is being spoken, and uses that character set for the response. Seeing as the last line seems to be the same mathematical expression as what you asked, my guess is that your equation just happened to perfectly match some sentence that would make sense in the weird language.

permalink
report
reply
16 points

I suppose it’s conceivable that there’s a bug in converting between different representations of Unicode, but I’m not buying and of this “detected which language is being spoken” nonsense or the use of character sets. It would just use Unicode.

The modulo idea makes absolutely no sense, as LLMs use tokens, not characters, and there’s soooooo many tokens. It would make no sense to make those tokens ambiguous.

permalink
report
parent
reply
7 points

I completely agree that it’s a stupid way of doing things, but it is how openai reduced the vocab size of gpt-2 & gpt-3. As far as I know–I have only read the comments in the source code– the conversion is done as a preprocessing step. Here’s the code to gpt-2: https://github.com/openai/gpt-2/blob/master/src/encoder.py I did apparently make a mistake, as the vocab reduction is done through a lut instead of a simple mod.

permalink
report
parent
reply
32 points

Do you have a source for that? Seems like an internal detail a corpo wouldn’t publish

permalink
report
parent
reply
20 points

Can’t find the exact source–I’m on mobile right now–but the code for the gpt-2 encoder uses a utf-8 to unicode look up table to shrink the vocab size. https://github.com/openai/gpt-2/blob/master/src/encoder.py

permalink
report
parent
reply
3 points

Seriously? Python for massive amounts of data? It’s a nice scripting language, but it’s excruciatingly slow

permalink
report
parent
reply
36 points

It looks so badass, I could have used that script now because im Ukrainian but instead I have cyrillic script which is so boring

permalink
report
reply
5 points

rebel against Russian imperialism, return to glagolitic

permalink
report
parent
reply
2 points
*

Cyrillic is literally greek+glagolitic and it was partly a diplomatic creation of the Eastern Roman Empire(aka Byzantine Empire), in order to bring the slavs culturally closer to them.

Russians have nothing to do with it, other than them claiming they are the continuation of Eastern Roman Empire, something which is kinda laughable but whatever dont let your dreams be dreams.

permalink
report
parent
reply
4 points
*

It’s not russian, If my bulgarian friend is right then it was created by a bulgarian guy

permalink
report
parent
reply
4 points

There is no single person responsible for Cyrillic script. It is mostly believed to be created by mixing and changing Greek and Glagolic scripts by the scholars of Preslav Literary School, which was indeed in Bulgaria. After a while, Peter the Great changed it a lot. And then Stalin stomped out almost all the deviations in the usage of the script.

The last part is mostly why it is considered Russian. A lot of languages suffered because of Moscow just forcing them to use the version of Cyrillic that Russians were using.

permalink
report
parent
reply
243 points

The thing that I find the most funny about this post, is the fact that you call this Italian

permalink
report
reply
210 points

how am i supposed to know how italians speak. i’ve never seen one

permalink
report
parent
reply
5 points

It’s a me, Mario!

permalink
report
parent
reply
47 points

From my experience, they speak mostly with their hands

permalink
report
parent
reply
14 points

🫰🤙🫵👌✊🫳🫸🤲🤌

permalink
report
parent
reply
23 points

They’re not real, but they can hurt you.

permalink
report
parent
reply
4 points

like reverse vampires ?

permalink
report
parent
reply
5 points

Ne sei sicuro?

permalink
report
parent
reply
2 points

Typical 'muricans being unable to comprehend anything besides English.

/s i don't mean to be racist

yes i was a r/2we4u user, how’d you know?

permalink
report
parent
reply
66 points

Blud could’ve chosen Runic, Egyptian, Ancient Romanian used by Vlad the Impaler, Mesapotamian or even Harappan Indic. But Italian is it.

permalink
report
parent
reply
11 points

Blud I’m gonna be fr no cap rn but wtf does blud mean I’ve been meaning to ask for months and I still don’t get it

permalink
report
parent
reply
10 points

It’s a Jamaican slang for ‘friend’ or ‘brother’.

permalink
report
parent
reply
1 point

Needs more fam

permalink
report
parent
reply
125 points

Let me simplify it: proceeds to print the same expression

permalink
report
reply
55 points
*

Typical AI behavior

Edit: and then it will gaslight you if you say the answer is the same.

permalink
report
parent
reply
18 points

Fucking hate when do that.

You are repeating the same mistake.

I’m sorry for repeating the same mistake, here’s a new solution with corrections *proceed to write the exactly thing already told it was wrong*

permalink
report
parent
reply
4 points

Gotta remember they were trained off of the internet. Which is to say the largest body of people loadly professing the opinions are fact and refusing to say otherwise.

permalink
report
parent
reply
13 points

Nope, they replaced an asterisk with an arrow!

permalink
report
parent
reply
4 points

Oh, right, now I get it!

permalink
report
parent
reply

Programmer Humor

!programmerhumor@lemmy.ml

Create post

Post funny things about programming here! (Or just rant about your favourite programming language.)

Rules:

  • Posts must be relevant to programming, programmers, or computer science.
  • No NSFW content.
  • Jokes must be in good taste. No hate speech, bigotry, etc.

Community stats

  • 6.4K

    Monthly active users

  • 1.5K

    Posts

  • 35K

    Comments