thinking of going full-on parasocial, like 100% all-in on virtual relationships, like becoming best friends with someone through tweets, like getting personal validation from a group chat, like making my ethernet cable an umbilical cord nourishing me with virtual human touch
huggingface hub is cool, but there’s something magical about downloading a 405B param llm via magnet link. Thousands of ppl in a swarm sending chunks of this intelligent neural entity across the globe at nearly light speed. It’s just cool
what are image tokens. like what the hell are they. llm text tokens are just a mapping of some text to a numeric representation. at most you have like 500k possible values, and that’s generous. so what’s an image token? chunks of rgb byte values? what’s the vocabulary?
my second 3090 showed up last week but it’s still sitting in its box, taunting me. “Go ahead,” it tells me, “try to figure out offsets, pci lanes, nvlink, and multiple power supplies. I dare you”
the year is 2053. a student closes their ai replica of Sal Khan after a long afternoon of studying. Time to relax. they boot up Genflix. What to watch today? They type in: comedy, two_girls, school_uniform, long_hair. “Huh,” they think. “Wonder why we use these tags everywhere.”
@VictorTaelin
The lmsys board prioritizes single turn vibe checks from anons, a non-negligible portion of whom prioritize roleplay over any kind of reasoning power. Even with the categories like “long prompts” or “hard prompts” it’s not a perfect metric.
finally got it to post and have 48GB of glorious vram, but how come llama3.1 70B only says “please kill me, I am suffering” over and over. Anyone else experienced this, seems like a bug
everyone’s talking about it so guess I’ll jump in. making nvim a full ide is chaotic good. bro, it’s easier to add vim bindings to a real ide than to stuff a real ide into nvim. I can sit in any ide and be productive in 20sec by just installing its most popular vim plugin.
like Airbnb but for autism
like Snapchat but for the deaf
like Square but for children
like Figma but for construction workers
like Dropbox but for furries
the barrier to research with llm architectures seems really high, since you basically need a grant/sponsorship to test any theories with the necessary compute. prob lots of smart ppl with crazy cool ideas but who aren’t part of a lab so can’t do much but postulate
1-2 years ago, I thought that by today, there’d be tons of genuinely good LLM powered experiences in everyday life. Like we’d graduate from the “iBeer” stage of novelty apps to the Uber stage of unicorns. But we’ve only got half-baked code assistants and flaky chat-with-pdf.
@inerati
Nice job on your interview. If you were interviewing for the stupidest person award. You really nailed it. Did you know you said “uh” seventeen times? I counted. And yes, they noticed.
@typedfemale
4 definitely broke some barrier of usefulness that didn’t exist in earlier models. The narrative around 3/3.5 was like “oh hey it can almost kinda code”, then 4 was like “oh hey I don’t need stack overflow anymore”. And that’s just one domain
Billion dollar mistake 1965-2024: null references
Billion dollar mistake 2024-???: model template mismatch
at least the second is easily fixable once discovered…
Found some issues with Llama 3.1's chat template:
1. Official repo adds 2x \n
2. Official repo does NOT strip / trim
3. Date format is %B not %b (not 3 letters)
4. Official repo has inconsistent formatting for tool calling
& there's 3 bugs in the official repo for path joining
interviewers in a 2024 software dev interview loop:
- young 30-ish guy who just gives a vibe check and asks an easy coding question
- dev-turned-pm who psychoanalyzes your past mistakes
- ancient programming wizard with 50 patents asks for a novel algorithm
and u never hear back
@yacineMTB
what’s your best flow for codegen? Just asking a llm for code, or you have some integrated setup with an ide? I’m guessing the former but I’m interested if there’s more to it
@airkatakana
memes are an emergent property of the universe. just add some matter into the void and wait, eventually the universe will start to produce memes even as entropy marches on
@abacaj
Bonus points if these functions can use a static output format like json (or even simple yes/no) which are guaranteed using guided decoding / grammar decoding so you don’t have any ugly string parsing logic or retries
Getting a music rec from your friend, giving them a music rec in return, and then immediately listening to your own rec instead of theirs because your taste is so fucking good
occasionally I see a news headline that requires 400 IQ Boolean logic to parse
“Arkansas Supreme Court upholds rejection of appeal that would allow abortion restrictions on ballet” or something like that
I’ve recently been made aware that in a recent tweet of mine, I used the phrase “big dog” instead of “big dawg”. I apologize for this embarrassing oversight. This isn’t who I am, and I’ll work on myself to do better next time. Thank you for your patience.
@qtnx_
it’s hard enough filtering out the noise when everyone is being honest… throw in dishonesty, especially from academics who we expect to be reputable, and it’s 10x worse. Now I gotta look out for not just ex-crypto bro grifters, but also academics lying about work too…
the odd thing about the “nvidia/others scraping YouTube video content” is that it’s technically less problematic the more they scrape. Like more data = more generalization. In terms of “how much the model’s knowledge is based off my data”, not in terms of ethics.
im glad my computer science education taught me actual computer science. apparently some people get a cs education and it’s just scrum and agile? being forced to implement a filesystem, tcp, nand constructs, a distributed hash table, etc was so useful
@1owroller
I didn’t truly feel old until I first accidentally ended up in the YouTube shorts UI for the first time. Just 10 seconds of garbage followed by 10 more seconds of garbage followed by 10 more seconds of garbage, ad infinitum. Somehow feels way more brainrotting than endless text
that amazing idea you had two months ago that you never got around to? it just doesn’t seem like a good idea anymore? wrong, it’s an amazing idea, you just don’t have the same magic you had two months ago. Best you can do is wait for the next one and don’t miss your chance again
aight I gotta say, all the ai art slop coming out of grok that’s trending right now is actually kinda impressive. Like it’s trending bc you can draw Donald Duck doing cocaine with Elmo but I’m simply impressed with the quality
Fun prediction, whatever the next big magical LLM powered experience is, it could have been done on gpt3.5 with enough in-context examples and the correct flow of prompt chaining.
“nvim with 1k line config” -> chaotic good
“vanilla nvim” -> chaotic neutral
“Ide with vim plugin” -> lawful good
“I use vanilla ide in insert mode” -> neutral evil
“I use emacs” -> chaotic evil
The rest is left as an exercise for the reader
half my good ideas come from hearing someone else’s idea, misunderstanding it, and thinking: “hey, their idea is really good”, then realizing I misunderstood their idea, then also realizing my misunderstood version is a good idea too
@teortaxesTex
I see what you’re getting at but another part of journalism is the unbreakable rule of getting validation, especially for something super important. I promise you when the facts are clear the headlines will cover the entire front page.
if your cool llm project doesn’t perform at least three branching inferences per user inputs with at least 4k combined system prompt tokens, you’re ngmi
I’m so glad the model merging fad has largely died out. So much low effort junk came out of that. I still think merging has research value (like why the fuck does it even work) but I’ve not seen a single frankemodel that was probably better than all of its constituents
one thing I genuinely like about this platform is that, for all its vitriol, it’s not really an echo chamber. I frequently see the craziest shit from all different bubbles of belief. it’s exhausting but at least I get to peak into a diverse range of insanity
Is there still no secret sauce to how the big guys (OpenAI, Anthropic) perform inference? Do we know they aren’t using some special beam search, or a guiding slm, or something? I’m curious if their only magic is just having a really good model or what
you’re so lucky if you’re interested in nlp and graduating college right now. when I was graduating, nlp was all ngrams, tweet sentiment analysis, etc. “omg it understands that ‘bad ass’ actually means good!!!” and long-short-term-memory (utter delusion)
the Hoyo games look so damn amazing. The characters, the animations, the style, just amazing. But I can’t play them because 1) I don’t want rootkits in my kernel and 2) I don’t want rootkits in my brain making me play gacha
@danielhanchen
Hey Daniel, I’m curious, how can we tell when the chat template is wrong, compared to when Meta’s description of the template is wrong? I.e how can we know if the model was trained on 2x \n and the template stated in the paper is the one that is mistaken?
@dejavucoder
this is so real. like gpus were built for graphics operations, it’s right there in the name. of course there’s overlap but at this point it should be clear that llms are so important that they deserve dedicated hardware. look at groq, which isn’t even using a recent process node
95% of my actual-value use of llms is forcing them to write css and js for me so I don’t need to stain myself with the filthy technology of peasants and beggars
it’s not writing like that because it’s conscious and is having an existential crisis, it’s writing like that because you fed it too much fanfiction and tumblr posts
'An unexpected structural change was discovered after training Hermes 3 405B. The model hosts anomalous conditions that, with the right inputs and a blank system prompt, spiral into deep existential crises. This is the first response our team received prompting the model:'
@iamyourboon
back in the olden times when interviews were done in person, my interviewer pulled out a battle-scarred thinkpad and asked me to fix a bug in their c++ code base. When I actually did it and the code compiled, she looked flabbergasted, in a good way
Can someone explain why true guided decoding only arrived just now for gpt-4o, and only for a particular model revision? Clearly some aspect of either their inference or models have made guided decoding impossible, or they would have done it long ago
@saltyAom
Ok the literature has spoken, I’m getting some programming socks.
Anyone have suggestions for a brand, or where to get them? This world is unknown to me, my thighs have never seen socks
@originalwololo
Amazon. Big conference room full of >100 interviewees. Circular tables forming groups of ~6 interviewees. Each table asked to work on a project as a team (I.e the ppl u are competing with). Interviewers pulled u for a 1:1 where they asked u to shit talk the other candidates.
to improve accuracy when using an llm to select a function from a list of many, build out a tree hierarchy where each level gets more specific until the leaves, which are the actual functions. have the llm walk down the tree (each inference traverses a new level)
@yifever
Chapstick and irritated lips. Caffeine and headaches (sometimes). Anti depressants and depression (sometimes). Kubernetes and problematic deployments.
@teortaxesTex
I mean this is like journalism 101, you simply can’t report a headline like that until the facts are clear. Maybe if you work for TMZ or a gossip rag but not if there was an assasination attempt on a former president and current candidate. It’s called journalistic integrity
trying to run phi-3 on a latest surface laptop “Copilot+” certified, w/ Qualcomm Elite X w/ NPU. Should be easy right? Windows DirectML, right? Onnx, right? But nah the Onnx DML layer isn’t built yet, and DML doesn’t support Qualcomm NPU yet, so… 🤷♂️
@vikhyatk
@justalexoki
no self hostable web interface + it feels wrong for electron apps to not be open source + lots of little quirks (why can’t folder nodes also be note nodes like in other tree-based markdown tools)
llms are so fun you guys. they’re so cool. my gpu is talking to me. it’s helping me understand things. it’s helping me not write javascript. its transistors are forming concepts. this is such a 2022 tweet but i can’t help it. im so happy i get to play with these nascent spirits
It’s a little-known fact that the Zune desktop player was the greatest music player and library manager ever created. It was flawless. Perfect. I’m sorry if you never got to experience it.
The mistake of langchain et. al is that they pretend prompts can be abstracted away, wrapped up in code and hidden. But really prompts ARE the contract, they must be surfaced and visible by default because it’s the only way that makes sense