Honeatly, the thing that will probably kill LLMs the hardest is someone writing a small language model that fits in JavaScript in a browser and hits comparable benchmarks.
Why bother with all those GPUs and energy usage if your Raspberri Pi could get comparable results?
Is this possible? I dunno. I'm not specialized in this.
But if I wanted to fuck the GenAI bubble over and had the relevant background experience? This is what I'd explore.
@soatok this is a real “who would win” meme idea. And honestly, I don’t care for AI but in general I wish there was more interest in doing things efficiently instead of just throwing more and more resources at things.
I think about it every time I see posts about the average size of a webpage, or user testing on cheaper/older mobile devices.
@soatok about a year ago, a bunch of friends were trying to do this. Various Chinese companies and universities had just released a bunch of relatively efficient models, and my friends ran them on phones and pi's with a wait of 1-5 minutes for each response. Imo, that's too long to be really competitive, but it's real close. Idk where things are now, but Id guess that it's only a matter of time until someone makes a decent model that can run entirely on the gpu of a phone nice and fast.
@soatok AI is a cancer Killing one kind of cancer isn't gonna make much of a difference. Sure you can kill LLMs but that just stops text slop. Does not really stop video slop or audio slop
@snow You gotta make the whole cancer impossible to ever profit from so The Money will criminalize the whole thing
@TommyTorty10 @soatok Chinese models are nearly there. DeepSeek R1 and Kimi K2 both being able to run on not much more than a Pi to get extremely decent results for the power needed.
@soatok
Not sure it's possible, but I believe as soon as TPU access is given to Webassembly or WebGPU shaders aren't hindered by literal garbage hardware in consumer laptops... very possible to see a decent model distilled into 1GiB or so, so roughly three quarters of a chromium tab.
@soatok If you want it just to be able to use language, sure. But they want a vastly overfitted model that lossily compresses the volume of human writing and can spit back out obfuscated plagiarism of arbitrary parts.
@soatok ollama allows u to run models locally, and others have run ai on phones, so i wouldnt be surprised if someone already has done this as well
but currently the quality of the responses suffers. am excited about the future tho because the best models today (claude, chatgpt, gemini) will hopefully be the same quality as a small local model in 10 years
@dalias One model per language.
Want it to generate C? Download the C model.
Want it to write bad poetry? Download the ~Vogon~ I mean English model.
@soatok Maybe if they throw linear algebra at the wall for long enough, they'll find themselves the right basis. :P
@soatok I might have something that could take a shot at it - a v2 of something I first wrote in 2008...
@soatok i should clarify: i am working on two models, one which takes an input and tries to spit out structured data
and another which takes structured data and outputs prose
@soatok in an assistant scenario, this allows the assistant to ascertain what the user wants, and then allows the assistant to report back with the results.
@soatok all of this will be AGPL because fuck Big Tech
@soatok Right but that's not all they want. They want it to generate obfuscated plagiarism of poetry. They want it to generate "copyright-free" copies of arbitrary FOSS programs, songs, etc. This inherently requires the largeness of the model because the plagiarism is buried in the overfitting.
@varx @Logical_Error @soatok fwiw an Ente employee made a local LLM app that runs in your browser during a hackathon. Haven’t tried it, but seems neat
@soatok Maybe, but I wouldn’t bet on that. They would try and extrapolate whatever method you used to make it run on a Raspberry Pi to make it scale up to data-center level again. If it can’t run better this way because of diminishing returns or whatever, it has to run more often instead. The large energy-cugging data centers are the point, not the performance of the AI. Same as more energy efficient LEDs didn’t lead to less power consumption but to more lamps in use.
Maybe that won’t happen here, but like I said, I’m not sure.
@muellermeier Right. This would need to be something that "satisfies" while obviating datacenters to be a death knell.
@soatok With more and more new personal compute platforms featuring an NPU, local SLM should absolutely be the outcome strived for. Local processing of streaming text-to-speech voices. Local uncensored image descriptions. Something useful like that a user might actually want a system with an NPU for. But that doesn't sell token subscriptions and gatekeep access..
There's a lot of interesting discussion in the replies.
My idea is to fight fire with fire. Not everyone has the stomach for that. That's okay. You don't gotta use those tools.
@soatok the thing that will kill LLMs the hardest is the fact that u need to charge like 1k a month to make it profitable after investors stop dumping money in and who tf would pay that much 
@soatok i think if you were able to do this, you might have also come up with the best compression algorithm ever designed
@soatok I don't think many people would be able to tell the difference between Cleverbot & chatgpt. I'm sure they'll get away with something as light as a Markov chain.
@lxo Killing it as in making any hope of a return on investment in all these datacenters impossible
@lxo You're completely misunderstanding what I'm suggesting, to the point that I question whether further discussion is even worthwhile
@lxo The way things have been going for years is this:
I'm focused on killing 1, which directly affects 4 and 5 in some way. I'm not offering a silver bullet for 2, 3, 6, or 7.
But if people with AI expertise were to choke out the centralization of this tech by obviating the big data center investments through "can run on a low-power device in your home", that won't be without impact.
@lxo In short, I'm suggesting that people who have the expertise I lack fight fire with fire.
@soatok I've been quietly beating the drum for awhile now that a lot of the anti-AI rhetoric is medium- and long-term moot because the cat. Is. Out. Of. The. Bag.
Yes, OpenAI are fascists and a problem and down with them.
Smaller models already run on a Raspberry Pi, and there's no particular reason to believe at this time that the next iteration of the raw research won't make training or cross-training them better / faster / cheaper. Most of the anti-AI arguments I see don't stay relevant when it becomes "A thing you slap on a shelf PC and have running in your own closet," and I don't think a lot of people are talking about what that world looks like.
@mark A lot of the value of slop comes in three buckets:
The actual value of language models that can run on a Raspberry Pi and produce useful results without being one or more of those three buckets is something that gets left off the table during these discussions because of how egregious those three are.
Whether that's an error or a strategic decision to focus on the societal-scale impact of those three is not my place to say.
@soatok Incidentally, catching up on your blog:
If that doesn’t make you feel all warm and fuzzy, remember that many industries still use FTP to transfer encrypted ZIP files back and forth in 2026.
I worked as liaison between my company and a company they'd acquired for about nine months. The acquiree's entire business niche was something that really reframed my understanding of where I set the bar for technical literacy and competency in general industry; I just flat-out thought we were training more software engineers than we were, and I was wrong.
This company's business model was that they did high-touch data massaging between
Big online retail search engines like Amazon, Baidu, and Google Shopping
Manufacturers who (a) made really high-quality products in their niche and (b) had an IT team that would celebrate with a pizza party if they could successfully implement one new data pipeline end-to-end. Per quarter.
The job was literally "You FTP us your inventory in whatever format you were able to conduct enough black-arts rituals to get the internal tracking system that someone else built for you to spit out, and we'll turn it into the right formats for these online retail markets." The mechanics of the job were a lot of "Take this not-actually-compliant CSV file and build heuristics to guess which commas were column separators and which were a place where their system had left out quotation marks."
... the company grossed past $1 million annually.
@mark @soatok Used to work for a company in the armpit of healthcare, but one of the products we offered accepted input in one of a few formats with published standards.
_Industry leader_ software wasn't properly compliant with any of the standards, and I dreaded every I saw a call with a particular caller ID because I knew who'd be on the other end by first name and how she'd be saying we didn't process a file correctly... until we pointed out syntax errors every time.
It wasn't even consistent, and I was just on support, but it was wild to see _smaller_ vendors than a company who's name rhymes with Mick Esson get it right every time, and them simply throw their name around as if that meant it.
Also their software couldn't do public key auth for SFTP, only password, which we didn't permit for SFTP upload.
I'm still trying to figure out if they called all the other vendors out there with similar services to us or they just rolled their eyes and eventually coded around this vendor's lack-of-compliance, or said "fuck it" and started just taking CSV files.