Cryptid Cafe

Conversation

Soatok Dreamseeker

Honeatly, the thing that will probably kill LLMs the hardest is someone writing a small language model that fits in JavaScript in a browser and hits comparable benchmarks.

Why bother with all those GPUs and energy usage if your Raspberri Pi could get comparable results?

Soatok Dreamseeker

soatok@furry.engineer

2 months ago

Reply to @soatok@furry.engineer

Is this possible? I dunno. I'm not specialized in this.

But if I wanted to fuck the GenAI bubble over and had the relevant background experience? This is what I'd explore.

Sluether

SuperSluether@tech.lgbt

2 months ago

Reply to @soatok@furry.engineer

@soatok this is a real “who would win” meme idea. And honestly, I don’t care for AI but in general I wish there was more interest in doing things efficiently instead of just throwing more and more resources at things.

I think about it every time I see posts about the average size of a webpage, or user testing on cheaper/older mobile devices.

TommyTorty10

TommyTorty10@infosec.exchange

2 months ago

Reply to @soatok@furry.engineer

@soatok about a year ago, a bunch of friends were trying to do this. Various Chinese companies and universities had just released a bunch of relatively efficient models, and my friends ran them on phones and pi's with a wait of 1-5 minutes for each response. Imo, that's too long to be really competitive, but it's real close. Idk where things are now, but Id guess that it's only a matter of time until someone makes a decent model that can run entirely on the gpu of a phone nice and fast.

Snow

snow@beasts.life

2 months ago

Reply to @soatok@furry.engineer

@soatok AI is a cancer Killing one kind of cancer isn't gonna make much of a difference. Sure you can kill LLMs but that just stops text slop. Does not really stop video slop or audio slop

Soatok Dreamseeker

soatok@furry.engineer

2 months ago

Reply to @snow@beasts.life

@snow You gotta make the whole cancer impossible to ever profit from so The Money will criminalize the whole thing

Lauren by RL Stan Account

nicfitzgerald@hachyderm.io

2 months ago

Reply to @TommyTorty10@infosec.exchange

@TommyTorty10 @soatok Chinese models are nearly there. DeepSeek R1 and Kimi K2 both being able to run on not much more than a Pi to get extremely decent results for the power needed.

gato blepus

thygrrr@tiggi.es

2 months ago

Reply to @soatok@furry.engineer

@soatok
Not sure it's possible, but I believe as soon as TPU access is given to Webassembly or WebGPU shaders aren't hindered by literal garbage hardware in consumer laptops... very possible to see a decent model distilled into 1GiB or so, so roughly three quarters of a chromium tab.

Cassandrich

dalias@hachyderm.io

2 months ago

Reply to @soatok@furry.engineer

@soatok If you want it just to be able to use language, sure. But they want a vastly overfitted model that lossily compresses the volume of human writing and can spit back out obfuscated plagiarism of arbitrary parts.

LogicalErzor

Logical_Error@fosstodon.org

2 months ago

Reply to @soatok@furry.engineer

@soatok ollama allows u to run models locally, and others have run ai on phones, so i wouldnt be surprised if someone already has done this as well

but currently the quality of the responses suffers. am excited about the future tho because the best models today (claude, chatgpt, gemini) will hopefully be the same quality as a small local model in 10 years

Soatok Dreamseeker

soatok@furry.engineer

2 months ago

Reply to @dalias@hachyderm.io

@dalias One model per language.

Want it to generate C? Download the C model.

Want it to write bad poetry? Download the ~Vogon~ I mean English model.

Ariadne Conill 🐰

ariadne@social.treehouse.systems

2 months ago

Reply to @soatok@furry.engineer

@soatok already on it :))))))

Soatok Dreamseeker

soatok@furry.engineer

2 months ago

Reply to @ariadne@social.treehouse.systems

@ariadne Oh hell yeah

Bersl

bersl2@furry.engineer

2 months ago

Reply to @soatok@furry.engineer

@soatok Maybe if they throw linear algebra at the wall for long enough, they'll find themselves the right basis. :P

ilobmirt

2 months ago

Reply to @soatok@furry.engineer

@soatok I'm sure it's extremely possible to get a specialized LLM to run on a toaster at this point. But isn't the point of why these companies are throwing all the money and hardware at a problem for training purposes? To pour over the massive content of the existing live Internet to build the generalized LLMs values?

Albeit, perhaps a positives of specializing is that it's rather finite than the extremely undefined model of a generalized LLM. There's only so much to the Rust / C / COBOL language than there is a model that takes in any plaintext language and outputs a desired product to some proximity.

I doubt however, using the tools of the hyper capitalists is an effective way to dismantle the oversized house of these tech billionaires and their unsightly toys. It takes community and an outright denial that these billionaires should use community land for selfish grift and exploitation.

ilobmirt

2 months ago

Reply to @ilobmirt

@soatok you deny these data centers that drink our clean water and pollute our air, and then eventually these tech billionaires will have to justify why they haven't built or scaled their company as much for all the investment they took in.

It should then fall like a house of cards. Because the cost of making things happen, when met with resistance, gets to become a non-trivial cost. But that requires all of us to be vigilant as to what's getting built near us and having a voice to dissent. We have to be that wrench in the gears that turn against community.

Raven Luni

RavenLuni@furry.engineer

2 months ago

Reply to @soatok@furry.engineer

@soatok I might have something that could take a shot at it - a v2 of something I first wrote in 2008...

Ariadne Conill 🐰

ariadne@social.treehouse.systems

2 months ago

Reply to @soatok@furry.engineer

@soatok i should clarify: i am working on two models, one which takes an input and tries to spit out structured data

and another which takes structured data and outputs prose

Ariadne Conill 🐰

ariadne@social.treehouse.systems

2 months ago

Reply to @ariadne@social.treehouse.systems

@soatok in an assistant scenario, this allows the assistant to ascertain what the user wants, and then allows the assistant to report back with the results.

Ariadne Conill 🐰

ariadne@social.treehouse.systems

2 months ago

Reply to @ariadne@social.treehouse.systems

@soatok all of this will be AGPL because fuck Big Tech

Cassandrich

dalias@hachyderm.io

2 months ago

Reply to @soatok@furry.engineer

@soatok Right but that's not all they want. They want it to generate obfuscated plagiarism of poetry. They want it to generate "copyright-free" copies of arbitrary FOSS programs, songs, etc. This inherently requires the largeness of the model because the plagiarism is buried in the overfitting.

Varx

varx@defcon.social

2 months ago

Reply to @Logical_Error@fosstodon.org

@Logical_Error @soatok

I'll just leave these links here

https://github.com/ngxson/wllama

https://github.com/huggingface/smollm

Phillip

phillip@social.lol

2 months ago

Reply to @varx@defcon.social

@varx @Logical_Error @soatok fwiw an Ente employee made a local LLM app that runs in your browser during a hackathon. Haven’t tried it, but seems neat

https://ensu.ente.io/

Müllermeier

muellermeier@bark.lgbt

2 months ago

Reply to @soatok@furry.engineer

@soatok Maybe, but I wouldn’t bet on that. They would try and extrapolate whatever method you used to make it run on a Raspberry Pi to make it scale up to data-center level again. If it can’t run better this way because of diminishing returns or whatever, it has to run more often instead. The large energy-cugging data centers are the point, not the performance of the AI. Same as more energy efficient LEDs didn’t lead to less power consumption but to more lamps in use.

Maybe that won’t happen here, but like I said, I’m not sure.

Soatok Dreamseeker

soatok@furry.engineer

2 months ago

Reply to @muellermeier@bark.lgbt

@muellermeier Right. This would need to be something that "satisfies" while obviating datacenters to be a death knell.

Yolfen🐾 🦯

Yolfen@chitter.xyz

2 months ago

Reply to @soatok@furry.engineer

@soatok With more and more new personal compute platforms featuring an NPU, local SLM should absolutely be the outcome strived for. Local processing of streaming text-to-speech voices. Local uncensored image descriptions. Something useful like that a user might actually want a system with an NPU for. But that doesn't sell token subscriptions and gatekeep access..

Soatok Dreamseeker

soatok@furry.engineer

2 months ago

Reply to @soatok@furry.engineer

There's a lot of interesting discussion in the replies.

My idea is to fight fire with fire. Not everyone has the stomach for that. That's okay. You don't gotta use those tools.

Winter Jo ❄️

ShadowJonathan@tech.lgbt

2 months ago

Reply to @soatok@furry.engineer

@soatok WASM, i'd suggest

Facade

lycansubscribe@furry.engineer

2 months ago

Reply to @soatok@furry.engineer

@soatok the thing that will kill LLMs the hardest is the fact that u need to charge like 1k a month to make it profitable after investors stop dumping money in and who tf would pay that much drgn_think_woozy

cameron

cameron@otsuka.systems

2 months ago

Reply to @soatok@furry.engineer

@soatok i think if you were able to do this, you might have also come up with the best compression algorithm ever designed

Soatok Dreamseeker

soatok@furry.engineer

2 months ago

Reply to @cameron@otsuka.systems

@cameron Heh, it wouldn't be lossless though

Luna Lactea

jackemled@furry.engineer

2 months ago

Reply to @soatok@furry.engineer

@soatok I don't think many people would be able to tell the difference between Cleverbot & chatgpt. I'm sure they'll get away with something as light as a Markov chain.

Alexandre Oliva

lxo@snac.lx.oliva.nom.br

2 months ago

Reply to @soatok@furry.engineer

killing it as in making it ubiquitous?!?

Soatok Dreamseeker

soatok@furry.engineer

2 months ago

Reply to @lxo@snac.lx.oliva.nom.br

@lxo Killing it as in making any hope of a return on investment in all these datacenters impossible

Alexandre Oliva

lxo@snac.lx.oliva.nom.br

2 months ago

Reply to @soatok@furry.engineer

it's already known to be impossible

but LLM is not those companies or those datacenters or those investments

I have no sympathy for them, but we'll be no better off if the bullshit generators become even more pervasive and wasteful

how does the following sound to you: if only people started mining cryptocurrencies on their browsers we'd kill cryptocurrencies!

it's just as much of a nonsequitur IMHO

Soatok Dreamseeker

soatok@furry.engineer

2 months ago

Reply to @lxo@snac.lx.oliva.nom.br

@lxo You're completely misunderstanding what I'm suggesting, to the point that I question whether further discussion is even worthwhile

Alexandre Oliva

lxo@snac.lx.oliva.nom.br

2 months ago

Reply to @soatok@furry.engineer

fair enough. ISTM you're conflating "killing the investment bubble" or "killing the worst LLM offenders", which seems to be what you're aiming for, with "killing LLM", which you wrote at first, and that's what I'm trying to point out. maybe there are a lot of people pretty unhappy with the bubble and with its worst offenders, but I'd like to think most of the people are actually more worried about other environmental and social consequences of LLMs, and making them pervasive won't alleviate it, it will probably make it worse. do I misunderstand what you're aiming at, or do you disagree with my assessment of the consequences, or what?

Soatok Dreamseeker

soatok@furry.engineer

2 months ago

Reply to @lxo@snac.lx.oliva.nom.br

@lxo The way things have been going for years is this:

LLMs are being made larger and larger, which requires larger GPU clusters and more datacenter space.
Generative AI, built on LLMs, are being used to commit plagiarism at scale. This is where most people's outrage is.
The AI investment circlejerk.
The ongoing centralization of compute, which concentrates power and drives enshittification.
Grifters and media hype.
Top-down "AI-first" dictives from Tech CEOs who are cashing in on investments related to the previous 5.
The ongoing harms of (2) to creative work.

I'm focused on killing 1, which directly affects 4 and 5 in some way. I'm not offering a silver bullet for 2, 3, 6, or 7.

But if people with AI expertise were to choke out the centralization of this tech by obviating the big data center investments through "can run on a low-power device in your home", that won't be without impact.

Soatok Dreamseeker

soatok@furry.engineer

2 months ago

Reply to @soatok@furry.engineer

@lxo In short, I'm suggesting that people who have the expertise I lack fight fire with fire.

Mark T. Tomczak

mark@mastodon.fixermark.com

1 month ago

Reply to @soatok@furry.engineer

@soatok I've been quietly beating the drum for awhile now that a lot of the anti-AI rhetoric is medium- and long-term moot because the cat. Is. Out. Of. The. Bag.

Yes, OpenAI are fascists and a problem and down with them.

Smaller models already run on a Raspberry Pi, and there's no particular reason to believe at this time that the next iteration of the raw research won't make training or cross-training them better / faster / cheaper. Most of the anti-AI arguments I see don't stay relevant when it becomes "A thing you slap on a shelf PC and have running in your own closet," and I don't think a lot of people are talking about what that world looks like.

Soatok Dreamseeker

soatok@furry.engineer

1 month ago

Reply to @mark@mastodon.fixermark.com

@mark A lot of the value of slop comes in three buckets:

AI (as used today) disempowers organized labor
AI launders mass-scale copyright violations (and plagiarism)
It provides hardware investment (especially GPUs) an offramp from cryptocurrency mining (and also, presently, increases demand for fossil fuels)

The actual value of language models that can run on a Raspberry Pi and produce useful results without being one or more of those three buckets is something that gets left off the table during these discussions because of how egregious those three are.

Whether that's an error or a strategic decision to focus on the societal-scale impact of those three is not my place to say.

Mark T. Tomczak

mark@mastodon.fixermark.com

1 month ago

Reply to @soatok@furry.engineer

@soatok Incidentally, catching up on your blog:

If that doesn’t make you feel all warm and fuzzy, remember that many industries still use FTP to transfer encrypted ZIP files back and forth in 2026.

I worked as liaison between my company and a company they'd acquired for about nine months. The acquiree's entire business niche was something that really reframed my understanding of where I set the bar for technical literacy and competency in general industry; I just flat-out thought we were training more software engineers than we were, and I was wrong.

This company's business model was that they did high-touch data massaging between

Big online retail search engines like Amazon, Baidu, and Google Shopping
Manufacturers who (a) made really high-quality products in their niche and (b) had an IT team that would celebrate with a pizza party if they could successfully implement one new data pipeline end-to-end. Per quarter.

The job was literally "You FTP us your inventory in whatever format you were able to conduct enough black-arts rituals to get the internal tracking system that someone else built for you to spit out, and we'll turn it into the right formats for these online retail markets." The mechanics of the job were a lot of "Take this not-actually-compliant CSV file and build heuristics to guess which commas were column separators and which were a place where their system had left out quotation marks."

... the company grossed past $1 million annually.

Kay Ohtie

KayOhtie@blimps.xyz

1 month ago

Reply to @mark@mastodon.fixermark.com

@mark @soatok Used to work for a company in the armpit of healthcare, but one of the products we offered accepted input in one of a few formats with published standards.

_Industry leader_ software wasn't properly compliant with any of the standards, and I dreaded every I saw a call with a particular caller ID because I knew who'd be on the other end by first name and how she'd be saying we didn't process a file correctly... until we pointed out syntax errors every time.

It wasn't even consistent, and I was just on support, but it was wild to see _smaller_ vendors than a company who's name rhymes with Mick Esson get it right every time, and them simply throw their name around as if that meant it.

Also their software couldn't do public key auth for SFTP, only password, which we didn't permit for SFTP upload.

I'm still trying to figure out if they called all the other vendors out there with similar services to us or they just rolled their eyes and eventually coded around this vendor's lack-of-compliance, or said "fuck it" and started just taking CSV files.