OpenAI have just released gpt-oss, an AI large language model (LLM) available for local download and offline use licensed under Apache 2.0, and optimized for efficiency on a variety of platforms without compromising performance. This is their first such “open” release, and it’s with a model whose features and capabilities compare favorably to some of their hosted services.
OpenAI have partnered with ollama
for the launch which makes onboarding ridiculously easy. ollama
is an open source, MIT-licensed project for installing and running local LLMs, but there’s no real tie-in to that platform. The models are available separately: gpt-oss-20b can run within 16 GB of memory, and the larger and more capable gpt-oss-120b requires 80 GB. OpenAI claims the smaller model is comparable to their own hosted o3-mini “reasoning” model, and the larger model outperforms it. Both support features like tool use (such as web browsing) and more.
LLMs that can be downloaded and used offline are nothing new, but a couple things make this model release a bit different from others. One is that while OpenAI have released open models such as Whisper (a highly capable speech-to-text model), this is actually the first LLM they have released in such a way.
The other notable thing is this release coincides with a bounty challenge for finding novel flaws and vulnerabilities in gpt-oss-20b. Does ruining such a model hold more appeal to you than running it? If so, good news because there’s a total of $500,000 to be disbursed. But there’s no time to waste; submissions need to be in by August 26th, 2025.
lmao. “Open”AI continuing the misleading name trend.
This is not Open Source Software. This is an open weights model. Huge difference. You have no idea what’s in that blob.
Why is nobody calling this shit out?
Because sneakily, nobody claimed it was Open Source Software, they just called it something-OSS. There is no trademark on ‘OSS’ so they can. It is marketing, so you can be sure you will be tricked one way or another. Now you have blown their cover so they will ignore it and sneakily do their next trick.
I have a 12GB GPU. Drats! Just falling short
I came here to note that the term GPT-OSS is never explained. Apparently it is on-par for the hackaday course. Just because it is a common term, I think it is still good journalistic practice to explain what is actually meant with an acronym title like that?
generative pre-trained transformer
At this points it’s not reply much of an acronym as much as it’s just a name.
(The meaning is irrelevant, slight differentiator at best).
Regardless, I was playing with the model yesterday (the 20b variant) on my MBP. I asked it for a project plan for a React based Dashboard project with Redis data store, and using WebSockets.
It gave me a nice breakdown with some suggested libraries and implementation strategy.
I picked this request as I’ve actually built such a project recently and wanted to do a quick comparison of the features list.
The response was well structured with justifications for the suggested modules and features.
It’s a bit slow in my M1, but I didn’t expect it to be all that fast. It gave a full breakdown that was quite extensive, but it took around 15 minutes to complete.
There are dozens of ~20b models available, does this actually outperform any?
Cynical though I am of all things AI, I think this is a positive step. I really don’t want to integrate AI into my workflow, precisely because its a cloud service subject to change at any time. Even if this is a black box, its something you can use forever, for free. That’s an improvement.