@percent

percent@infosec.pub · 2 days ago

Chinese AI labs kinda seem like a plot twist in LLM evolution. Their models are quite capable now. They’re not at the levels of American labs’ flagship models yet, but the gap has been narrowing quite a lot.

When OpenAI and Anthropic models are only marginally better, but much pricier, then I would think they’ll gradually shed users (followed by investors).

Ironically, I could imagine a possibility of Nvidia “saving” American AI. If they can take the lead with Nemotron (in like a “post-OpenAI/Anthropic” future when open-source models dominate), then maybe they can survive on chip sales… Though they’d probably have to compete with Chinese chips at that point.

percent@infosec.pub · 2 days ago

I know I’m being pedantic, but machine learning has been around for many decades

percent@infosec.pub · 8 days ago

Seems like they were operating with a pile of bad practices, then threw AI into the mix.

Neural networks are approximation algorithms. There’s a reason LLMs are generally more productive with statically typed languages, TDD, etc. They need those feedback loops and guard rails, or they’ll just carry on as if assuming they never make mistakes (which tends to have a compounding effect).

If you want to use AI safely, you should be more defensive about it. It will fuck up; plan accordingly.

percent@infosec.pub · 8 days ago

These protocols predate LLMs

percent@infosec.pub · edit-2 6 days ago

Gemma4:26b is also worth trying. I find it runs much faster on my hardware.

Edit: Qwen3.6:35B might be the sweet spot. It’s bigger than the 27B, but actually more lightweight when running. TIL the 27B is not a MoE model; it’s a dense model. The 35B is a MoE model with only 3B active params.

So far, I think Qwen3.6:35B might be giving me better results than Gemma4:26B. It’s a bit slower than Gemma4:26B, but definitely faster than Qwen3.6:27B.

percent@infosec.pub · edit-2 9 days ago

+1, exactly the same experience. Except Gemma4:26B really sucks with OpenCode. Works great with Pi though

percent@infosec.pub · 9 days ago

Do you not use it enough because yet get bad results? I discovered that, no matter how smart the LLM might be, its first attempt is never its best work. Tell it to review its work (or its plan, if using planning mode). If it makes any changes, tell it to review its work again. Repeat until there are no more changes.

(You don’t actually have to do this repetition manually; just tell the AI to do it in a loop. I recommend making it into a SKILL.md so you don’t have to explain the loop every time.)

With these loops, I get better results AND burn lots of tokens. (Yes, it feels strange that excessive token consumption is actually considered a good thing)

percent@infosec.pub · 9 days ago

That’s Notepad in the screenshot? I haven’t used Windows in years, but I remember Notepad being the one that didn’t do rich text. Did they just fold Wordpad into Notepad and add Copilot?

percent@infosec.pub · 26 days ago

iPhones save old notifications in a database? Why?

percent@infosec.pub · 1 month ago

Thanks

percent@infosec.pub · 1 month ago

Would you mind elaborating a little more? I live under a rock but this seems interesting

percent@infosec.pub · 1 month ago

Oh that’s an interesting challenge.

I hear some LLMs now have some solutions for the classic “how many Rs in ‘strawberry’” problem (related to the tokenization processes), but I have no idea how they might solve the phonetic thing. I’m sure some smart people will eventually find a way though

percent@infosec.pub · 1 month ago

This may sound weird, but I think anecdotal evidence might be more informative than the productivity stats for now, until the industry settles on a new equilibrium.

Some engineers are more productive with AI, and some (maybe even most, still) are less productive. People are still putting in the effort to learn how to use it more effectively/productively (there is a learning curve), and some of the less productive are getting laid off.

It sucks, but that’s just how it is now.

Also, AI tooling is still evolving very rapidly. A lot of information and stats are only valid for maybe a few months.

percent@infosec.pub · 1 month ago

It is, intentionally. Some of the training data is synthetic

percent@infosec.pub · 1 month ago

deleted by creator

percent@infosec.pub · 2 months ago

Why Deepseek?

percent@infosec.pub · 2 months ago

Maybe they’d prefer “Macroslop”?

percent@infosec.pub · 2 months ago

I remember noticing YouTube’s software quality taking a nosedive when they shoved Shorts into the app. At the time, it seemed like it was hastily done as a reaction to TikTok’s explosion, but it’s still a buggy POS , even after all these years.

percent@infosec.pub · 2 months ago

I wouldn’t be surprised to hear about some OpenClaw-like agent doing something like that. Some people are pretty reckless with it

percent@infosec.pub · 3 months ago

There’s an extra “.” in your link. Here’s a corrected link for my fellow lazies: AutoEQ