Apple Just Laid the Groundwork to Change the Public Perception of AI // Evan Coleman

If you’re a software engineer who hasn’t been living under a rock for the last six months, you already know AI is real. Not hype-real. Actually-real. It writes code, it refactors code, it explains code you’ve never seen, and it does all of it well enough that a lot of us have quietly reorganized how we work.

And yet, talk to anyone who doesn’t write software, and you’ll hear the opposite. AI is a joke. AI is a scam. AI is the thing that ruined their search results and took their job. The gap between how engineers talk about this technology and how everyone else does is enormous.

I think that gap is about to close. And I think Apple just lit the fuse at WWDC last week.

Two small features with a big implication

There were the usual headline announcements this year, but the two that stuck with me were small, almost throwaway:

Safari can now generate extensions from a prompt. You describe what you want the browser to do, and it builds you an extension to do it.
Shortcuts can now be created from a prompt. Same idea. Describe the automation, get the automation.

That’s it. No keynote fireworks. But what Apple actually did here is ship vibe-coding to a few hundred million people who have never opened a terminal and never will.

These aren’t AI features in the way people have been trained to roll their eyes at. There’s no chatbot bolted onto a fridge. There’s no summary nobody asked for. The user states intent, and a working artifact comes out the other side. That distinction is the whole ballgame.

Why AI is genuinely good at code

Here’s the thing most of the public hasn’t been told, because it doesn’t fit either the hype or the backlash: AI is good at writing code for a specific, structural reason. And that reason doesn’t transfer to most of the places AI has been marketed.

Code is verifiable. You can run it. You can test it. You can review it. You can write a unit test that passes or fails with no opinion about how it feels. The output is reproducible — same input, same behavior, every time. When a model writes code, it can check its own work against a ground truth that actually exists: does it compile, do the tests pass, does it do what it’s supposed to.

That feedback loop is everything. It’s why an AI can write a function, run it, see it fail, and fix it — converging on something correct without a human in the loop for every step. The model isn’t trusted to be right. It actually checks that what it did works.

Now look at where AI has actually been sold to consumers. Summarizing your email. Answering questions about the news. Giving medical-ish advice. Chatbots that send you in loops. None of these have a compiler. None of them have a test suite. There’s no assert that the summary is faithful, no green checkmark that the answer is true. The output is plausible, and that’s just about it.

Hallucination is the wrong word

We call it hallucination, and that word does a lot of quiet damage. Hallucination implies a malfunction: the system glitched, made a mistake, will be patched.

It’s not a malfunction. A language model making something up because it sounds right is the model functioning exactly as designed. These things are built to produce the most probable continuation of text. Truth is not a parameter. When the most plausible-sounding answer happens to be true, great. When it doesn’t, you get the same confident sentence, equally fluent, but completely wrong. The model has no idea which one it just did. It can’t, because nothing in it is checking against reality.

This is why AI feels like a scam to so many people. They’ve only ever met it in the contexts where it can’t verify itself, where “sounds right” is the ceiling. They’ve been shown the one version of this technology that has no ground truth to stand on, and told it’s the future.

Code is the opposite. Code has ground truth baked in. That’s not a small advantage. That’s the entire reason it works.

What Apple actually changed

So back to those two boring little features.

When a non-technical person asks Safari to build an extension that hides every “people also viewed” section on a shopping site, something different happens than when they ask a chatbot to summarize an article. The extension either works or it doesn’t. They watch it run. They see the sections disappear, or they don’t, and they tweak the prompt and try again. There’s a verifiable artifact sitting right in front of them.

For the first time, regular people are going to experience AI in the mode where it’s actually good: intent in, working a problem out, and a result you can actually check with your own eyes. Not a summary they have to trust. Not an answer they have to fact-check. A thing that does what they asked, or visibly doesn’t.

That’s the moment the public perception flips. Not because the marketing got better, but because the use case finally matches what the technology is good at.

Bespoke software, on demand

I think this is the front edge of something much bigger.

For the entire history of software, the economics have forced one program to serve millions of people. The interface is a committee compromise. Every button is there because someone needed it, which means it’s cluttered for everyone. We’ve all learned to mold ourselves around software built for an average user who doesn’t exist.

LLMs break that constraint. When generating a working interface costs almost nothing, there’s no reason the software has to be generic anymore. The natural endpoint is bespoke software for one person, for one task, generated on the spot.

You open your laptop. You say what you’re trying to do. It designs and builds an interface optimized for that — your data, your workflow, your one weird task — and throws it away when you’re done. No download. No settings menu with four hundred options you’ll never touch. No learning someone else’s idea of how the task should go. The software conforms to you instead of the other way around.

Apple didn’t ship that this week. They shipped a prompt box in Safari and a prompt box in Shortcuts. But that’s the same idea in its smallest possible form, handed to the largest possible audience. And once people feel what it’s like to say a thing and get a working thing, they’re not going back to molding themselves around someone else’s software.

The engineers already know AI works. The rest of the world is about to find out, not from a keynote, but the first time the browser does exactly what they asked.

I’d love to be wrong about the timeline, but I don’t think I am. Reach out on Mastodon at @[email protected] or Bluesky at @edc.me if you want to argue about it.