Before you dive in: I built addTaskManager to help you actually get things done. It uses a simple framework: Assess what matters, Decide what's next, Do the work. No complex setups—just clarity and action. Use code ZENFLOW for 50% off Lifetime Purchase.
The rise of LLMs created a lot of value in our world. It also generated a lot of bias.
Arguably, the single most famous AI implementation ever was a chat, later known as ChatGPT. That chat was so good, people were hooked to it with magic, invisible strings. They could not believe all they have to do is ask. And then something at the end of the stream will bring to them, almost instantly, everything they wanted to know.
So, everything became a conversation. Coding transitioned from JavaScript to “plain English”.
The User Interface That You Cannot Talk To
The initial success of the streaming interface was so big, that now the overwhelming trend is to make everything conversational. The answer to all our struggles, and the future of AGI, is “agents, more agents”.
Don’t get me wrong, I like the idea of agents. I also built my own, called AIGernon, so I know what I’m talking about. The fact that you can just lay out a set of instructions, and then another entity is in charge of implementing that, is extremely compelling. It really makes you want to polish all those soft skills you never thought you’d make anything of anymore. Like how to open, sustain and manage a conversation.
Except not everything is actually “conversational”. There is a large part of our life spent talking, but then there’s another one spent interacting in a spatial field, with our – let’s call them like that, for now – limbs. We point with our fingers, we touch, we step, we turn to change our field of view. There’s an entire part unfolding without talking.
And, in my opinion, this part is wrongfully ignored right now.
Our Whole Body Talks – Just Not With Words
Cognitive science has known this for a while: thinking is not just verbal. When you gesture while explaining something, you’re not merely decorating your speech — you’re actually thinking with your hands. When you navigate a room, or reach for a tool, something real is happening in your mind that has nothing to do with language. They call this “embodied cognition”. The idea that the body is not a dumb vessel carrying a brain around, but part of the thinking apparatus itself.
This is the layer that LLM-centric AI largely skips now. It just happened – because chat was so successful, so fast, that everything else got absorbed into it. The real gap in AGI might not be deeper reasoning or more memory or larger context windows, all stacked up into bigger and bigger LLMs, in an endless scaling race. It might be the ability to navigate and interact with physical space.
A Step Backward Masked as a Step Forward
I think it’s worth remembering what we had before. Barely 40 years ago, the GUI was a real revolution — Xerox PARC, then the Mac, then Windows. Instead of typing commands, you pointed and clicked. Then came touch. The iPhone removed even the click. Then spatial computing started exploring what it means to interact in three dimensions, with your eyes, your hands, the orientation of your body. Each of these was an expansion. A new vocabulary added on top of the old one.
Chat collapsed all of that back into text. Which is fine, for what it does. But the framing that followed — that this is the final interface, the one that subsumes all others — is where things went a little bit too far. LLMs didn’t evolve us past pointing and touching. We just got excited and forgot about them for a while.
What a Non-Conversational AI Interface Actually Looks Like
I’ve been thinking about this concretely, because I’m building things. One of them is called Flight Lens. It has a feature called The Pulse — a single index number that aggregates delays, cancellations, weather disruptions and anomalies across global air traffic at any given moment. You open the app, you see a number on a simple scale, from green to red. You don’t ask anything. You don’t type a query. The number tells you something rich and complex, instantly, without a conversation ever starting. That’s not a worse experience than chat. For that specific job, it’s a better one.
Or take Light Njoy, an ambient lamp app that generates light sequences and packs them together in a shareable format. You can create a sequence of light, like a slow breathing rhythm, followed by a color mood, then a light bulb animation — and then share it as a .njoy file. Someone else opens it. The light in their room changes. Nothing was said. No prompt was typed, no response was read. There was a communication, fully formed, that traveled from one person to another entirely outside language. That’s an AI-powered interaction. It just doesn’t look like a chatbox.
Peripheral Vision, Not a Stare
There’s one more thing worth naming. Conversation is expensive, cognitively. When you chat with something, you have to be present, to give it all your focus. You have to formulate, wait, read, respond. That’s fine when you need depth. But a lot of what we actually want from intelligent systems is more like peripheral vision: just a signal, something sitting in the corner of your eye. Something that informs without demanding your full attention.
The interfaces that will matter most, I think, are the ones that will get this right. Not everything is found by asking directly, in plain English. Some things should just be there — a number on a screen, a color in a room, a sound that changes when something changes — carrying meaning without requiring a dialogue to unlock it.
That’s not a limitation of AI. That might actually be where it starts getting interesting.
I've been location independent for 15 years
And I'm sharing my blueprint for free. The no-fluff, no butterflies location independence framework that actually works.
Plus: weekly insights on productivity, financial resilience, and meaningful relationships.
Free. As in free beer.