Samuel Yeboah-Asi

Lately I’ve been completely tangled in this question: How do you give an AI more of the right context without drowning it in more tokens? I thought I had a clever direction at first. The idea was simple. Before sending a user’s message to the model, I would tack on a bunch of helpful surrounding information: summaries, related content, extra reminders, structure… basically anything to help the model “understand” better.

On paper, that sounds like the whole point of contextual AI: make the prompt richer so the output is smarter. But that experiment has turned into one of the biggest contradictions I’ve run into so far.

The more I enriched the prompt, the more I shortened the context window.

I went in assuming enriched context would save tokens over time by reducing ambiguity. Instead, it consumed them upfront and exhausted the space before the model even got to think. Even the simplest follow-ups, like “what about this part?” became impossible because there was no room left. All the “helpful” scaffolding suffocated the reasoning we were trying to enable.

It hit me that I was treating the model like it was forgetful, when really the issue is that it is literal. It doesn’t care which parts are backstory and which parts are urgent. All tokens cost the same.

Enrichment became bloat.

It also made me rethink what context really is. I used to think of it as additional text . “Here’s everything related,” like the way we help a friend recall a story. But models don’t reason like humans; they don’t selectively compress history into what's meaningful right now. They just ingest whatever sits in front of them.

That means the question isn’t How do I give the model more It’s How do I give the model less but smarter?

I still want systems that understand long-term relationships, not just what the user asked this moment. But now I’m convinced that the answer can’t be just richer prompts. If the intelligence is always bottlenecked by a finite context window, then forcing all relevant knowledge through that same choke point will always break down.

So this week has been a bit humbling. I wasn’t discovering “better context.”
I discovered the cost of context.

There’s something kind of ironic in how it played out. I spent all this time building logic to fetch supporting info, stitch everything together, annotate it, and wrap it in nicely labeled sections… only to realize that the model would’ve performed better if I said nothing at all.

I’m not discouraged though. If anything, the clearer the contradiction, the closer I feel to solving it. Maybe the real breakthrough isn’t in enriching the prompt, but in reducing what needs to be inside it. Offloading memory, storing structures, building context that doesn’t live inside the query, that’s probably where the future is.

So yeah… lesson learned: If you keep expanding the prompt to solve context,
you’re actually creating the problem you’re trying to solve.

And now that I know that, the next phase is figuring out: How do I give the model deeper understanding without handing it every detail in every prompt?

I don’t have the solution yet.
But at least now I finally have the right question.

- Sam