← All posts

    text generation is a lot more important that people realise.

    February 5, 2026

    Lots of people are excited about AI-generated images and video, and I think they're cool.

    But the most important developments in AI are happening in writing, and most people are completely missing it.

    When people think of AI's ability to write, they think "cool, it can turn my bullets into an email, or draft a LinkedIn post."

    But the real power of writing is how fundamental it is as a way of communicating thought.

    The ability for generative AI models to problem solve through text is insane, because an enormous amount of knowledge work is just processing text: scientific research, engineering specs, product design docs, sales strategies, marketing briefs.

    Even video editors know this. 90% of a video's performance is in the story and script. The edit is just finding the best way to convey what's already been written.

    The performance isn't about how it sounds

    These text models aren't competing on tone anymore. It's about how effectively they can think and problem solve.

    From GPT-4 to GPT-5, from Sonnet 3 to Sonnet 4, the difference isn't in their writing capabilities. Most people can't tell the difference. Many prefer the older versions for their tone and style.

    The biggest difference shows up when you throw a complex problem or high stakes situation with tons of context at them. Can the model understand the situation, identify which context is most relevant, consider the options, and make good decisions?

    Here's what this looks like: You dump 50 pages of customer feedback, your product roadmap, competitive analysis, and sales data into the model. You ask it to recommend which feature to build next and why.

    GPT-4 will pick a random feature and give you generic reasoning. GPT-5 will reference specific feedback patterns, connect them to revenue data, and explain the tradeoffs. It's better, but still misses nuances a human would catch, like that your biggest customer's complaint is actually about implementation, not the feature itself.

    How to actually learn what models can do

    The best way to learn the extent of these models' problem solving capabilities: try to give everything in your day-to-day work to it and see how it performs.

    You'll be surprised by what it can do and frustrated by what still falls flat. Over time you'll build an intuition around capabilities and limits, but revisit these assumptions every 2-3 months. That's the pace this industry moves at.

    When testing, give the model all the context you as a human would have access to, multiple documents, books, resources, previous discussions.

    This is where tools built for context actually matter. The easier it is to throw different knowledge sources at a model and iterate on them, the faster you'll learn what's possible. You don't need to be a developer to run these experiments.

    Why this actually matters

    If you still think AI is mainly about generating pretty images or saving time on emails, you're optimizing for the wrong thing. The real shift is happening in how these models handle the messy, high-context problem solving that makes up most valuable knowledge work.

    The question isn't whether AI can write. It's whether it can think through your specific problem with your specific context well enough to actually help you make better decisions.

    We're getting close. But we're not there yet.


    LinkedIn Post

    I tested GPT-5 against GPT-4 yesterday.

    Same prompt. Same task.

    Most people wouldn't be able to tell the difference just from reading the output.

    Some would probably even prefer GPT-4's tone.

    But here's where it got interesting.

    I dumped 50 pages of customer feedback into both models. Added our roadmap, competitive analysis, sales data.

    Then asked: "What should we build next and why?"

    GPT-4 picked a feature. Gave me generic reasoning that sounded smart but could apply to anything.

    GPT-5 referenced specific patterns in the feedback. Connected them to actual revenue data. Explained the tradeoffs between speed and impact.

    Still not perfect. It missed that our biggest customer's complaint wasn't actually about the feature. It was about how we implemented it.

    But the gap was obvious.

    Everyone's excited about AI-generated images and video right now.

    And yeah, they're cool.

    But the most important shift is happening in text, and most people are completely missing it.

    Writing isn't just about drafting emails or LinkedIn posts.

    It's how we communicate thought.

    Most knowledge work is just processing text. Research papers. Engineering specs. Product docs. Sales strategies. Marketing briefs.

    Even video editors know 90% of a video's performance is in the story and script.

    These models aren't competing on how they sound anymore.

    They're competing on how well they can think through messy, high-context problems.

    Best way to learn what they can actually do?

    Give them your real work. Not toy prompts. All the context you'd have as a human.

    You'll be surprised by what works. Frustrated by what doesn't.

    And you'll build intuition fast.

    Just don't get too attached to it. Re-test every 2-3 months.

    That's how fast this space moves.

    If you still think AI is mainly about saving time on emails, you're optimizing for the wrong thing.

    The question isn't whether AI can write.

    It's whether it can think through your specific problem with your specific context well enough to help you make better decisions.

    We're getting close.

    But we're not there yet.


    X Thread

    1/ Everyone's excited about AI images and video.

    But the most important AI developments are happening in text, and most people are completely missing it.

    2/ When people think of AI writing, they think "cool, it can draft my email or LinkedIn post."

    But writing is how we communicate thought.

    Most knowledge work is just processing text.

    3/ Research papers. Engineering specs. Product docs. Sales strategies. Marketing briefs.

    Even video editors know 90% of performance is in the story and script.

    The edit just finds the best way to convey what's already been written.

    4/ Here's what people don't realize:

    These text models aren't competing on tone anymore.

    Most people can't tell the difference between versions. Some prefer older ones.

    5/ The real difference shows up when you throw a complex problem at them.

    50 pages of customer feedback + roadmap + competitive analysis + sales data.

    "What should we build next and why?"

    6/ Weak model: picks a random feature, generic reasoning.

    Strong model: references specific feedback patterns, connects to revenue data, explains tradeoffs.

    Still not perfect. Misses nuances like when a "feature request" is actually an implementation problem.

    7/ Best way to learn what models can do:

    Give them your actual work. All the context you'd have as a human.

    You'll be surprised by what works. Frustrated by what doesn't.

    8/ Don't get attached to your intuition though.

    Re-test every 2-3 months.

    That's how fast this space moves.

    9/ This is where tools built for context actually matter.

    The easier it is to throw different knowledge sources at a model and iterate, the faster you learn what's possible.

    10/ If you still think AI is mainly about generating pretty images or saving time on emails, you're optimizing for the wrong thing.

    The real shift is in high-context problem solving.

    11/ The question isn't whether AI can write.

    It's whether it can think through your specific problem with your specific context well enough to help you make better decisions.

    We're getting close. But we're not there yet.