Oct 2025

Designing AI features for real people (so far)

A living list of my observations/discoveries from designing AI features for real people.

For reflective uses (like “analyse my journal”), small errors kill trust. It’s like your doctor getting your name wrong.
For task-based uses, people are more forgiving of minor errors and hallucinations.
The more AI infers, the noisier the results. The opposite is true: constrain what the AI infers and you’ll get better output.
Users aren’t great at specifying context for AIs. The whole concept is confusing. UX techniques that help users define context and limit AI inference are critical to output quality. Prompt hinting and auto-complete work well.
LLMs are great at semantic searches, less so when they have to find specific things. For example, “find my notes on ...” (works well) versus “summarise all of my completed tasks from last week” (unreliable).
Scoped tasks make AI more reliable. For example, “summarise all of these selected notes”. Here, the human sets the context and input, leaving the AI to do what it does best.
People already have mental slots for their preferred AI tool. Changing behavioural habits is still very hard. For example, users will copy/paste back and forth between their notes and ChatGPT rather than use an AI chat built on top of their notes application.
Asking an AI to take the ‘Next Best Action’ based on user-specified context is a bit of a mixed bag. Sometimes it feels like magic, sometimes like rubbish.

(Updated: October 2025)