It’s tempting to treat AI as the clever part and data as a detail to tidy up later. In practice it’s the other way around. The capability of any AI system is capped by the data beneath it — and that foundation is where most of the real work, and the real risk, sits.
Why data is the real constraint
A model is only as good as what it learns from. Feed it incomplete, inconsistent or poorly governed data and it will produce confident, plausible, wrong answers — at scale. The organisations getting genuine value from AI aren’t necessarily using cleverer models; they’ve usually done the unglamorous work of getting their data in order first.
What “in order” actually means
You don’t need a perfect data estate to begin, but you do need a few things you can rely on:
- Quality — data that’s accurate, reasonably complete and consistent.
- Access — the ability to get the right data to the right place without a three-week request.
- Context — knowing what a field means, where it came from, and whether you’re allowed to use it.
- Governance — clear ownership, privacy controls, and a record of how data is used.
The pragmatic takeaway
Treat data as the product, not the by-product. Before the next AI initiative, ask a simpler question: do we trust the data this will rely on, and do we know where it came from? If the honest answer is “not really”, then that’s the project — and it will pay off well beyond AI.
