When I started Emet, I envisioned myself as a white-hat hacker on a mission to DDoS-attack AI with high-quality data. I wanted to flood models with high-quality data in order to ground the models in truth. As I've come to realize, that's easier said than done.
Having recently attended the second annual World of DaaS conference, I have been reflecting on the stark contrast between the way corporate and financial firms approach and value data, and the prevailing norms amongst most in the AI industry today.
For financial and corporate firms, data isn't just an input. Rather, its the very knowledge foundation upon which trillions in market value and critical decisions rest. There is an almost reverential respect in these industries for the accuracy, completeness, and truth that high-quality data can provide. Respect for accurate data is not a luxury - it is a necessity.
Banks, hedge funds, and Fortune 500 companies invest heavily in data procurement and quality because the cost of not knowing, or being wrong, is astronomical. A flawed market signal can set off a cascade of bad trades. An error in a credit scoring model can result in billion in bad loans. In these environments, data is stress-tested, verified, and audited with an intensity that borders on paranoia, and quality data providers are rewarded handsomely as a result.
The AI industry, by contrast, operates in a looser, more probabilistic world. Until recently (more on the tides of change later), data was less a foundation and more a means to an end. Datasets are scraped from the web and shared freely, with quality often treated as a secondary consideration, easily corrected by scale (if I had a nickel for how many times I heard "but I can just take it online…"). The focus is on what is "good enough" for the task, even if that means tolerating a significant margin of error.
Of course, this attitude comes at a cost. As AI systems move past images and video and become more deeply embedded in critical decision-making, from loan approvals to medical diagnoses, the margin for error will shrink. The stakes will rise. And as we have seen in finance, the market for truth is unforgiving. To that end, AI companies would do well to adopt some of the rigor that financial firms have developed over decades. They should treat data as a first-class asset, worthy of the same scrutiny, validation, and investment as any other critical input.
If we want AI to truly extend human capabilities rather than amplify our blind spots, we must demand more from them. This means less forgiveness for hallucinations, more rigorous validation, and a deeper commitment to truth. It means moving beyond the convenience of large, noisy datasets and embracing the hard work of precision and verification.
AI is on the cusp of transforming industries at a scale we can barely imagine. But to do so responsibly, the industry must first learn the lessons that others have understood for decades, that truth, however difficult to achieve, is the only reliable foundation for decision-making.