← Back to blog

Why general-purpose AI fails on pet health records

· Emily Ikeda
  • pet tech
  • ai
  • pet data
  • infrastructure

The first time I fed a vaccination record into a general AI model, it looked great. It was fast, every field was filled in, and every vaccination had a date next to it. Then, I noticed one of the dates was wrong. It wasn't just misread, it was completely hallucinated.

There had been a gap in the record, and the model filled it in by guessing. And that one record is basically the whole problem in miniature.

Pet records are genuinely a mess. They come in as faxes, phone photos, handwritten notes, and PDFs exported from a dozen different systems with a dozen different formats. Pulling the text off the page is the easy part, and general models do handle that well. Where they fall apart is the processing and proper structuring afterwards. That's the gap Pawssier sits in: turning whatever shows up into clean, structured data so the platforms using it don't have to build that messy layer themselves.

After enough of these records, the failure patterns get really predictable. Here are the five we run into most:

1. Unit ambiguity

Take something as basic as weight. A record says 8.2 and never says whether that's kilograms or pounds. A human glancing at it can usually sort that out from context. A general model just commits to one, and sometimes it's the wrong one. Same story with medication doses, temperatures, lab values. The number comes through fine, but what it actually means contextually is anybody's guess.

A general AI model guessing kilograms for an unlabeled weight of 8.2, compared to PSRF output flagging the unit as ambiguous instead of guessing.

2. Field confusion

A single vaccination certificate might list the date a shot was given, the date the next one is due, and the date the document itself got printed, all sitting right next to each other. A general model tends to grab whichever one is most visually prominent, which is very often not the one you needed. Mix up "administered" and "due" and suddenly a perfectly compliant pet looks overdue, or an overdue one looks cleared, which is a real problem if that record is what's getting a dog onto a flight. The same thing happens between a product's brand name and the disease it actually protects against, or between the clinic and the vaccine manufacturer.

A vaccination certificate showing three dates — printed, administered, and due — with a general AI model incorrectly selecting the printed date instead of the administered date.

3. Silent assumptions

Records are full of holes: A lot number that got cut off at the edge of a scan, a year that's smudged, a signature line left blank. The instinct of a general model is to treat a hole as something to patch rather than something to point at, so it produces a value that looks completely reasonable and hands it over with the exact same confidence as the values it genuinely read off the page. With health data that's the dangerous move, because a confident guess is so much worse than an honest "I don't know." Nobody downstream has any reason to question it.

4. No validation logic

Reading a field correctly isn't the same as knowing whether it makes sense. A three-year rabies vaccine logged for an eight-week-old puppy can't be right. An expiration date that falls before the date the shot was actually given can't be right either. A general model will report both without blinking, because it has no internal sense of what a valid record is even supposed to look like. It's reading, not checking, and those turn out to be very different jobs.

5. No field-level provenance

Even when a general model gets every field right, you've got no way to see how it got there. There's no pointer back to the exact spot on the page each value came from, and no honest read on how sure it was about any given one. When that record is about to feed a travel approval or an insurance claim, an answer you can't trace back and audit isn't really usable, no matter how right it happens to be.

A general AI output with no confidence score or source location, compared to PSRF output showing a confidence score and page location for each field, including a low-confidence flag on one field.

The pattern underneath

None of this is edge-case weirdness. It's just what a general model does by default when you point it at a document this specialized. The fix isn't a cleverer prompt, it's structure that actually understands what a pet record is, validation that knows what can't possibly be true, and provenance that's willing to show its work. That's worth building properly one time, so everyone downstream can stop rebuilding a worse version of it. That's why we've built Pawssier.

← Back to blog