I’ve written 437 construction estimates in my career. I kept a spreadsheet starting in 2004. My average miss rate over 22 years: 14.3% over budget. My best year was 2017 at 6.8%. My worst was 2021 — lumber tripled and I was off by 31% on four projects in a row. Nearly broke me.

I am, by most measures, pretty good at this.

Fourteen percent. That’s my number after two decades of obsessive calibration. An entry-level estimator working with spreadsheets and gut instinct misses by 20–30%. The industry average, according to a McKinsey analysis spanning 20 countries and 70 years of data, is 28% over budget. Nine out of ten projects exceed their estimates.

9 in 10 Construction projects exceed their original budget (Propeller Aero / Oxford meta-analysis, 20 countries, 70 years)

Into this disaster walks a new generation of AI tools claiming they can hit 97% accuracy. I spent six weeks testing four of them on actual residential projects.

What “97% Accuracy” Actually Means

That headline number comes from BuildVision AI’s 2025 construction report, which defines accuracy as deviation from final project cost on a dataset of completed commercial projects. There’s a critical caveat buried in the methodology: the 97% figure applies to quantity takeoffs from digital plans — measuring how much drywall, how many studs, how many linear feet of pipe. Not total project cost.

That distinction matters enormously. Quantity takeoff is the part of estimating that’s most like counting. The AI is very good at counting. A tool like STACK Construction Technologies or Togal.AI can ingest a PDF blueprint and extract material quantities in minutes — work that takes a human estimator four to eight hours on a residential project. The accuracy on that specific task genuinely approaches 95–97%.

Total project cost? Different animal entirely.

The Five Layers of an Estimate

The AACE International Recommended Practice 17R-97 breaks cost estimation into five classes, from Class 5 (screening, −50% to +100% accuracy) down to Class 1 (check estimate, −10% to +15%). Most residential contractors operate somewhere around Class 3 — budget-level — where the expected range is −20% to +30%.

AACE ClassProject DefinitionAccuracy RangeAI Impact
Class 5 — Screening0–2%−50% to +100%Moderate (comparable data helps)
Class 4 — Feasibility1–15%−30% to +50%Moderate
Class 3 — Budget10–40%−20% to +30%Strongest impact zone
Class 2 — Control30–75%−15% to +20%Good (real-time pricing)
Class 1 — Check65–100%−10% to +15%Marginal (human judgment dominates)

AI tools are strongest in the middle — Class 3 and Class 4 — where historical project data and automated takeoffs can compress weeks of work into hours. They’re weakest at Class 1, where the accuracy depends on site-specific conditions that no algorithm can learn from a blueprint: the subcontractor who actually returns calls, the inspector who flags everything, the homeowner who changes the tile three times.

What I Tested

I ran four sets of residential plans through ProEst (Autodesk’s estimating platform, $200–$400/month), Buildxact (targeted at residential builders, AI-powered takeoffs), CostToBuild.net (consumer-facing, plan-upload pricing), and a manual estimate from an estimator I trust with 15 years’ experience.

The plans: a 2,400 sq ft single-family in the Bay Area, a 1,800 sq ft ranch in Central Texas, a 3,100 sq ft custom in suburban Denver, and a gut renovation of a 1920s bungalow in Portland.

I already knew the actual build costs on all four. That’s the test.

78% of homeowners exceeded their renovation budget (Clever Real Estate survey, n=1,000, 2024)

Where AI Won

Takeoff speed. Every tool completed quantity takeoffs in under 20 minutes. The manual estimator took 6–11 hours per project. On the straightforward Texas ranch, the AI tools were within 3–4% on material quantities — genuinely impressive and consistent with the industry benchmarks.

Material pricing. The tools with live supplier integrations (ProEst and Buildxact) pulled current regional pricing that was more accurate than the manual estimator’s cost books, which were based on Q3 2025 averages. On the Denver project, lumber had dropped 8% since those cost books were printed. The AI caught it. The human didn’t.

Consistency. Run the same plans through the same AI tool twice and you get the same number. Run them through two different experienced estimators and you’ll get a 12–18% spread. This matters more than people think. A consistent estimate you can calibrate against is worth more than a “better” estimate that varies by who had coffee that morning.

Where AI Lost

The Portland renovation. Every AI tool choked. A 1920s bungalow with knob-and-tube wiring, plaster-and-lath walls, a foundation that had settled unevenly, and a previous owner who’d done unpermitted work in the bathroom — none of that is on the blueprints. The AI estimates came in 22–35% below actual. The manual estimator, who drove by the house and spent 20 minutes in the crawl space, was off by 9%.

Renovations are where estimates go to die, and AI makes them die faster by creating false confidence. The plans say “remove existing wall.” They don’t say the wall has asbestos texture or that someone ran HVAC ductwork through it in 1987.

Subcontractor reality. None of the tools know that the only two plumbers available in rural Central Texas are booked through September and charging a 15% premium. The AI priced plumbing at regional average. Actual cost was 22% higher because of local market constraints — the kind of thing you learn by making phone calls, not running algorithms.

Contingency. The AI tools tacked on a flat 10% contingency. An experienced estimator adjusts contingency by project type: 5–8% on a simple new build, 15–25% on renovation, 30%+ on historic rehab. CMiC’s 2025 overrun analysis confirms that flat contingency is the single biggest source of estimate failure on complex projects.

My Uncomfortable Conclusion

The AI tools beat the manual estimator on two of four projects — the Bay Area new build and the Texas ranch. Clean plans, straightforward construction, good regional data. The manual estimator beat the AI on the renovation and the custom Denver home (where the architect specified six unusual material choices the AI couldn’t price accurately).

Combined? A human estimator using AI tools would have beaten both approaches on all four projects. The AI does the counting. The human does the crawl space. Neither is sufficient alone.

That’s not a sexy conclusion. Nobody’s writing press releases about “AI achieves adequate accuracy when supervised by an expensive professional.” But it’s the truth, and if you’re about to build a house, the truth is worth more than a marketing claim.

My 14.3% miss rate over 22 years? With the AI doing my takeoffs and pricing last year, I got it down to 8.1%. That’s real. That’s not 97%, and I’m not pretending it is. But on a $500,000 house, the difference between 14% and 8% is $30,000. That’s a kitchen. That’s someone’s savings account not getting emptied.

I’ll take it.

Sources