Titusville, Florida, population 49,000, sits on the Space Coast between the Indian River and Kennedy Space Center. Lockheed Martin just broke ground on a $140 million manufacturing facility there, joining Blue Origin, SpaceX supply chain vendors, and a wave of commercial development that has overwhelmed the city's planning infrastructure. Rocket launches are projected to hit 120 this year, and a development services department staffed for a sleepy riverside town is drowning in permit applications it was never sized to handle. Last January, the city extended a contract with SwiftGov, an AI-powered plan review platform, after processing 132 real-world development projects through the system's pilot program.
Buried in the announcement was a number that should concern anyone whose building permit recently passed through an AI review engine: approximately 80% accuracy.
Eighty percent.
On a system evaluating roughly 250 regulatory rules per engineering review, two out of every ten checks were wrong, not "flagged for human review" but flat-out incorrect.
The building code does not grade on a curve.
Why Every City Wants One
The rush to automate makes sense. Seven months. That is the average permitting delay in the United States, long enough to watch lumber prices swing 20%, long enough for a buyer's rate lock to expire, long enough to kill a deal. Meanwhile disposable income rises at half the pace of regulatory costs, and the building code compliance bill has become the single largest regulatory line item on a new home. Municipal leaders need to look like they are doing something about housing affordability without actually rewriting the codes, and AI plan review offers an irresistible pitch: keep every rule while cutting the time and cost of enforcement.
Jacksonville deployed SwiftGov alongside Microsoft, saving 600 hours against a 50,000-unit affordable housing shortage. Naples partnered with Blitz AI, trained on all 800 pages of the Florida Building Code, producing redlined drawings in minutes. After Hurricane Helene buried Hernando County in rebuilding applications, SwiftGov cut single-family review times by 93%, collapsing thirty-day reviews into two, and Austin, Honolulu, Seattle, Bellevue, Miami, Louisville, and Los Angeles have followed.
North of the border, 23% of Canadian municipalities report using AI for permitting, with a national training initiative teaching 2,500 officials to deploy these tools. Nobody is publishing accuracy audits.
What 80% Accuracy Actually Means
Titusville's disclosure deserves closer scrutiny because it remains the only municipality that has put a number on AI plan review performance. SwiftGov itself frames the tool carefully, noting in the city's announcement: "The AI does not automate approvals or replace professional judgment. Instead, it accelerates how staff identify applicable regulations, surface potential compliance issues and maintain consistency across reviews."
That framing is important. It positions the AI as a first pass: read the plans, check them against 250-odd rules, hand a report to a human reviewer who catches whatever the machine missed.
Sounds reasonable. Until you consider what happens to human attention when a machine has already done the checking.
Aviation calls it automation complacency, medicine calls it alert fatigue, and every field where a human monitors a mostly-correct system produces the same result: the human stops looking. A 2016 NASA study on flight deck automation found that pilots missed critical anomalies at significantly higher rates when automated monitoring systems were active, precisely because they trusted the automation to catch what needed catching, precisely because the automation usually did.
Building plan review is not air traffic control, but the cognitive dynamic is identical: a highly competent automated system lulls an overworked human into trusting outputs they would otherwise verify. A plan reviewer who receives an AI-generated report showing 200 rules checked and passed is not going to re-check all 200. They will zero in on the 50 flagged as potential violations, and that remaining 20% sits in the pile of green checkmarks that nobody questions, because the whole point of deploying AI was to stop making humans read every line.
The Liability Gap Nobody Has Addressed
Consider a concrete scenario. An AI plan review engine checks a residential set of drawings against the Florida Building Code's stairway provisions. Section R311.7.5.1 of the IRC specifies a maximum riser height of 7¾ inches, a dimension that matters because a riser just three-eighths of an inch too tall changes the angle of every step on a flight and multiplies the probability of a misstep fall, particularly for elderly occupants navigating stairs in low light. SwiftGov scans the architectural stair detail, reads the noted dimension as compliant, and marks it green. But the actual dimension on the drawing is 8⅛ inches, rendered in a hand-annotated note that the OCR module misread. Seeing the green checkmark, the human reviewer moves on, the permit issues, and the builder builds the stairs as drawn.
Eighteen months later, someone falls. A stair riser exceeds code by ⅜ of an inch. The field inspection was a 15-minute visit, one of eleven that day, and ⅜ of an inch is not visible without a tape measure and a reason to suspect a problem.
Who is liable? Every vendor's terms disclaim consequential damages, and every one positions its product as a "decision-support tool," a legal distinction that means their contractual obligations end at delivering software, not safety. Blitz AI's VP of Growth told the Business Observer: "A human takes over. The AI gives them a head start." SwiftGov says the AI "does not automate approvals or replace professional judgment."
Translation: if something goes wrong, the human reviewer bears the professional responsibility, the municipal liability remains anchored to the plan review function, and the vendor walks away having collected its fee and contractually disclaimed the accuracy of its output.
No state legislature has enacted a statute allocating liability for AI-assisted plan review errors, even as dozens of municipalities have deployed these systems across thousands of permits. Insurance carriers have not published underwriting guidance on how AI adoption affects municipal errors-and-omissions coverage, and the only federal analog is the CFPB's 2024 rule governing automated valuation models in home appraisals, which requires safeguards for accuracy and nondiscrimination but applies to valuations, not code compliance. Nothing comparable exists for the AI systems deciding whether your home's structural connections, fire separation, and egress paths meet standards that exist because people died in buildings that lacked them.
Calculating the Cost of a Missed Violation
Plan review exists to catch code violations on paper, where fixing them costs a red pen and a revised drawing. Catching the same violation during construction means demolition and reconstruction, and catching it after occupancy, when drywall covers the framing and families have moved in, costs litigation, remediation, sometimes relocation, and always more money than anyone budgeted because nobody budgets for the thing the permit review was supposed to prevent.
| When Caught | Typical Fix Cost | Multiplier |
|---|---|---|
| Plan review (on paper) | $0 to $500 (designer revision) | 1x |
| During construction (pre-drywall) | $1,500 to $8,000 | 3-16x |
| During construction (post-drywall) | $5,000 to $25,000 | 10-50x |
| After occupancy (remediation) | $10,000 to $100,000+ | 20-200x |
| After incident (litigation) | $50,000 to $1M+ | 100-2000x |
These ranges come from published construction defect data. A National Institute of Standards and Technology study estimated that inadequate interoperability in the capital facilities industry costs $15.8 billion annually, with two-thirds attributable to costs incurred during operations and maintenance rather than design and construction. Isolating the residential share is difficult, but the multiplier effect is well-documented: the Construction Industry Institute's rule of thumb is that every dollar spent on quality during design saves five to ten dollars during construction and twenty to one hundred dollars during operations.
Do the math. Apply those multipliers to NAHB's $40,200 code compliance cost. If AI plan review catches 80% of code issues at the paper stage (cost: minimal) but passes 20% through to the field, and half of those are caught during construction at an average remediation cost of $5,000 each, the math looks like this for a system reviewing 250 rules:
250 rules × 20% error rate = 50 rules incorrectly assessed. Not all incorrect assessments produce real-world violations: many are false negatives on rules that the designer actually satisfied, or rules that a subsequent inspection would catch. But even if 10% of incorrect assessments result in a constructed defect, that is five defects per project that passed plan review unchallenged. At $5,000 average remediation per defect: $25,000 in rework costs that would have been $0 if a human had checked those 50 rules.
Read that again. That $25,000 in preventable rework exceeds the entire schedule-delay savings that AI plan review claims to deliver, which means a municipality adopting the tool to save money on permitting may be shifting a larger cost onto the builders and homeowners whose projects pass through the system with uncaught violations. Faster permitting, higher rework: that is not efficiency, it is cost displacement.
The Strongest Case for AI Plan Review
A fair counterargument deserves its full weight. Human plan reviewers are not 100% accurate either. They are inconsistent, they make more mistakes at 4 PM than at 9 AM, and they interpret the same code section differently depending on which reviewer pulls your application. The Independent Institute compared the problem to baseball umpires' varying strike zones. Worse, the municipal development workforce has been shrinking for 15 years while the regulatory load has grown more complex, a "de facto national underinvestment in permitting capacity" that HousingWire estimates in the tens of billions.
An AI that catches 80% consistently may outperform a burned-out human reviewer working a 60-application backlog who catches 70% on a good day and 50% on a bad one, which is the entire justification that vendors are selling and cities are buying. Fair enough. Hernando County's 93% reduction in review times is real. Blitz AI's reported 85% reduction in resubmittals suggests that its upfront AI screening catches enough issues early to save builders multiple rounds of revision, which saves real money even if the system is not perfect.
None of this means AI plan review should stop. It means that deployment without independent accuracy benchmarking, without published error analysis, and without liability frameworks is not innovative: it is reckless. Three questions should precede every adoption decision, and no municipality deploying these systems has answered any of them: how accurate does an AI plan review system need to be before a city can rely on it, who independently verifies that threshold, and what happens to the homeowner whose structural defect sailed through the system's blind spot?
What Nobody Is Requiring
Aviation has the FAA. Medical devices have the FDA. Financial algorithms have the SEC, CFPB, and a growing body of state law (Colorado's SB24-205, now being challenged by xAI with DOJ backing, attempted to regulate algorithmic discrimination). Building code plan review AI? Nothing. No certification, no accuracy threshold, no mandatory audit, no disclosure requirement. A city can deploy an AI plan review system trained on its code, process thousands of permits through it, and never tell an applicant that a machine participated in the review.
A 2026 paper published in Springer Nature's Discover Cities journal audited LLM judgment against human professional judgment in urban infrastructure governance using the Delphi method. Their recommendation: "LLMs should be treated as junior analytical assistants whose outputs require review and validation." They called for audit-based adoption with explicit readiness levels defining boundaries on permissible use, and warned that "outputs from models classified at higher readiness levels may support internal analysis but should not enter official records without independent verification."
Zero cities have published results of such an audit. And consider the source: Titusville's 80% figure came from a press release about extending a contract, not from an independent evaluation, and it describes self-reported performance by a vendor with a financial interest in continued deployment. Draw your own conclusions.
What Should Happen Next
Builders: request the AI-generated compliance report alongside your permit approval letter. You are entitled to those records under most state public records laws. Compare the AI's findings to your own code analysis section by section, and document discrepancies in writing before construction proceeds.
Homeowners: if your jurisdiction uses AI plan review, hire a code consultant to independently verify structural connections, fire separation assemblies, egress dimensions, and energy code compliance. Cost: $500 to $1,500, which is negligible compared to a post-construction code violation.
Municipalities: require your vendor to publish accuracy data. Not self-reported accuracy on a curated test set, but measured accuracy on your jurisdiction's actual permit applications, benchmarked against a human reviewer processing the same drawings blind. Publish annually, broken down by code section. Averaged accuracy is a lie. A system 100% accurate on setbacks and 0% accurate on structure is not 50% safe, and those two numbers should never share a denominator.
Limitations of This Analysis
Titusville's 80% accuracy figure is the only publicly reported accuracy metric for a deployed AI plan review system in the United States. It may not be representative of other platforms (SwiftGov, Blitz AI, Archistar eCheck, CivCheck, AutoReview.AI). Our $25,000 rework estimate assumes a 10% defect materialization rate from incorrectly assessed rules, which is an analytical assumption, not an empirical measurement, and the actual rate could be significantly higher or lower depending on the types of rules the AI gets wrong. Structural and fire safety errors are far more consequential than landscaping setback errors, and this analysis does not distinguish between them because no published data breaks AI accuracy down by code section. Automation complacency research drawn from aviation and medicine; no study has examined the phenomenon in building plan review specifically. NAHB's $131,734 regulatory cost figure reflects average national conditions and varies substantially by jurisdiction, with some markets significantly higher (coastal California) and others lower (rural markets with minimal overlay districts). CFPB's AVM rule applies to automated home valuations, not plan review, and is cited here as a regulatory precedent, not a direct analog.