AI Checked Your Handrail Height in Seconds. It Can't Tell You If Your ADU Is Legal.

Researchers at the Urban Institute uploaded Minneapolis's entire 467-page zoning code to ChatGPT 5.1 and asked a straightforward question that any homeowner building an accessory dwelling unit might ask: what are the setback requirements for my lot? It had the full document, web search enabled, and a retrieval-augmented generation framework pulling relevant sections. It still gave an unhelpful answer, because the setback rules for that lot required cross-referencing three different chapters, an overlay district amendment from 2019, and a state preemption statute that overrode two of the local provisions. The AI retrieved some of those sections, but it also retrieved irrelevant ones, and it could not tell which was which.

Two hundred miles south in Gainesville, Florida, a different AI system reviewed a set of residential construction plans against the Florida Building Code and flagged a fire egress violation in under 30 minutes. A human reviewer would have taken weeks to find the same violation. Same technology, same year, radically different results, and what distinguished the two was not the model. It was the code.

Two Kinds of Rules

Building codes and zoning codes govern the same parcel of land, but they are written in fundamentally different languages. Building codes are mostly prescriptive: IRC R311.7.8.1 says handrails shall be 34 to 38 inches high, which is a number a machine can check. Florida's building code specifies concrete thickness, fire separation distances, glazing standards, roof ventilation ratios. Every requirement resolves to a measurement, a material, or a yes-or-no compliance question.

Zoning codes are discretionary. Wildly so. They say things like "compatible with the character of the surrounding neighborhood" and "adequate buffering from adjacent uses" and "the proposed structure shall not create an undue concentration of density." Those phrases have legal meaning, established through decades of variance hearings, appeals board decisions, and case law. A planning director in Minneapolis interprets "compatible character" differently than one in Palo Alto, and both of them interpret it differently on a Tuesday than they did on a Thursday, depending on who showed up to the public hearing.

AI does not attend public hearings.

Where AI Plan Review Actually Works

AutoReview.AI, developed over ten years of research at the University of Florida, is the clearest success story in automated code compliance. Named UF Invention of the Year in 2022, the system parses construction plans against the full Florida Building Code, an 800-page prescriptive document, and identifies violations in minutes. Gainesville, Pasco County, and Altamonte Springs are customers. Pasco County Commission Chairman Jack Mariano reported that turnaround time improved and the backlog shortened. In Altamonte Springs, City Manager Frank Martz said work that took a staff employee a few weeks could be completed by AI in as little as 30 minutes.

Those numbers are real, and they are significant because permit backlogs cost builders money directly: every week a plan sits in a review queue adds carrying costs on land, construction loans, and labor contracts. A survey of Florida building professionals conducted by the UF team identified the top delay factors as high submission volume, lack of qualified staff, complex plans, and the sheer number of corrections required on resubmission. Most frequently violated provisions were means of egress, fire resistance, glazing, and roof ventilation. All prescriptive. All measurable. All things a machine can verify by comparing a number on a plan to a number in a code table, which is exactly the kind of work that should be automated, and exactly the kind of work that Florida's AI plan review vendors have chosen to target first, for reasons that become obvious the moment you ask a model to interpret "compatible with the character of the surrounding neighborhood."

In Hernando County, SwiftBuild.ai cut review times from roughly 30 days to as little as 2, saving an estimated $1 million in administrative costs. But their scope is single-family homes and subdivisions, the simplest zoning case there is. Final approvals still require a human reviewer's signature, and the platform explicitly targets zoning and building review for the category of project where the zoning answer is almost always yes.

467 pages

Length of Minneapolis's zoning code. The Urban Institute found AI could not reliably cross-reference its sections to answer basic homeowner questions.

Why Zoning Breaks the Model

The Urban Institute's benchmarking study tested multiple AI models, including Mistral, Llama, GPT 5-mini, and ChatGPT 5.1 with web search, against Minneapolis's zoning code. They tested two personas: a professional developer evaluating a multifamily project and a homeowner trying to build an ADU. Their RAG framework retrieved relevant text from the document, fed it to the model, and asked for answers.

Results were not catastrophic, which would have been easier to dismiss. They were worse. Confidently mediocre. Often, models provided answers that sounded authoritative but missed critical provisions that would have changed the outcome entirely. When questions required synthesizing information from multiple code sections, the retrieval step pulled the right text alongside irrelevant text, and the models could not distinguish between the two. Researchers tested uploading the full 467-page code directly to ChatGPT 5.1, bypassing retrieval entirely. This did not help at all. "Information overload," the researchers wrote, which means that even unlimited context was no substitute for structured retrieval when the underlying document actively resists being parsed by a machine that has never sat through a public hearing, never read a variance decision's footnotes, and never learned the informal conventions that every planning department develops over decades of interpreting its own code.

One finding, though, was genuinely encouraging: the models were reasonably well-calibrated on confidence, meaning they said "I don't know" when they were uncertain rather than hallucinating an answer. That matters. A model that admits ignorance is less dangerous than one that fabricates a confident wrong answer about whether your lot allows a second dwelling unit. But "I don't know" is not what a homeowner needs when they are deciding whether to spend $150,000 on an ADU, and it is not what a planning department can use to clear a backlog.

The researchers' conclusion was blunt: even further iteration on the AI models would not dramatically improve results "without changes to zoning code documentation itself to make information retrieval easier for AI." In plain language, the problem is not the software. Codes are written in a way that machines cannot parse, and rewriting them is a policy problem, not a technology problem.

The Cross-Reference Trap

Consider what happens when a homeowner in a mid-size city asks whether they can build an ADU on their lot. The answer might require reading the base zoning district table (Chapter 20), then the ADU-specific provisions (Chapter 20, Article XIV, Section 20-340 through 20-348), then an overlay district modification adopted by ordinance in 2021 that amends three of those sections, then the state's ADU preemption statute that overrides two of the local provisions, then the local implementing ordinance that adds conditions the state statute did not contemplate. Five documents, each cross-referencing the others, some contradicting each other, with the contradiction resolved by a hierarchy of authority that the code itself does not always make explicit.

A human planner navigates this by calling a colleague, pulling up the variance history on the parcel, and checking whether the planning commission issued guidance on that overlay district last year. An AI model navigates it by retrieving text fragments and hoping the right ones land in the context window together. When they do not, the answer is wrong, and the homeowner discovers this after spending $8,000 on architectural plans that the planning department rejects at intake.

Tampa's procurement office understood this boundary perfectly when it issued a request for information seeking AI-based plan review software. Its RFI specified "basic prescriptive (exact) requirements" and listed examples: square footage calculations, code references, building elevations, product approval numbers, engineer stamps. Zoning interpretation was not on the list. Smart. That city already knew what the Urban Institute's researchers proved with a study.

Florida's Legislative Framework

Florida House Bill 803 explicitly authorizes "automated or software-based plans review system" for single-trade plan reviews, including compliance with the National Electrical Code and the Florida Building Code. Its scope is narrowly limited to prescriptive code, and the distinction appears intentional. Nobody in Tallahassee wrote a bill authorizing AI to interpret whether a proposed structure is "compatible with the character of the surrounding neighborhood." Good. That phrase has generated more litigation than the entire electrical code combined.

Meanwhile, the International Code Council has launched an AI Navigator, a custom LLM trained on the model building codes, which lets officials ask questions like "What are the gaps in our high-rise fire safety rules versus the latest IBC?" The model codes are prescriptive. It works. Nobody has built the equivalent for zoning. Why? Because nobody has figured out how to encode "the board's general sentiment during the April 2023 hearing" into a retrievable document.

What This Means If You're Building

If your project requires only building code review, automated plan review is real and available today. Expect faster turnaround, fewer resubmission cycles, and lower carrying costs. Ask your jurisdiction whether they use or are piloting an AI review system. Cities adopting these tools, particularly in Florida, are already seeing review times collapse from weeks to days.

If your project involves a zoning question, do not rely on any AI tool for the answer. Not ChatGPT, not a vendor's zoning lookup product, not a chatbot on a city website. Current technology cannot reliably cross-reference overlay districts, variance histories, or discretionary provisions. Pay a land use attorney or a planning consultant to read the code, check the parcel history, and call the planning department. That $500 consultation is cheaper than the $8,000 in architectural plans you will waste on a project the zoning code does not allow.

If you work in a planning department, the Urban Institute's most actionable recommendation was not about better AI. It was about better codes: machine-readable text, metadata tagging, restructured tables, symbol-free formatting. The AI is waiting for the codes to catch up. The cities that rewrite their zoning documents for machine readability will be the first to benefit from automated zoning review. Every other city will keep paying human planners to interpret human-readable codes, which is exactly what those codes were designed for.

Limitations

The Urban Institute tested AI against only Minneapolis's zoning code. Cities with simpler, newer, or more recently codified zoning documents may get better results. We do not have accuracy rates from the Florida AI plan review pilots. Hernando County, Gainesville, Pasco County, and Altamonte Springs have published time savings and cost savings, but no jurisdiction has published a head-to-head comparison of AI versus human reviewer accuracy on the same set of plans. The prescriptive-versus-discretionary distinction is a spectrum: many code provisions contain elements of both, and a provision that looks prescriptive ("minimum lot width: 50 feet") can become discretionary when a variance has been granted or an overlay modifies the standard. We also lack data on how well AI performs on zoning codes that have been recently recodified with machine readability in mind, since very few cities have done this.

← Back to AI Home Building