The AI Drew 31 Floor Plans. Only 8 Had Windows That Worked.

Nine point six lux. That is the amount of daylight an AI-generated floor plan delivered to a living room in Tromsø, Norway, on a winter afternoon. IES Lighting Handbook recommendation for a living space: 300 lux. This AI hit three percent of that. You could read your phone in that room, barely, if you held it close to your face and squinted. You could not read a book, find your keys, or tell whether the shirt you grabbed from the closet was navy or black.

That number comes from a peer-reviewed study published in AI EDAM (Cambridge University Press), which tested AI-generated floor plans against daylight simulation software across five climate zones, from Jakarta to the Arctic Circle. Researchers fed climate-specific prompts into three AI tools: ChatGPT, Microsoft Copilot, and LookX AI, a platform marketed specifically for architects. They generated 31 plans and could use eight.

74%

of AI-generated floor plans were too architecturally illegible to simulate for daylight performance. Only 8 of 31 outputs passed basic usability thresholds. (AI EDAM, Cambridge University Press)

LookX, the one built for architects, could not draw windows.

I will say that again. Could not draw windows. Not windows that opened the wrong direction or windows that were too small for the climate zone, but windows that simply were not there. With academic restraint, the researchers wrote that LookX "consistently failed to depict critical plan elements, such as fenestration or defined room boundaries." They excluded it entirely. An architecture-specific AI tool that cannot render the most fundamental aperture in a building is not a design assistant. It is a box-drawing machine that forgot what boxes are for.

What the 70% Actually Gets You

Patrick Murphy, CEO of Maket, the first residential AI floor plan tool to offer what the industry calls "agentic editing" (you tell it "make the kitchen bigger" in a chat, and it redraws the plan live), puts the value proposition simply: the AI handles 70 to 75 percent of schematic design work, and then you bring in an architect for the rest.

That remaining 25 to 30 percent, according to Maket's own documentation, includes structural review, code compliance, engineering, furniture placement, material selection, and anything involving construction documents. What it produces is a schematic floor plan, not a permit set. It cannot tell you whether your dream kitchen island fits the room or whether the load-bearing wall you just deleted was holding up the second floor.

Maket is transparent about the limitations, which I respect. Their agentic editing guide advises users to make one change at a time because "the longer the prompt, the less reliably the AI respects every part of it." Vague requests like "reorganize this for better flow" produce unpredictable results because, as they acknowledge, "the AI has to guess what 'better' means to you."

Architects do not guess what better means. They have spent years learning why a circulation path from the front door through the living room to the kitchen should not bisect the space where a family watches television. They know that a south-facing window wall in Phoenix requires a completely different shading strategy than a south-facing window wall in Portland, and that the difference is the reason one house is luminous and the other is an oven. This knowledge is not the 25 percent the AI leaves on the table. It is the entire premise of spatial design.

The Daylight Problem Nobody Tested Until Now

What makes the Cambridge study striking is not that AI floor plans have problems. Everyone knew that. What is striking, what should concern anyone about to build a house using AI-generated schematics, is the specific dimension where these tools fail most comprehensively: environmental responsiveness, the relationship between a building and the climate it sits in. These tools do not merely produce imperfect layouts. They produce layouts that have no relationship to the sun.

After reconstructing usable AI plans in AutoCAD, the researchers built 3D models, assigned standard material reflectances (white painted walls at 0.85, clear glazing at 0.65, wooden flooring at 0.35), and ran daylight simulations through Velux Daylight Visualizer across equinox and solstice dates. They measured illuminance in living rooms, kitchens, and bedrooms.

ChatGPT-generated plans performed better than Copilot's. In tropical climates like Jakarta, ChatGPT hit 900-plus lux in the living room, well above the 300-lux threshold, though dangerously close to glare territory with no evidence of passive shading logic. Copilot's plans were erratic in a way that revealed the underlying absence of spatial reasoning: the Jakarta kitchen measured 1,347.8 lux, which suggests massive unshaded openings dumping equatorial sun directly onto countertops, while the Alice Springs living room measured only 306.1 lux, barely scraping the minimum in a desert climate that offers some of the most intense natural light on earth.

1,741 lux

Copilot's Winnipeg living room in summer. Nearly 6x the recommended maximum. No passive shading logic. No seasonal modulation. The AI pointed every window at the sun and forgot that seasons exist.

Winnipeg told the most damning story of the entire study. Copilot generated a summer living room at 1,740.6 lux, almost six times the recommended ceiling, a space so bright it would be uninhabitable without blinds drawn shut for months. Confronted with the same climate's winter, that same AI offered no compensating strategy for the opposite extreme. Careful but unmistakable, the researchers' conclusion: "generative AI tools, despite interpreting prompt keywords like 'natural light' or 'cross-ventilation,' do not yet possess a functional understanding of solar geometry or daylight performance across seasons."

Seasons. It cannot reason about seasons. Let that sit. Let that sit for a moment. Consider the most fundamental variable in residential architecture, the one that determines where you put the bedroom so the morning sun wakes you gently instead of baking you awake at 5 AM in July, is invisible to these tools.

The Math That Matters

An architect charges $100 to $350 per hour, according to Angi's 2026 data. Typical residential project fees run $2,189 to $11,557. Full-service design on a $400,000 build costs $20,000 to $60,000 at the standard 5 to 15 percent of construction cost, per the National Association of Home Builders.

AI floor plan tools cost $20 to $35 per month. Read that twice.

A homeowner in Nashville, documented by Coohom, used an AI-generated draft as a starting point, then hired an architect on an hourly consult-plus-permit-plans basis for a 1,500-square-foot bungalow. Total design cost: $5,600, roughly half what her neighbor paid for the same size house using full-service architecture from day one.

That Nashville case is the strongest argument for the hybrid approach, and it works; I am not pretending otherwise. For a rectangular lot, a single-story home with a standard program, in a temperate climate where solar orientation matters less than in Jakarta or Tromsø, AI-generated schematics can save $1,000 to $4,000 during the exploration phase. You figure out what you want faster. You bring fewer questions to the architect who stamps the plans. Fewer change orders, fewer billable hours spent on "actually, can you move the master to the other side?"

Where the Savings Disappear

But the Cambridge study exposes the cost of that efficiency. A separate MDPI study evaluating AI floor plan metrics found that "none of the existing metrics are effective in evaluating generated residential floor plans." The standard image-quality metrics that AI researchers use, FID and SSIM and PSNR, "perform poorly in capturing the structural and spatial characteristics unique to residential floor plans." There is no automated way to measure whether an AI-generated layout is actually a good place to live. Researchers at the Australian Institute of Building Sciences trained a convolutional neural network to classify layout "rationality," meaning basic spatial adjacency and circulation logic, and achieved only 82.28 percent accuracy. One in five layouts that passed the AI's own quality check had broken circulation or irrational room placement.

MDPI researchers working under the CorbuAI framework identified what they call "functional-aesthetic decoupling." AI tools are good at either spatial logic or visual quality, but not both simultaneously. Results: "architectural hallucinations," generated forms that look plausible in a rendering but have no functional basis. A staircase that leads to a wall. A bathroom accessible only through a bedroom closet. A kitchen island that blocks the only path to the back door.

When the floor plan goes wrong in ways that daylight simulation or a neural network can detect, you catch it before construction. When it goes wrong in the ways that only living in the space reveals, you catch it the first morning your bedroom is 85 degrees because the AI placed your window wall facing west in Tucson, or the first dinner party where your guests have to walk through the laundry room to reach the bathroom.

What the AI Cannot See

I have been designing residential spaces for a long time. What makes a house feel right, what clients describe as "I don't know why, but this house just works," are almost never in the 70 percent an AI can optimize. They are in the geometry of how morning light enters a breakfast nook. How a kitchen sink relates to the view outside the window above it. How a hallway narrows slightly before opening into a living room, compressing your visual field so the room feels larger than its square footage. A gradient from public to private, from loud to quiet, from shared to solitary, that turns a collection of rooms into a home.

An AI floor plan generator knows that a living room should be adjacent to a kitchen. It does not know that the transition between them determines whether the space feels generous or cramped, whether cooking is an act of isolation or participation, whether the person at the stove can see the child doing homework at the dining table or is staring at a blank wall. These are design decisions that emerge from understanding human behavior in space, not from optimizing room adjacency graphs.

A question from the Cambridge researchers deserves repeating: "Can machines think like architects, or do they merely draw like them?"

Right now, they draw. Fast, and cheap, and competent at the minimum spatial logic of a box connected to another box connected to another box, which is a useful thing to be good at when you are brainstorming at nine in the evening and not yet ready to pay an architect's hourly rate. What they do not draw is the relationship between a house and the sun that lights it, the climate that shapes it, and the people who inhabit it.

Actionable Recommendations

If you are exploring what you want in a home: AI floor plan generators are excellent brainstorming tools. Use Maket ($20-$35/month) or similar platforms to generate dozens of options, test room sizes, and clarify your priorities before your first architect meeting. This is where the Nashville homeowner saved money, and the approach works.

If you are building in a climate with strong seasonal variation: Do not trust AI-generated window placement, room orientation, or passive solar strategy. Cambridge data shows consistent failure in climate-responsive design across every tool tested. Bring an architect or energy consultant into the project before the schematic phase is finalized, not after.

If you are an architect evaluating whether to use AI tools: They accelerate the divergent-thinking phase of design, generating variations you might not have explored. They do not replace the convergent-thinking phase where you evaluate those variations against site, climate, code, and the specific way a family lives. Use them as sketch generators, not as design partners.

If you are comparing costs: AI floor plans at $20-$35/month versus architect schematics at $1,000-$5,000 is not a fair comparison. AI produces room arrangements; an architect produces spatial design. Under the Nashville model, AI for exploration then architect for engineering and code, appears to cut total design costs by roughly 40 to 50 percent on straightforward residential projects, and that is real savings. But it requires the owner to know when the AI's contribution has ended and the architect's must begin.

Limitations of This Analysis

This study tested text-to-image diffusion models (ChatGPT/DALL-E, Copilot), not dedicated floor plan generators like Maket, TestFit, or Planner 5D. Purpose-built tools with constraint-based algorithms may perform better on spatial logic, though no peer-reviewed daylight study of those tools currently exists. Daylight simulations used static material reflectances and did not account for context-specific obstructions, nearby buildings, or trees. Cost comparison data comes from Angi's national averages and a single Nashville case study; regional variation in architect fees is substantial.

Note that the 82 percent accuracy figure for AI layout rationality classification (ANZAScA study) applies to the neural network evaluating AI plans, not to the AI plan generators themselves. It means our tools for measuring plan quality are themselves imperfect, which makes the problem harder, not easier.

Most importantly, this analysis treats daylight as a proxy for spatial quality. Daylight is measurable and simulated with validated software, but it is only one dimension of what makes a house work. Acoustics, thermal comfort, privacy, accessibility, and the ineffable qualities of spatial experience are harder to quantify, and AI's performance on those dimensions remains largely untested.

← Back to AI Home Building