Education

exploratory

What school spaces do to children

Where the evidence on classroom air, acoustics, light and green is robust, where it is thin, and what to measure before the build.

By Christian Huser, in The Built Review · 12 May 2026 · 13 min read · 18 named sources

Download this report
PDF Markdown JSON
Empty classroom with wooden chair-desks and a full-height window onto trees Education

Evidence status as of 12 May 2026 · Version 1

Situation

School spaces work on children through several channels, and each one can be measured on its own: air quality, acoustics, light and colour, the connection to green. The effects are documented in field studies and in controlled work. What rarely happens is the measurement of any of these channels before a district decides to build or renovate.

Boston Public Schools is the exception that shows what is otherwise normal. Starting in 2021 the district installed more than 4,000 air sensors across over 100 schools, paid for with about 7 million dollars in federal COVID-relief money and close to 200,000 dollars from a city clean-air grant, a figure that covers the sensors, the installation, the public dashboard and the analysis. Each sensor reads six quantities: temperature, humidity, CO2, carbon monoxide and two particulate fractions, PM2.5 and PM10. The readings feed a public dashboard that parents can watch in real time. Patricia Fabian and her group at the Boston University School of Public Health analyze the series and are building models that link air quality to student outcomes. The sensors have surfaced hidden HVAC faults before they failed. Denver and Montgomery County are following. This is a measurement apparatus that runs ahead of the decision, not an audit that arrives after it.

The research base behind the four channels can be put in numbers.

For air, Wargocki published a quantitative synthesis of prior studies in 2020 (Wargocki et al., 2020, Building and Environment 173:106749). Aggregating the regressions, his model implies that lowering classroom CO2 from 2,300 to 900 ppm would raise performance on learning-progress tests by about 5 percent. This is a modeled association across the pooled studies, not a single measured effect. Haverinen-Shaughnessy and Shaughnessy measured 70 schools and 140 fifth-grade classrooms with 3,109 pupils in 2015 to see how ventilation rate and room temperature relate to standardized test scores (Haverinen-Shaughnessy and Shaughnessy, 2015, PLOS ONE 10(8):e0136165). Each additional litre per second per person of fresh air was associated with roughly 7 mathematics points, at a 95 percent confidence interval of 1 to 12, and with 11 points once the outliers were removed, at an interval of 2 to 20. The average ventilation rate they observed was 3.6 litres per second per person, well under the standard of 7.1.

For light and for sound the school-specific evidence predates the current sensor wave. The Heschong Mahone Group analyzed more than 21,000 student records across some 2,000 classrooms in three districts in 1999 (Heschong, 1999, Daylighting in Schools, commissioned by Pacific Gas and Electric). In the Capistrano district in California, the one district where the data allowed a learning-rate analysis, students in the most daylit classrooms progressed about 20 percent faster in mathematics and 26 percent faster in reading over a year than those in the least daylit rooms. Shield and Dockrell surveyed the noise outside 142 London primary schools and inside 16 of them in 2008 and matched it to national test scores (Shield and Dockrell, 2008, Journal of the Acoustical Society of America 123(1):133-144). External noise was negatively associated with attainment at age 11, with correlations around minus 0.4, and the association held after they controlled for free-school-meal share, English as an additional language and special-educational-need share. The background level inside the occupied classroom mattered more still, correlating around minus 0.7 with English scores.

For the learning effect of overall physical design, Barrett completed the HEAD study in 2015 with Davies, Zhang and Barrett (Barrett, Davies, Zhang and Barrett, 2015, Building and Environment 89:118-133). It covered 3,766 pupils in 153 classrooms across 27 schools, with a multilevel model that separates the physical room features from the pupil features. The finding: physical classroom design explains about 16 percent of the variation in learning progress over a school year. It is the most cited single number on the K-12 design effect.

For executive function, Morgan analyzed a longitudinal sample of 11,010 children in 2019 (Morgan et al., 2019, Early Childhood Research Quarterly 46:20-32). Kindergartners with working-memory deficits ran roughly three to five times the odds of repeated academic difficulty across elementary school, with the largest effect in mathematics. The figure holds for working memory specifically, not for executive function and school success in general.

For green, Dadvand tested about 2,600 children aged seven to ten in Barcelona four times over twelve months in 2015 (Dadvand et al., 2015, PNAS 112(26):7937-7942). More greenness at and around the school was associated with stronger development of working memory and with less inattentiveness, and part of that association ran through lower exposure to air pollution.

For the sensory side, Park built a freestanding Sensory Well-Being Hub in a Chicago high school in 2020 and tested it with about 60 adolescents who have developmental disabilities (Park et al., 2020, Journal of Interior Design 45(1):13-32). Self-reported well-being rose from 2.96 to 3.42 on the scale during the hub visit and carried back into the classroom. Nair ran a questionnaire of 87 parents and caretakers of autistic children in 2022 on how light and colour in the built environment affect the children’s behavior (Nair et al., 2022, Frontiers in Psychiatry 13:1042641). The data are parent perception, not direct measurement of the children.

Then the reference values. The World Health Organization has recommended a 24-hour mean for PM2.5 below 15 µg/m³ since 2021. ANSI S12.60 sets classroom reverberation time below 0.6 seconds and background noise below about 35 dB-A in the unoccupied room. CO2 near 1,000 ppm is a widely used proxy for adequate fresh air. It is a rule of thumb rather than a limit in ASHRAE 62.1, which prescribes ventilation rates per person and not a CO2 ceiling.

The channels are measurable and the research is broad. One district acts on that before it builds. The rest decide first and measure later, if at all.

Finding

The tension runs not between research and practice but between two readings of the same research.

The strong reading says school spaces lift learning in a measurable way, and Barrett’s 16 percent is the proof. Trade publications such as EDspaces turn this into design recommendations for acoustics, sensory relief, wall treatment and outdoor space. In this reading good design is a learning lever, and the conclusion is to design better.

Against that reading stands a named objection that is older than Barrett and aimed squarely at him anyway. The National Research Council had an expert committee review the evidence in 2007 in Green Schools: Attributes for Health and Learning. Its Finding 8 states that the methodologies used to correlate the overall condition of a building with student achievement are not adequate to determine whether a relationship exists at all. The reason is confounding by socioeconomic status. Children are not assigned to schools at random, and poorer and minority children sit disproportionately in the older and worse buildings. What looks like a building effect is, to an unknown degree, a poverty effect.

Barrett’s study is real and its numbers hold. It is also cross-sectional, single-year and correlational, and its effect rests on the multilevel model. No named formal rebuttal of the 16 percent figure exists in the literature, and I will not invent one. The NRC critique hits exactly this class of study though. Anyone who derives a building program from 16 percent of variance is betting on an effect whose causal share is not secured.

My position is that the robust evidence does not sit with holistic design but with the narrow physical channels. Ventilation and CO2 are documented across more than one sampling logic. Wargocki’s synthesis of aggregated studies and Haverinen-Shaughnessy’s cross-section across 70 schools point the same way. Acoustics carries data of its own. Shield and Dockrell measured noise and attainment in London and found the negative association survived a control for free-school-meal share, language and special-educational-need, which is precisely the confounder the NRC committee warned about. Daylight is a separate lever with school-specific evidence behind it. These channels can be measured before anyone builds, and they do not hang on a single model.

The confounding objection cuts the other way here. Where Barrett rests on one cross-section, the physical-channel findings repeat across several sampling logics. Ventilation shows up in a pooled synthesis, a multi-school cross-section and a prospective sensor study, with a dose-dependent physiological mechanism behind it. The noise effect in Shield and Dockrell held after the socioeconomic control that sinks the building-condition studies. A single correlation has neither the convergence nor the mechanism.

A qualifier belongs here, and it comes from inside the physical-channel research itself. Mendell measured about 150 classrooms across 28 California schools prospectively in 2016, with daily CO2 sensors rather than a one-time reading (Mendell et al., 2016, Indoor Air 26(4):546-557). For English he found 0.6 extra test points per 10 percent higher ventilation rate at p=0.01, and a similar but non-significant value for mathematics. His own conclusion was that most models showed small positive associations but few confidence intervals excluded the null. Even the robust channel delivers modest and sometimes non-significant effects once the measurement is strictly prospective.

Daylight shows the same shape from the other side. The Heschong Mahone correlation is real and survived several controls, including a check on whether more experienced teachers had been assigned to the brighter rooms. The group could not cleanly replicate its own 1999 result in a 2002 follow-up though, where the story shifted from the quantity of daylight toward the quality of the view out of the window (Heschong, Wright and Okura, 2002, Leukos 1(3)). Peter Boyce at the Lighting Research Center warned against treating the daylight and achievement correlation as a settled causal relationship (Boyce, 2005, Leukos). The direction is robust. The 20 to 26 percent headline is not a causal constant that a planner can bank on.

The weakest number is also the one that sells best, for a reason that has nothing to do with the evidence. A renovation sold on sixteen percent better learning gives a buyer something visible to point to, a design they can see and stand behind. Better ventilation buys a few test points and a duct nobody notices. So the market leans hardest on the claim with the least behind it, exactly the one a buyer should weigh most carefully.

A small, channel-dependent effect is the reason to measure, not the reason to stop. It cannot be replaced by a design promise, only secured by data at the specific site. The effect is measurable, but before a renovation it is rarely measured. Boston shows the gap can be closed operationally, though the outcome models that would prove the payoff are still being built. Almost every other district does not even start.

Research context

The effect evidence is present, but the durable field data is thin. Palacios Temprano published a protocol in 2020 for a large-scale sensor deployment across 280 classrooms over five years (Palacios Temprano et al., 2020, BMJ Open 10(3):e031233). He counts more than 90 percent of prior studies running under ten days of measurement, and none beyond thirty. The effect evidence stands, the durable field-data base is missing. That is exactly where Boston and Palacios Temprano go to work, one operationally and one as a study.

The channels are unevenly supported. Air is the most robust, carried by three different sampling logics: Wargocki’s synthesis, Haverinen-Shaughnessy’s cross-section and Mendell’s prospective design. Acoustics is solid as well, and unlike the building-condition studies the NRC faulted, the noise effect in Shield and Dockrell held up after socioeconomic controls. Daylight is robust in direction but contested in cause, which is why it belongs among the levers to measure rather than among the promises to sell. Green has one strong and large study in Dadvand, but the greenness measure is satellite-based and the mechanism runs partly through reduced air pollution rather than through time spent in the green itself. The sensory side is smaller and partly self-reported. Park’s hub measured about 60 adolescents on self-reported well-being, and Nair’s result comes from a parent survey. Carrying findings from autistic children over to the general classroom is inference, not a measured result.

The weakest bridge is the one between executive function and space. Morgan shows that early working-memory deficits predict later school difficulty. That this function can be moved through the design of the room does not follow from it. Executive-function research is mostly pedagogical, not spatial. Anyone who justifies wall coverage with executive function is reasoning from a real finding to a lever that no study measures directly.

What is missing can be named cleanly. Causal, longitudinal and space-specific designs in place of cross-sectional correlations. Calibrated thresholds for wall coverage and green share, which are practitioner defaults so far. K-12-specific outdoor studies, since Dadvand measures the green in the school and home surroundings rather than the effect of a specific schoolyard redesign. The evidence also sits across disciplines that rarely share a table. Building science and indoor-air epidemiology carry the air channel, acoustics and environmental psychology carry the noise channel, and developmental and educational research carry the executive-function work. No single field sees the whole room, and the planning decision depends on putting them together. These gaps are the reason the report weights the holistic levers below the physical ones.

Consequences

The decision framework follows from the state of the evidence, not from a design preference. The basic principle is to measure the robustly supported channels before the building decision, because the measurement is cheap and the decision is expensive and binding for decades. The order of confidence is not arbitrary. Air and acoustics first, daylight second, the holistic design and green promises with their uncertainty named.

For the facility manager the most immediate lever is the air. A continuous measurement on the Boston pattern makes hidden HVAC faults visible and gives the ventilation decision data instead of assumption. The thresholds already exist. The WHO guideline value for PM2.5, and the common proxy of CO2 near 1,000 ppm for adequate fresh air. The framework is not a recipe but a decision rule. If the threshold is exceeded, the ventilation plan belongs on the table before the renovation budget is released. If it is met, the renovation argument itself needs examining, rather than spending money against a problem the measurement does not show. The link back to learning is the cleanest in the whole report. Wargocki’s synthesis ties a CO2 level to an expected performance change, so a sensor on the wall gives a rough indication of the room’s effect on a test, not a guarantee.

For the architect the lever sits with acoustics and material. ANSI S12.60 gives the target values, reverberation under 0.6 seconds and background noise under 35 dB-A for the unoccupied room. Shield and Dockrell’s London data show why the occupied room matters even more. The noise level with children in it tracked test scores more closely than the traffic outside, and it did so after socioeconomic controls. Measuring before the material choice tells you whether hard surfaces like exposed concrete are tolerable or whether acoustic ceiling baffles and absorbent surfaces need priority. Daylight is the second lever and belongs in early planning, not in the fit-out at the end, with the caveat that its measured effect is more modest than the headline.

For the K-12 planner the most important consequence is a stop-doing rule. The holistic levers, a wall coverage between 20 and 50 percent clear area or a minimum green share in the schoolyard, are practitioner defaults rather than calibrated thresholds. Barrett supports the order of magnitude, that a moderate level of complexity beats both the cluttered wall and the bare one, but not the exact value. Defaults like these are usable as orientation as long as they are treated as what they are. The mistake would be to shift budget toward a 16 percent promise while neglecting the channels whose evidence holds. The design budget belongs where the measurement points, not on a feature that a case study made look like settled science.

The consequence is measurement before the decision, on the channels where the research carries, rather than more design. The cost is asymmetric: the sensor hardware runs in the low hundreds of dollars per room, and even Boston’s full program with installation, dashboard and multi-year analysis came to about 7 million dollars against renovation budgets in the millions per school. The district that measures first spends less and decides better than the one that designs on a brochure figure. School spaces matter, and almost every district still commits the expensive decision before taking the cheap measurement that would inform it.

Sources

  1. Wargocki, Porras-Salazar, Contreras-Espinoza and Bahnfleth, 2020, Building and Environment 173:106749. Quantitative synthesis of published studies on classroom CO2 and performance.
  2. Haverinen-Shaughnessy and Shaughnessy, 2015, PLOS ONE 10(8):e0136165. 70 schools, 140 fifth-grade classrooms, N=3,109; ventilation and temperature against test scores.
  3. Mendell, Eliseeva, Davies and Lobscheid, 2016, Indoor Air 26(4):546-557. Prospective study, ~150 classrooms in 28 California schools, daily CO2 sensors.
  4. Heschong (Heschong Mahone Group), 1999, Daylighting in Schools, commissioned by Pacific Gas and Electric. 21,000+ student records; Capistrano +20% math and +26% reading learning-rate.
  5. Heschong, Wright and Okura, 2002, Leukos 1(3). Follow-up that could not cleanly replicate the 1999 daylight effect and shifted toward view quality.
  6. Boyce, 2005, Leukos. Caution against treating the daylight and achievement correlation as a settled causal relationship.
  7. Shield and Dockrell, 2008, Journal of the Acoustical Society of America 123(1):133-144. London; external and internal classroom noise negatively associated with attainment after socioeconomic controls.
  8. Barrett, Davies, Zhang and Barrett, 2015, Building and Environment 89:118-133. HEAD project; 3,766 pupils, 153 classrooms, 27 schools; design explains ~16% of variance in learning progress.
  9. Morgan, Farkas, Wang, Hillemeier, Oh and Maczuga, 2019, Early Childhood Research Quarterly 46:20-32. N=11,010; working-memory deficits and repeated mathematics difficulty.
  10. Dadvand, Nieuwenhuijsen, Esnaola et al., 2015, PNAS 112(26):7937-7942. ~2,600 Barcelona schoolchildren; green space and cognitive development, BREATHE project.
  11. Park, Nanda, Adams, Essary and Hoelting, 2020, Journal of Interior Design 45(1):13-32. Sensory Well-Being Hub for ~60 adolescents with developmental disabilities.
  12. Nair, Priya, Rajagopal et al., 2022, Frontiers in Psychiatry 13:1042641. Questionnaire of 87 parents and caretakers of autistic children on light and colour.
  13. Palacios Temprano, Eichholtz, Willeboordse and Kok, 2020, BMJ Open 10(3):e031233. Protocol for sensor deployment across 280 classrooms over five years.
  14. National Research Council, 2007, Green Schools: Attributes for Health and Learning, National Academies Press. Committee review; Finding 8 on the inadequacy of building-condition methodologies.
  15. Boston Public Schools indoor air-quality sensor program, 2021 onward. 4,000+ sensors across 100+ schools; analysis by Boston University School of Public Health (Patricia Fabian).
  16. EDspaces, trade-practitioner design recommendations for K-12 sensory, executive-function and outdoor spaces.
  17. World Health Organization, 2021 global air quality guidelines. PM2.5 24-hour mean below 15 µg/m³.
  18. ANSI S12.60. Classroom reverberation time below 0.6 s, background noise below NC-30 (~35 dB-A) in the unoccupied room.
  19. ASHRAE Standard 62.1. Per-person ventilation rates; CO2 near 1,000 ppm is a common rule-of-thumb proxy, not an ASHRAE limit.
Download this report
PDF Markdown JSON