Education

exploratory

Which classroom conditions affect learning

Where the evidence on classroom air, acoustics, light and green is strong, where it is thin, and what to measure before the build.

By Christian Huser, in The Built Review · 12 May 2026 · 13 min read · 19 named sources

Download this report

PDF Markdown JSON

Empty classroom with wooden chair-desks and a full-height window onto trees

Education

Evidence status as of 12 May 2026 · Version 1

Situation

School spaces work on children through four channels that can each be measured on their own: air quality, acoustics, light and colour, the connection to green. Field studies and controlled work document the effects. What rarely happens is the measurement of any channel before a district decides to build or renovate.

Boston Public Schools is the exception that shows what is otherwise normal. Starting in 2021 the district installed more than 4,000 air sensors across over 100 schools, paid for with about 7 million dollars in federal COVID-relief money and close to 200,000 dollars from a city clean-air grant, a figure that covers the sensors, the installation, the public dashboard and the analysis. Each sensor reads six quantities: temperature, humidity, CO2, carbon monoxide and two particulate fractions, PM2.5 and PM10. The readings feed a public dashboard that parents can watch in real time. Patricia Fabian and her group at the Boston University School of Public Health analyse the series and are building models that link air quality to student outcomes. The sensors have surfaced hidden HVAC faults before they failed. Denver and Montgomery County are following. In Boston the readings arrive before the building decision, early enough to change what gets built.

The research base behind the four channels can be put in numbers.

For air, Wargocki published a quantitative synthesis of prior studies in 2020 (Wargocki et al., 2020, Building and Environment 173:106749). Aggregating the regressions, his model implies that lowering classroom CO2 from 2,300 to 900 ppm would raise performance on learning-progress tests by about 5 per cent. This is a modelled association across the pooled studies, not a single measured effect. Haverinen-Shaughnessy and Shaughnessy measured 70 schools and 140 fifth-grade classrooms with 3,109 pupils in 2015 to see how ventilation rate and room temperature relate to standardised test scores (Haverinen-Shaughnessy and Shaughnessy, 2015, PLOS ONE 10(8):e0136165). Each additional litre per second per person of fresh air was associated with roughly 7 mathematics points, at a 95 per cent confidence interval of 1 to 12, and with 11 points once the outliers were removed, at an interval of 2 to 20. The average ventilation rate they observed was 3.6 litres per second per person, well under the standard of 7.1.

For light and for sound the school-specific evidence predates the current sensor wave. The Heschong Mahone Group analysed more than 21,000 student records across some 2,000 classrooms in three districts in 1999 (Heschong, 1999, Daylighting in Schools, commissioned by Pacific Gas and Electric). In the Capistrano district in California, the one district where the data allowed a learning-rate analysis, students in the most daylit classrooms progressed about 20 per cent faster in mathematics and 26 per cent faster in reading over a year than those in the least daylit rooms. Shield and Dockrell surveyed the noise outside 142 London primary schools and inside 16 of them in 2008 and matched it to national test scores (Shield and Dockrell, 2008, Journal of the Acoustical Society of America 123(1):133-144). External noise was negatively associated with attainment at age 11, with correlations around minus 0.4, and the association held after they controlled for free-school-meal share, English as an additional language and special-educational-need share. The background level inside the occupied classroom mattered more still, correlating around minus 0.7 with English scores.

For the learning effect of overall physical design, Barrett completed the HEAD study in 2015 with Davies, Zhang and Barrett (Barrett, Davies, Zhang and Barrett, 2015, Building and Environment 89:118-133). It covered 3,766 pupils in 153 classrooms across 27 schools, with a multilevel model that separates the physical room features from the pupil features. The finding: physical classroom design explains about 16 per cent of the variation in learning progress over a school year. It is the most cited single number on the K-12 design effect.

For executive function, Morgan analysed a longitudinal sample of 11,010 children in 2019 (Morgan et al., 2019, Early Childhood Research Quarterly 46:20-32). Kindergartners with working-memory deficits ran roughly three to five times the odds of repeated academic difficulty across elementary school, with the largest effect in mathematics. The figure holds for working memory specifically. It does not extend to executive function and school success in general.

For green, Dadvand tested about 2,600 children aged seven to ten in Barcelona four times over twelve months in 2015 (Dadvand et al., 2015, PNAS 112(26):7937-7942). More greenness at and around the school was associated with stronger development of working memory and with less inattentiveness, and part of that association ran through lower exposure to air pollution.

For the sensory side, Park built a freestanding Sensory Well-Being Hub in a Chicago high school in 2020 and tested it with about 60 adolescents who have developmental disabilities (Park et al., 2020, Journal of Interior Design 45(1):13-32). Self-reported well-being rose from 2.96 to 3.42 on the scale during the hub visit and carried back into the classroom. Nair ran a questionnaire of 87 parents and caretakers of autistic children in 2022 on how light and colour in the built environment affect the children’s behaviour (Nair et al., 2022, Frontiers in Psychiatry 13:1042641). The data are parent perception. They are not a direct measurement of the children.

Then the reference values. The World Health Organization has recommended a 24-hour mean for PM2.5 below 15 µg/m³ since 2021. ANSI S12.60 sets classroom reverberation time below 0.6 seconds and background noise below about 35 dB-A in the unoccupied room. CO2 near 1,000 ppm is a widely used proxy for adequate fresh air. It is a rule of thumb rather than a limit in ASHRAE 62.1, which prescribes ventilation rates per person and not a CO2 ceiling.

The channels are measurable and the research is broad. Boston acts on that before it builds. In most districts the building decision comes first and the measurement later, if it comes at all.

Finding

Two readings draw on the same research and disagree over how far it reaches.

The strong reading says school spaces lift learning in a measurable way, and Barrett’s 16 per cent is the proof. Trade publications such as EDspaces turn this into design recommendations for acoustics, sensory relief, wall treatment and outdoor space. In this reading, good design raises learning directly, and the conclusion is to design better.

Against that reading stands a named objection that is older than Barrett and aimed squarely at him anyway. The National Research Council had an expert committee review the evidence in 2007 in Green Schools: Attributes for Health and Learning. Its Finding 8 states that the methodologies used to correlate the overall condition of a building with student achievement are not adequate to determine whether a relationship exists at all. The reason is confounding by socioeconomic status. Children are not assigned to schools at random, and poorer and minority children sit disproportionately in the older and worse buildings. What looks like a building effect is, to an unknown degree, a poverty effect.

Barrett’s study is real and its 16 per cent figure is published and unchallenged. It is also cross-sectional, single-year and correlational, and only one multilevel model produces that effect. No named formal rebuttal of the 16 per cent figure exists in the literature, and I will not invent one. The NRC critique hits exactly this class of study though. Anyone who derives a building programme from 16 per cent of variance is betting on an effect whose causal share is not secured.

My position is that the evidence is strongest for the narrow physical channels, air, acoustics and light. It is weaker for whole-room design. Ventilation and CO2 are documented across more than one sampling logic. Wargocki’s synthesis of aggregated studies and Haverinen-Shaughnessy’s cross-section across 70 schools point the same way. Acoustics has its own data. Shield and Dockrell measured noise and attainment in London and found the negative association survived a control for free-school-meal share, language and special-educational-need, which is precisely the confounder the NRC committee warned about. Daylight brings school-specific evidence too. These channels can be measured before anyone builds, and they do not hang on a single model.

The confounding objection cuts the other way here. Barrett draws on one cross-section. The air findings repeat across several sampling logics. Ventilation shows up in a pooled synthesis, a multi-school cross-section and a prospective sensor study, with a dose-dependent physiological mechanism behind it. The noise effect in Shield and Dockrell survived the socioeconomic control that sinks the building-condition studies. A single correlation has neither the convergence nor the mechanism.

A qualifier belongs here, and it comes from inside the physical-channel research itself. Mendell measured about 150 classrooms across 28 California schools prospectively in 2016, with daily CO2 sensors rather than a one-time reading (Mendell et al., 2016, Indoor Air 26(4):546-557). For English he found 0.6 extra test points per 10 per cent higher ventilation rate at p=0.01, and a similar but non-significant value for mathematics. His own conclusion was that most models showed small positive associations but few confidence intervals excluded the null. Even the air channel, the one with the most behind it, delivers modest and sometimes non-significant effects once the measurement is strictly prospective.

Daylight shows the same shape from the other side. The Heschong Mahone correlation is real and survived several controls, including a check on whether more experienced teachers had been assigned to the brighter rooms. The group could not cleanly replicate its own 1999 result in a 2002 follow-up though, where the story shifted from the quantity of daylight towards the quality of the view out of the window (Heschong, Wright and Okura, 2002, Leukos 1(3)). Peter Boyce at the Lighting Research Center warned against treating the daylight and achievement correlation as a settled causal relationship (Boyce, 2005, Leukos). The correlation between daylight and achievement stays positive. The 20 to 26 per cent headline is not a causal constant that a planner can bank on.

The weakest number is also the one that sells best, for a reason that has nothing to do with the evidence. A renovation sold on 16 per cent better learning gives a buyer something visible to point to, a design they can see and stand behind. Better ventilation buys a few test points and a duct nobody notices. So renovation vendors lean hardest on the claim with the least behind it, exactly the one a buyer should weigh most carefully.

A small, channel-dependent effect is still worth measuring; it is not a reason to walk away from the channel. A design promise cannot substitute for that reading. Only data at the specific site secures the effect. The effect is measurable, yet before a renovation it is rarely measured. Boston shows the gap can be closed in operation, though the outcome models that would confirm the effect on learning are still being built. Most other districts do not measure anything before they build.

Research context

The effect evidence is present, but the durable field data is thin. Palacios Temprano published a protocol in 2020 for a large-scale sensor deployment across 280 classrooms over five years (Palacios Temprano et al., 2020, BMJ Open 10(3):e031233). He counts more than 90 per cent of prior studies lasting under ten days of measurement, and none beyond thirty. That short-window record is the gap Boston and Palacios Temprano set out to close, Boston operationally and Palacios Temprano as a study.

The channels are unevenly supported. Air stands on the firmest evidence. Three different sampling logics agree: Wargocki’s synthesis, Haverinen-Shaughnessy’s cross-section and Mendell’s prospective design. Acoustics depends on one cross-section, Shield and Dockrell, whose noise effect survived the socioeconomic control that sinks the building-condition studies the NRC faulted; the convergence across sampling logics that air has is still missing. Daylight points the same direction but stays contested in cause, which is why it belongs among the channels to measure rather than the promises to sell. Green has one strong and large study in Dadvand, but the greenness measure is satellite-based and the mechanism operates partly through reduced air pollution rather than through time spent in the green itself. The sensory side is smaller and partly self-reported. Park’s hub measured about 60 adolescents on self-reported well-being, and Nair’s result comes from a parent survey. Extending findings from autistic children to the general classroom is inference. It is not a measured result.

The weakest bridge is the one between executive function and space. Morgan shows that early working-memory deficits predict later school difficulty. That this function can be moved through the design of the room does not follow from it. Executive-function research is mostly pedagogical, not spatial. Anyone who justifies wall coverage with executive function is reasoning from a real finding to a design choice that no study has tied to it.

The missing pieces are specific. Causal, longitudinal and space-specific designs in place of cross-sectional correlations. Calibrated thresholds for wall coverage and green share, which are practitioner defaults so far. K-12-specific outdoor studies, since Dadvand measures the green in the school and home surroundings rather than the effect of a specific schoolyard redesign. The evidence also sits across disciplines that rarely share a table. Building science and indoor-air epidemiology cover the air channel, acoustics and environmental psychology cover the noise channel, and developmental and educational research cover the executive-function work. No single field sees the whole room, and the planning decision depends on putting them together. These gaps are the reason the report weights the whole-room design below the physical channels.

Implications

The decision framework follows from the strength of the evidence, channel by channel. The basic principle is to measure the channels the evidence backs before the building decision, because the measurement is cheap and the decision is expensive and binding for decades. Air and acoustics first, daylight second, the overall design and green promises with their uncertainty named.

For the facility manager, the air is the first channel to act on. A continuous measurement on the Boston pattern makes hidden HVAC faults visible and gives the ventilation decision data instead of assumption. The thresholds already exist. The WHO guideline value for PM2.5, and the common proxy of CO2 near 1,000 ppm for adequate fresh air. The framework is a decision rule, keyed to what the sensor shows. If the threshold is exceeded, the ventilation plan belongs on the table before the renovation budget is released. If it is met, the renovation argument itself needs examining, rather than spending money against a problem the measurement does not show. The link back to learning is the cleanest in the whole report. Wargocki’s synthesis ties a CO2 level to an expected performance change, so a sensor on the wall gives a rough indication of the room’s effect on a test, not a guarantee.

The size of the payoff depends on which study backs the pitch to a school board. Pooling the available trials, Wargocki puts the gain from cutting CO2 from 2,300 to 900 ppm at five per cent, but using daily sensors across 150 California classrooms rather than one-time readings, Mendell found a smaller effect: 0.6 extra points in English per 10 per cent more ventilation, and a similar figure for mathematics that missed significance (Mendell et al., 2016, Indoor Air). Cite Mendell’s number to a board weighing the cost. It is the stricter measurement, and the harder number to defend under scrutiny.

For the architect the work sits with acoustics and material. ANSI S12.60 gives the target values, reverberation under 0.6 seconds and background noise under 35 dB-A for the unoccupied room. In Shield and Dockrell’s London data, the occupied room matters even more. The noise level with children in it tracked test scores more closely than the traffic outside, and it did so after socioeconomic controls. Measuring before the material choice tells you whether hard surfaces like exposed concrete are tolerable or whether acoustic ceiling baffles and absorbent surfaces need priority. Daylight comes second and belongs in early planning, not in the fit-out at the end, with the caveat that its measured effect is more modest than the headline.

Where the daylight budget is tight, view matters more than raw glazing area. In a 2002 follow-up, Heschong, Wright and Okura could not cleanly replicate the 1999 quantity effect; they found the story shifting towards the quality of the view instead (Heschong, Wright and Okura, 2002, Leukos). Boyce warned against treating the original correlation as settled causal ground, a caution that applies most directly here (Boyce, 2005, Leukos). A specification that maximises glazing area is betting on the half of Heschong’s finding that did not replicate in 2002. The half that held up was the quality of the view. Sightlines and glare control are where the budget should go.

For the K-12 planner the most important consequence is a stop-doing rule. The whole-room targets, a wall coverage between 20 and 50 per cent clear area or a minimum green share in the schoolyard, are practitioner defaults rather than calibrated thresholds. Barrett supports the order of magnitude, that a moderate level of complexity beats both the cluttered wall and the bare one, but not the exact value. Defaults like these are usable as orientation as long as they are treated as what they are. The mistake would be to shift budget towards a 16 per cent promise while neglecting the channels the evidence supports. The design budget belongs where the measurement points. A single cross-sectional study is a weak anchor for a wall-coverage programme.

The same test that separated air and acoustics from Barrett’s figure works on the next vendor claim. Two checks apply: did the number survive a control for free-school-meal share or an equivalent socioeconomic measure, the test Shield and Dockrell’s noise finding passed and the building-condition literature the NRC reviewed did not, and does it replicate across more than one sampling logic, a pooled synthesis, a cross-section and a prospective design, the way the air evidence does. A claim that fails both checks is a Barrett-class number: real, published and too thin to anchor a design budget by itself.

The consequence is measurement before the decision, on the channels where the evidence is strongest, rather than more design. The cost is asymmetric: the sensor hardware sits in the low hundreds of dollars per room, and even Boston’s full programme with installation, dashboard and multi-year analysis came to about 7 million dollars against renovation budgets in the millions per school. The district that measures first spends less and decides better than the one that designs on a brochure figure. Most districts still make the building decision before measuring anything.

Sources

Wargocki, Porras-Salazar, Contreras-Espinoza and Bahnfleth, 2020, Building and Environment 173:106749. Quantitative synthesis of published studies on classroom CO2 and performance.
Haverinen-Shaughnessy and Shaughnessy, 2015, PLOS ONE 10(8):e0136165. 70 schools, 140 fifth-grade classrooms, N=3,109; ventilation and temperature against test scores.
Mendell, Eliseeva, Davies and Lobscheid, 2016, Indoor Air 26(4):546-557. Prospective study, ~150 classrooms in 28 California schools, daily CO2 sensors.
Heschong (Heschong Mahone Group), 1999, Daylighting in Schools, commissioned by Pacific Gas and Electric. 21,000+ student records; Capistrano +20% maths and +26% reading learning-rate.
Heschong, Wright and Okura, 2002, Leukos 1(3). Follow-up that could not cleanly replicate the 1999 daylight effect and shifted towards view quality.
Boyce, 2005, Leukos. Caution against treating the daylight and achievement correlation as a settled causal relationship.
Shield and Dockrell, 2008, Journal of the Acoustical Society of America 123(1):133-144. London; external and internal classroom noise negatively associated with attainment after socioeconomic controls.
Barrett, Davies, Zhang and Barrett, 2015, Building and Environment 89:118-133. HEAD project; 3,766 pupils, 153 classrooms, 27 schools; design explains ~16% of variance in learning progress.
Morgan, Farkas, Wang, Hillemeier, Oh and Maczuga, 2019, Early Childhood Research Quarterly 46:20-32. N=11,010; working-memory deficits and repeated mathematics difficulty.
Dadvand, Nieuwenhuijsen, Esnaola et al., 2015, PNAS 112(26):7937-7942. ~2,600 Barcelona schoolchildren; green space and cognitive development, BREATHE project.
Park, Nanda, Adams, Essary and Hoelting, 2020, Journal of Interior Design 45(1):13-32. Sensory Well-Being Hub for ~60 adolescents with developmental disabilities.
Nair, Priya, Rajagopal et al., 2022, Frontiers in Psychiatry 13:1042641. Questionnaire of 87 parents and caretakers of autistic children on light and colour.
Palacios Temprano, Eichholtz, Willeboordse and Kok, 2020, BMJ Open 10(3):e031233. Protocol for sensor deployment across 280 classrooms over five years.
National Research Council, 2007, Green Schools: Attributes for Health and Learning, National Academies Press. Committee review; Finding 8 on the inadequacy of building-condition methodologies.
Boston Public Schools indoor air-quality sensor programme, 2021 onwards. 4,000+ sensors across 100+ schools; analysis by Boston University School of Public Health (Patricia Fabian).
EDspaces, trade-practitioner design recommendations for K-12 sensory, executive-function and outdoor spaces.
World Health Organization, 2021 global air quality guidelines. PM2.5 24-hour mean below 15 µg/m³.
ANSI S12.60. Classroom reverberation time below 0.6 s, background noise below NC-30 (~35 dB-A) in the unoccupied room.
ASHRAE Standard 62.1. Per-person ventilation rates; CO2 near 1,000 ppm is a common rule-of-thumb proxy, not an ASHRAE limit.

Download this report

PDF Markdown JSON