Quantile Regression Ensemble Summer Year (QRESY)

The zip file contain 4 datasets in csv format. Each of them correspond to weather files of one hot summer year hourly data based on the weather observed over 40 (basis) years, 1974 - 2013. Two are the so-called probabilistic design summer years (PDSY) for the cities of London (UK) and Joao Pessoa (Brazil). The PDSY uses an overheating metric that is based on the number of hours in which the temperature is above a certain threshold when a building is occupied. Then, PDSY is created by selecting an entire year which contains the third hottest mean based on this overheating metric. PDSY is currently used in the UK as reference of warm summers. However it is the first time that a PDSY is created for Brazil. The other two weather files correspond to the new quantile ensemble regression summer year (QRESY) also aiming to represent hot summers both for London and Joao Pessoa. QRESY is created by combining observed summer extreme temperatures. This is done by endowing higher weights to quantiles away from the median for ensembles within upper quantiles. At the same time, it increases the importance of quantiles near to the median for combining lower quantiles.

Cite this dataset as:
Herrera Fernandez, M., Ramallo-González, A., Eames, M., Ferreira, A., Coley, D., 2018. Quantile Regression Ensemble Summer Year (QRESY). Bath: University of Bath Research Data Archive. Available from: https://doi.org/10.15125/BATH-00480.


[QR code for this page]


application/zip (163kB)
Creative Commons: Attribution 4.0

Data under Licence: Creative Commons Attribution 4.0 International


Matt Eames
University of Exeter

Aida A. Ferreira
Pernambuco Federal Institute of Education, Science, and Technology

David Coley
University of Bath


University of Bath
Rights Holder


Temporal coverage:

From 1 January 1974 to 31 December 2013

Geographical coverage:

London (UK) and Joao Pessoa (Brazil)


Data collection method:

The Quantile Regression Ensemble Summer Year (QRESY) creation process starts by collecting hourly weather data over a long period. Typically, weather files attempt to be representative of periods around 20-40 years, and here we do the same, however much longer periods could be used. The existence, variables and quality of hourly weather data varies depending on the location. The variables usually include temperature, atmospheric pressure, cloud cover, wind speed and wind direction, precipitation, etc.

Data processing and preparation activities:

For the QRESY process, preprocessing of this data is required to ensure it contains no long sequences of missing data. If large amount of data are missing in any of the variables, the whole year is removed from the analysis. At this point it is also necessary to decide the target level of extreme weather to work with. That is, to fix the quantile level for the subsequent construction of the Quantile Regression models depending on the distance to the median (quantile 50, Q50). Running a Quantile Regression model for every year under analysis is an ``embarrassingly parallel'' problem, as it is straightforward to separate the problem into a number of parallel tasks and the code run on a parallel machine. The set of regressions is combined in a unique year of hourly data. This is done by endowing higher weights to quantiles away from Q50 for ensembles within upper quantiles. At the same time, it increases the importance of quantiles near to Q50 for combining lower quantiles. The idea being to focus on explaining critical phases of summer temperatures. Each ensemble is thereby made over the predictors of a number of regression models corresponding to each of the years in the database. The ensemble parameters can be tuned by cross-validation over random partitions of the data into training and test summer periods.


Engineering and Physical Sciences Research Council (EPSRC)

COLBE - The Creation of Localized Current and Future Weather for the Built Environment

Publication details

Publication date: 21 March 2018
by: University of Bath

Version: 1

DOI: https://doi.org/10.15125/BATH-00480

URL for this record: https://researchdata.bath.ac.uk/id/eprint/480

Related papers and books

Herrera, M., Ramallo-González, A. P., Eames, M., Ferreira, A. A., and Coley, D. A., 2018. Creating extreme weather time series through a quantile regression ensemble. Environmental Modelling & Software, 110, 28-37. Available from: https://doi.org/10.1016/j.envsoft.2018.03.007.

Contact information

Please contact the Research Data Service in the first instance for all matters concerning this item.

Contact person: Manuel Herrera Fernandez


Faculty of Engineering & Design
Architecture & Civil Engineering