Processed datasets containing all numerical sensor data used for training and testing the ML algorithms discussed in the associated publication. Data from temperature, pressure, humidity, VOC and spectral sensors is included. The data is split into four datasets (as defined in Table V of the associated publication), each containing a different combination of sensor data and each subdivided into data ("x") and labels ("y") for both testing and training data. 30% of the cleaned data is randomly taken to form the testing data, while the remaining 70% forms the training data. Each data subset is balanced, as discussed in section 3.E.3 in the associated publication.