THIS README IS FOR THE DATA/PROCESSED_DATA DIRECTORY, EACH SUB-DIRECTORY WILL ALSO CONTAIN IT'S OWN README FILE.

THIS DIRECTORY CONTAINS THE CODE FOR THE CREATION OF THE COUNT DATA AT DIFFERENT GRID RESOLUTIONS FOR THE GRID-MESH OPTIMISATION METHOD AND FOR MODELLING OF THE CRIME DATA.
THIS DIRECTORY CONTAINS 3 SUB-DIRECTORIES ONE FOR THE SHAPEFILES, THE SOCIO-ECONOMIC VARIABLES AND THE CRIME DATA. THE FINAL SUB-DIRECTORY ALSO CONTAINS CODE TO TAKE THE CRIME, COVARIATE AND CENSUS TRACT SHAPEFILES DATA TO FORMULATION COUNT DATA OVER THE CENSUS TRACTS AND GRIDS WITH THE RELEVANT SOCIO-ECONOMIC VARIABLE DATA ATTACHED.

THE MANIPULATED DATA FILES (COPIED OVER FROM THE RAW_DATA DIRECTORY) ARE NOT CONTAINED WITHIN THIS ARCHIVED FOLDER BUT CAN BE CREATED THROUGH THE R FILES IN THE RELEVANT RAW_DATA DIRECTORY. WHILE WE DO NOT HAVE THE DATA FILES ARCHIVED, WE DISCUSS THE NAMING CONVENTIONS BELOW AS THEY ARE USED WITHIN THE R SCRIPTS.
HOWEVER, WE DO ARCHIVE THE COUNT DATA GENERATED OVER THE CENSUS TRACTS AND GRIDS WITHIN THEIR RELEVANT SUB-DIRECTORIES WITHIN THE PROCESSED_DATA/CRIME DIRECTORY.

- CRIME: this sub-directory contains four further sub-directories which take the outputs for the homicide and motor vehicle theft data created in DATA/RAW_DATA/CRIME in order to produce count data over the census tracts as well as count data over different grid resolutions. The details for the interpolation of the variables are discussed in Chpater 4.1 of my thesis.
	-- POINT_PATTERN: this contains the homicide and motor vehicle theft data produced in and copied over from DATA/RAW_DATA/CRIME.
	-- COUNT_DATA_CENSUS_TRACT: this sub-directory contains the R script CountDataGen_CT_final.R which creates the census tract-level count data which are saved in a sub-directory for each city as *2015CTCountData_projFinal.rda and *2015CTSFCountData_projFinal.rda where the latter is the same data but saved as an sf object.
	-- COUNT_DATA_GMO: this sub-directory contains the R script CountDataGen_GMO_final.R which takes the census tract level socio-economic variables stored in PROCESSED_DATA/COVARIATES (copied over from RAW_DATA/COVARIATES) as well as the crime data stored in PROCESSED_DATA/CRIME/POINT_PATTERN (copied over from RAW_DATA/CRIME) to generate count data over different resolution grids. Importantly, this produces the gridded count data for the creation of the covariates for the Grid-Mesh Optimisation method implemented on the Los Angeles window (Chapter 4) where the creation of the second covariate arises from the average income *_CTInc_15_imp_proj.rds where all missing data for census tracts are interpolated, regardless of the total household estimates for the census tract. This is then interpolated slightly differently to the income data for the modelling of the city data, using the proportion of the area of the census tract intersected with the grid cells as the weights for the areal interpolation. This is discussed in more detail in Chapter 4 of my thesis.
	-- COUNT_DATA_FINAL: this sub-directory contains the R scripts CountDataGen_final.R and CountDataGen_Scale_final.R. The former produces the count data on different grid resolutions at the projected (UTM coordinates) scale while the latter takes this count data frame as well as the grids used to create it and transforms them so that they are scaled where a unit shift in the x or y direction relates to 10km shift rather than 1m shift and and then shifting the locations so that the bottom-left of the boundary window of the city's polygon is at (0,0). As with the count data generated in DATA/PROCESSED_DATA/CIMRE/COUNT_DATA_GMO we take the census tract level socio-economic variables stored in PROCESSED_DATA/COVARIATES (copied over from RAW_DATA/COVARIATES) as well as the crime data stored in PROCESSED_DATA/CRIME/POINT_PATTERN (copied over from RAW_DATA/CRIME). However, unlike the count data generated in DATA/PROCESSED_DATA/CIMRE/COUNT_DATA_GMO we use the average income *_CTInc_15_imp0_proj.rds, where the missing data that corresponds to census tracts with estimated zero total households is set to zero instead of imputed. Additionally, the interpolation of the average income onto the grids involves using the proportion of the area of the grid cell intersected with the census tracts. This is discussed in more detail in Chapter 4 of my thesis. For the different resolutions we output additional data such as meshes and the boundary polygon for the cities.

- COVARIATES: this contains the processed socio-economic data on the census tracts for the necessary cities, extracted in and copied over from DATA/RAW_DATA/COVARIATES.

- SHAPEFILES: this contains processed shapefiles for the census tracts for each city, created in and copied over from DATA/RAW_DATA/SHAPEFILES/
