THIS README IS FOR THE DATA/PROCESSED_DATA/CRIME DIRECTORY, EACH SUB-DIRECTORY WILL ALSO CONTAIN IT'S OWN README FILE.

THIS DIRECTORY CONTAINS THE OUTPUT FROM THE DATA MANIPULATION OF THE RAW DATA AS WELL AS CODE AND OUTPUTS FOR THE CREATION OF THE COUNT DATA AT THE CENSUS TRACT LEVEL AS WELL AS DIFFERENT GRID RESOLUTIONS FOR THE GRID-MESH OPTIMISATION METHOD AND MODELLING.
THIS DIRECTORY CONTAINS 4 SUB-DIRECTORIES ONE FOR THE PROCESSED POINT PATTERN DATA FOR HOMICIDE AND MOTOR VEHICLE THEFTS IN THE DIFFERENT CITIES AS PRODUCE FROM DATA/RAW_DATA/CRIME, THE SECOND FOR THE CREATION OF COUNT DATA FRAMES FOR EACH CRIME IN EACH CITY OVER THAT CITIES CENSUS TRACTS, THE THIRD CREATES GRIDDED COUNT DATA FRAMES FOR THE GRID-MESH OPTIMISATION METHOD, WHILE THE FINAL SUB-DIRECTORY PRODUCES THE DATA, GRIDS AND MESHES FOR THE MODELLING OF THE CRIME DATA USING THE INLA OR UNIVARIATE AND MULTIVARIATE INLA WITHIN MCMC ALGORITHMS.

THE MANIPULATED DATA FILES (COPIED OVER FROM THE RAW_DATA DIRECTORY AND IN TO THE POINTS_PATTERNS DIRECTORY) ARE NOT CONTAINED WITHIN THIS ARCHIVED FOLDER BUT CAN BE CREATED THROUGH THE R FILES IN THE RELEVANT RAW_DATA DIRECTORY. HOWEVER, THE GENERATED COUNT DATA OVER THE CENSUS TRACTS AND GRIDS ARE ARCHIVED IN THE RELEVANT SUB-DIRECTORIES. WHILE WE DO NOT HAVE ALL OF THE DATA FILES ARCHIVED, WE DISCUSS THE NAMING CONVENTIONS BELOW AS THEY ARE USED WITHIN THE R SCRIPTS.

- POINT_PATTERN: this contains the homicide and motor vehicle theft data produced in DATA/RAW_DATA/CRIME.
- COUNT_DATA_CENSUS_TRACT: this sub-directory contains the R script CountDataGen_CT_final.R to produce the census tract count data which are saved in a sub-directory for each city as *2015CTCountData_projFinal.rda and *2015CTSFCountData_projFinal.rda where the latter is the same data but saved as an sf object.
- COUNT_DATA_GMO: this sub-directory contains the R script CountDataGen_GMO_final.R which takes the census tract level socio-economic variables stored in PROCESSED_DATA/COVARIATES and taken from RAW_DATA/COVARIATES and the crime data stored in PROCESSED_DATA/CRIME/POINT_PATTERN taken from RAW_DATA/CRIME. Importantly, this produces the gridded count data for the creation of the covariates for the Grid-Mesh Optimisation method implemented on the Los Angeles window (Chapter 4) where the creation of the second covariate arises from the average income *_CTInc_15_imp_proj.rds where all missing data for census tracts are interpolated, regardless of the total household estimates for the census tract. This is then interpolated slightly differently, using the proportion of the area of the census tract intersected with the grid cells as the weights for the areal interpolation. This is discussed in more detail in Chapter 4 of my thesis.
- COUNT_DATA_FINAL: this sub-directory contains the R scripts CountDataGen_final.R and CountDataGen_Scale_final.R. The former produces the count data on different grid resolutions at the projected scale while the latter takes this count data frame as well as the grids used to create it and transforms them so that they are scaled where a unit shiftin the x or y direction relates to 10km shift rather than 1m shift and and then shifting the locations so that the bottom-left of the boundary window of the city's polygon is at (0,0). Unlike the count data generated in DATA/PROCESSED_DATA/CIMRE/COUNT_DATA_GMO we use the average income *_CTInc_15_0imp_proj.rds (as for the census tract count data sets), where the missing data that corresponds to census tracts with estimated zero total households is set to zero instead of imputed. Additionally, the interpolation of the average income onto the grids involves using the proportion of the area of the grid cell intersected with the census tracts. This is discussed in more detail in Chapter 4 of my thesis. For the different resolutions we output additional data such as meshes and the boundary polygon for the cities.

NOTE THAT THE OUTPUTS FOUND IN THIS DIRECTORY ALL CONTAIN ADDITIONAL SOCIO-ECONOMIC VARIABLES HOWEVER, ANY USE OF THESE DATA SETS WITHIN THE THESIS ARE FOCUSSED ONLY ON THE TOTAL POPULATION AND AVERAGE INCOME. THEREFORE, ANY MENTION OF THESE ADDITIONAL VARIABLES IN RELEVANT README DOCUMENTS ARE AS AN ASIDE AND THE R CODE TO INCLUDE THESE ADDITIONAL VARIABLES ARE COMMENTED OUT. HOWEVER, THE CODE AND THE ACCESS TO THESE VARIABLES ARE STILL AVAILABLE IF REQUIRED.

