THIS README IS FOR THE DATA/MODELS DIRECTORY, EACH SUB-DIRECTORY WILL ALSO CONTAIN IT'S OWN README FILE.

THIS DIRECTORY CONTAINS TWO SUB-DIRECTORIES, THE FIRST CONTAINS THE CODE TO MODEL THE CENSUS TRACT COUNT DATA THROUGH GENERALISED LINEAR MODELS, POISSON AND NEGATIVE BINOMIAL MODELS FOR THE HOMICIDE AND MOTOR VEHICLE COUNTS ON THE CENSUS TRACT LEVEL FOR EACH CITY. THE SECOND CONTAINS THE CODE TO RUN THE MINIMUM CONTRAST CODE TO ESTIMATE THE RANGE AND VARIANCE FOR THE LOS ANGELES CRIME POINT PATTERNS. ADDITIONALLY, THE SECOND HAS THE CODE TO GENERATE THE LA 2014 DATA AGGREGATED OVER THE 200m-BY-200m GRID FOR A SIMPLE GLM FIT WITHIN INLA. THE RESULTS OF THIS FINAL R SCRIPT WERE FOR INTEREST AND FOR THE SAKE OF COMPARISON TO THE COVARIATE EFFECTS SIZE AND MAGNITUDE FOR THE GRID-MESH OPTIMISATION IMPLEMENTATION ON THE LOS ANGELES POLYGON.

THE MANIPULATED DATA FILES (CREATED WITHIN THIS DIRECTORY) ARE NOT ALL CONTAINED WITHIN THIS ARCHIVED FOLDER BUT THOSE THAT ARE NOT STORED CAN BE CREATED THROUGH THE R FILES IN THIS DIRECTORY. WHILE WE DO NOT HAVE ALL OF THE DATA FILES ARCHIVED, WE DISCUSS THE NAMING CONVENTIONS IN THE RELEVANT SUB-DIRECTORIES AS THEY ARE USED WITHIN THE R SCRIPTS.

- GLMS: this sub-directory contains the R script to calculate the Ripley's K function (homogeneous and inhomogeneous) for the crime data sets and produce figures to compare the results against the theoretical function which are presented and discussed in Chapter 1 of my thesis. Additonally, we have code to implement the Poisson and Negative Binomial generalised linear models using the HMC algorithm through the rstan and rstanarm package and three sub-directories (one for each city) to store the output models and relevant plots of the results, which are presented and discussed in Chapter 1 and Appendix A of my thesis.

- MINIMUM_CONTRAST: this sub-directory contains the code to generate the 2014 count data as well as the code to extract the 2014 point patterns and socio-economic variables (population and income only)  for Los Angeles in order to produce the census-tract level count data for 2014. Then the code fits the minimum.contrast function from the lgcp R package in order to estimate the range and variance for the 2014 homicide and motor vehicle theft point patterns in LA in order to gauge what a reasonable upper limit on the grid and mesh resolution would be in Chapter 4 of my thesis. Additionally this sub-directory contains the code to quickly generate the 2014 LA data over the 200m-by-200m grid for some quick GLM fits within INLA. While the count data generated are archived, the generated socio-economic variables (sub-sets of the original data as we only require the data on the relevant census tracts) are not archived, as for those produced in DATA/RAW_DATA/COVARIATES. More details can be found in the relevant README files.