THIS README IS FOR THE GRID_MESH/IRREGULAR_POLYGON_LGCP/IRREGPOLLGCP_CODE DIRECTORY.

THIS DIRECTORY CONTAINS THE DATA AND CODE FOR THE IMPLEMENTATION OF THE TRADITIONAL AND SBC SIMULATION STUDY FOR LGCP DATA ON THE LOS ANGELES POLYGON. WITHIN BOTH R SCRIPTS IS THE CODE THAT WAS RUN TO PRODUCE THE COVARIATES AND MESHES AND GRIDS, WHICH ARE THEN COMMENTED OUT AFTER THESE WERE CREATED AND STORED TO SAVE THE WASTED TIME IN RE-CREATING THE MESH AT EACH STEP AND TO ENSURE CONSISTENCY.

- GridMeshOptimIrregTrad_final.R: this file contains the code to simulate the covariates and meshes at different resolutions for the simulation study, which are commented out after their initial creation. This R script contains the code to run the traditional simulation study for the grid-mesh optimisation method.

- GridMeshOptimIrreg_final.R: this file contains the code to simulate the covariates and meshes at different resolutions for the simulation study, which are commented out after their initial creation. This R script contains the code to run the SBC simulation study for the grid-mesh optimisation method. There was a minor error in the code when it was run, for the mean field SBC rank calculations, this typo has since been fixed with a comment discussing the alteration.

- TimeErrorProcessandDataGeneration_final.R: this code finds the Time Errors in the completed runs, and makes a note of the Simulation-Grid-Mesh indices and creates a data frame for use within GridMeshOptimIrreg_TimeErrorRuns_final.R for the re-runs. These are saved in a list of data frames in TimingErrorDataFrames.rda. This R script will also take the final SBC (with time errors still present) outputs and save them under the same file name with _TIMEERRORFINAL added as a suffix to store separately. This is stored in the IRREGULAR_POLYGON_LGCP/IRREGPOLLGCP_OUTPUT directory.

- GridMeshOptimIrreg_TimeErrorRuns_final.R: this file contains the code (altered version of GridMeshOptimIrreg_final.R) which re-ran the SBC simulation study for the simulation, iteration, grid and mesh which resulted in a TIME ERROR j (did not complete in 6 hours) and run with all cores of the node available (16). The outputs loaded for the re-runs in have the suffix `_TIMEERRORFINAL' and the re-run saves the final output without any suffix added - the original output file name. The re-runs are guided by the simulation-grid-mesh information for each process in the list of data frames for the time errors for each process. There was a minor error in the code when it was run, for the mean field SBC rank calculations, this typo has since been fixed with a comment discussing the alteration.
As well as the Grid-Mesh Optimisation outputs we also output the data frames for each individual process that were produced in a single list from TimeErrorProcessandDataGeneration_final.R, saved as TimingErrorDataFrames.rda. These data frame outputs for each process, after the TIME ERROR re-runs, should have a 1 in the final column if the run was complete, if there is a 2 then there was an error when running, usually this also occurred after 12 hours (or the error was the > 12hr run) and so was not re-run. Additionally, in our case there are two with the value 0, which originally indicates that this was not re-run but in our case, the re-run took place but took over 12 hours although when we manually stopped the run, there was no 2 placed in the final column as it should have been. These particular output files can be found in the IRREGULAR_POLYGON_LGCP/IRREGPOLLGCP_OUTPUT/ directory. 

- GridMeshOptimIrreg_SpaceErrorRuns_final.R: this file contains the code (altered version of GridMeshOptimIrreg_final.R) which re-ran the SBC simulation study for the simulation iteration and grid and mesh which resulted in a SPACE ERROR (due to issues with the temporary directory, jobs initially failed due to a lack of space and when the error was removed and the jobs re-started some ended up not producing another error, while others still presented errors and were therefore re-run after all other simulations and TIME ERROR re-runs were completed) and run with all cores of the node available (16). The outputs loaded for the re-runs have the suffix `\_SPACEERRORFINAL' and the re-run saves the final output without any suffix added - the original output file name. There was a minor error in the code when it was run, for the mean field SBC rank calculations, this typo has since been fixed with a comment discussing the alteration.
Note: if there was no problem with the temporary directory initially, then these would likely be unnecessary for the simulation study.

FOR THE DATA BELOW, NOTE THAT EACH OF THE LISTS WERE CREATED BASED ON THE CHOICE OF 5KM, 2KM, 1KM, 0.5KM AND 0.2KM FOR THE GRID CELL WIDTHS AND CORRESPONDING MAXIMUM MESH EDGE LENGTHS AND THEREFORE THE MESHES, GRIDS WITH THE CORRESPONDING COORDINATES AND AGGREGATED COVARIATES ARE OF LENGTH 5, ALTHOUGH WE ONLY USE THE COARSER 4 RESOLUTIONS AS THE 0.2KM WOULD RESULT IN EXTREMELY EXPENSIVE COMPUTATIONS. THE R SCRIPTS FOR THE SIMULATION STUDIES EXTRACT THE REQUIRED ELEMENTS FROM THE LISTS.

- MeshesIrregPolLGCP.rda: this contains a list of meshes for each resolution.

- QuadratsRegPolLGCP.rda, CoordsIrregPolLGCP.rda: pre-calculated grid discretisation of the regular window as well as the grid centres in order to save time when simulation data, especially as these are fixed for each data set. Additionally, the latter includes the grid cell areas for each of the grid resolutions and a list of data frames of the ordering of the grid cells, so that we traverse down the y axis before travelling across the x axis with respect to the grid cells.

- GridMeshIrregPolLGCPSSCov.rda: the simulated covariates as rasters/pixel images where the latter is used for the data simulation. These covariates are produced from the 0.2km-by-0.2km resolution Los Angeles city population and average income, where the average income is not quite the true gridded interpolation for the data that we intend to use for our final crime models and these data sets are found in DATA/PROCESSED_DATA/CRIME/COUNT_DATA_GMO.

- CovAggGridIrregPolLGCP.rda: covariates pre-aggregated over the different grid resolutions to prevent needing to re-calculate at each data simulation and saving time, especially for the finer grid aggregation.

- WindowsIrregPolLGCP.rda: this is used for the data generation and is the Los Angeles city window.


THERE IS ALSO A SUB-DIRECTORY, IRREGPOLLGCP_CODE_TESTINGVALUES CONTAINING THE CODE THAT WAS USED TO SIMULATE DATA AND ACCESS WHETHER THERE WERE REASONABLE DATA SETS BEING PRODUCED, RATHER THAN EXTREMELY LARGE DATA SETS OCCURRING OFTEN OR PRODUCING ERRORS WITHIN THE lgcp FUNCTION DUE TO VERY LARGE POINTS NEEDING TO BE SIMULATED.

- Data: these are the same data files created in the main directory and used here for the simulations, although not all of these are necessary for the simulations. While the window is not used here, the census tract shapefiles to transform in to the required polygon for the data generation.
	-- MeshesIrregPolLGCP.rda
	-- QuadratsRegPolLGCP.rda
	-- GridMeshIrregPolLGCPSSCov.rda
	-- CovAggGridIrregPolLGCP.rda

- LGCPCovarianceandFixedPriorTest_final*.R: this contains some rough code for some quick tests of the behaviour of the data generation for different priors for sigma and the fixed effects, while the prior for rho is held fixed. With *=bRho,cRho this involves a slightly different prior for rho - see the prior.sim functions in each for the rho prior information and the prior for rho is always held fixed for our simulations.

- LACovFixedEffectsPrior_Final*.rda: these are the outputs from different combinations of priors. With additional b or c suffixes if they were output from LGCPCovarianceandFixedPriorTest_final/b/c.R where the different suffixes (none, b or c) corresponds to runs using different Rho priors. Initial simulations produced N=50 point patterns for each combination, but results with _Long_i in the suffix set N=100 and those with the additional suffix of 2 or Take2 set N=500.

- thetatilden_final.R: this produces some quick checks on the simulation of the point patterns for fixed values of the parameters, without any interest in the priors.

- DataPlottingIrregPol_final.R: This produces some plots (using tmap) of the simulated data sets from the fixed parameter values that will be used in the simulation studies.
	-- Outputs: MeshesIrregPolLGCP.pdf, QuadratsIrregPolLGCP.pdf, IrregPolLGCPCovariates.pdf, IrregPolLGCPGriddedCovariates, IrregPolSimStudySimulatedDataSeti*.pdf and LA*IrregPolSimStudy.pdf: these are the plots of the grids, meshes and covariates for the simulation studies as well as the true Los Angeles crime data, and the simulated data on two of the grid resolutions as a comparison to the true crime data. There were three simulated data sets for these plots, and so i = 1, 2, 3.