Dataset Title: Simulation data for toppling and height probabilities in sandpiles Supervisor: Antal A. Jarai, A.Jarai@bath.ac.uk PhD student: Minwei Sun, ms2271@bath.ac.uk Date of data collection: 09-Oct-2017 -- 16-May-2019 This dataset provides simulation data used in analyzing the results in Chapter 3 simulation results of the student's thesis. All data are collected using the Balena HPC cluster at the University of Bath. The research of the student was supported by an EPSRC doctoral training grant to the University of Bath. Note: The simulation data is freely available from the University of Bath research data archive and sample simulation codes are also available. This dataset contains simulation data in 2d, 3d, 5d, and 32d in folders named sandpile_data_xd. In 2d, we simulate the sandpile model in a box size of 2L x 2L both with periodic boundary conditions for systems with L = 512, 1024, 2048, and 4096 with sample sizes 2 × 10^7, 1.5 × 10^7, 3 × 10^6, and 7.5 × 10^5, and with Dirichlet boundary conditions for systems with L = 512, 1024, 2048, 4096, and 8192 with sample sizes 6 × 10^7, 3 × 10^7, 7.5 × 10^6, 4 × 10^6, and 10^6, respectively. The characteristics simulated include the toppling probability, the number of waves, and the height probability at the origin. In 3d, we generate the data of the toppling probability with Dirichlet boundary conditions for systems with L = 32, 64, 128, and 256 with sample sizes 8 × 10^7, 2 × 10^7, 4.5 × 10^6, and 4 × 10^6. In 5d, we simulate the toppling probability using hashing in the box with radius L = 32. The number of samples taken was 4 × 10^7, with approximately 400 samples discarded due to a full hashtable. In 32d, we simulate the height probability at the origin using hashing for a system with L = 128 with a sample size 4 × 10^6. We check our results in two ways to confirm that our methods give results consistent with earlier work. One uses the data to agree with some exponents in the earlier work in 2d and 3d. In 2d, the data are in out files called xxxsink-cluster-origin-aaa with L = xxx, and the overall averages are in text files named by s-origin-2d-average and s-distinct-origin-2d-average. In 3d, the data are in out files called 3d-xxxsink-cluster-origin with L =xxx, and the overall averages are in text files called s-origin-3d-average and s-distinct-origin-3d-average. On the other hand, we check that our methods yield the known height probabilities in 2d with L = 4096 in Dirichlet boundary conditions. The data used are in the text files called probability4096sink-aaa, and the overall average height probability is in the text file named probability4096sink-average. The number of samples generated in each file is 5 x 10^4. There are 80 files in total, so the total sample size is 4 x 10^6. Naming conventions: 2D data: box4096box256sink-aaa contains simulation run number aaa of 2D toppling probabilities in a box [-256,256]^2 for a system with L = 4096 with Dirichlet boundary conditions. The first two columns indicate x and y coordinates, and the third column corresponds to toppling probability. The number of samples generated in each file is 5 x 10^4. There are 70 text files in total, so the total sample size is 3.5 x 10^6. box4096box256sink-average is the average toppling probabilities of all samples in a box [-256,256]^2 for a system with L = 4096 with Dirichlet boundary conditions. portion512new-aaa contains simulation run number aaa of 2D toppling probabilities along the positive x-axis in a box with radius L = 512 with periodic boundary conditions. The top row corresponds to x = 0. The number of samples generated in each file is 2 x 10^6. There are 10 text files in total, so the total sample size is 2 x 10^7. portion1024new-aaa contains simulation run number aaa of 2D toppling probabilities along the positive x-axis in a box with radius L = 1024 with periodic boundary conditions. The top row corresponds to x = 0. The number of samples generated in each file is 5 x 10^5. There are 30 text files in total, so the total sample size is 1.5 x 10^7. portion2048new-aaa contains simulation run number aaa of 2D toppling probabilities along the positive x-axis in a box with radius L = 2048 with periodic boundary conditions. The top row corresponds to x = 0. The number of samples generated in each file is 1 x 10^5. There are 30 text files in total, so the total sample size is 3 x 10^6. portion4096new-aaa contains simulation run number aaa of 2D toppling probabilities along the positive x-axis in a box with radius L = 4096 with periodic boundary conditions. The top row corresponds to x = 0. The number of samples generated in each file is 1 x 10^4. There are 75 text files in total, so the total sample size is 7.5 x 10^5. portionxxx-average is the average toppling probabilities of all samples in 2D for a system with L = xxx with periodic boundary conditions. portion512sink-aaa contains simulation run number aaa of 2D toppling probabilities along the positive x-axis in a box with radius L = 512 with Dirichlet boundary conditions. The top row corresponds to x = 0. The number of samples generated in each file is 5 x 10^6. There are 12 text files in total, so the total sample size is 6 x 10^7. portion1024sink-aaa contains simulation run number aaa of 2D toppling probabilities along the positive x-axis in a box with radius L = 1024 with Dirichlet boundary conditions. The top row corresponds to x = 0. The number of samples generated in each file is 1 x 10^6. There are 30 text files in total, so the total sample size is 3 x 10^7. portion2048sink-aaa contains simulation run number aaa of 2D toppling probabilities along the positive x-axis in a box with radius L = 2048 with Dirichlet boundary conditions. The top row corresponds to x = 0. The number of samples generated in each file is 2.5 x 10^5. There are 30 text files in total, so the total sample size is 7.5 x 10^6. portion4096sink-aaa contains simulation run number aaa of 2D toppling probabilities along the positive x-axis in a box with radius L = 4096 with Dirichlet boundary conditions. The top row corresponds to x = 0. The number of samples generated in each file is 5 x 10^4. There are 80 text files in total, so the total sample size is 4 x 10^6. portion8192sink-aaa contains simulation run number aaa of 2D toppling probabilities along the positive x-axis in a box with radius L = 8192 with Dirichlet boundary conditions. The top row corresponds to x = 0. The number of samples generated in each file is 2.5 x 10^4. There are 40 text files in total, so the total sample size is 10^6. portionxxxsink-average is the average toppling probabilities of all samples in 2D for a system with L = xxx with Dirichlet boundary conditions. waves-8192-Dir-aaa contains simulation run number aaa of 2D, the number of waves during an avalanche in a box with radius L = 8192 with Dirichlet boundary conditions. The top row corresponds to the number of waves = 0 (i.e., there is no avalanche), the second row corresponds to the number of waves = 1, and so on. The number of samples generated in each file is 2.5 x 10^4. There are 40 text files in total, so the total sample size is 10^6. waves-8192-Dir-average is the average number of waves during an avalanche of all samples. xxx.out is an example of HPC out files of the simulation in 2D for a system with L = xxx with periodic boundary conditions. xxxsink.out is an example of HPC out files of the simulation in 2D for a system with L = xxx with Dirichlet boundary conditions. 3D data: portion3d-32sink-aaa contains simulation run number a of 3D toppling probabilities along the positive x-axis in a box with radius L = 32 with Dirichlet boundary conditions. The top row corresponds to x = 0. The number of samples generated in each text file is 5 x 10^6. There are 16 text files in total, so the total sample size is 8 x 10^7. portion3d-64sink-aaa contains simulation run number a of 3D toppling probabilities along the positive x-axis in a box with radius L = 64 with Dirichlet boundary conditions. The top row corresponds to x = 0. The number of samples generated in each text file is 2.5 x 10^6. There are 8 text files in total, so the total sample size is 2 x 10^7. portion3d-128sink-aaa contains simulation run number aaa of 3D toppling probabilities along the positive x-axis in a box with radius L = 128 with Dirichlet boundary conditions. The top row corresponds to x = 0. The number of samples generated in each text file is 1 x 10^5. There are 45 text files in total, so the total sample size is 4.5 x 10^6. portion3d-256sink-aa contains simulation run number aaa of 3D toppling probabilities along the positive x-axis in a box with radius L = 256 with Dirichlet boundary conditions. The top row corresponds to x = 0. The number of samples generated in each text file is 4 x 10^4. There are 100 text files in total, so the total sample size is 4 x 10^6. portion3d-xxxsink-average is the average toppling probabilities of all samples in 3D for a system with L = xxx. 3d-xxxsink.out is an example of HPC out files of the simulation in 3D for a system with L = xxx. 5D data: portion5d-32m32new-aaa contains simulation run number aa of 5D toppling probabilities along the positive x-axis in a box with radius L = 32 with Dirichlet boundary conditions. The top row corresponds to x = 0. The simulation uses hashing with m = 32. The number of samples requested in each text file is 1.5 x 10^6, with about 15 samples discarded due to a full hashtable. There are 27 text files in total, so the total sample size is about 4 x 10^7. portion5d-32m32new-average is the average toppling probabilities of all samples. 32D data: probability32d-128qdvar-aaa contains simulation run number aa of 32D, the height probability at the origin for a system with L = 128. The top row corresponds to the probability that the number of particles at the origin is 0. The number of samples generated in each text file is 2.5 x 10^5. There are 16 text files in total, so the total sample size is 4 × 10^6. probability32d-128qdvar-average is the average height probability of all samples. xxx.out is an HPC out file of the corresponding named text file in 5d and 32d, which recorded the number of samples generated.