Dataset for "Timing of replication is a determinant of neutral substitution rates but does not explain slow Y chromosome evolution in rodents"

The dataset consists of intronic substitution rates (Ki) for mouse-rat orthologs using the July 2007 (mm9) and November 2004 (rn4) assemblies respectively, both obtained from the UCSC Table Browser. Intronic substitution rates were corrected for multiple hits according to the model of Tamura and Kumar (2002) Mol Biol Evol, 19:1727-1736. The dataset was further filtered to control for selective effects as described in the methodologies of Pink et al. (2009) Genome Biol Evol. 1:13-22, and Pink and Hurst (2010) Mol Biol Evol. 27(5): 1077-1086. Two intronic substitution rate datasets are provided: The main findings of Pink and Hurst (2010) were based on a filtered dataset, purged of all introns thought to be evolving under purifying selection that had failed a test for clusters of conserved bases, potentailly indicative of hidden functional sites. An unfiltered dataset underpins supplementary findings. Full details of the test and other filters for selection are provided in Pink et al. (2009).

The dataset combines intronic substitution rates with mouse replication times. Replication timing data for mouse cell lines prior to differentiation were downloaded from www.replicationdomain.org (Hiratani et al. (2008) PLoS Biol, 6:e245). The four available datasets were treated as replicates: Three derived from embryonic stem cells and a fourth derived from induced pluripotent stem cells (iPS). Positive values were indicative of early replication and negative values were indicative of replication later during S-phase.

Positions of genes on the mouse genome were defined by the terminal 5’ and 3’ base pairs of the coding sequence. These positions were obtained from annotations of the July 2007 assembly (mm9). As mouse replication timing data were assigned genomic coordinates based on the February 2006 assembly (mm8), the stand alone liftOver utility and associated chain file mm9ToMm8.over.chain, both obtained from UCSC, were used to convert positions between builds. Genic replication times were then taken from an average of times determined for probe positions overlapping with any part of the orthologous gene, within the limits of the coding sequence. Both means and medians are provided for each gene.

The dataset also includes intronic GC content and extent of intronic G+T skew for each ortholog, the latter used as a proxy for germ-line expression rate. The published datasets are the original .txt and .xls formats produced by the scripts, as well as .xlsx and .csv versions for preservation purposes, containing the variables described above. Details of methodologies are provided both in the publications Pink et al. (2009) and Pink and Hurst (2010), as well as the accompanying readme file. The readme file also contains details of the original sources of input data. Scripts used to process these input data and create the final datasets are also provided.

Subjects:

Cite this dataset as:
Pink, C., Hurst, L., Lercher, M., 2015. Dataset for "Timing of replication is a determinant of neutral substitution rates but does not explain slow Y chromosome evolution in rodents". Bath: University of Bath Research Data Archive. Available from: https://doi.org/10.15125/BATH-00091.

Export

[QR code for this page]

Data

mm_rn_Ki_RT … with_filter.txt
text/plain (1MB)
Creative Commons: Attribution 4.0

mm_rn_Ki_RT … with_filter.xls
text/plain (1MB)
Creative Commons: Attribution 4.0

mm_rn_Ki_RT … with_filter.csv
text/plain (1MB)
Creative Commons: Attribution 4.0

mm_rn_Ki_RT … with_filter.xlsx
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet (1MB)
Creative Commons: Attribution 4.0

mm_rn_Ki_RT_dataset_no_filter.txt
text/plain (1MB)
Creative Commons: Attribution 4.0

mm_rn_Ki_RT_dataset_no_filter.xls
text/plain (1MB)
Creative Commons: Attribution 4.0

mm_rn_Ki_RT_dataset_no_filter.csv
text/plain (1MB)
Creative Commons: Attribution 4.0

mm_rn_Ki_RT … no_filter.xlsx
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet (1MB)
Creative Commons: Attribution 4.0

Input_Files.zip
application/zip (23kB)
Creative Commons: Attribution 4.0

Code

Scripts_with_filter.zip
application/zip (70kB)
Creative Commons: Attribution 4.0

Scripts_no_filter.zip
application/zip (70kB)
Creative Commons: Attribution 4.0

All files available under CC-BY 4.0 licence.

Creators

Catherine Pink
University of Bath

Laurence Hurst
University of Bath

Martin Lercher
Heinrich Heine Universität

Contributors

University of Bath
Rights Holder

Documentation Files

ReadMe.txt
text/plain (12kB)

Funders

Medical Research Council (MRC)
https://doi.org/10.13039/501100000265

Publication details

Publication date: 2015
by: University of Bath

Version: 1

DOI: https://doi.org/10.15125/BATH-00091

URL for this record: https://researchdata.bath.ac.uk/id/eprint/91

Related papers and books

Pink, C. J., and Hurst, L. D., 2009. Timing of Replication Is a Determinant of Neutral Substitution Rates but Does Not Explain Slow Y Chromosome Evolution in Rodents. Molecular Biology and Evolution, 27(5), 1077-1086. Available from: https://doi.org/10.1093/molbev/msp314.

Pink, C. J., Swaminathan, S. K., Dunham, I., Rogers, J., Ward, A., and Hurst, L. D., 2009. Evidence That Replication-Associated Mutation Alone Does Not Explain Between-Chromosome Differences In Substitution Rates. Genome Biology and Evolution, 1, 13-22. Available from: https://doi.org/10.1093/gbe/evp001.

Contact information

Please contact the Research Data Service in the first instance for all matters concerning this item.

Contact person: Catherine Pink

Departments:

Life Sciences
Biology & Biochemistry