Topic outline

  • Introduction to self-help resources

    This is a curated collection of the best tutorials, websites, videos and books available for self-help in statistical analysis and experimental design. I have focused on collecting resources which will be relevant to undergraduates and taught postgraduates in SBBS, but research students and staff will also find resources to help them. Where resources focus on implementation with computer programmes, I have selected resources which use R because it's open source software and commonly used.

    Where an item is particularly recommended for an SBBS module, it is marked with the module code.

    You don't have to be subscribed to the QMPlus page to view this module, feel free to share the link with other students and colleagues.

    I always welcome new suggestions of items to add to the list from undergraduates, postgraduates and staff. Please contact me on j.e.littlefair [at] qmul.ac.uk to add a resource.  You can also contact me if you are a member of staff and you would like me to add your module codes next to resources for your students (please specify which resources).

  • General resources

    Biostatistical Design and Analysis Using R : A Practical Guide by Murray Logan. Covers introductory statistical principles, plotting graphics in base R, correlation, t-tests, ANOVA, linear modelling, generalized linear modelling, and the implementation of all this in R. Quite nice step-by-step examples of how to perform and interpret the different models. Available as an ebook from QMUL library. [BIO209, BIO782P]

    Experimental design and data analysis for biologists by Quinn & Keogh. Covers the basics of the purpose of statistics, descriptive statistics, correlation, linear modelling, basic ordination techniques. Available as an ebook from QMUL library.

    Design and analysis of ecological experiments by Scheiner & Gurevitch. A more complex introduction to hypothesis testing, ANOVA, and linear modelling. Also covers some more advanced tests such as logistic regression, path analysis, time series analysis, meta-analysis and spatial statistics. Available as an ebook from QMUL library.

    Our Coding Club provides a comprehensive set of online tutorials covering data manipulation, visualisation and basic to advanced analysis in R. Aimed at ecological and environmental data scientists, you can also choose to do your own mini-course by following recommended streams (Stats from Scratch, Wiz of Data Viz, Mastering Modelling).

    Recommended for BIO209: Intro to R Part I and II, Coding etiquette.

    Recommended for BIO782P: Intro to R Part I and II, Basic data manipulation, Introduction to Github for version control, Coding etiquette, From distributions to linear models, browse others as interested (e.g. modelling tutorials, visualisation, reproducible research).

    Quebec Centre for Biodiversity Science provides a comprehensive set of online tutorials at an advanced and fast paced level. Scroll down the wiki and click on the titles to see follow-along tutorials each containing scripts, practise datasets, instructions and a Prezi presentation. Themes include introductory R (although this is a reasonably advanced introduction), visualisation, linear modelling, generalised additive models, mixed effects models of different types, and multivariate analyses. Probably best suited to environmental/ecological postgraduates and above with some existing experience.  Available in English or French!

    The R Book by Mick Crawley, Imperial College. Touted as the definitive R bible. The vast length of this book means that it has been somewhat superceded by the plethora of more accessible online tutorials. However, for breadth of coverage it is untouched by any other resource. Comprehensible conceptual explanations of statistics as well as implementation in R. Available as an ebook from QMUL library. [BIO209, BIO782P]


  • Visualistion

    A basic introduction to visualisation using ggplot2 package including histograms, boxplots, bargraphs, scatterplots and plot customisation. Includes a mixture of videos to watch and commands to type into your own RStudio programme. By the R Ladies Sydney group. [BIO209]

    The R graph catalogue Shiny app by Joanna Zhao and Jennifer Bryan. Great interactive app which will help you to choose a graph according to what type you need (you can also filter by "good" and "not recommended" formats!). Once you have chosen your graph, you can click on it to copy and paste the code. [BIO209, BIO782P]

    An introduction to visualisation. Takes you through plotting and customising a scatterplot using ggplot2 and tidyverse. Includes commands to type into your own RStudio programme and exercises to try (although no answers to those exercises!). By the R for Data Science team - Garrett Grolemund and Hadley Wickham.

    The R graph gallery Has some novel ideas for visualisation and can help you choose a graph type

  • T tests

    Simple introductions to one-sample and two-sample T tests by the STHDA website. Focuses on the implementation of tests in R with commands that you can copy into your own R programme. [BIO209]

    Robustness and power of the T test Shiny app by Michael Whitlock of UBC. Nice little interactive app which demonstrates the relationship between means, standard deviations and sample sizes and test power and type I error rates. [BIO209, BIO782P]

  • ANOVA

    Quebec Centre for Biodiversity Science linear modelling tutorial. The second part of this follow-along website tutorial covers the implementation of ANOVA in R (scroll down to section 3). Good coverage of one-way and two-way ANOVA, ANCOVA and testing the assumptions. Tutorial contains instructions, scripts, practise datasets, challenges and a Prezi presentation. Also available in French [BIO782P]

    Introductory R by Rob Knell (available from the course QMPlus pages) has one of the best step-by-step run-throughs for understanding one-way ANOVA, and building on that with two-way ANOVA and interactions. Skip to Chapter 12, p258 [BIO209, BIO782P]

  • LINEAR MODELLING

    Linear models in R: understanding the coefficients table for regression and simple linear models by SBCS's Dr Rob Knell. Youtube video covering the visualisation, model-fitting and interpretation of linear regressions and ANCOVA-style models and how to interpret their model coefficients.  [BIO782P]

    Quebec Centre for Biodiversity Science linear modelling tutorial. Follow-along website tutorial covering implementation of linear models in R. Good coverage of background information, testing assumptions and data transformation if assumptions are violated. Tutorial contains instructions, scripts, practise datasets, challenges and a Prezi presentation. Also available in French [BIO782P]

    Shiny app to explain linear regression residuals by Michael Whitlock of UBC. Very clear little app to click through. [BIO209, BIO782P]
  • EXPERIMENTAL DESIGN

    Practical Field Ecology: A project guide by Wheater, Bell & Cook. Introduces experimental designs for common ecological questions, how to monitor site characteristics  and different sampling techniques arranged by popular taxonomic groups. Available as an ebook from the QMUL library. [BIO209]

    Statistical Bioinformatics: For Biomedical and Life Sciences Researchers, edited by Jae Lee. An advanced discussion of some of the unique problems faced by those designing and analysing experiments with high-throughput data. Includes data quality control, statistical testing for large biological datasets, supervised and unsupervised learning, multidimensional analysis, network analysis, resampling techniques, and analysing GWAS. Available as an ebook from the QMUL library. [BIO782P]


  • Ordination techniques

    Ordination using non-metric multidimensional scaling by SBCS's Dr Rob Knell. A youtube video approaching NMDS from a community ecology angle covering interpretation and implementation in R.

    Ordination methods for ecologists by Mike Palmer, Oklahoma State University. Implementation in CANOCO but good conceptual explanations of statistical concepts.

    Quebec Centre for Biodiversity Science tutorial on multivariate analysis and advanced multivariate analysis. A fairly advanced run-through of data exploration, association analyses, unconstrained ordination, and canonical analyses. Tutorial contains instructions, scripts, practise datasets, challenges and a Prezi presentation. Suitable for ecological and environmental postgraduates and above. Also available in French.

  • OTHER STATISTICAL APPROACHES

    Biomedical Data Science analytical pipeline by Rafael Irizarry & Michael Love. Website featuring a comprehensive take on fairly advanced analysis of biomedical data including basic modelling, dimension reduction, machine learning, genomic annotation, and hypothesis-testing and visualisation with genomic data. Features commands in R & Bioconductor to copy into your programmes, plus exercises to perform.

    Guessing correlation coefficients - an online interactive Shiny app by Michael Whitlock of UBC which will help you to develop an intuitive sense of how strong correlation coefficients are.

  • reproducible research

    Happy Git with R by Jenny Bryan and Jim Hester. Comprehensive website tutorial that explains how to integrate RStudio with GitHub for version-controlled, reproducible, collaborative research. Step by step guide containing commands that you can copy into your own programmes.