This course will provide an introduction (or refresher) to essential quantitative theory that underpins modern bioinformatics. Concepts will be introduced via a series of core problems whose details will be explored in greater depth in later modules.
Quantitative topics will include: • Linear Algebra: essential matrix-vector operations, least-squares • Probability Theory: Rules of Probability, Conditional Probability, Bayes’ Rule, distributions • Descriptive Statistics: summary statistics, visualisation • Hypothesis Testing: Fisher exact, chi-square, t-test • Correlation and Causation: Parametric and non-parametric measures • Introduction to Statistical Modelling in the R programming language: linear models, estimation
The module contains a variety of integrated learning environments, including interactive lectures as well as tutorials to explain and give feedback on aspects of assessment
Learning Outcomes
By the end of the module students should be able to:
Understand essential mathematical and statistical concepts and apply the correct techniques to solve elementary data analysis problems
Correctly apply techniques for the graphical representation and visualisation of data
Perform essential statistical data analysis in a computer programming language, specifically R
Solve quantitative problems inspired by real world bioinformatics with the implementation of the correct mathematical and statistical techniques to biological applications
Demonstrate the qualities and transferable skills necessary for employment requiring the exercise of initiative and personal responsibility, decision making in complex and unpredictable situations, and the independent learning ability required for continuing professional development
Open-book exam with a deadline of 2 hours. Students will select a subset of questions to answer. This will test the overall understanding of statistical concepts and their application to biological problems. (40%)
Coursework covering implementing statistical concepts in R, general programming (functions, loops etc.), visualisation and presentation of results. (40%)
Four short multiple choice quizzes (roughly 5 - 10 min each). These will be performed during in person practical sessions on Wed. and Friday afternoon of the two weeks. This will test the memorisation of statistical concept which is especially important for learning in subsequent modules. (20%)