# Discovering Statistics Using R

- Andy Field - University of Sussex, UK
- Jeremy Miles - RAND Corporation, USA
- Zoë Field - University of Sussex, UK

Watch Andy talk about the new version of his book for R: click here!

Hot on the heels of the award-winning and best selling *Discovering Statistics Using SPSS Third Edition*, Andy Field has teamed up with Jeremy Miles (co-author of *Discovering Statistics Using SAS*) to write **Discovering Statistics Using R**. Keeping the uniquely humorous and self-depreciating style that has made students across the world fall in love with Andy Field's books, **Discovering Statistics Using R** takes students on a journey of statistical discovery using the freeware *R*, a free, flexible and dynamically changing software tool for data analysis that is becoming increasingly popular across the social and behavioral sciences throughout the world.

The journey begins by explaining basic statistical and research concepts before a guided tour of the *R* software environment. Next the importance of exploring and graphing data will be discovered, before moving onto statistical tests that are the foundations of the rest of the book (for e.g. correlation and regression). Readers will then stride confidently into intermediate level analyses such as ANOVA, before ending their journey with advanced techniques such as MANOVA and multilevel models. Although there is enough theory to help the reader gain the necessary conceptual understanding of what they're doing, the emphasis is on *applying* what's learned to playful and real-world examples that should make the experience more fun than expected.

Like its sister textbooks, *Discovering Statistics Using R* is written in an irreverent style and follows the same ground-breaking structure and pedagogical approach. The core material is augmented by a cast of characters to help the reader on their way, hundreds of examples, self-assessment tests to consolidate knowledge, and additional website material for those wanting to learn more (at www.uk.sagepub.com/dsur/).

Given this book's accessibility, fun spirit, and use of bizarre real-world research it should be essential for *anyone* wanting to learn about statistics using the freely-available *R *software.

What will this chapter tell me? |

What the hell am I doing here? I don't belong here |

Initial observation: finding something that needs explaining |

Generating theories and testing them |

Data collection 1: what to measure |

Data collection 2: how to measure |

Analysing data |

What have I discovered about statistics? |

Key terms that I've discovered |

Smart Alex's tasks |

Further reading |

Interesting real research |

What will this chapter tell me? |

Building statistical models |

Populations and samples |

Simple statistical models |

Going beyond the data |

Using statistical models to test research questions |

What have I discovered about statistics? |

Key terms that I've discovered |

Smart Alex's tasks |

Further reading |

Interesting real research |

What will this chapter tell me? |

Before you start |

Getting started |

Using R |

Getting data into R |

Entering data with R Commander |

Using other software to enter and edit data |

Saving Data |

Manipulating Data |

What have I discovered about statistics? |

R Packages Used in This Chapter |

R Functions Used in This Chapter |

Key terms that I've discovered |

Smart Alex's Tasks |

Further reading |

What will this chapter tell me? |

The art of presenting data |

Packages used in this chapter |

Introducing ggplot2 |

Graphing relationships: the scatterplot |

Histograms: a good way to spot obvious problems |

Boxplots (box-whisker diagrams) |

Density plots |

Graphing means |

Themes and options |

What have I discovered about statistics? |

R packages used in this chapter |

R functions used in this chapter |

Key terms that I've discovered |

Smart Alex's tasks |

Further reading |

Interesting real research |

What will this chapter tell me? |

What are assumptions? |

Assumptions of parametric data |

Packages used in this chapter |

The assumption of normality |

Testing whether a distribution is normal |

Testing for homogeneity of variance |

Correcting problems in the data |

What have I discovered about statistics? |

R packages used in this chapter |

R functions used in this chapter |

Key terms that I've discovered |

Smart Alex's tasks |

Further reading |

What will this chapter tell me? |

Looking at relationships |

How do we measure relationships? |

Data entry for correlation analysis |

Bivariate correlation |

Partial correlation |

Comparing correlations |

Calculating the effect size |

How to report correlation coefficents |

What have I discovered about statistics? |

R packages used in this chapter |

R functions used in this chapter |

What will this chapter tell me? |

An Introduction to regression |

Packages used in this chapter |

General procedure for regression in R |

Interpreting a simple regression |

Multiple regression: the basics |

How accurate is my regression model? |

How to do multiple regression using R Commander and R |

Testing the accuracy of your regression model |

Robust regression: bootstrapping |

How to report multiple regression |

Categorical predictors and multiple regression |

What have I discovered about statistics? |

R packages used in this chapter |

R functions used in this chapter |

Key terms that I've discovered |

Smart Alex's tasks |

Further reading |

Interesting real research |

What will this chapter tell me? |

Background to logistic regression |

What are the principles behind logistic regression? |

Assumptions and things that can go wrong |

Packages used in this chapter |

Binary logistic regression: an example that will make you feel eel |

How to report logistic regression |

Testing assumptions: another example |

Predicting several categories: multinomial logistic regression |

What have I discovered about statistics? |

R packages used in this chapter |

R functions used in this chapter |

Key terms that I've discovered |

Smart Alex's tasks |

Further reading |

Interesting real research |

What will this chapter tell me? |

Packages used in this chapter |

Looking at differences |

The t-test |

The independent t-test |

The dependent t-test |

Between groups or repeated measures? |

What have I discovered about statistics? |

R packages used in this chapter |

R functions used in this chapter |

Key terms that I've discovered |

Smart Alex's tasks |

Further reading |

Interesting real research |

What will this chapter tell me? |

The theory behind ANOVA |

Assumptions of ANOVA |

Planned contrasts |

Post hoc procedures |

One-way ANOVA using R |

Calculating the effect size |

Reporting results from one-way independent ANOVA |

What have I discovered about statistics? |

R packages used in this chapter |

R functions used in this chapter |

Key terms that I've discovered |

Smart Alex's tasks |

Further reading |

Interesting real research |

What will this chapter tell me? |

What is ANCOVA? |

Assumptions and issues in ANCOVA |

ANCOVA using R |

Robust ANCOVA |

Calculating the effect size |

Reporting results |

What have I discovered about statistics? |

R packages used in this chapter |

R functions used in this chapter |

Key terms that I've discovered |

Smart Alex's tasks |

Further reading |

Interesting real research |

What will this chapter tell me? |

Theory of factorial ANOVA (independant design) |

Factorial ANOVA as regression |

Two-Way ANOVA: Behind the scenes |

Factorial ANOVA using R |

Interpreting interaction graphs |

Robust factorial ANOVA |

Calculating effect sizes |

Reporting the results of two-way ANOVA |

What have I discovered about statistics? |

R packages used in this chapter |

R functions used in this chapter |

Key terms that I've discovered |

Smart Alex's tasks |

Further reading |

Interesting real research |

What will this chapter tell me? |

Introduction to repeated-measures designs |

Theory of one-way repeated-measures ANOVA |

One-way repeated measures designs using R |

Effect sizes for repeated measures designs |

Reporting one-way repeated measures designs |

Factorisal repeated measures designs |

Effect Sizes for factorial repeated measures designs |

Reporting the results from factorial repeated measures designs |

What have I discovered about statistics? |

R packages used in this chapter |

R functions used in this chapter |

Key terms that I've discovered |

Smart Alex's tasks |

Further reading |

Interesting real research |

What will this chapter tell me? |

Mixed designs |

What do men and women look for in a partner? |

Entering and exploring your data |

Mixed ANOVA |

Mixed designs as a GLM |

Calculating effect sizes |

Reporting the results of mixed ANOVA |

Robust analysis for mixed designs |

What have I discovered about statistics? |

R packages used in this chapter |

R functions used in this chapter |

Key terms that I've discovered |

Smart Alex's tasks |

Further reading |

Interesting real research |

What will this chapter tell me? |

When to use non-parametric tests |

Packages used in this chapter |

Comparing two independent conditions: the Wilcoxon rank-sum test |

Comparing two related conditions: the Wilcoxon signed-rank test |

Differences between several independent groups: the Kruskal-Wallis test |

Differences between several related groups: Friedman's ANOVA |

What have I discovered about statistics? |

R packages used in this chapter |

R functions used in this chapter |

Key terms that I've discovered |

Smart Alex's tasks |

Further reading |

Interesting real research |

What will this chapter tell me? |

When to use MANOVA |

Introduction: similarities and differences to ANOVA |

Theory of MANOVA |

Practical issues when conducting MANOVA |

MANOVA using R |

Robust MANOVA |

Reporting results from MANOVA |

Following up MANOVA with discriminant analysis |

Reporting results from discriminant analysis |

Some final remarks |

What have I discovered about statistics? |

R packages used in this chapter |

R functions used in this chapter |

Key terms that I've discovered |

Smart Alex's tasks |

Further reading |

Interesting real research |

What will this chapter tell me? |

When to use factor analysis |

Factors |

Research example |

Running the analysis with R Commander |

Running the analysis with R |

Factor scores |

How to report factor analysis |

Reliability analysis |

Reporting reliability analysis |

What have I discovered about statistics? |

R Packages Used in This Chapter |

R Functions Used in This Chapter |

Key terms that I've discovered |

Smart Alex's tasks |

Further reading |

Interesting real research |

What will this chapter tell me? |

Packages used in this chapter |

Analysing categorical data |

Theory of Analysing Categorical Data |

Assumptions of the chi-square test |

Doing the chi-square test using R |

Several categorical variables: loglinear analysis |

Assumptions in loglinear analysis |

Loglinear analysis using R |

Following up loglinear analysis |

Effect sizes in loglinear analysis |

Reporting the results of loglinear analysis |

What have I discovered about statistics? |

R packages used in this chapter |

R functions used in this chapter |

Key terms that I've discovered |

Smart Alex's tasks |

Further reading |

Interesting real research |

What will this chapter tell me? |

Hierarchical data |

Theory of multilevel linear models |

The multilevel model |

Some practical issues |

Multilevel modelling on R |

Growth models |

How to report a multilevel model |

What have I discovered about statistics? |

R packages used in this chapter |

R functions used in this chapter |

Key terms that I've discovered |

Smart Alex's tasks |

Further reading |

Interesting real research |

Appendix |

Table of the standard normal distribution |

Critical Values of the t-Distribution |

Critical Values of the F-Distribution |

Critical Values of the chi-square Distribution |

### Supplements

In statistics, R is the way of the future. The big boys and girls have known this for some time: There are now millions of R users in academia and industry. R is free (as in no cost) and free (as in speech). Andy, Jeremy, and Zoe's book now makes R accessible to the little boys and girls like me and my students. Soon all classes in statistics will be taught in R.

I have been teaching R to psychologists for several years and so I have been waiting for this book for some time. The book is excellent, and it is now the course text for all my statistics classes. I'm pretty sure the book provides all you need to go from statistical novice to working researcher.

Take, for example, the chapter on t-tests. The chapter explains how to compare the means of two groups from scratch. It explains the logic behind the tests, it explains how to do the tests in R with a complete worked example, which papers to read in the unlikely event you do need to go further, and it explains what you need to write in your practical report or paper. But it also goes further, and explains how t-tests and regression are related---and are really the same thing---as part of the general linear model. So this book offers not just the step-by-step guidance needed to complete a particular test, but it also offers the chance to reach the zen state of total statistical understanding.

Prof. Neil Stewart

Warwick University

Field's Discovering Statistics is popular with students for making a sometimes deemed inaccessible topic accessible, in a fun way. In Discovering Statistics Using R, the authors have managed to do this using a statistics package that is known to be powerful, but sometimes deemed just as inaccessible to the uninitiated, all the while staying true to Field's off-kilter approach.

Dr Marcel van Egmond

University of Amsterdam

Probably the wittiest and most amusing of the lot (no, really), this book takes yet another approach: it is 958 pages of R-based stats wisdom (plus online accoutrements)... A thoroughly engaging, expansive, thoughtful and complete guide to modern statistics. Self-deprecating stories lighten the tone, and the undergrad-orientated 'stupid faces' (Brian Haemorrhage, Jane Superbrain, Oliver Twisted, etc.) soon stop feeling like a gimmick, and help to break up the text with useful snippets of stats wisdom. It is very mch a student textbook but it is brilliant... Field et al. is the complete package.

David M. Shuker

AnimJournal of Animal Behaviour

"*This work should be in the library of every institution where statistics is taught. It contains much more content than what is required for a beginning or advanced undergraduate course, but instructors for such courses would do well to consider this book; it is priced comparably to books which contain only basic material, and students who are fascinated by the subject may find the additional material a real bonus. The book would also be very good for self-study. Overall, an excellent resource*."

**Northern Michigan University**

**Choice**

The main strength of this book is that it presents a lot of information in an accessible, engaging and irreverent way. The style is informal with interesting excursions into the history of statistics and psychology. There is reference to research papers which illustrate the methods explained, and are also very entertaining. The authors manage to pull off the Herculean task of teaching statistics through the medium of R... All in all, an invaluable resource.

**Research Officer, Praxis Care, Belfast**

This textbook is easy-to-understand, and it is written in a humorous way. It has all the information needed for an easy comprehension of statistical intricacies. The book also comes with vast database that can be used by students to test their knowledge of the material, and/or how to operate R.

**Business Administration Dept, Southwestern Adventist Univ**

As I'm sure you know, R does more sophisticated statistical techniques for free that mainstream packages do not and so is potentially useful to the doctoral students I teach and supervise. An intermediate step is to use the R links in Field's SPSS book but in the end it is necessary to get to grips with R and this book is a relatively painless way of doing this (especially if you and/or your students are familiar with Fields SPSS book).

**School of Psychology, Surrey University**

Discovering Statistics using R is an excellent book to engage students in learning statistics using top of the line software. The content is presented in a clear and coherent way, and the exercises help reinforce and consolidate knowledge in quite a funny way. It is great material for teaching and learning, but also a handy reference book for researchers.

**Facultad de Ciencias Politicas, National Autonomous Univ of Mexico**

As an open source software and with a vast supporting community, R is increasingly adopted by researchers. Discovering Statistics Using R allows a soft transition from other statistical softwares to this open source alternative.

**Psychology , King's College London**

Clear and easy to use as an alternative to using SPSS for my psychology students

**Spectrum Centre for Mental Health Research, Division of Health Research, Lancaster University**

This book covers the material we need, with plenty of exercises, accessible explanations. Most importantly, it describes and teaches the R statistics platform integrated with the rest of the text.

**Mathematics Dept, Cuny College Of Staten Island**

This textbook is quite thorough, but the overall style of the writing would not land very well to my California audience unfortunately.

**Social Sciences Division, Mount Saint Mary College**