Session 4 — Visualisation and basic statistics with ggplot2
Language Analytics in R
0.1 Learning objectives
By the end of this session you will:
- Build plots with ggplot2: data, aesthetics (
aes()), and geoms. - Use layers: add geoms, scales, labels, and themes.
- Choose appropriate geoms for the task (e.g. bar, point, line, boxplot, histogram).
- Add basic statistical summaries to plots (e.g. means, error bars, smooth trends).
- Customize axes, labels, and themes for clear, publishable figures.
- Apply ggplot2 to language and corpus data (e.g. frequencies, distributions, model results).
1 Grammar of graphics
- Data → Aesthetics (mapping variables to x, y, color, fill, etc.) → Geometries (how to draw).
ggplot(data, aes(x = ..., y = ...)) + geom_...()
[Add a minimal example: e.g. bar chart of word counts or scatter plot.]
2 Core geoms
geom_bar()/geom_col()— counts vs pre-computed values.geom_point()— scatter plots.geom_line()— time series or trends.geom_histogram()/geom_density()— distributions.geom_boxplot()/geom_violin()— group comparisons.
[Add one or two code chunks per geom using course-relevant data.]
3 Aesthetics and layers
aes(): map variables to x, y, color, fill, size, shape.- Faceting:
facet_wrap(),facet_grid(). - Scales:
scale_*for axes, colors, labels.
[Add an example with faceting or custom scales.]
4 Labels and themes
labs()— title, subtitle, axis labels.theme_minimal(),theme_bw(), or customtheme().- Saving:
ggsave()or Quarto figure options.
[Add a polished plot example and ggsave.]
5 Adding basic statistics to plots
stat_summary()for means and confidence intervals/error bars.geom_smooth()for trend lines (e.g. linear model or loess).- Combine group summaries with faceting to compare patterns across categories.
[Add one example with means + error bars and one example with geom_smooth().]
6 Hands-on exercises
- Reproduce a simple plot from the session (e.g. bar chart, boxplot).
- Build a plot from your own or course data; add labels and a theme.