Rainfall

I grew up in Fife and now live on the outskirts of Glasgow. And maybe it’s just the ‘lovely’ July weather we’ve been experiencing, or maybe it’s just that I’m getting older and have a selective memory for the past, but I don’t recall summers being quite so wet when I was growing up!

I thought I’d take the opportunity to explore the data and play around with some visualisation…

Data from Met Office

The Met Office provides historic data from weather stations across the UK and the stations at Leuchars (in Fife) and Paisley (just to the west of Glasgow, although I live just to the south) have data going back to the 1950s.

library(tidyverse)
library(stringr)
library(viridis)
library(hrbrthemes)
#Data is a bit messy in txt format (although they do have a help page on how to use the Excel 'text to columns wizard' to import the data...!)
leucharsRawData <- 
  readr::read_lines("http://www.metoffice.gov.uk/pub/data/weather/uk/climate/stationdata/leucharsdata.txt", skip = 5) %>% #skip first five lines of explanations
  stringr::str_trim() %>% #remove whitespace
  stringr::str_split(stringr::boundary("word")) %>% #split lines into columns
  purrr::discard(~("Provisional" %in% .x)) #Remove recent provisional results
paisleyRawData <- 
  readr::read_lines("http://www.metoffice.gov.uk/pub/data/weather/uk/climate/stationdata/paisleydata.txt", skip = 5) %>%
  stringr::str_trim() %>% #remove whitespace
  stringr::str_split(stringr::boundary("word")) %>% #split lines into columns
  purrr::discard(~("Provisional" %in% .x)) #Remove recent provisional results

#Check that column names are the same
stopifnot(leucharsRawData[[1]] == paisleyRawData[[1]])

#Note that 2nd row of each resulting list is an unhelpful second row of column headers
leucharsData <- 
  tail(leucharsRawData, -2) %>%
   purrr::map_df(~set_names(as.list(as.numeric(.x)), leucharsRawData[[1]])) %>%
  dplyr::mutate(dataDate = lubridate::dmy(paste(01, mm, yyyy, sep = "_")))
paisleyData <- 
  tail(paisleyRawData, -2) %>%
   purrr::map_df(~set_names(as.list(as.numeric(.x)), paisleyRawData[[1]])) %>%
  dplyr::mutate(dataDate = lubridate::dmy(paste(01, mm, yyyy, sep = "_")))

allData <- 
  list(Leuchars = leucharsData, Paisley = paisleyData) %>%
  dplyr::bind_rows(.id = "station") %>%
  dplyr::filter(
    dataDate >= max(min(leucharsData$dataDate), min(paisleyData$dataDate)))

The combined data goes back to 1959 and as a first pass, it might be helpful just to compare the total rainfall each year:

allData %>%
  group_by(station, yyyy) %>%
  dplyr::summarise(annualRainfall = sum(rain)) %>%
  ggplot() +
  hrbrthemes::theme_ipsum() +
  geom_line(
    aes(x = yyyy, y = annualRainfall, colour = forcats::as_factor(station)),
    size = 1.5) +
  scale_color_viridis(name = NULL, discrete = TRUE) +
  scale_y_comma(limits = c(0, NA), breaks = scales::pretty_breaks(n = 8)) +
  labs(
    subtitle = "Analysis of annual rainfall, comparing east to west",
    caption = "Data sourced from http://www.metoffice.gov.uk/pub/data/weather/uk/climate/stationdata/"
  ) +
  xlab(NULL) + ylab("Annual rainfall (mm)") +
  theme(legend.position = c(0.9, 0.2))

Well, that looks pretty clear. I really should be expecting a wetter time in the West of Scotland (nearly twice as much now, near Paisley, as I experienced as a child in the 80s).

allData %>%
  ggplot() +
  geom_tile(
    aes(x = forcats::as_factor(as.character(mm)), 
      y = forcats::fct_rev(forcats::as_factor(as.character(yyyy))), 
      fill = rain)
  ) +
  scale_fill_viridis(name = "Rainfall (mm)") + 
  scale_y_discrete(breaks = scales::pretty_breaks(n = 10)) +
  facet_wrap(~station, nrow = 1) +
  theme_minimal() +
  xlab("Months") + ylab("Years") +
  labs(
    subtitle = "Analysis of monthly rainfall over time, comparing East to West",
    caption = "Data sourced from http://www.metoffice.gov.uk/pub/data/weather/uk/climate/stationdata/"
  )

Overall it looks wetter in Paisley, predominantly based on the Autumn and Winter months. Hard to tell much about the summer months though.

If we filter the data to only look at June, July, and August…

allData %>%
  filter(mm %in% c(6, 7, 8)) %>%
  dplyr::mutate(yyyy = forcats::fct_rev(forcats::as_factor(as.character(yyyy)))) %>%
  ggplot() +
  geom_tile(
    aes(x = forcats::as_factor(as.character(mm)), 
      y = yyyy, fill = rain)) +
  scale_fill_viridis(name = "Rainfall (mm)") + 
  scale_y_discrete(breaks = scales::pretty_breaks(n = 10)) +
  facet_wrap(~station, nrow = 1) +
  theme_minimal() +
  xlab("Months") + ylab("Years") +
  labs(
    subtitle = "Analysis of summer rainfall over time, comparing East to West",
    caption = "Data sourced from http://www.metoffice.gov.uk/pub/data/weather/uk/climate/stationdata/"
  )

Whilst there doesn’t appear to be too much difference over the last few years in July, it’s still a lot wetter over in the West now than it was when I was growing up!