Exploring the route of this years ‘Deutschland Tour’ with R
The “Deutschland Tour” is a big (road) bike race in Germany. By far not as big as the “Tour de France”, but maybe it will get there :).
This years “Deutschland Tour” is a special one. It passes near my hometown and therefore I am wondering where there is a good spot for spectators. For me these key points are important:
To answer this question, I will use my R skills to import and visualise the data.
In this analysis the following libraries are used:
If you want to reproduce this analysis, you have to perform the following steps:
renv::restore()
(Ushey and Wickham 2024)targets::tar_make()
(Landau 2021)Alternatively, you could run this analysis by copying and executing chunk by chunk in your R session (installing the above mentioned packages manually).
At first define where gpx files can be read from:
gpx_url <- c("https://www.deutschland-tour.com/fileadmin/content/2_Deutschland_Tour/DT_24/Elite/DT24_E1_SW-HN_177km_inklneutral.gpx",
"https://www.deutschland-tour.com/fileadmin/content/2_Deutschland_Tour/DT_24/Elite/DT24_E2_HN-GD_173km_inklneutral.gpx",
"https://www.deutschland-tour.com/fileadmin/content/2_Deutschland_Tour/DT_24/Elite/DT24_E3_GD-VS_211km_inklneutral.gpx",
"https://www.deutschland-tour.com/fileadmin/content/2_Deutschland_Tour/DT_24/Elite/DT24_E4_Annw-SB_182km_inklneutral.gpx")
Define helper function that reads in stage data using ‘httr2’ (Wickham 2024). Read in html file and search for elements representing html files using a css selector.
stage <- function(gpx_url, css_track_point) {
resp <- req_perform(request(gpx_url))
gpx_trackpoints <- resp_body_string(resp) |>
read_html() |>
html_elements(css_track_point)
tibble(
lat = html_attr(gpx_trackpoints, "lat"),
lon = html_attr(gpx_trackpoints, "lon"),
elevation = html_text(gpx_trackpoints))
}
Define the CSS selector and apply the above mentioned function to all urls resulting in one final data frame:
css_track_point <- "trkpt"
Preprocess decisive columns to numeric values:
df_stages_pro <- mutate(df_stages, across(c(lon, lat, elevation), function(x) parse_number(x)))
Turn data frame into a sf (Pebesma 2018) object:
Simple feature collection with 40834 features and 2 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 6.97561 ymin: 48.05465 xmax: 10.25127 ymax: 50.04494
Geodetic CRS: WGS 84
# A tibble: 40,834 × 3
stage_id elevation geometry
* <chr> <dbl> <POINT [°]>
1 1 238. (10.23479 50.04494)
2 1 236. (10.23523 50.04476)
3 1 236. (10.23523 50.04476)
4 1 233 (10.23582 50.04443)
5 1 233 (10.23582 50.04443)
6 1 233. (10.23637 50.04409)
7 1 233. (10.23637 50.04409)
8 1 227. (10.23682 50.04377)
9 1 227. (10.23682 50.04377)
10 1 228. (10.23678 50.04381)
# ℹ 40,824 more rows
The spatial data is represented as points at the moment. Summarise points per stage, combining them into ‘multipoints’ and one row per stage:
sf_stages_multipoint <- summarise(sf_stages, geometry = st_combine(geometry),
.by = stage_id)
Simple feature collection with 4 features and 1 field
Geometry type: MULTIPOINT
Dimension: XY
Bounding box: xmin: 6.97561 ymin: 48.05465 xmax: 10.25127 ymax: 50.04494
Geodetic CRS: WGS 84
# A tibble: 4 × 2
stage_id geometry
<chr> <MULTIPOINT [°]>
1 1 ((10.23479 50.04494), (10.23523 50.04476), (10.23523 50.04…
2 2 ((9.21813 49.1415), (9.21785 49.14149), (9.21706 49.14149)…
3 3 ((9.79703 48.80134), (9.79703 48.80133), (9.7971 48.80137)…
4 4 ((7.96211 49.20398), (7.96207 49.2041), (7.96204 49.20418)…
Cast into lines with this operation:
sf_stages_line <- st_cast(sf_stages_multipoint, "LINESTRING")
Simple feature collection with 4 features and 1 field
Geometry type: LINESTRING
Dimension: XY
Bounding box: xmin: 6.97561 ymin: 48.05465 xmax: 10.25127 ymax: 50.04494
Geodetic CRS: WGS 84
# A tibble: 4 × 2
stage_id geometry
<chr> <LINESTRING [°]>
1 1 (10.23479 50.04494, 10.23523 50.04476, 10.23523 50.04476, …
2 2 (9.21813 49.1415, 9.21785 49.14149, 9.21706 49.14149, 9.21…
3 3 (9.79703 48.80134, 9.79703 48.80133, 9.7971 48.80137, 9.79…
4 4 (7.96211 49.20398, 7.96207 49.2041, 7.96204 49.20418, 7.96…
We can now plot the data using known ‘tidyverse’ (Wickham et al. 2019) techniques. To Include an underlying map, ‘ggspatial’ (Dunnington 2023) is used.
vis_stages_line <- function(sf_stages_line) {
ggplot() +
annotation_map_tile(zoom = 8, type = "cartolight") +
layer_spatial(sf_stages_line, aes(color = stage_id)) +
theme(legend.position = "bottom") +
labs(
color = "Stage Number",
title = "Deutschland Tour 2024",
subtitle = "Color indicates Stage Number")
}
gg_stages_line <- vis_stages_line(sf_stages_line)