scooter analysis

Author

Harrison DeFord

Published

May 5, 2022

Setup

This script relies on 3 geopackage outputs from the previous script. It creates a long dataframe, inclusive of all three vendors, groups them by permanent, unique bike ID, and defines a trip as a movement of 50 meters between timestamps, in an attempt to adjust for GPS variability.

link_data <- st_read("../results/link_mon_am.gpkg")
lime_data <- st_read("../results/spin_mon_am.gpkg")
spin_data <- st_read("../results/spin_mon_am.gpkg")
scooters_raw = bind_rows(link_data, lime_data, spin_data) %>%
  st_transform(crs = 3857)
if(!file.exists("../results/scooters_raw.gpkg")){
  st_write(scooters_raw, dsn = "../results/scooters_raw.gpkg")
}

First, we’ll filter out the disabled bike observations, and split the bikes into their own lists in order to apply a function over each bike.

scooters_split <- scooters_raw %>%
  distinct() %>%
  filter(is_disabled == 0) %>%
  group_by(vendor, bike_id) %>%
  group_split()

This function defines what constitutes a trip. It is slow. When it’s done, each list element, representing a scooter, will have a field delineating if a given time-interval was part of a trip.

scooters_trip <- lapply(scooters_split, function(df){
  df %>% mutate(dist_prev = units::drop_units(st_distance(geom, lag(geom), by_element = TRUE)),
                dist_next = units::drop_units(st_distance(geom, lead(geom), by_element = TRUE)),
                time_id = row_number(), #this is what allows us to order points for QGIS analysis
                movement_id = paste(bike_id, "_", row_number(), sep = ""), #perhaps redundant, but easy solution for moving between R and QGIS
                trip = case_when(
                  dist_prev > 50 | dist_next > 50 ~ 1, #define trip based on distance column
                  TRUE ~ 0))
})

Note that this is some of the earliest R code I’d ever written. I’m leaving it alone for posterity’s sake, but I’m not sure why I split it into groups and re-bound it again.

trip_long <- bind_rows(scooters_trip) %>%
  filter(trip == 1) #filter by only trip points
trip_split <- trip_long %>% #split again by trips
  group_by(bike_id) %>%
  group_split()
#trip_split_id <- lapply(trip_split, function(df){
#  df %>% mutate(time_id = row_number())
#})
trip_id_long <-bind_rows(trip_split) #bind into trips
if(!file.exists("../results/trip_id_long.gpkg")){
  st_write(trip_id_long, "../results/trip_id_long.gpkg", append = FALSE)
}