-
Notifications
You must be signed in to change notification settings - Fork 12
Expand file tree
/
Copy pathaccessibility.Rmd
More file actions
362 lines (302 loc) · 13.7 KB
/
accessibility.Rmd
File metadata and controls
362 lines (302 loc) · 13.7 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
---
title: "Introduction to accessibility: calculating accessibility measures"
output: rmarkdown::html_vignette
bibliography: '`r system.file("REFERENCES.bib", package = "accessibility")`'
vignette: >
%\VignetteIndexEntry{Introduction to accessibility: calculating accessibility measures}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
Sys.setenv(OMP_THREAD_LIMIT = 2)
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
**accessibility** is an R package that offers a set of fast and convenient
functions to calculate multiple transport accessibility measures. Given a
pre-computed travel cost matrix and a land use dataset, the package allows one
to calculate active and passive accessibility levels using multiple
accessibility metrics, such as cumulative opportunities (using either a travel
cost cutoff or a travel cost interval), minimum travel cost to closest N
number of activities, gravitational measures and different floating catchment
area methods. This vignette briefly overviews the package with a few
reproducible examples.
# Installation
Before using **accessibility** please make sure that you have it installed in
your computer. You can download either the most stable version from CRAN...
```{r, eval = FALSE}
install.packages("accessibility")
```
...or the development version from GitHub.
```{r, eval = FALSE}
# install.packages("remotes")
remotes::install_github("ipeaGIT/accessibility")
```
# Overview of the package
As of the time of writing this vignette, the **accessibility** package includes
five different functions to calculate accessibility:
1. `cost_to_closest()` - calculates the minimum travel cost to the closest
*n* number of opportunities.
2. `cumulative_cutoff()` - calculates the frequently used threshold-based
cumulative opportunities measure.
3. `cumulative_interval()` - calculates accessibility as the median/mean (or
any other summary measure, really) number of opportunities that can be
reached within a cost interval.
4. `gravity()` - calculates gravity-based accessibility, taking a decay
function specified by the user (more on that in a few paragraphs).
5. `floating_catchment_area()` - calculates accessibility taking into account
the effects of competition for opportunities with different floating
catchment area measures.
You may have noticed that we've mentioned a few times that the functions
calculate accessibility using travel *cost*, and not travel *time*. That's
because we're treating *costs* here in its generic sense: anything that
increases the impedance from an origin to a destination, such as travel time,
monetary costs, distances, risk perception, etc., can be considered a generic
*cost*.
The `gravity()` and `floating_catchment_area()` functions can use
different decay functions when estimating accessibility levels. These decay
functions effectively weigh the number of opportunities in a destination by a
factor that depends on the travel cost between the origin and the destination.
For convenience, the package currently includes the following decay functions:
1. `decay_binary()` - binary decay function (the one used in cumulative
opportunities measures).
2. `decay_exponential()` - negative exponential decay function.
3. `decay_linear()` - linear decay function (weights decay linearly from 1 to 0
until a specific travel cost cutoff is reached).
4. `decay_power()` - inverse power decay function.
5. `decay_stepped()` - stepped decay function (similar to `decay_binary()`, but
can take an arbitrary number of steps, instead of a single one).
The users can also specify their own custom decay functions, if they need to
use functions currently not included in the package. For more details on this,
please read the [decay functions vignette](decay_functions.html).
# Demonstration on sample data
Enough talking. Let's demonstrate some of the key features of the package.
First we'll need to load the libraries we'll be using:
```{r, message = FALSE, warning = FALSE}
library(accessibility)
library(data.table)
library(ggplot2)
library(sf)
```
## Data requirements
To use **accessibility**, you will need a pre-computed travel cost matrix and
some land use data (e.g. location of jobs, healthcare, population, etc.). As
mentioned before, travel costs can be presented in terms of travel times,
distances or monetary costs, for example. This dataset must be structured in a
`data.frame` containing, at least, the columns `from_id`, `to_id` and the
travel cost between each origin-destination pair.
Your data should look similar to this sample dataset with public transport
travel times for the city of Belo Horizonte, Brazil, included in the package
for demonstration purposes[^1].
[^1]: If you would like to calculate such travel cost matrices yourself, there
are several computational packages to do that in R, such as
[r5r](https://github.com/ipeaGIT/r5r),
[dodgr](https://github.com/UrbanAnalyst/dodgr),
[gtfsrouter](https://github.com/UrbanAnalyst/gtfsrouter),
[hereR](https://github.com/munterfi/hereR) and
[opentripplanner](https://github.com/ropensci/opentripplanner).
```{r}
data_dir <- system.file("extdata", package = "accessibility")
travel_matrix <- readRDS(file.path(data_dir, "travel_matrix.rds"))
head(travel_matrix)
```
The land use data must also be structured in a `data.frame` and must contain an
`id` column, referring to the ids listed in the travel matrix, and the number
of opportunities/facilities/services in each spatial unit. The sample dataset
we'll be using looks like this:
```{r}
land_use_data <- readRDS(file.path(data_dir, "land_use_data.rds"))
head(land_use_data)
```
## Minimum travel cost
`cost_to_closest()` calculates the minimum travel cost to a given number of
opportunities. Much like the other functions we'll be demonstrating in this
section, it takes as inputs the travel matrix and land use datasets, the name
of the column in the latter with the opportunities to be considered and the
name of the column in the former with the travel cost to be considered.
Additionally, it takes the minimum number of opportunities to be considered.
Here's how calculating the time from each origin in Belo Horizonte to the
closest school looks like:
```{r, message = FALSE}
mtc <- cost_to_closest(
travel_matrix,
land_use_data,
opportunity = "schools",
travel_cost = "travel_time",
n = 1
)
head(mtc)
```
## Cutoff-based cumulative opportunities
`cumulative_cutoff()` calculates the traditional cumulative opportunities
measure, indicating the number of opportunities that are accessible within a
given travel cost threshold. In this example, we estimate how many jobs can be
reached from each origin with trips taking up to 30 minutes of travel time.
```{r, message = FALSE}
cum_cutoff <- cumulative_cutoff(
travel_matrix,
land_use_data,
opportunity = "jobs",
travel_cost = "travel_time",
cutoff = 30
)
head(cum_cutoff)
```
Let's say that we wanted to, instead, calculate *passive* accessibility - i.e.
by how many people each destination can be reached within a given travel cost.
Doing so requires very few changes to the call: just change the "opportunity"
column to `"population"` and set `active` (`TRUE` by default) to `FALSE`.
```{r, message = FALSE}
passive_cum_cutoff <- cumulative_cutoff(
travel_matrix,
land_use_data,
opportunity = "population",
travel_cost = "travel_time",
cutoff = 30,
active = FALSE
)
head(passive_cum_cutoff)
```
The `active` parameter is available in most other accessibility functions as
well (with the exception of `floating_catchment_area()`), making it very easy
to calculate both active and passive accessibility.
## Interval-based cumulative opportunities
`cumulative_time_interval()` calculates the interval-based cumulative
opportunities measure. This measure, developed by @tomasiello2023time,
mitigates the impacts of arbitrary choices of cost cutoffs, one of the main
disadvantages of the traditional threshold-based cumulative opportunities
measure. Given a cost interval, it calculates several accessibility estimates
within the interval and summarizes it using a user-specified function. In the
example below, we calculate the average number of accessible jobs considering
multiple minute-by-minute time thresholds between 40 and 60 minutes.
```{r, message = FALSE}
cum_interval <- cumulative_interval(
travel_matrix = travel_matrix,
land_use_data = land_use_data,
opportunity = "jobs",
travel_cost = "travel_time",
interval = c(40, 60),
summary_function = base::mean
)
head(cum_interval)
```
## Gravity measures
`gravity()` calculates gravity-based measures - i.e. measures in which the
weight of opportunities is gradually discounted as the travel cost increases.
Of course, several different decay functions can be used to so, each one of
them with a range of possible different parameters. In order to accommodate such
generalization, the function takes the decay function to be used as a
parameter.
In the example below, we calculate accessibility using a negative exponential
function with a `decay_value` (usually referred as the \eqn{\beta} in its
formulation) of 0.2. Please see [the vignette on decay
functions](decay_functions.html) for more information on the decay functions
shipped with the package and how to use custom functions.
```{r, message = FALSE}
negative_exp <- gravity(
travel_matrix,
land_use_data,
opportunity = "schools",
travel_cost = "travel_time",
decay_function = decay_exponential(decay_value = 0.2)
)
head(negative_exp)
```
## Floating catchment area
`floating_catchment_area()` calculates accessibility accounting for competition
of resources using different floating catchment area (FCA) methods. The FCA
family includes several different methods, which can be specified using the
`method` parameter. As of the time of writing this vignette, the package
supports two different methods:
- 2-Step Floating Catchment Area (`"2sfca"`) - the first metric in the FCA
family, originally proposed by @luo2003measures.
- Balanced Floating Catchment Area (`"bfca"`) - takes competition affects into
account while correcting for issues of inflation of demand and service levels.
Originally proposed by @paez2019demand and named in @pereira2021geographic.
Please note that, since FCA measures consider competition effects, we have to
specify which column in the land use dataset represents the population
competing for opportunities with the `demand` parameter. The function
also supports different decay functions. In the example below, we calculate
accessibility to jobs using the BFCA method, considering that the entire
population of the city compete for these jobs and using a negative exponential
decay function.
```{r, message = FALSE}
bfca <- floating_catchment_area(
travel_matrix,
land_use_data,
opportunity = "jobs",
travel_cost = "travel_time",
demand = "population",
method = "bfca",
decay_function = decay_exponential(decay_value = 0.5)
)
head(bfca)
```
## Spatial availability
`spatial_availability()` also calculates accessibility considering competition
effects, though using the spatial availability measure proposed by
@soukhov2023introducing. The results from this metric are proportional both to
the demand in each origin and the travel cost it takes to reach the
destinations. As with the FCA function, we have to specify the column in the
land use dataset that contains the population competing for opportunities and we
can use different decay functions to calculate the impedance between
origin-destination pairs.
```{r}
spatial_avlblt <- spatial_availability(
travel_matrix,
land_use_data,
opportunity = "jobs",
travel_cost = "travel_time",
demand = "population",
decay_function = decay_exponential(decay_value = 0.1)
)
head(spatial_avlblt)
```
## Balancing cost
`balancing_cost()` calculates the balancing cost accessibility measure.
Originally proposed by @barboza2021balancing under the name "balancing time",
this metric is defined as the travel cost required to reach as many
opportunities as the number of people in a given origin. Just like the
previous two functions, `balancing_cost()` also includes a
parameter to specify the population competing for opportunities.
The function also includes a `cost_increment` parameter, that should be used to
specify the increment that defines the travel cost distribution from which the
potential balancing costs will be picked. For example, an increment of 1 (the
default) tends to suit travel time distributions, meaning that the function will
first check if any origins reach their balancing cost with a travel time of 0
minutes, then 1 minute, 2 minutes, 3, 4, ..., etc. On the other hand, an
increment of 1 might be too big for a distribution of monetary costs, which
could possibly benefit from a smaller increment of 0.05 (5 cents), for example.
Such increment results in the function looking for balancing costs first at a
monetary cost of 0, then 0.05, 0.10, ..., etc. In the example below, we use the
default cost increment of 1.
```{r}
bal_cost <- balancing_cost(
travel_matrix,
land_use_data,
opportunity = "jobs",
travel_cost = "travel_time",
demand = "population"
)
head(bal_cost)
```
## Visualize results
If you have the spatial data of your origins/destinations, you can easily merge
it with the accessibility to create spatial visualizations of the results. The
example below quickly shows how to create a simple map using `{ggplot2}`.
```{r, eval = requireNamespace(c("sf", "ggplot2"), quietly = TRUE), out.width = "80%", fig.width = 6, fig.height = 6}
grid <- system.file("extdata/grid_bho.rds", package = "accessibility")
grid <- readRDS(grid)
spatial_data <- merge(grid, cum_cutoff, by = "id")
ggplot() +
geom_sf(data = spatial_data, aes(fill = jobs), color = NA) +
labs(
title = "Job accessibility by transit in under 30 min.",
fill = "Accessible jobs"
) +
scale_fill_viridis_c() +
theme_void()
```
# References