In this vignette, we compare the computation time/memory usage of dense matrix and sparse Matrix.

Allocation and length

We begin with an analysis of the time/memory it takes to create these objects. In the atime code below, we allocate a vector for comparison, and we specify a result function which computes the length of the object x created by each expression. This means atime will save length as a function of data size N (in addition to time and memory).

library(Matrix)
N_seq <- unique(as.integer(10^seq(0,7,by=0.25)))
vec.mat.result <- atime::atime(
  N=N_seq,
  vector=numeric(N),
  matrix=matrix(0, N, N),
  Matrix=Matrix(0, N, N),
  result=function(x)data.frame(length=length(x)))
plot(vec.mat.result)

#> log-10 transformation introduced infinite values.
#> log-10 transformation introduced infinite values.
#> log-10 transformation introduced infinite values.

The plot above shows three panels, one for each unit.

kilobytes is the amount of memory used. We see that Matrix and vector use the same amount of memory asymptotically, whereas matrix uses more (larger slope on the log-log plot implies larger asymptotic complexity class).
length is the value returned by the length function. We see that matrix and Matrix have the same value, whereas vector has asymptotically smaller length (smaller slope on log-log plot).
seconds is the amount of time taken. We see that Matrix is slower than vector and matrix by a small constant overhead, which can be seen for small N. We also see that for large N, Matrix and vector have the same asymptotic time complexity, which is much faster than matrix.

Comparison with `bench::press`

An alternative method to compute asymptotic timings is via bench::press, which provides functionality for parameterized benchmarking (similar to atime_grid). Because atime() has special treatment of the N parameter, the code required for asymptotic measurement is relatively simple; compare the atime code above to the bench::press code below, which measures the same asymptotic quantities (seconds, kilobytes, length).

seconds.limit <- 0.01
done.vec <- NULL
measure.vars <- c("seconds","kilobytes","length")
press_result <- bench::press(N = N_seq, {
  exprs <- function(...)as.list(match.call()[-1])
  elist <- exprs(
    vector=numeric(N),
    matrix=matrix(0, N, N),
    Matrix=Matrix(0, N, N))
  elist[names(done.vec)] <- NA #Don't run exprs which already exceeded limit.
  mark.args <- c(elist, list(iterations=10, check=FALSE))
  mark.result <- do.call(bench::mark, mark.args)
  ## Rename some columns for easier interpretation.
  desc.vec <- attr(mark.result$expression, "description")
  mark.result$description <- desc.vec
  mark.result$seconds <- as.numeric(mark.result$median)
  mark.result$kilobytes <- as.numeric(mark.result$mem_alloc/1024)
  ## Compute length column to measure in addition to time/memory.
  mark.result$length <- NA
  for(desc.i in seq_along(desc.vec)){
    description <- desc.vec[[desc.i]]
    result <- eval(elist[[description]])
    mark.result$length[desc.i] <- length(result)
  }
  ## Set NA time/memory/length for exprs which were not run.
  mark.result[desc.vec %in% names(done.vec), measure.vars] <- NA
  ## If expr went over time limit, indicate it is done.
  over.limit <- mark.result$seconds > seconds.limit
  over.desc <- desc.vec[is.finite(mark.result$seconds) & over.limit]
  done.vec[over.desc] <<- TRUE
  mark.result
})

#> Running with:
#>           N
#>  1        1
#>  2        3
#>  3        5
#>  4       10
#>  5       17
#>  6       31
#>  7       56
#>  8      100
#>  9      177
#> 10      316
#> 11      562
#> 12     1000
#> 13     1778
#> 14     3162
#> 15     5623
#> 16    10000
#> 17    17782
#> 18    31622
#> 19    56234
#> 20   100000
#> 21   177827
#> 22   316227
#> 23   562341
#> 24  1000000
#> 25  1778279
#> 26  3162277
#> 27  5623413
#> 28 10000000

#> Some expressions had a GC in every iteration; so filtering is disabled.

The bench::press code above is relatively complicated, because it re-implements two functions that are provided by atime:

If an expression takes longer than the time limit of 0.01 seconds, then it will not be run for any larger N values. This keeps overall computation reasonable, even when comparing expressions which have different asymptotic time complexity (such as quadratic for matrix and linear for Matrix in this example).
If you want to measure quantities other than seconds and kilobytes as a function of N (such as length in this example), then atime makes that easy (just provide a result function), whereas it is more complex to implement in bench::press (for loop is required).

Below we visualize the results from bench::press,

library(data.table)
(press_long <- melt(
  data.table(press_result),
  measure.vars=measure.vars,
  id.vars=c("N","description"),
  na.rm=TRUE))

N	description	variable	value
1	vector	seconds	0.000000e+00
1	matrix	seconds	0.000000e+00
1	Matrix	seconds	0.000000e+00
3	vector	seconds	0.000000e+00
3	matrix	seconds	0.000000e+00
⋮	⋮	⋮	⋮
3162277	Matrix	length	9.999996e+12
5623413	vector	length	5.623413e+06
5623413	Matrix	length	3.162277e+13
10000000	vector	length	1.000000e+07
10000000	Matrix	length	1.000000e+14

if(require(ggplot2)){
  gg <- ggplot()+
    ggtitle("bench::press results for comparison")+
    facet_grid(variable ~ ., labeller=label_both, scales="free")+
    geom_line(aes(
      N, value,
      color=description),
      data=press_long)+
    scale_x_log10(limits=c(NA, max(press_long$N*2)))+
    scale_y_log10("")
  if(requireNamespace("directlabels")){
    directlabels::direct.label(gg,"right.polygons")
  }else gg
}

#> log-10 transformation introduced infinite values.
#> log-10 transformation introduced infinite values.
#> log-10 transformation introduced infinite values.

We can see that the plot from atime and bench::press are consistent.

Complexity class estimation with atime

Below we estimate the best asymptotic complexity classes:

vec.mat.best <- atime::references_best(vec.mat.result)
plot(vec.mat.best)

#> log-10 transformation introduced infinite values.

Allocation and length

Comparison with bench::press

Complexity class estimation with atime

Comparison with `bench::press`