This is an implementation of the Marcov Affinity-based Graph Imputation of Cells (MAGIC) algorithm described in Van Dijk, David et al, modified for flexibility. It includes the mar_mat_input argument, which allows the data used to compute the diffusion operator and the data to be imputed to be specified independently. This allows any low-dimensional representation of the data, including batch-corrected data, to be directly used to calculate the powered Marcov affinity matrix.

magicBatch(
  data,
  select_features = NULL,
  mar_mat_input = NULL,
  import_mar_mat = FALSE,
  pca = 20,
  t_param = c(2, 4, 6),
  n_diffusion_components = 0,
  k = 9,
  ka = 3,
  epsilon = 1,
  rescale_percent = 90,
  rescale_method = "adaptive",
  python_command = system("which python3", intern = TRUE)
)

Arguments

data

An expression matrix where cells correspond to rows and genes correspond to columns

select_features

A vector of features to use for imputation.

mar_mat_input

A matrix where cells correspond to rows and components or features correspond to columns. If left unspecified, the Marcov matrix calculation is initialized with PCA of data.

import_mar_mat

Whether to return the Markov matrix

pca

An integer specifying the number of PCA components that should be used

t_param

An integer or a vector of integers to be used to power the marcov affinity matrix

n_diffusion_components

Number of diffusion map components to compute. If set to 0, this diffusion map will not be computed.

k

The number of nearest neighbors used to construct the knn graph

ka

This controls the standard deviation used in the Gaussian kernel width for a given cell, which is set to the distance to the ka-th nearest neighbor.

epsilon

Epsilon parameter used in MAGIC

rescale_percent

Percentile to rescale data to after imputation

rescale_method

A string passed to the rescale_method argument of the rescale_data function. Two methods are available: "adaptive" or "classic" See rescale_data function for details.

python_command

A character string passed to the "command" argument of the system2 function in order to invoke python. E.g. "/usr/local/bin/python3" on a Mac.

Value

A list that includes the following elements:

imputed_data

A cell by gene matrix of the imputed gene expression values.

diffusion_map

A cell by diffusion map component matrix.

marcov_matrix

A cell by cell matrix of the markov affinity matrix.

Author

Kevin Brulois