Impute data using the MAGIC algorithm — magicBatch • magicBatch

This is an implementation of the Marcov Affinity-based Graph Imputation of Cells (MAGIC) algorithm described in Van Dijk, David et al, modified for flexibility. It includes the mar_mat_input argument, which allows the data used to compute the diffusion operator and the data to be imputed to be specified independently. This allows any low-dimensional representation of the data, including batch-corrected data, to be directly used to calculate the powered Marcov affinity matrix.

magicBatch(
  data,
  select_features = NULL,
  mar_mat_input = NULL,
  import_mar_mat = FALSE,
  pca = 20,
  t_param = c(2, 4, 6),
  n_diffusion_components = 0,
  k = 9,
  ka = 3,
  epsilon = 1,
  rescale_percent = 90,
  rescale_method = "adaptive",
  python_command = system("which python3", intern = TRUE)
)

Arguments

data: An expression matrix where cells correspond to rows and genes correspond to columns
select_features: A vector of features to use for imputation.
mar_mat_input: A matrix where cells correspond to rows and components or features correspond to columns. If left unspecified, the Marcov matrix calculation is initialized with PCA of data.
import_mar_mat: Whether to return the Markov matrix
pca: An integer specifying the number of PCA components that should be used
t_param: An integer or a vector of integers to be used to power the marcov affinity matrix
n_diffusion_components: Number of diffusion map components to compute. If set to 0, this diffusion map will not be computed.
k: The number of nearest neighbors used to construct the knn graph
ka: This controls the standard deviation used in the Gaussian kernel width for a given cell, which is set to the distance to the ka-th nearest neighbor.
epsilon: Epsilon parameter used in MAGIC
rescale_percent: Percentile to rescale data to after imputation
rescale_method: A string passed to the rescale_method argument of the rescale_data function. Two methods are available: "adaptive" or "classic" See rescale_data function for details.
python_command: A character string passed to the "command" argument of the system2 function in order to invoke python. E.g. "/usr/local/bin/python3" on a Mac.

Value

A list that includes the following elements:

imputed_data: A cell by gene matrix of the imputed gene expression values.
diffusion_map: A cell by diffusion map component matrix.
marcov_matrix: A cell by cell matrix of the markov affinity matrix.

Author

Kevin Brulois