API

Import MMoCHi as:

`import mmochi as mmc`

For usage examples, see the Tutorials.

Hierarchy Generation

Designing your hierarchy is the first step to classification. The Hierarchy class contains many methods used for building the hierarchy and defining high-confidence thresholds.

Hierarchy Class to organize a MMoCHi hierarchy.
Classification A Hierarchy building block, describing subsetting rules, whose parent is a subset (or "all").
Subset A Hierarchy building block, describing a population of cells beneath a classification layer.
hc_defs Helper function for defining simple or complex gating strategies for high-confidence thresholding.

Thresholding

High-confidence thresholding is performed primarily through methods in the mmc.Hierarchy object, but these are the functions that perform thresholding under the hood.

threshold(markname, adata[, data_key, ...]) Performs thresholding for marker, displays expression distribution (colored by "pos", "?", and "neg") for visualization and interactive adjustment, and optionally returns thresholds and thresholded events.
run_threshold(markname, adata, data_key, thresh) Lightweight wrapper to find the marker (utils.get_data()), then performs pos/neg/? thresholding on all events in the AnnData, given a list of positive and negative thresholds.

Classification

Once the Hierarchy is created and high-confidence thresholds are drawn, you are ready to classify. The mmc.classify function runs mmc.classifier_setup and mmc.hc_threshold internally, but you have the option to run these separately for testing.

classifier_setup(adata, x_modalities[, ...]) Setup that can optionally be completed before running mmc.classify.
hc_threshold(adata, hierarchy, level[, ...]) Performs high-confidence thresholding using the subset definitions defined in one level of a MMoCHi hierarchy.
classify(adata, hierarchy[, key_added, ...]) Classify subsets using the provided hierarchy.
terminal_names(adata[, obs_column, ...]) Create a column in the .obs featuring the most specific classification/subset for each event.

Plotting

Once you have run your classification, you may be interested in plotting some metrics of its performance or evaluating feature importances.

plot_confusion(adata, levels[, hierarchy, ...]) Determine the performance at a single level by creating a confusion plot using high-confidence thresholds as truth.
plot_confidence(adata, levels[, hierarchy, ...]) Determine how confident classification is for each subset by displaying calibration curves, which compare the events classified at a given class to its confidence.
feature_importances(hierarchy, level) Returns a DataFrame of features used in classification and their importances in the random forest of a given level.
plot_important_features(adata, levels, hierarchy) Creates violin plots for the 25 most important genes or proteins for each specified level in levels.
plot_tree(hierarchy, level[, tree_number, save]) Plots a tree from the random forest at a specified level of the classifier.

There are also a few plotting functions we have created for interrogating high-confidence thresholds and classifier performance using UMAPs:

umap_thresh(adata, h[, markers, batch_key, ...]) Plots UMAPs for the listed markers with thresholded expression data for the markers overlayed on top.
umap_interrogate_level(adata, level[, ...]) Plots UMAPs showing events selected by high-confidence thresholding and used for training and breaks down annotation confusion onto the UMAP.

Landmark Resgistration

Prior to classification, you may be interested in performing batch correction on ADT expression. This module contains the tools necessary to perform and evaluate batch correction by landmark registration.

landmark_register_adts(adata[, batch_key, ...]) Batch correction for expression of all ADTs.
update_landmark_register(adata, batch, marker) Landmark registration batch correction for ADT expression for a single marker on a single batch.
stacked_density_plots(adata, marker_list[, ...]) Method to plot multiple density plots of positive and negative peaks for batches and [data_keys]s with properly placed labels.
density_plot(adata, marker, batch[, ...]) Creates density plot of a single batch for a single marker for a given data_key.
density_plot_total(adata, marker, batch[, ...]) Plots density of a single marker on a single batch in front of the density of this marker for the whole dataset.
update_peak_overrides(batch, marker, ...[, ...]) Update peak overrides object for a single batch, marker.
save_peak_overrides(path, peak_overrides) Saves peak overrides to a JSON file for easy loading.
load_peak_overrides(path) Loads peak overrides from a JSON file.

Helper functions

We have also developed a suite of helper functions which may be useful for running MMoCHi or preparing your data.

marker(adata, parameter[, data_key, ...]) Lookup a marker name within the .X or the .obsm[data_key]
get_data(adata, parameter[, ...]) Searches an AnnData object along its .var, .obs, .layers, and .obsm[preferred_data_key] for a specified parameter.
convert_10x(adata[, drop_totalseq, data_key]) Convert default 10X data to an AnnData with protein in the .obsm, and gene expression in the .X
obsm_to_X(adata[, data_key]) Makes the .obsm[data_key] of an AnnData object into its .X
preprocess_adatas([adatas, ...]) Function to load and preprocess adatas from either filename(s) or backup_url(s).
intersect_features(adatas[, data_key]) Subsets each AnnData object to only genes and values of data_key found in every AnnData objects
generate_exclusive_features(adata_list[, ...]) Reads in adata objects from a list of paths, or takes in a list of adata objects and finds the features in all objects
batch_iterator(adata, batch_key[, sort]) Generates a series of masks, for each different batch in batch_key and its corresponding batch name
identify_group_markers(adata, group1[, ...]) Calculates differentially expressed genes between two provided groups (specified in the .obs).

Logging

MMoCHi has built-in logs which can be helpful for debugging or reproducibility.

log_to_file(file_name[, file_mode]) Enable logging for all functions in the MMoCHi package