mmc.hc_threshold()

mmochi.classifier.hc_threshold(adata, hierarchy, level, data_key=utils.DATA_KEY, plot=False, reference=None, batch=None, log_reference=True)

Performs high-confidence thresholding using the subset definitions defined in one level of a MMoCHi hierarchy. Note: This function relies on mmc.utils.get_data() for extracting expression information from the adata object. Thus, for each marker it will first search for matching proteins, then genes, and finally the .obs for matching feature names. If ‘_gex’ or ‘_obs’ are appended to the marker name, this priority order will be bypassed.

Parameters:
  • adata (AnnData) – Object containing gene expression data, and expression data for modalities for every data key in .obsm and .var[MODALITY_COLUMN]

  • hierarchy (Hierarchy object) – Specifies one or more classification levels and subsets.

  • level (str) – Name of a classification level in the hierarchy to threshold and high-confidence threshold on

  • data_key (Union[str, list, None] (default: utils.DATA_KEY)) – Key(s) in adata.obsm or .var[utils.MODALITY_COLUMN] to be used for high-confidence thresholding

  • plot (bool (default: False)) – Passed to thresholding.threshold, whether to display histograms of thresholds

  • reference (Optional[str] (default: None)) – Column in the adata.obs to be used for logged debug info to reveal how the hc_threshold function performs

  • batch (Optional[str] (default: None)) – Name of a column in adata.obs that corresponds to a batch for use in the classifier

  • log_reference (bool (default: True)) – If true, logs high-confidence thresholding information

Returns:

Dataframe of high-confidence calls for which classification an event falls into for the specified level of the hierarchy

Return type:

templin