mmc.classifier_setup()
-
mmochi.classifier.classifier_setup(adata, x_modalities, data_key=utils.DATA_KEY, reduce_features_min_cells=0, features_limit=None) Setup that can optionally be completed before running mmc.classify. This can be run before the classifier (to reduce runtime of the classifier function in a parameter optimization loop) or is automatically run when training a classifier. It concatenates the .X and any data_key in the .obsm, then performs feature reduction (if reduce_features_min_cells > 0). Next, features can be limited by an external feature set. Then, it sorts the resulting feature_names (the columns from the .X and .obsm[data_key]) and csr.matrix, alphabetically, to make the feature order reproducible across runs. If defined, feature limits can be performed so that you can match the expected features of the hierarchy.
Parameters: - adata (
AnnData) – Object containing gene expression data, and expression data for modalities for every data key in .obsm - x_modalities (
Union[str,List[str]]) – Name of the modality of the data in the .X of adata - data_key (
Optional[str] (default:utils.DATA_KEY)) – Key in adata.obsm to concatenate into .X and to reduce features across - reduce_features_min_cells (
int(default:0)) – Remove features that vary in fewer than this number of cells passed to _reduce_features - features_limit (default:
None) – listlike of str or dictionary in the format {‘modality_1’:[‘gene_1’, ‘gene_2’, …], ‘modality_2’:’All’} Specifies the allowed features to classify on for a given modality
Return type: Returns: - scipy.sparse.csr_matrix – Reduced adata data for classification
- list – List of features that were checked/used in the reduction process
- adata (