API reference #

Returns:

Nothing

stampede.pp#

preprocessing functions

stampede.pp.binarize(adata, verbose=True)#

Binarize the values in adata.X

Parameters:

adata (AnnData) – adata object
verbose (bool) – provide written feedback (default: True)

Return type:

Returns:

Nothing, updates adata.layers and adata.X

stampede.pp.cell_qc_postfilter(adata)#

Compute metadata after filtering

Parameters:: adata (AnnData) – an adata object
Return type:: None
Returns:: Nothing, updates adata.obs

stampede.pp.detection_rates(adata, samples_column, normalize=True)#

Calculate gene detection rates per sample in the samples_column of adata.obs.

Parameters:

adata (AnnData) – adata object
samples_column (str) – column in adata.obs
normalize (bool) – normalize detection rates for sample quality

Return type:

DataFrame

Returns:

a dataframe with normalized gene detection rates

stampede.pp.dim_red(adata, n_dims=50, key_added=None, random_state=42)#

Dimensionality reduction using Term Frequency Latent Semantic Indexing.

Parameters:

adata (AnnData) – adata object
n_dims (int) – number of PCs to use (default: 50)
key_added (str) – key in adata.obsm for function output (default: “X_svd”)
random_state (int) – random seed value

Return type:

Returns:

Nothing, updates adata.obsm and adata.uns

stampede.pp.filter_cells(adata, dist2edge_px_min=0, falsecode_max=5, negprobe_max=3, ntranscript_min=250, ntranscript_max=1500, area_min=25, area_max=100, filter_columns=None, verbose=True)#

Filter adata.obs by a set of qc_params.

Parameters:

adata (AnnData) – adata object
dist2edge_px_min (int)
falsecode_max (int) – maximum number of false codes the cell may have
negprobe_max (int) – maximum number of negative probes the cell may have
ntranscript_min (int) – minimum number of transcripts the cell must have
ntranscript_max (int) – maximum number of transcripts the cell must have
area_min (int) – minimum area (in pixels) the cell must have
area_max (int) – maximum area (in pixels) the cell must have
filter_columns (list) – a list of additional columns to filter by. Columns by (convertible to) boolean, where False values are removed.
verbose (bool) – provide written feedback (default: True)

Return type:

AnnData

Returns:

the filtered adata object

stampede.pp.filter_genes(adata, ncell_min=0, ncell_max=inf, ntranscript_min=0, ntranscript_max=inf, signal2noise_threshold=1.0, filter_columns=None, verbose=True)#

Filter adata.var by a set of qc_params.

Parameters:

adata (AnnData) – adata object
ncell_min (int) – minimum number of cells the gene is found in.
ncell_max (int) – maximum number of cells the gene is found in.
ntranscript_min (int) – minimum number of transcripts the gene must have.
ntranscript_max (int) – maximum number of transcripts the gene must have.
signal2noise_threshold (float) – the minimum signal-to-noise ratio the gene must have.
filter_columns (str | list) – a list of additional columns to filter by. Columns by (convertible to) boolean, where False values are removed.
verbose (bool) – provide written feedback (default: True)

Return type:

AnnData

Returns:

the filtered adata object

stampede.pp.gene_qc(adata, signal2noise_threshold=None, mult=1, overwrite=False)#

Add QC parameters to adata.var.

About the Signal-to-noise filter:

Approach from https://doi.org/10.1038/s41467-025-64990-y Wang et al. “Systematic benchmarking of imaging spatial transcriptomics platforms in FFPE tissues” Nat Com, 2025.

Calculate the mean expression and standard deviation of the negative control probes. Remove genes with average expression < mean + mult* x STD of ctrl probes.

*the paper used mult=2

Parameters:

adata (AnnData) – an adata object
signal2noise_threshold (float | Iterable) – manually specify the threshold. If None, use the filter specified above.
mult (int | float) – if signal2noise_threshold is None, mult is used in the signal2noise threshold computation specified above.
overwrite (bool) – overwrite existing qc columns (default: False)

Return type:

Returns:

Nothing, updates adata.var

stampede.pp.gene_qc_postfilter(adata)#

Compute metadata after filtering

Parameters:: adata (AnnData) – an adata object
Return type:: None
Returns:: Nothing, updates adata.var

stampede.pp.knn_count_smoothing(adata, layer_added=None, neighbors_use_rep=None, neighbors_key_added=None, neighbors_kwargs=None, verbose=True)#

For each cell, replace its gene vector with the average of its KNN neighborhood.

Runs sc.pp.neighbors if it has not run. See https://scanpy.readthedocs.io/en/stable/api/generated/scanpy.pp.neighbors.html

Parameters:

adata (AnnData) – adata object
layer_added (str) – key in adata.layers for function output (default: “KNN_binary_mean”)
neighbors_use_rep (str) – See sc.pp.neighbors for details
neighbors_key_added (str) – See sc.pp.neighbors for details
neighbors_kwargs (dict) – kwargs passed to sc.pp.neighbors
verbose (bool) – provide written feedback (default: True)

Return type:

Returns:

Nothing, updates adata.layers and adata.X

stampede.pp.pseudobulk(adata, samples_column, samples=None, cluster_column=None, cluster=None, layer=None)#

Generate a pseudobulk table (genes x samples) for all samples in the sample_column and the cluster in the cluster_column, if specified.

Parameters:

adata (AnnData) – adata object
samples_column (str) – column in adata.obs
samples (Iterable) – samples in the sample columns to use (default: all)
cluster_column (str) – column in adata.obs (only needed if cluster is specified)
cluster (str) – name of the cluster in cluster_column to aggregate to pseudobulk
layer (str) – layer to aggregate (default: “counts”)

Return type:

DataFrame

Returns:

a dataframe with summed layer values per sample

stampede.pp.slide_qc(adata, slides, data_dir=None)#

Use the fov_positions file to create a dataframe with metadata columns per slide and fov, and store this in adata.uns[“fov_metadata”]. Additional adds columns to adata.obs reflecting the distance from the cell to the camera’s FOV edge.

Parameters:

adata (AnnData) – adata object generated using the slides dict
slides (dict) – a dictionary with the slide number as keys, and a dictionary as values. The value dict must contain keys “exprmat” and “metadata”, with should map to matching respective files
data_dir (str) – optional filepath prefix (default: “”)

Return type:

Returns:

Nothing, updates adata.uns and adata.obs

stampede.pl#

plotting functions

stampede.pl.avg_per_pixel(adata, column, fill_cell_area=False, normalize_cell_area=True, log1p=False, cmap=None, background_color=None, figsize=(20, 15), subplot_kwargs=None, plot_kwargs=None)#

Plot the average values of the given column over all FOVs. Color’s the cell’s center pixel, unless fill_cell_area is set to True (slow).

Parameters:

adata (AnnData) – an adata object
column (str) – a column in adata.obs with numeric values
fill_cell_area (bool) – distribute the column value over all pixels covered by the cell, assuming square cells (default: False)
normalize_cell_area (bool) – if fill_cell_area is True, normalize the column value over the cell area (default: True)
log1p (bool) – normalize the final values per pixel?
cmap (tuple[float, float, float] | str | tuple[float, float, float, float] | tuple[tuple[float, float, float] | str, float] | tuple[tuple[float, float, float, float], float]) – colormap (default: “gist_rainbow”)
background_color (tuple[float, float, float] | str | tuple[float, float, float, float] | tuple[tuple[float, float, float] | str, float] | tuple[tuple[float, float, float, float], float]) – color for pixels with 0 values (default: “black”)
figsize (tuple) – figure size
subplot_kwargs (dict) – kwargs passed to plt.subplots
plot_kwargs (dict) – kwargs passed to the main plotting function

Return type:

Returns:

matplotlib figure and array of axes

stampede.pl.column_distribution(adata, column, axis=None, min_quantile=0.0, max_quantile=0.95, subplot_kwargs=None, plot_kwargs=None)#

Plot the distribution of values for a column present in either adata.obs or adata.var.

Parameters:

adata (AnnData) – an adata object.
column (str) – a column in either adata.obs or adata.var
axis (int) – specify if the column name is present in both obs (0) and var (1).
min_quantile (float) – lowest quantile of values to plot (default: 0.00)
max_quantile (float) – highest quantile of values to plot (default: 0.95)
subplot_kwargs (dict) – kwargs passed to plt.subplots
plot_kwargs (dict) – kwargs passed to the main plotting function

Return type:

Returns:

matplotlib figure and array of axes

stampede.pl.correlations(adata, xcolumn, ycolumn, log1p_xcolumn=False, log1p_ycolumn=False, color_xcolumn=None, color_ycolumn=None, cmap_2d=None, bins_1d=50, bins_2d=None, stat=None, figsize=(8, 7), subplot_kwargs=None, plot_kwargs=None)#

Plot the distributions and 2D correlation between two columns in adata.obs.

Parameters:

adata (AnnData) – an adata object
xcolumn (str) – columns in adata.obs to plot on the x-axis
ycolumn (str) – columns in adata.obs to plot on the y-axis
log1p_xcolumn (bool) – normalize the xcolumn? (default: False)
log1p_ycolumn (bool) – normalize the ycolumn? (default: False)
color_xcolumn (tuple[float, float, float] | str | tuple[float, float, float, float] | tuple[tuple[float, float, float] | str, float] | tuple[tuple[float, float, float, float], float]) – color of the xcolumn plot
color_ycolumn (tuple[float, float, float] | str | tuple[float, float, float, float] | tuple[tuple[float, float, float] | str, float] | tuple[tuple[float, float, float, float], float]) – color of the ycolumn plot
cmap_2d (tuple[float, float, float] | str | tuple[float, float, float, float] | tuple[tuple[float, float, float] | str, float] | tuple[tuple[float, float, float, float], float]) – colormap of the 2d correlation plot (default: “Blues”)
bins_1d (str | int) – number of bins on the 1-dimensional histogram plots
bins_2d (str | int) – number of bins on the 2-dimensional histogram plot
stat (str) – which statistic to plot, see sns.histplot for more details (default: “percent”)
figsize (tuple) – figure size
subplot_kwargs (dict) – kwargs passed to plt.subplots
plot_kwargs (dict) – kwargs passed to the main plotting function

Return type:

Returns:

matplotlib figure and array of axes

stampede.pl.dim_red(adata, columns, obsm_key=None, cmap='tab10', n_dims=6, subset_size=1000, random_state=42)#

Scree plot

Parameters:

adata (AnnData) – adata object
columns (str | Iterable) – one or more columns in adata.obs to plot. One multiplot per column.
obsm_key (str) – key in adata.obsm with dim_red output (default: “X_svd”)
cmap (tuple[float, float, float] | str | tuple[float, float, float, float] | tuple[tuple[float, float, float] | str, float] | tuple[tuple[float, float, float, float], float]) – colormap
n_dims (int) – number of PCs to use (default: 50)
subset_size (int) – subsample the data to this number (per column)
random_state (int) – random seed value

Return type:

list[tuple[Figure, Axes]]

Returns:

a list of tuples with matplotlib figure and axis

stampede.pl.ncell_per_condition(adata, columns, offset_between_conditions=1, palette=None, subplot_kwargs=None, plot_kwargs=None, text_kwargs=None)#

Plot the number of cells per condition in a column in adata.obs.

Parameters:

adata (AnnData) – an adata object
columns (str | list) – one or more columns in adata.obs to visualize, in order of significance
offset_between_conditions (int | list) – distance between different conditions Can be a single value, or a list of offset values for each column (length=len(columns)-1)
palette (tuple[float, float, float] | str | tuple[float, float, float, float] | tuple[tuple[float, float, float] | str, float] | tuple[tuple[float, float, float, float], float]) – color palette (default: “terrain”)
subplot_kwargs (dict) – kwargs passed to plt.subplots
plot_kwargs (dict) – kwargs passed to the main plotting function
text_kwargs (dict) – kwargs passed to ax.set_xticks and ax.set_yticks

Return type:

Returns:

matplotlib figure and array of axes

stampede.pl.paired_binomial_glm_volcano(df, drop_perfect_separation=True, pval_thresh=0.05, or_thresh=0.75, to_label=5, subplot_kwargs=None, plot_kwargs=None, text_kwargs=None)#

Generate a volcano plot from the detection_rates results dataframe.

Parameters:

df (DataFrame) – a dataframe
drop_perfect_separation (bool) – whether to drop the genes with perfect separations
pval_thresh (float) – threshold pvalue_column for genes to be significant
or_thresh (float) – threshold for the log2 odds ratios to be considered significant
to_label (int | list) – the number of top genes (down and up each) to be labeled
subplot_kwargs (dict) – kwargs passed to plt.subplots
plot_kwargs (dict) – kwargs passed to the main plotting function
text_kwargs (dict) – kwargs passed to ax.text

Return type:

Returns:

matplotlib figure and axis object

stampede.pl.pydeseq2_volcano(df, symbol_column='index', log2fc_column='log2FoldChange', pvalue_column='padj', baseMean_column='baseMean', pval_thresh=0.05, log2fc_thresh=0.75, to_label=5, colors=None, subplot_kwargs=None, plot_kwargs=None, text_kwargs=None)#

Generate a volcano plot from a pyDESeq2 results dataframe.

Adapted from mousepixels/sanbomics

Parameters:

df (DataFrame) – a pyDESeq2 results dataframe
symbol_column (str) – column name of gene IDs to use
log2fc_column (str) – column name of log2 Fold-Change values
pvalue_column (str) – column name of the adjusted p values to be converted to -log10 p-values
baseMean_column (str) – column name of base mean values for each gene
pval_thresh (float) – threshold pvalue_column for points to be significant
log2fc_thresh (float) – threshold for the absolute value of the log2 fold change to be considered significant
to_label (int | list) – If an int is passed, that number of top down and up genes will be labeled. If a list of gene Ids is passed, only those will be labeled
colors (list) – order and colors to use
subplot_kwargs (dict) – kwargs passed to plt.subplots
plot_kwargs (dict) – kwargs passed to the main plotting function
text_kwargs (dict) – kwargs passed to ax.text

Return type:

Returns:

matplotlib figure and axis object

stampede.pl.scree(adata, obsm_key=None)#

Scree plot

Parameters:

adata (AnnData) – adata object
obsm_key (str) – key in adata.obsm with dim_red output (default: “X_svd”)

Return type:

Returns:

matplotlib figure and array of axes

stampede.pl.sketch(adata, obs_column='subset', use_rep='X_svd', plot_kwargs=None)#

Scatterplot highlighting the cells that were sampled. Requires the full adata object.

Parameters:

adata (AnnData) – adata object
obs_column (str) – column in adata.obs with boolean values if the cell is kept
use_rep (str) – use the indicated representation
plot_kwargs (dict) – kwargs passed to the main plotting function

Return type:

Returns:

matplotlib figure and array of axes

stampede.pl.slide_qc(adata, columns=None, figsize=None, subplot_kwargs=None, plot_kwargs=None)#

Plot the values from one or QC columns in adata.uns[“fov_metadata”] (added by slide_qc_data()). Specify columns to limit the number of plots.

Parameters:

adata (AnnData) – an adata object
columns (str | Iterable) – columns in adata.uns[“fov_metadata”] to plot (default: all)
figsize (tuple) – tuple of figure, will be multiplied by the number of plots
subplot_kwargs (dict) – kwargs passed to plt.subplots
plot_kwargs (dict) – kwargs passed to the main plotting function

Return type:

Returns:

matplotlib figure and array of axes

stampede.pl.value_distribution(adata, layer=None, min_quantile=0.0, max_quantile=0.95, subplot_kwargs=None, plot_kwargs=None)#

Plot the number of occurrences of values in the dataset.

Parameters:

adata (AnnData) – an adata object.
layer (str) – the layer the values are drawn from (default: X)
min_quantile (float) – lowest quantile of values to plot (default: 0.00)
max_quantile (float) – highest quantile of values to plot (default: 0.95)
subplot_kwargs (dict) – kwargs passed to plt.subplots
plot_kwargs (dict) – kwargs passed to the main plotting function

Return type:

Returns:

matplotlib figure and array of axes

stampede.pl.violin(adata, columns, inner=None, fill=False, cut=0, log_scale=(False, True), subplot_kwargs=None, plot_kwargs=None)#

Violin plots for one or more columns in adata.obs.

Wraps seaborn’s violinplot. See https://seaborn.pydata.org/generated/seaborn.violinplot.html

Parameters:

adata (AnnData) – an adata object
columns (str | list) – one or more column in adata.obs
inner (str) – See sns.violinplot for more details.
fill (bool) – See sns.violinplot for more details.
cut (int) – See sns.violinplot for more details.
log_scale (tuple[bool, bool]) – See sns.violinplot for more details.
subplot_kwargs (dict) – kwargs passed to plt.subplots
plot_kwargs (dict) – kwargs passed to the main plotting function

Return type: