Skip to content

Single Cell RNA-Seq Monocle3 Autocorrelation

Introduction

Monocle3 provides an alternative method for identifying genes that vary between groups of cells in lower dimension space. In the context of trajectories, it focuses on genes that change along the trajectory graph within UMAP space.

Monocle3 employs Moran's I statistic to identify such genes, a measure of spatial autocorrelation. Spatial autocorrelation refers to a correlation in a signal among neighbouring locations in space. For scRNA-Seq trajectories, the space is the UMAP, and the locations are along the trajectory graph. By conducting this Monocel3 aims to discover genes associated with specific locations on the trajectory, such as branching points.

The concept of autocorrelation differs from that of differentially expressed (DE) genes. In scRNA-Seq trajectories, DE genes usually change their mean expression as a function of pseudotime. This does not necessarily consider the UMAP subspace or the branching patterns.

This tool is based on the R package Monocle3. Please cite Monocle3 as:

Qiu, Xiaojie, et al. "Reversed Graph Embedding Resolves Complex Single-Cell Trajectories." Nature Methods, vol. 14, no. 10, 21 Aug. 2017, pp. 979–982, 10.1038/nmeth.4402.

Qiu et al. "Single-Cell mRNA Quantification and Differential Analysis with Census." Nature Methods, vol. 14, no. 3, 23 Jan. 2017, pp. 309–315, 10.1038/nmeth.4150

Trapnell, Cole, et al. "Pseudo-Temporal Ordering of Individual Cells Reveals Dynamics and Regulators of Cell Fate Decisions." Nature Biotechnology, vol. 32, no. 4, 1 Apr. 2014, pp. 381–386, 0.1038/nbt.2859.

Accessing Monocle3 Autocorrelation in OmicsBox

Perform trajectory analysis with Monocle3 in OmicsBox. After the trajectory analysis, the results table will appear in the Main Table Output. On the side panel of this output, click "Autocorrelation" to initiate Monocle3 Autocorrelation (refer to Figure 1). This action uses the current trajectory analysis results as input for the Monocle3 Autocorrelation Wizard.

By default, the autocorrelation analysis will consider all the results after the trajectory analysis. If you wish to run the analysis for a subset of the trajectory, please subset it beforehand.

image-20240110-093208.png

Figure 1. Autocorrelation with Monocle3 is available as the side panel option for Monocle3 Trajectory Inference Results in OmicsBox.

Configure Monocle3 Autocorrelation

  1. Neighbourhood Graph: Select whether you want to run the auto-correlation for the Principal graph (PQ-graph) or the KNN graph. The Principal graph is the default setting, which will select the trajectory line as the input, basically a PQ-Tree (a particular case of Minimum Spanning Tree).
  2. Nearest Neighbours: This option is only enabled when the requested Neighbourhood Graph is set to "KNN ".
  3. Alternative Hypothesis: Hypothesis to test against the NULL hypothesis, which states that none of the genes are significantly expressed.

  4. Two-sided: This alternative hypothesis tests for any form of spatial autocorrelation without specifying the direction.

  5. Greater: This tests for positive spatial autocorrelation specifically. It suggests that similar genes are clustered together in space more than expected under a random spatial distribution. This type of hypothesis is chosen when there is a reason to test for the presence of clustering.
  6. Less: This test is for negative spatial autocorrelation. This implies that dissimilar genes are located near each other more frequently than would be expected by chance, indicating a spatial pattern of dispersion. This hypothesis is selected when the interest lies in detecting spatial segregation or dispersion of genes.
  7. Family: Expected distribution of the residuals for modelling the expression. Refer to the table below,
Expression Family Accuracy Speed Recommendation
Negative Binomial (Default) +++ + It is recommended for most users and is highly accurate.
Quasi-Poisson ++ ++ It is recommended for users with massive datasets.
Poisson - +++ For debugging and testing only.
Binomial ++ ++ High zero inflation with extremely low counts.

image-20240403-153330.png

Figure 2. Configuration of Autocorrelation Analysis.

Side Panel Actions

You can see the side panel actions after completing the analysis and obtaining the results. The currently available side panel options include:

  1. Summary Report: Produces a summary of the analysis.
  2. Update Tags: Update the significant tags based on the test results.
  3. Fisher’s Exact Test: Perform overrepresentation analysis of the significantly tagged genes.

image-20240110-095123.png

Figure 3. Side Panel Action after the Monocle3 Autocorrelation analysis in OmicsBox.

Output

Monocle3 Autocorrelation in OmicsBox generates a main table and a summary report. This table has the following columns.

  1. Tags: Whether the gene is significant or not. Tags can be updated using the side panel actions.
  2. Gene ID: IDs of the genes.
  3. Gene Name: The gene names supplied; if the gene names are not supplied initially, this column will reflect the IDs.
  4. Moran's-I: This is a measure of spatial autocorrelation. Moran's I is a statistic that can range from -1 to +1, where a value close to +1 indicates strong positive spatial autocorrelation (meaning similar values cluster together in space), a value close to -1 indicates strong negative spatial autocorrelation (meaning dissimilar values are adjacent), and a value around 0 indicates a random spatial pattern (no autocorrelation).
  5. Moran's Test Statistic: Once Moran's I is calculated, it is typically transformed into a test statistic that follows a known probability distribution (under the null hypothesis of no spatial autocorrelation). The Moran's Test Statistic enables evaluating the observed Moran's I's significance, determining whether the observed spatial pattern (as measured by Moran's I) could reasonably occur by chance.
  6. P-Value: The p-value associated with Moran's I indicate the probability of observing a Moran's I as extreme as the one calculated from your data, assuming there is no spatial autocorrelation (the null hypothesis).
  7. Q-Value: Its relevance emerges in the context of multiple testing or multiple comparisons, such as when conducting Moran's I test across multiple spatial datasets or variables. The q-value could control the false discovery rate across all these tests.

image-20240403-160411.png

Figure 4. Output table of the main table after the Monocle3 Autocorrelation analysis in OmicsBox.

Actions: Summary Report

Generates a summary report of the analysis.

image-20240408-105758.png

Figure 5. Detailed summary of the results and the parameters used during the analysis.

Actions: Update Tags

Update tags based on the results columns.

  1. Select Statistic: Choose between the Moran’s I and the Moran’s I Test Statistic.

  2. Select direction: Choose the direction for the selected statistic (Moran’s I or the Moran’s I Test Statistic).

  3. Significance Criteria: Choose between P-Value or Q-Value (Adjusted P-Value).

  4. Threshold: Select the threshold for the selected significance criteria.

image-20240408-110356.png

Figure 6. Multiple options to update tags.

Actions: Fisher’s Exact Test

Please visit the section for Fisher’s Exact Test of the OmicsBox Manual for more details.

image-20240408-110902.png

Figure 7. Fisher’s Exact Test Wizard to perform overrepresentation analysis of significantly tagged genes.