scATAC-seq and identification of cell type clusters
We performed scATAC-seq on root tips isolated from rice Japonica group cultivar Nipponbare (Nip). High-quality nuclei from 120 root tips (5 mm in length of primary roots) were obtained from 3-day-old Nip seedlings using our improved nuclei isolation protocol (Additional file 1: Fig. S1). About 16,000 nuclei per replicate were incubated with Tn5 transposase in bulk. The transposed nuclei were then loaded onto a 10x Genomics microfluidic device and mixed with single-cell reaction reagents including cell barcodes. The resulting libraries were sequenced in a single pool (Fig. 1a). To exclude low-quality cells, we removed cells with less than 1000 unique nuclear fragments and a transcription start site (TSS) enrichment score of < 1.5. In addition, we removed doublets by using ArchR [19]. We found an excellent correlation between replicates, and different samples were still very similar despite being processed in different batches, suggesting that our process was robust (Additional file 1: Fig. S2a-c). For further quality control (QC), we first performed bulk ATAC-seq of independently isolated nuclei and again found a high correlation between scATAC-seq and bulk ATAC-seq (Additional file 1: Fig. S2a). Second, we compared the fragment size distribution in our samples to published data from other plants, such as maize [18], and found them to be similar (Additional file 1: Fig. S2b). Third, strong enrichment of fragments was observed within the regions that have been described as accessible chromatin (Additional file 1: Fig. S2d). Forth, the majority of scATAC peaks between replicates were overlapped (Additional file 1: Fig. S2e). Taken together, these results demonstrated that our scATAC-seq dataset is robust and of high quality. To give the community full access to the intuitive mining of our data, we have developed the Single-Cell Chromatin Accessibility of Rice root (SCAR) website, which is freely available at http://www.elabcaas.cn/scar/index.html.
Building on our quality controls, we were able to identify an average of 14,602 unique Tn5 integration sites in 25,312 nuclei, which corresponded to 13,848 ACRs covering ~ 18.6% of the rice genome (Additional file 1: Table S1). Based on unsupervised uniform manifold approximation and projection (UMAP), cells grouped into nine clusters representing distinct chromatin profiles (Fig. 1b). Following published strategies [15, 18], we annotated clusters based on specific ACRs located in the promoters of genes with cell type-specific expression. To this end, we first identified a list of genes with cluster-specific ACRs using a gene score matrix (Fig. 1c; Additional file 2: Tables S2, S3), before cross-referencing with our previously published cell type marker genes [7] (Fig. 1d). For example, the promoter of the root hair marker gene LOC_Os10g42750 contains a cluster 1 (C1)-specific ACR, the promoter of the root cap marker gene LOC_Os01g73700 contains a cluster 2 (C2)-specific ACR, the promoter of the endodermis marker gene LOC_Os01g15610 contains a cluster 3 (C3)- and cluster 4 (C4)-specific ACR, the promoter of the epidermis marker gene LOC_Os08g02300 contains a cluster 5 (C5)-specific ACR, the promoter of the stele marker gene LOC_Os01g15830 contains a cluster 6 (C6)- and cluster 7 (C7)-specific ACR, and the promoter of the cortex marker gene LOC_Os01g68589 contains a cluster 8 (C8)- and cluster 9 (C9)-specific ACR (Fig. 1d, e; Additional file 1: Fig. S3a). This strategy allowed us to annotate all scATAC-seq clusters and revealed that major cell types of the root were represented in our dataset. Interestingly, not all cluster-specific ACRs mapped to cell type marker genes. Consequently, we performed RNA in situ hybridization for a gene associated with differentially accessible regions and no prior evidence of cell type-specific expression. Reassuringly, the mRNA detection of LOC_Os04g53640 in the root cap was consistent with its enriched expression in the UMAP annotated root cap cluster (Fig. 1f, g; Additional file 1: Fig. S3b). It suggests that cluster-specific ACRs have the power to detect differentially expressed genes that single-cell RNA-seq is unable to identify.
Next, we compared the scATAC-seq with our recently reported scRNA-seq data from the protoplast of rice root tips [7]. We embedded them by adopting the approach used for integrating Arabidopsis scRNA-seq and scATAC-seq datasets [16] and plotted all cells with cell type labels from each dataset (Additional file 1: Fig. S4a). UMAP visualization showed similar distribution patterns for matching cell types between scRNA-seq and scATAC-seq, which further supported our cluster annotation. We also observed non-overlapping cells, which are likely caused by multiple factors, including (1) significant expression effects induced by the protoplasting during scRNA-seq, absent from the scATAC-seq process, and (2) genes with open chromatin accessibility but low/no expression (Additional file 2: Table S4), and genes in regions of closed chromatin, but high expression (Additional file 2: Table S5). Overall, cell type marker genes of scATAC-seq and scRNA-seq (Additional file 2: Table S6) showed a much higher correlation between matching cell types than between different cell types (Additional file 1: Fig. S4b), suggesting that the chromatin accessibility and gene expression datasets were positively correlated, further substantiating our cluster annotation. However, we found that some cell type marker genes of scRNA-seq, including LOC_Os08g03450 of endodermis, did not show cell type-specific chromatin accessibility. One explanation is that scATAC-seq was performed on the nuclei, while RNA in situ hybridization (scRNA-seq) was performed on the whole tissue (cell), including the cytoplasm. The regulation of RNA decay, RNA transport, RNA methylation, etc. may contribute to the differences between chromatin accessibility and gene expression. On the other hand, it suggested that further experimental validation would still be required.
To categorize the function of genes associated with cluster-specific ACRs, we performed Gene Ontology (GO) analyses and found converging terms between scATAC-seq and scRNA-seq for all cell types. For example, genes related to water transport were enriched among mRNAs of cortex cells in scRNA-seq [7], as well as genes associated with cortex-specific ACRs identified in scATAC-seq (Fig. 1h; Additional file 2: Table S7), further highlighting the close correlation between chromatin accessibility maps and cell type transcriptomes from rice root tips.
Characterization of cell type TF motifs in rice
We have previously described cell type-specific transcriptomes for root tips of two rice cultivars [7], but the regulatory mechanisms leading to these highly distinct expression programmes remained elusive. After having identified cell type-specific ACRs, we next asked which chromatin marks may be associated with these regions to gain an insight into their biochemical state. To this end, we integrated published ChIP-seq data on histone modifications with our ATAC-seq-based ACRs. Interestingly, we found that for all modifications tested, namely H3K4me1, H3K4me3, H3K27me3, and H3K9me2, the ChIP-seq signal at the middle point of ACRs represented a regional minimum. Furthermore, we found that levels of some modifications, such as H3K4me1 and H3K27me3, were different between cell types, whereas H3K9me2 appeared similar across all cell types (Fig. 2a). The underlying mechanism for this interesting pattern could be explored by future single-cell/cell type resolution histone modifications studies. These results suggested on the one hand that ACRs represent genomic regions with reduced histone methylation and on the other hand showed that these modifications show cell type-specific variations, which could contribute to chromatin accessibility and gene expression.
Since gene expression is dictated by the interplay of open chromatin with TF activity, we aimed to elucidate the regulatory networks responsible for cell type differentiation in rice roots. Our analyses were built on the notion that TF-binding elements in ACRs represent the substrate for expressed TFs, identified by scRNA-seq, since most TFs can only bind to open chromatin, which is readily identified by scATAC-seq [18]. To establish cell type TF signatures, we first annotated known TF-binding sequences across the rice genome and plotted the average motif coverage around ACRs. We found that the occurrence of potential TF-binding sites was highly enriched in ACRs relative to control and flanking regions of the rice genome (Fig. 2b), supporting the idea that ACRs represent important regulatory regions. Next, we identified highly represented TF motifs for each cell type by calculating the relative enrichment of TF binding sequences within the top 2000 differential ACRs. An average of 32 TF motif combinations were significantly enriched (Fisher exact test, FDR < 0.01) per cell type, and the largest number of motif combinations was 53, found in the C2 root cap. Our analysis revealed many cluster-enriched TF motifs (Fig. 2c, d, Additional file 1: Fig. S5). For example, WRKY family TF motifs enriched in root hair; NAC, AP2, and CAMTA family TF motifs enriched in root cap; MYB family TF motifs enriched in the stele; and C2H2 family TF motifs enriched in the cortex (Additional file 1: Fig. S6). The enrichment of TF motifs was coincident with the known regulators of cell identity, including the WRKY family in root hair development [20], NAC family TFs in root cap development [21], and DNA binding with one finger (Dof) TF in endodermis regulation [22]. Cross-referencing these findings with our scRNA-seq data, we found that some members of TFs with cell type-specific expression, such as members of bHLH, MYB, C2H2, and ERF family TFs (Fig. 2e), coincide with their enriched TF-binding motifs from the same cell type (Fig. 2d; Additional file 1: Fig. S5a), further indicating that our annotation was robust.
Chromatin accessibility trajectory of epidermal and root hair cells
While these analyses allowed the reconstruction of potential regulatory modules active in each cell type of the rice root, they were insufficient to reveal developmental dependencies between TF and target cell types. Pseudotime analysis has been a major step towards establishing causal relations in scRNA-seq data, and we have previously used it to reveal the differentiation trajectory of root hairs from epidermal cells [7]. Importantly, these analyses can also be applied to scATAC-seq data to order ACR heterogeneity within a cell type cluster into time-resolved chromatin accessibility dynamics. Using the root hair cluster as a model, we identified 13,848 ACRs, 131 TF motifs, and 3882 genes with significant differences in chromatin accessibility across the root hair pseudotime trajectory (Additional file 1: Fig. S7). Several known root hair developmental genes, including LOC_Os10g42750, were identified among the top differentially accessible genes throughout root hair development. Consistent with the central role of this gene for root hair development [23], the accessibility of chromatin in its regulatory region strongly increased during the transition from the epidermis to root hair identity, which was consistent with its RNA expression profile derived from scRNA-seq (Fig. 3a, b) [7]. Interestingly, we found a WRKY-binding motif in the ACR of LOC_Os10g42750, and cross-referencing with our scRNA-seq data allowed us to identify the WRKY TF LOC_Os04g50920, which is only expressed in the root hair and root cap and thus may represent a potential upstream activator (Fig. 3c, d). Thus, the pseudotime analysis of chromatin accessibility and RNA expression not only was able to resolve gene expression dynamics during root hair development but also allowed to resolve the regulatory mechanisms underlying a key developmental transition.
Single-cell chromatin accessibility dynamics in response to heat stress
An important aspect of plant development is its astounding plasticity in response to environmental variation. Therefore, we were interested to analyse how an environmental factor, such as temperature, would affect the chromatin accessibility and hence the transcriptional programmes leading to root cell type development. We have previously shown that chromatin accessibility is responsive to heat stress (HS) in rice roots but were lacking cell type resolution [24]. To explore the HS-induced chromatin dynamics at the single-cell level, we carried out two independent rounds of scATAC-seq on rice root tips after exposure of 3 h to 45 °C (Fig. 4, Additional file 1: Fig. S2; Additional file 2: Tables S8, S9). In total, we generated scATAC-seq profiles from 21,446 cells, which yielded on average 16,570 unique fragments mapping to the Nip genome (Additional file 1: Table S1). For comparative analysis of cells that had not experienced heat shock, scATAC-seq reads from control and HS samples were merged and subsequently clustered. UMAP visualization revealed nine clusters of HS-treated cells that overall were similar to clusters identified in controls (Fig. 4a, b). In addition to the general alignment of cell clusters, the relative numbers of cells per cluster were also comparable (Fig. 4c), demonstrating that chromatin accessibility was not massively disturbed by HS. Interestingly, we observed that cells of the root cap, root hair, and stele showed more divergence between normal growth conditions and HS (Fig. 4e) than those of the epidermis, endodermis, and cortex, suggesting chromatin accessibility in these tissues was specifically responsive to heat stress. To elucidate the mechanisms driving specific and more general responses to temperature, we identified heat shock-specific ACRs and searched for overrepresented regulatory motifs by MotifScan [25]. Notably, heat stress transcription factor (HSF)-binding motifs were the top enriched ones for all nine cell types (Additional file 3: Table S10). In the next step, we subjected genes associated with heat shock-specific ACRs to GO analysis and found that several GO terms were shared by all cell clusters (Fig. 4d). Reassuringly, “response to heat” was the top GO term across all cell types, demonstrating that HS generally affected the chromatin accessibility of heat response-related genes. Interestingly, we also observed cluster-specific GO terms. For example, “jasmonic acid biosynthetic process” was found in cluster 3 endodermis (Fig. 4d; Additional file 1: Fig. S8). To compare our scATAC and bulk ATAC data, we used gene track analysis on genes that showed responses in chromatin state in one of the assays (Fig. 4e, f), while we were able to identify many differential ATAC peaks between normal and HS that were cell type-specific in scATAC-seq (Fig. 4e), which could not be identified by bulk ATAC-seq (Fig. 4f), suggesting scATAC-seq not only offers single-cell resolution, but also is more powerful in identifying ACRs in response to environmental changes.