Official PyTorch Implementation of scRCL

Refinement Contrastive Learning of Cell-Gene Associations for Unsupervised Cell Type Identification (AAAI'26 Post)

Overview

Unsupervised cell type identification is crucial for uncovering and characterizing heterogeneous populations in single cell omics studies. Although a range of clustering methods have been developed, most focus exclusively on intrinsic cellular structure and ignore the pivotal role of cell-gene associations, which limits their ability to distinguish closely related cell types. To this end, we propose a Refinement Contrastive Learning framework (scRCL) that explicitly incorporates cell-gene interactions to derive more informative representations. Specifically, we introduce two contrastive distribution alignment components that reveal reliable intrinsic cellular structures by effectively exploiting cell-cell structural relationships. Additionally, we develop a refinement module that integrates gene-correlation structure learning to enhance cell embeddings by capturing underlying cell-gene associations. This module strengthens connections between cells and their associated genes, refining the representation learning to exploiting biologically meaningful relationships. Extensive experiments on several single-cell RNA sequencing and spatial transcriptomics benchmark datasets demonstrate that our method consistently outperforms state-of-the-art baselines in cell-type identification accuracy. Moreover, downstream biological analyses confirm that the recovered cell populations exhibit coherent gene-expression signatures, further validating the biological relevance of our approach.

Requirements

CUDA Version: 11.6
python>=3.9.16
numpy==1.26.1
torch>=1.12.1+cu116
torch_geometric==2.4.0
scanpy==1.9.8
tqdm==4.66.1
matplotlib==3.9.4

Datasets

SC

For scRNA-seq datasets, Quake_Smart-seq2_Diaphragm (Diaphragm), Quake_Smart-seq2_Lung (Lung), Quake_Smart-seq2_Trachea (Trachea), Quake_10x_Bladder (Bladder), Quake_10x_Limb_Muscle (Limb_Muscle), Quake_10x_Spleen (Spleen), and Baron_human can be downloaded here. While li_tumor (Tumor), Human_ESC, and Zeisel datasets can be found here.

Before running the runSC.py script, please place the downloaded dataset files in the ./datasets/scdata/ directory, e.g., ./datasets/scdata/Quake_Smart-seq2_Diaphragm.h5ad.

ST

With respect to spatial transcriptomics datasets, please refer to the GraphST paper to download the DLPFC, Human_Breast_Cancer, and Mouse_Brain_Anterior datasets. The Mouse_Embryo_E9.5 dataset can be downloaded here.

Before running the runST.py script, place the dataset files in the ./datasets/stdata/ directory. For example:

./datasets/stdata/DLPFC/151507/...
./datasets/stdata/Human_Breast_Cancer/...

Examples

SC

For the scRNA-seq cell type identification task, you can run the script as follows:

python runSC.py --dataset Quake_Smart-seq2_Diaphragm

ST

Regarding spatial transcriptomics domain identification task, you can run the following script:

python runST.py --dataset 151507

DEGs

For differentially expressed genes (DEGs) analysis, please refer to this tutorial. After saving the model-predicted labels in the ./results/SC directory, you can run the draw_DEGs.py file.

python draw_DEGs.py

Acknowledgments

Our code is primarily inspired by the following works:

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
config		config
datasets		datasets
figures		figures
results		results
LICENSE		LICENSE
README.md		README.md
clustering.py		clustering.py
dataset.py		dataset.py
draw_DEGs.py		draw_DEGs.py
evaluation.py		evaluation.py
model.py		model.py
runSC.py		runSC.py
runST.py		runST.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Official PyTorch Implementation of scRCL

Overview

Requirements

Datasets

SC

ST

Examples

SC

ST

DEGs

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Official PyTorch Implementation of scRCL

Overview

Requirements

Datasets

SC

ST

Examples

SC

ST

DEGs

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages