GeneExpressionGNN is a project focused on leveraging graph neural networks (GNNs) and other models for inferring patterns in gene expression data. This repository provides tools and models to explore relationships between genes and their expression profiles.
- Code for gene expression prediction from L1000 data to RNAseq data.
- Implementation of graph neural network architectures.
- Training codes for GNN, MLP, and SwinIR.
- Evaluation metrics for gene expression prediction.
-
Clone the repository:
git clone https://github.com/HyunjinHwn/GeneExpressionGNN.git -
Install the required dependencies:
cmapPy==2.2.2 numpy==1.26.4 torch==2.0.0 scipy==1.13.1 pandas==2.2.3 -
Download the datasets in the following link: Please download RNAseq file through this link and unzip through following code.
gunzip GSE92743_Broad_GTEx_RNAseq_Log2RPKM_q2norm_n3176x12320.gctx.gzPlace this file in
data/before you run the codes.
You can run this project using either the Python script or the Jupyter Notebook:
Open gnn_training.ipynb with Jupyter Notebook or JupyterLab, and run the cells in order.
To run the training as a script, use:
python gnn_training.py --select_method order --select_gene_num 970 --graph cos50_Lfull_Nposneg --lr 0.0005 --gpu 0 --loss L1
python gnn_training.py --select_method greedy_forward --select_gene_num 108 --graph cos50_Lfull_Nposneg --lr 0.0005 --gpu 0 --loss L1To train the MLP model, use:
Open mlp_training.ipynb with Jupyter Notebook or JupyterLab, and run the cells in order.
To train the SwinIR model, move to the swinir/ directory.
cd swinirThen, follow the instructions(readme.md) in swinir/.
After running either gnn_training.py, gnn_training.ipynb or mlp_training.ipynb, the inferred gene expression profiles for the test set will be saved in the prediction/ directory.
To evaluate the prediction performance, you can run the evaluation.ipynb.
If you want to generate intermediate results(e.g. level4 data),
place the inferred gene expression profiles files in evaluation/level3 and run the cells in metric_comparison.ipynb.
Note that metadata/model_summary.txt should be properly edited, before runing metric_comparison.ipynb.
Please refer to the evaluation instructions(evaluation/readme.md) for the details.