Learning Complete Protein Representation by Dynamically Coupling of Sequence and Structure. [paper]
datasets.py gives some dataset functions to process data, including the amino acid types and the physicochemical properties of each residue, namely, a steric parameter, hydrophobicity, volume, polarizability, isoelectric point, helix probability, and sheet probability. Besides, the geometric features are included.
PyTorch 1.13.1
PyG, transformers
torch-geometric
pip install transformers
PyTorch Scatter
...
The code is released under MIT License.
Thanks for these great work:
@article{hu2024learning,
title={Learning Complete Protein Representation by Dynamically Coupling of Sequence and Structure},
author={Hu, Bozhen and Tan, Cheng and Xia, Jun and Liu, Yue and Wu, Lirong and Zheng, Jiangbin and Xu, Yongjie and Huang, Yufei and Li, Stan Z},
journal={Advances in Neural Information Processing Systems},
volume={37},
pages={137673--137697},
year={2024}
}
