An interpretable alphabet for local protein structure search based on amino acid neighborhoods
S. Zerefa, J. Cool, P. Singh, S. Petti
Bioinformatics (2025)
We design a “3Dn” structural alphabet that encodes the local neighborhoods
around each amino acid in an interpretable way. In a search benchmark task, a
combination of our alphabet and Foldseek’s 3Di alphabet, outperforms each alphabet
individually and ranks best among local search methods that do not require amino acid
identity information. We provide software tools that enable the exploration of novel
alphabets and combinations of alphabets for protein structure search.