Notes on the identity of the orb-weaver spider Araneus nox Simon, 1877 (Araneae: Araneidae) from India, including its transfer to Eriovixia Archer, 1951 and one new synonymy

  Notes on the identity of the orb-weaver spider Araneus nox Simon, 1877 (Araneae: Araneidae) from India, including its transfer to Eriovixia Archer, 1951 and one new synonymy The orb-weaver genus Araneus Clerck, 1757 has historically served as a heterogeneous assemblage for numerous araneid spiders lacking clear generic placement, and several Asian species formerly assigned to Araneus have subsequently been transferred to more narrowly defined genera. One species that still needs further investigation on its true identity is Araneus nox (Simon, 1877), originally described as Epeira nox Simon, 1877 from Basilan Island, Philippines, and later transferred to Araneus by Simon (1905). In the same year as the description of Epeira nox, Thorell (1877) described Epeira pilula from the Moluccas (Indonesia), which was subsequently synonymised under Epeira nox by Simon (1880). Despite its broad Oriental distribution, the taxonomic identity and generic placement of A. nox have remained insuff...

MultiTox: A sequence-based stacked ensemble model for multiclass protein toxin classification

 


MultiTox: A sequence-based stacked ensemble model for multiclass protein toxin classification

Abstract

Understanding the structural and functional diversity of toxin proteins is critical for elucidating macromolecular behavior, mechanistic variability, and structure-driven bioactivity. Traditional approaches have primarily focused on binary toxicity prediction, offering limited resolution into distinct modes of action of toxins. Here, we present MultiTox, an ensemble stacking framework for the classification of toxin proteins based on their molecular mode of action: neurotoxins, cytotoxins, hemotoxins, and enterotoxins. We curated a comprehensive dataset of 24,756 proteins (20,361 toxins and 4395 non-toxins) and extracted high-dimensional ESM-2 embeddings that encode evolutionary, structural, and biochemical features. The two-tier stacking framework integrates LGBM, MLP, ET, KNN, and QDA as base classifiers and XGBoost as a meta classifier. MultiTox achieved an overall accuracy of 91.07 %, an F1-score of 90.73 %, and a Matthews Correlation Coefficient (MCC) of 91.61 %. Class-wise accuracies were 93.75 % (neurotoxins), 87.79 % (cytotoxins), 98.80 % (hemotoxins), 97.02 % (enterotoxins), and 95.83 % (toxins vs. non-toxins). SHAP-based interpretation and correlation with known physicochemical descriptors revealed class-specific features linked to biologically meaningful patterns in structural motifs, hydrophobicity, and solvent accessibility. Functional annotations using InterProScan, clusters of orthologs, and secretion signal analysis identified toxin class-specific signatures related to folding, localization, and host interactions. We deployed a public web server (https://cosylab.iiitd.edu.in/multitox/) for real-time and batch-mode predictions. MultiTox provides a scalable and biologically interpretable framework for protein classification, bridging sequence data with functional insights.
Sharma, H., Thakur, M. S., Barala, A., Khan, M. S., Bhagat, S., & Bagler, G. (2025). MultiTox: A sequence-based stacked ensemble model for multiclass protein toxin classification. International Journal of Biological Macromolecules, 327, 147399. https://doi.org/10.1016/j.ijbiomac.2025.147399