Abstract Details
Name
A Cross-Species Framework for Clustering H3NX Influenza A Hemagglutinin Nucleotide Sequences
Presenter
Hoc Tran,University of Guelph
Co-Author(s)
Hoc Tran1, Angela McLaughlin1,2, Nicole Ricker1, Olaf Berke1, Zvonimir Poljak1 1Ontario Veterinary College, University of Guelph, Guelph, ON, Canada 2Dalhousie University, Halifax, NS, Canada
Abstract Category
Breaking & Entering
Abstract
Influenza A viruses (IAVs) frequently cross species barriers, complicating surveillance and classification. H3 viruses are particularly widespread, circulating among hosts such as humans, swine, and diverse poultry and wild bird species. This broad host range highlights the need for a standardized, host-agnostic grouping system to track viral evolution across time, space, and species. Such consistency is increasingly important as machine learning (ML) methods are applied to influenza nucleotide sequences, including hemagglutinin (HA), where model validation depends on reliable classification across hosts. Existing systems for classifying H3 nucleotide sequences typically focus on a single host species, limiting their utility for genomic surveillance and real-time risk assessment. Therefore, this study aims to develop a preliminary standardized grouping system for the HA of H3NX IAVs using publicly available sequences to support ML model development. Whole genome H3 IAV nucleotide sequence data were retrieved from public repositories and cleaned. Sequences from only the HA genome segment were then used to infer a maximum-likelihood phylogenetic tree with 1000 ultra-fast bootstrap replicates using IQ-TREE3. Tree tips were subsequently clustered using TreeCluster with the max-clade method and patristic distance threshold values of 0.01-0.05 subs/site in increments of 0.01. The number of clusters, singletons, and maximum cluster size were then compared across threshold values and 0.02 subs/site was identified as a suitable clustering threshold. Grouping of H3 HA IAV sequences into clusters using this methodology will allow for phylogenetically-informed model validation in future studies involving H3 HA IAV host prediction and investigation of between-species transmission using ML.
Close