Classification of Influenza Hemagglutinin Protein Sequences using Convolutional Neural Networks

Chrysostomou, Charalambos and Alexandrou, Floris and Nicolaou, Mihalis and Şeker, Hüseyin (2021) Classification of Influenza Hemagglutinin Protein Sequences using Convolutional Neural Networks. In: at the 43rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, October 31 – November 4, 2021, online.

EMBC21.pdf - Accepted Version

Download (194kB)


The Influenza virus can be considered as one of the most severe viruses that can infect multiple species with often fatal consequences to the hosts. The Haemagglutinin (HA) gene of the virus can be a target for antiviral drug development realised through accurate identification of its sub-types and possible the targeted hosts. This paper focuses on accurately predicting if an Influenza type A virus can infect specific hosts, and more specifically, Human, Avian and Swine hosts, using only the protein sequence of the hemagglutinin (HA) gene. In more detail, we propose encoding the protein sequences into numerical signals using the Hydrophobicity Index and subsequently utilising a Convolutional Neural Network-based predictive model. The Influenza HA protein sequences used in the proposed work are obtained from the Influenza Research Database (IRD). Specifically, complete and unique HA protein sequences were used for avian, human and swine hosts. The data obtained for this work was 17999 human-host proteins, 17667 avian-host proteins and 9278 swine-host proteins. Given this set of collected proteins, the proposed method yields excellent results, outperforming previously proposed methods. As the results show, the proposed model can distinguish HA protein sequences with high accuracy whenever the virus under investigation can infect Human, Avian or Swine hosts.

Item Type: Conference or Workshop Item (Paper)
15 July 2021Accepted
30 November 2021Published Online
Subjects: CAH01 - medicine and dentistry > CAH01-01 - medicine and dentistry > CAH01-01-01 - medical sciences (non-specific)
CAH03 - biological and sport sciences > CAH03-01 - biosciences > CAH03-01-02 - biology (non-specific)
CAH09 - mathematical sciences > CAH09-01 - mathematical sciences > CAH09-01-01 - mathematics
CAH11 - computing > CAH11-01 - computing > CAH11-01-01 - computer science
CAH11 - computing > CAH11-01 - computing > CAH11-01-05 - artificial intelligence
Divisions: Faculty of Computing, Engineering and the Built Environment > School of Computing and Digital Technology
Depositing User: Huseyin Seker
Date Deposited: 10 Aug 2021 08:57
Last Modified: 30 Nov 2022 03:00

Actions (login required)

View Item View Item


In this section...