PAN AFRICAN JOURNAL OF LIFE SCIENCES
Volume 4, No. 3, December 2020
Genomic Diversity and Phylogenetic Analysis of SARS-CoV-2 Circulating in Africa and Other Continents: Implications for Diagnosis, Transmission, and Prevention
Idowu A. Taiwo1,2 Bamidele A. Iwalokun1,3,Titilola A. Samuel1,4, Adesola Olalekan1,5, Khalid O. Adekoya1,2, Oluwabukola M. Akinloye6, Gloria Amegatcher7, Eyitayo Adenipekun1,5, Daniel Adewun1,5, Fatimah O. Anwoju1,2, Olayiwola A. Popoola1,5, Oluwatoyin P. Popoola1,8, Bolanle
Iranloye1,9, Olubunmi A. Magbagbeola1,4 and Oluyemi Akinloye1,3,5*
1Centre for Genomics of Non-communicable Diseases and Personalized Healthcare (CGNPH). 2Department of Cell Biology and Genetics, University of Lagos.
3Department of Molecular Biology and Biotechnology, Nigerian Institute of Medical Research (NIMR), Yaba, Lagos.
4Department of Biochemistry and University of Lagos.
5Department of Medical Laboratory Sciences, University of Lagos.
6Department of Medical Laboratory Sciences, Oulton College, Moncton, New Brunswick, Canada. 7Department of Medical Laboratory Sciences and West African Centre for Cell Biology of Infectious Pathogen (WACCBIP), University of Ghana.
8Department of Biomedical Engineering, University of Lagos. 9Department of Physiology, University of Lagos.
Background: COVID-19 pandemic caused by SARS-CoV-2 remains a global health threat. Assessment of the genetic relatedness of the genome sequence is a prerequisite to understanding the dynamics, which is important to improve diagnosis and preventive measures. This study determined genomic diversity and SNP characteristic of genomes of SARS-CoV-2 from Africa and the rest of the world. The study involved molecular and phylogenetic analyses to understand the phylogeny and transmission dynamics of the virus.
Methods: The SARS-CoV-2 genome sequence data were mined and retrieved from major databases for one year in two phases: Phase 1; December 2019 to May 2020 and Phase 2; June 2020 to December 2020. A maximum of the four sequences that fulfilled the following predetermined criteria from each country were randomly selected for inclusion in the study: (i) sequence length >29,700 nt, (ii) number of Ns in the sequence not >5%, (iii) inclusion of Poly-A tail in the sequence record to ensure completeness.
Results: The similarity of SARS-Cov-2 genomes within and between countries was generally high with an average of 99.9%. Thus, SARS-CoV-2 vary between countries and continents by 0.1% as a result of SNPs in its genome. Phylogenetic data revealed multiple origin of SARS-CoV-2 in Africa and also suggested that the virus spreads by ‘founder’s effect’; whereby few viruses newly introduced into a population multiply rapidly and accumulate mutations as they spread quickly by community transfer to create population-based identity. Tree of continental consensus sequences retrieved in Phase 1 suggested that SARS-CoV-2 virus is of two major clusters: African cluster consisting of Africa, Europe, and North America and Asian cluster made up of Asia, South America, and Oceania. However, this clustering pattern vanished in phase 2. Thus, upholding the view that SARS-CoV-2 is constantly evolving.
Conclusion: This dynamism and genetic diversity of SARS-CoV-2 have important implications in diagnosis, transmission, and prevention strategy.
Keywords: SARS-CoV-2, SNPs, Genomics, Phylogenetic, Data Mining