<?xml version="1.0" encoding="UTF-8"?>
<STUDY_SET xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <STUDY accession="ERP129771" alias="ena-STUDY-IDIBELL, Bellvitge University Hospital-11-06-2021-11:10:04:104-850" center_name="IDIBELL, Bellvitge University Hospital">
    <IDENTIFIERS>
      <PRIMARY_ID>ERP129771</PRIMARY_ID>
      <EXTERNAL_ID namespace="BioProject">PRJEB45630</EXTERNAL_ID>
      <SUBMITTER_ID namespace="IDIBELL, Bellvitge University Hospital">ena-STUDY-IDIBELL, Bellvitge University Hospital-11-06-2021-11:10:04:104-850</SUBMITTER_ID>
    </IDENTIFIERS>
    <DESCRIPTOR>
      <STUDY_TITLE>Comparative pangenome analysis of capsulated Haemophilus influenzae serotype f highlights their high genomic stability</STUDY_TITLE>
      <STUDY_TYPE existing_study_type="Other"/>
      <STUDY_ABSTRACT>Haemophilus influenzae is an opportunistic pathogen highly adapted to the human respiratory tract. Although non-typeable H. influenzae has been reported to have high heterogeneity, few studies have analysed the genomic variability of capsulated strains. This study aims to examine the diversity of serotype f isolates from the Netherlands, Portugal, and Spain, and to compare all capsulated genomes available on public databases.  Thirty-seven serotype f isolates were sequenced for pangenome and phylogenetic studies. To better understand the diversity of capsulated H. influenzae, all available capsulated genomes recorded in the European Nucleotide Archive (ENA) and National Center for Biotechnology Information (NCBI) databases were included in the analysis.  The 37 serotype f isolates belonged to clonal complex 124. The isolates shared few single nucleotide polymorphisms (SNPs) (n = 10,999), but a high percentage of core genes (&gt;80%). Although all isolates were closely related, three main clades were identified by the presence of 75, 60 and 41 exclusive genes for clade I, II, and III, respectively.  Multi-locus sequence type (MLST) analysis of all capsulated genomes revealed a reduced number of clonal complexes associated with each serotype: 5 for serotype a; 4 for serotype b; 1 for serotypes c, d and e, and 2 for serotype f. Pangenome analysis of the 800 capsulated genomes revealed a large pool of genes (n = 6,360), many of which were part of the accessory genome (n = 5,323). Phylogenetic analysis suggested higher diversity in serotype f, but with a significant higher total number of SNPs in serotypes a, b, and e (p &lt; 0.0001), supporting the low variability of this serotype. Capsulated H. influenzae are genetically homogeneous, with few lineages in each serotype. Serotype f has high genetic stability regardless of time and country of isolation.</STUDY_ABSTRACT>
      <CENTER_PROJECT_NAME>Genomics of capsulated Haemophilus influenzae</CENTER_PROJECT_NAME>
      <STUDY_DESCRIPTION>Haemophilus influenzae is an opportunistic pathogen highly adapted to the human respiratory tract. Although non-typeable H. influenzae has been reported to have high heterogeneity, few studies have analysed the genomic variability of capsulated strains. This study aims to examine the diversity of serotype f isolates from the Netherlands, Portugal, and Spain, and to compare all capsulated genomes available on public databases.  Thirty-seven serotype f isolates were sequenced for pangenome and phylogenetic studies. To better understand the diversity of capsulated H. influenzae, all available capsulated genomes recorded in the European Nucleotide Archive (ENA) and National Center for Biotechnology Information (NCBI) databases were included in the analysis.  The 37 serotype f isolates belonged to clonal complex 124. The isolates shared few single nucleotide polymorphisms (SNPs) (n = 10,999), but a high percentage of core genes (&gt;80%). Although all isolates were closely related, three main clades were identified by the presence of 75, 60 and 41 exclusive genes for clade I, II, and III, respectively.  Multi-locus sequence type (MLST) analysis of all capsulated genomes revealed a reduced number of clonal complexes associated with each serotype: 5 for serotype a; 4 for serotype b; 1 for serotypes c, d and e, and 2 for serotype f. Pangenome analysis of the 800 capsulated genomes revealed a large pool of genes (n = 6,360), many of which were part of the accessory genome (n = 5,323). Phylogenetic analysis suggested higher diversity in serotype f, but with a significant higher total number of SNPs in serotypes a, b, and e (p &lt; 0.0001), supporting the low variability of this serotype. Capsulated H. influenzae are genetically homogeneous, with few lineages in each serotype. Serotype f has high genetic stability regardless of time and country of isolation.</STUDY_DESCRIPTION>
    </DESCRIPTOR>
    <STUDY_ATTRIBUTES>
      <STUDY_ATTRIBUTE>
        <TAG>ENA-FIRST-PUBLIC</TAG>
        <VALUE>2021-09-15</VALUE>
      </STUDY_ATTRIBUTE>
      <STUDY_ATTRIBUTE>
        <TAG>ENA-LAST-UPDATE</TAG>
        <VALUE>2021-09-15</VALUE>
      </STUDY_ATTRIBUTE>
    </STUDY_ATTRIBUTES>
  </STUDY>
</STUDY_SET>
