<?xml version="1.0" encoding="UTF-8"?>
<STUDY_SET xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <STUDY accession="ERP123595" alias="ena-STUDY-KING ABDULLAH UNIVERSITY OF SCIENCE AND TECHNOLOGY-25-08-2020-05:07:32:124-453" center_name="KING ABDULLAH UNIVERSITY OF SCIENCE AND TECHNOLOGY">
    <IDENTIFIERS>
      <PRIMARY_ID>ERP123595</PRIMARY_ID>
      <EXTERNAL_ID namespace="BioProject">PRJEB40003</EXTERNAL_ID>
      <SUBMITTER_ID namespace="KING ABDULLAH UNIVERSITY OF SCIENCE AND TECHNOLOGY">ena-STUDY-KING ABDULLAH UNIVERSITY OF SCIENCE AND TECHNOLOGY-25-08-2020-05:07:32:124-453</SUBMITTER_ID>
    </IDENTIFIERS>
    <DESCRIPTOR>
      <STUDY_TITLE>Genome sequence of the genetically modified mutant of P. falciparum (NF54 strain) Pf?mei2</STUDY_TITLE>
      <STUDY_TYPE existing_study_type="Other"/>
      <STUDY_ABSTRACT>In the Pf?mei2 mutant the Open Reading Frame (ORF) of mei2 (PF3D7_ 0623400) has been removed from the genome of the human malaria parasite P. falciparum (NF54 strain) using well-established CRISPR/Cas9 technology. The mei2 gene of WT PfNF54 was deleted using a donor DNA plasmid and 2 different sgRNA-donor containing plasmids targeting the mei2 gene using the CRISPR/Cas9 technology as described by Marin-Mogollon et al 1. Pf?mei2 has been engineered to remove nearly all heterologous DNA that is introduced for deletion of the mei2 ORF; what is still present are 34 bp nucleotide sequence (FRT sequence). Heterologous DNA used to generate the mei2 deletion was flanked by two FRT sequences and has been excised in the presence of FLPe recombinase, leaving one FRT sequence in the genome, flanked by 16bp and 14bp cloning restriction sites. This method of removal of heterologous DNA by FLPe recombinase is similar to the method described by Roestenberg et al 2. Whole genome sequencing showed correct deletion of the mei2 gene, confirmed the absence of sequences used in the CRISPR/Cas9, gDNA and flpe plasmids (cas9, ampicillin, blasticidin, hdhfr, yfcu, flpe recombinase) and showed the absence of unwanted recombination events in endogenous 5'- and 3'-UTR sequences, that have been used in the plasmids to drive gene expression.  Whole genome sequencing was performed at the King Abdullah University of Science and Technology (KAUST, Thuwal, Saudi Arabia Prof. Arnab Pain). A total of 200 ng of DNA was used for DNA library preparation using NebNext Ultra II DNA library prep kit for Immumina (NEB). Upon library quantification and size verification, DNA library sequencing was carried out on a MiSeq platform (Illumina) that produced 2x150 bp paired-end reads. The quality of the raw reads was assessed using FATSQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc). Low-quality reads and Illumina adaptors sequences from the end of the reads were removed using Trimmomatic (PMID: 24695404). Quality trimmed reads were mapped to P. falciparum 3D7 reference genome (release 40 in PlasmoDB- http://www.plasmoddb.org) using BWA 3. Read pairing information, flag and duplicate reads were removed using Picard's CleanSam, FixMateInformation, and MarkDuplicates tools. SNPs were called using the genome analysis tool kit (GATK) best practices pipeline 4. Identified SNPs were filtered using vcftools to keep high-quality SNP with the quality score (Q) = 30 and depth (d) = 50. A total of 167 high-quality SNPs were identified. SNPs were annotated and their effect on coding sequences of genes was done via snpEFF. For insertion and deletion (InDel) identification, raw gapped alignment were realigned using GATK RealignerTargetCreator and IndelRealigner tools. Variants were called using bcftool's mpileup and call tools. Variants were tagged with quality score (Q) = 30 and depth (d) = 50 tagged using vcf-annotate option and only InDels with quality score (Q) = 30 and depth (d) = 100 were filtered for further analysis. Insertion between 20-2000 bp and deletion between 20-2000 bp were identified using GATK's SelectVariants tool. Variants were tagged with quality score (Q) = 30 and depth (d) = 50 tagged using vcf-annotate option and only InDels with a quality score (Q) = 30 and depth (d) = 100 were filtered for further analysis. Insertion between 20-2000 bp and deletion between 20-2000 bp were identified using GATK's SelectVariants tool.</STUDY_ABSTRACT>
      <CENTER_PROJECT_NAME>P. falciparum Pf?mei2 genome sequence</CENTER_PROJECT_NAME>
      <STUDY_DESCRIPTION>In the Pf?mei2 mutant the Open Reading Frame (ORF) of mei2 (PF3D7_ 0623400) has been removed from the genome of the human malaria parasite P. falciparum (NF54 strain) using well-established CRISPR/Cas9 technology. The mei2 gene of WT PfNF54 was deleted using a donor DNA plasmid and 2 different sgRNA-donor containing plasmids targeting the mei2 gene using the CRISPR/Cas9 technology as described by Marin-Mogollon et al 1. Pf?mei2 has been engineered to remove nearly all heterologous DNA that is introduced for deletion of the mei2 ORF; what is still present are 34 bp nucleotide sequence (FRT sequence). Heterologous DNA used to generate the mei2 deletion was flanked by two FRT sequences and has been excised in the presence of FLPe recombinase, leaving one FRT sequence in the genome, flanked by 16bp and 14bp cloning restriction sites. This method of removal of heterologous DNA by FLPe recombinase is similar to the method described by Roestenberg et al 2. Whole genome sequencing showed correct deletion of the mei2 gene, confirmed the absence of sequences used in the CRISPR/Cas9, gDNA and flpe plasmids (cas9, ampicillin, blasticidin, hdhfr, yfcu, flpe recombinase) and showed the absence of unwanted recombination events in endogenous 5'- and 3'-UTR sequences, that have been used in the plasmids to drive gene expression.  Whole genome sequencing was performed at the King Abdullah University of Science and Technology (KAUST, Thuwal, Saudi Arabia Prof. Arnab Pain). A total of 200 ng of DNA was used for DNA library preparation using NebNext Ultra II DNA library prep kit for Immumina (NEB). Upon library quantification and size verification, DNA library sequencing was carried out on a MiSeq platform (Illumina) that produced 2x150 bp paired-end reads. The quality of the raw reads was assessed using FATSQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc). Low-quality reads and Illumina adaptors sequences from the end of the reads were removed using Trimmomatic (PMID: 24695404). Quality trimmed reads were mapped to P. falciparum 3D7 reference genome (release 40 in PlasmoDB- http://www.plasmoddb.org) using BWA 3. Read pairing information, flag and duplicate reads were removed using Picard's CleanSam, FixMateInformation, and MarkDuplicates tools. SNPs were called using the genome analysis tool kit (GATK) best practices pipeline 4. Identified SNPs were filtered using vcftools to keep high-quality SNP with the quality score (Q) = 30 and depth (d) = 50. A total of 167 high-quality SNPs were identified. SNPs were annotated and their effect on coding sequences of genes was done via snpEFF. For insertion and deletion (InDel) identification, raw gapped alignment were realigned using GATK RealignerTargetCreator and IndelRealigner tools. Variants were called using bcftool's mpileup and call tools. Variants were tagged with quality score (Q) = 30 and depth (d) = 50 tagged using vcf-annotate option and only InDels with quality score (Q) = 30 and depth (d) = 100 were filtered for further analysis. Insertion between 20-2000 bp and deletion between 20-2000 bp were identified using GATK's SelectVariants tool. Variants were tagged with quality score (Q) = 30 and depth (d) = 50 tagged using vcf-annotate option and only InDels with a quality score (Q) = 30 and depth (d) = 100 were filtered for further analysis. Insertion between 20-2000 bp and deletion between 20-2000 bp were identified using GATK's SelectVariants tool.</STUDY_DESCRIPTION>
    </DESCRIPTOR>
    <STUDY_ATTRIBUTES>
      <STUDY_ATTRIBUTE>
        <TAG>ENA-FIRST-PUBLIC</TAG>
        <VALUE>2021-08-24</VALUE>
      </STUDY_ATTRIBUTE>
      <STUDY_ATTRIBUTE>
        <TAG>ENA-LAST-UPDATE</TAG>
        <VALUE>2021-08-24</VALUE>
      </STUDY_ATTRIBUTE>
    </STUDY_ATTRIBUTES>
  </STUDY>
</STUDY_SET>
