Comment[GEAAccession]	E-GEAD-452
MAGE-TAB Version	1.1
Investigation Title	The transcriptome analysis of controls and Mesp1 null embryos at gastrulation stage in mice
Experiment Description	New Mesp1 KO mice was established using genome editing techniques without introducing selection markers commonly used before, which did not display any phenotype. To unveil gene expression changes in the KO mice, we performed the transcriptome analysis using controls and Mesp1 KO embryos at early bud and late bud stage embryos.
Experimental Design	development or differentiation design	genotype design
Experimental Factor Name	genotype
Experimental Factor Type	genotype
Person Last Name	Muraoka	Ajima	Saga
Person First Name	Masafumi	Rieko	Yumiko
Person Affiliation	Mammalian Development Laboratory  Department of Gene Function and Phenomics, National Institute of Genetics		
Person Roles	submitter	submitter	submitter
Public Release Date	2021-10-07
Protocol Name	P-GEAD-938	P-GEAD-939	P-GEAD-940	P-GEAD-941	P-GEAD-942
Protocol Type	sample collection protocol	nucleic acid extraction protocol	nucleic acid library construction protocol	nucleic acid sequencing protocol	normalization data transformation protocol
Protocol Description	E7.5 mouse whole embryos were dissected in ice-cold PBS, trimmed small piece of extraembryonic tissue for genotyping, frozen the rest of tissue with liquid nitrogen, and stored at -80 degrees until preparation time of total RNA.	Stage matched two embryos were lysed together with 500ul of TRIzol Reagent (Thermo Fisher Scientific), heated at 55 degrees C 5 minuites, added 100ul of Chloroform, centrifuged 12000xg at 4 degrees C for 15 minuites, and saved supernatant. Combined 300ul of Ethanol and the supernatant, applied to a RNeasy MinElute Spin Column, and havested total RNA according to the manufacturer's instructions of using the RNeasy Micro kit (Qiagen).	Total RNA was used to make a RNA-seq library using the KAPA Stranded mRNA-Seq Kit Illumina platform (KAPA) and barcoded with DNA adapters using the KAPA single-indexed adaptor kit (KAPA). Quality assessment was performed using the Agilent DNA1000 Kit (Agilent).	The libraries of each sample were sequenced as 100-bp pair-end on an Illumina Hiseq2500 (Illumina) according to the manufactures protocol.	For all libraries, Low-quality sequences, adapters were trimmed or removed using Cutadapt(version 3.4) with the following options: \"-j 8 -m 36 -q 20 -a GATCGGAAGAGCACACGTCTGAACTCCAGTCAC\". The raw reads and processed reads were checked using FastQC (version 0.11.9, http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). For preparation for mapping reads to the mouse reference genome, the ensembl mouse reference genome (release-102, Mus_musculus.GRCm38.dna.primary_assembly.fa.gz) and the gene annotation file in General Transfer Format (GTF) (Mus_musculus.GRCm38.102.gtf.gz) were downloaded from the ensembl ftp site (http://ftp.ensembl.org/). and then non-chromosomal sequences(scaffolds) and the related annotation data were removed in the reference genome and the gene annotation GTF file, respectively. To increase mapping accuracy of splicing reads, Splicing-site and exon information were extracted from the gene annotation GTF file using the Python scripts hisat2_extract_splice_sites.py and hisat2_extract_exons.py, respectively, from the HISAT2 (version 2.2.1) package. The HISAT2 index files of the reference genome were built including the extracted genomic information using \"hisat2-build\" command with options: \"--ss\" and \"--exon\". Clean reads were then mapped to the HISAT2 index files using the HISAT2 with default options. The obtained Sequence Alignment Map (SAM) files were sorted by genomic coordinates and converted to  Binary Alignment Map (BAM) files using SAMtools(version 1.13) \"sort\" command with option: \"-O BAM\". Raw read counts per gene were calculated using featureCounts (version 2.0.3) with options: \"-s 2 -T 8 -t exon -g gene_id -a Mus_musculus.GRCm38.102.gtf\". Normalized counts and statistical values for differential gene expression analysis were calculated using the Bioconductor DESeq2 packages (version 1.32.0) in the R (version 4.1.0). For checking the gene expression correlation between samples, the pair-wise scatter plot was produced using log2(the normalized counts + 1), the \"cor\" function with the parameter \"method='spearman', use='pairwise.complete.obs'\" and ggplot2 packages (version 3.3.5) in the R. For principal component analysis (PCA), the variance stabilizing transformed (vst) normalized counts were calculated using the vst function of the DESeq2 with the default settings and PCA was performed with the top 500 most variable genes using the DESeq2's plotPCA function. For checking differentially expressed genes (DEGs), MAplots for each comparison of sampe groups(Het vs KO in EB or LB) were produced using the results of the DESeq2 analysis with the ggplot2. The detection of DEGs were performed using the DESeq2 with the cut-off criteria of adjusted p-value < 0.01.
SDRF File	E-GEAD-452.sdrf.txt
Comment[AEExperimentType]	RNA-seq of coding RNA
Comment[BioProject]	PRJDB12228
Comment[Last Update Date]	2021-10-07