Comment[GEAAccession]	E-GEAD-398
MAGE-TAB Version	1.1
Investigation Title	eQTL summary of ImmuNexUT
Experiment Description	eQTL top variant table of 28 immune cell subsets from the ImmuNexUT cohort. Whole genome sequencing was performed with whole blood samples. RNA-seq was performed with each immune cell subset samples. After filtering and normalization of the gene expression data, eQTL analysis was performed in each immune cell type (Naive_CD4, Mem_CD4, Fr._I_nTreg, Fr._II_eTreg, Fr._III_T, Th1, Th2, Th17, Tfh, NK,  Naive_CD8, Mem_CD8, EM_CD8, CM_CD8, TEMRA_CD8, Naive_B, USM_B, SM_B, DN_B, Plasmablast, CL_Mono (or CD16n_Mono), CD16p_Mono, Int_Mono, NC_Mono, mDC, pDC,  LDG, Neu).
Experimental Design	case control design	translational bias design	all pairs
Experimental Factor Name	disease
Experimental Factor Type	disease
Person Last Name	Nagafuchi
Person First Name	Yasuo
Person Affiliation	Department of Allergy and Rheumatology, Graduate School of Medicine, The University of Tokyo
Person Roles	submitter
Public Release Date	2021-04-30
Publication DOI	10.1016/j.cell.2021.03.056
Protocol Name	P-GEAD-654	P-GEAD-655	P-GEAD-656	P-GEAD-657	P-GEAD-658	P-GEAD-659
Protocol Type	sample collection protocol	nucleic acid extraction protocol	nucleic acid labeling protocol	nucleic acid hybridization to array protocol	array scanning and feature extraction protocol	normalization data transformation protocol
Protocol Description	Whole blood from 416 ImmunexUT cohort were collected.	Genomic DNA was isolated from peripheral blood using QIAmp DNA Blood Midi kit (QIAGEN).	nucleic acid library construction protocol; Libraries for whole genome sequencing were prepared using TruSeq DNA PCR-Free Library prep kit (Illumina).	dummy protocol	nucleic acid sequencing protocol; Whole genomes were sequenced on Illumina HiSeq X with 151-bp pair-end reads.	WGS data processing was performed based on the standardized best-practice method proposed by GATK.  Samples with genotyping call rates < 99% were removed.  We used BEAGLE to impute missing genotypes. Variants with minor allele frequency < 1% were excluded. Genes expressed at low levels (< 5 count in more than 80% samples or < 0.5 CPM in more than 80% samples) were filtered out in each cell subset. The residual expression data were normalized between samples with TMM, converted to CPM and then normalized across samples using an inverse normal transform. A Probabilistic Estimation of Expression Residuals method was applied to normalized expression data to infer hidden covariates. The top 2 genetic principal components, sample collection phase, clinical diagnosis, sex and latent factors were utilized as covariates for eQTL analysis. Mem CD8s, which were collected in phase1 and divided into CM CD8 and EM CD8 in phase2, were analyzed jointly with EM CD8 for eQTL analysis because the majority of the Mem CD8 population consisted of EM CD8. For each cell subset, we used a QTLtools permutation pass with 10,000 permutations to obtain gene-level nominal P value thresholds corresponding to FDR < 0.05. We subsequently performed forward-backward stepwise regression eQTL analysis with a QTLtools conditional pass.
SDRF File	E-GEAD-398.sdrf.txt
Comment[Number of channel]	single-channel
Comment[Array Design REF]	A-GEAD-11
Comment[AEExperimentType]	transcription profiling by array
Comment[BioProject]	PRJDB10692
Comment[Last Update Date]	2021-04-30