Introduction
Rapeseed (Brassica napus L.) is an important source of edible oil and protein-rich livestock feed in the world. B. napus (AACC) was ancestrally originated from an interspecific hybridization between two diploid progenitors, B. rapa (AA) (n = 10) and B. oleracea (CC) (n = 9), less than 7500 years ago. In our previous study, we resequenced a world-wide collection of 991
B. napus gerplasm accessions, including 658 winter types, 145 semi-winter types and 188 spring types, from 39 countries (Wu et al., 2019).
The geographic distribution of rapeseed accessions
In genetics, a genome-wide association study (GWAS), also known as whole genome association study (WGAS), is an observational study of a genome-wide set of genetic variants in different individuals to see if any variant is associated with a trait. GWAS typically focus on associations between single-nucleotide polymorphisms (SNPs) and traits like major agronomic traits.
In order to make better use of this huge B. napus gerplasm accessions, we develop this interactive application (BnaGWAS) in R with Shiny
. This application can conduct GWAS, visualization of GWAS results (Manhattan plot and QQ plot), extraction of significant genes and annotation of genes.
Data input
phenotype data (.txt)
Noted: Your Samples Uploaded MUST Be The 300 Core Collection Samples Used Here ! So If Some Samples Are Not In Your LIST, You Need Add Them In Your List, And Set The Value NA. If Some Samples In Your List Are Not In The 300 Core Collection Samples Here, JUST REMOVE THEM!
I highly recommended first download the example of the expected input phenotype dataset below, and then replace the phenotype values with you own data.
You just need upload your phenotype data to run GWAS. Here we just use the 300
core collection gerplasm which represent the most of genetic resources of 1000 B. napus gerplasm accessions. an example of the expected input data format is present as below:
R4157 | 0.859791123 |
R4158 | 0.87369142 |
R4163 | 0.842593709 |
R4168 | 0.884782609 |
R4171 | NA |
R4176 | 0.885619807 |
R4177 | 0.885884455 |
R4179 | 0.879374612 |
R4180 | 0.878567797 |
R4182 | 0.868825911 |
… | … |
Where, column one correspond to samples, column two correspond to phenotype values.
An example of the expected input phenotype dataset can be accessible here.
Other parameters
Next you need enter your trait name (recommended
) (default: Bna_trait). Now just support the EMMAX model. After all the prepared works are ready, then clink Run Analysis to start GWAS.
Visualization
For the Visualization
section in this App, it is aiming to visualize the Manhattan plot and QQ plot. You can choose the alternate colors for alternate chromosomes and p-value threshold (default: p-value=5).
Extraction
The extraction of significant genes is based on the significant p-value of SNPs. So here you need choose the p-value threshold and the distance up/down-stream of SNPs (recommended
) (default: 75kb).
Annotation
This section is designed for gene annotation based on different databases (eggNOG
, GO
, KEGG
, NR
, etc.).
Analysis Workflow
Step 1: Data Input Upload your phenotype data via RUN GWAS
section, a distribution plot of your input data will be presented.
Step 2: visualization generate Manhattan plot and QQ plot.
Step 3: Extraction obtain the significant genes.
Step 4: Annotation annotate significant genes based on different databases.
Frequently Asked Questions and Answers
Q1. How to organize my own phenotype data?
A: Now the input data MUST be .txt
format. I highly recommended first download the example of the expected input phenotype dataset from here, and then replace the phenotype values with you own data.
Q2. Can I choose other models (GLM, MLM, CMLM, etc.) to run GWAS?
A: NO, this app can only run the EMMAX model. It is easy to implement other models for GWAS, but other models need more time to run.
Q3. Can I run the GWAS of quality trait?
A: Yes! quantitative trait and quality trait are fully support.
Q4. How to choose the distance up/down-stream for gene extraction?
A: This dependents on your GWAS results and Linkage Disequilibria of your sample. Here we recommend 75kb~100kb
.
Feedback
This App is developed and maintained by Tao Yan at the JiangLX Lab, Institute of Crop science, College of Agriculture and Biotechnology (CAB), Zhejiang University.
If you have any questions, comments, or suggestions, feel free to contact the developer at tyan<at>zju.edu.cn.
Step1: Upload Your Phenotype File
Step2: Select Reference Genome, GWAS Model and Enter Trait Name
Distribution Of Your Phenotype
Result of GWAS
If you use EMMAX to publish research, please cite:
Kang HM, Sul JH, Service SK, Zaitlen NA, Kong SY, Freimer NB, Sabatti C, Eskin E. (2010) Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42:348-54.
Customized the visualization of Manhattan Plot and QQ Plot
Manhattan Plot
QQ plot
Extract genes based on significant SNPs
Genes extracted based on significant SNPs
Gene annotation based on different databases
If you are sure your previous analysis are right, then clink the button: Run Annotation