Introduction

Rapeseed (Brassica napus L.) is an important source of edible oil and protein-rich livestock feed in the world. B. napus (AACC) was ancestrally originated from an interspecific hybridization between two diploid progenitors, B. rapa (AA) (n = 10) and B. oleracea (CC) (n = 9), less than 7500 years ago. In our previous study, we resequenced a world-wide collection of 991 B. napus gerplasm accessions, including 658 winter types, 145 semi-winter types and 188 spring types, from 39 countries (Wu et al., 2019).

The geographic distribution of rapeseed accessions

In genetics, a genome-wide association study (GWAS), also known as whole genome association study (WGAS), is an observational study of a genome-wide set of genetic variants in different individuals to see if any variant is associated with a trait. GWAS typically focus on associations between single-nucleotide polymorphisms (SNPs) and traits like major agronomic traits.

In order to make better use of this huge B. napus gerplasm accessions, we develop this interactive application (BnaGWAS) in R with Shiny. This application can conduct GWAS, visualization of GWAS results (Manhattan plot and QQ plot), extraction of significant genes and annotation of genes.

Data input

phenotype data (.txt)

Noted: Your Samples Uploaded MUST Be The 300 Core Collection Samples Used Here ! So If Some Samples Are Not In Your LIST, You Need Add Them In Your List, And Set The Value NA. If Some Samples In Your List Are Not In The 300 Core Collection Samples Here, JUST REMOVE THEM!

I highly recommended first download the example of the expected input phenotype dataset below, and then replace the phenotype values with you own data.

You just need upload your phenotype data to run GWAS. Here we just use the 300 core collection gerplasm which represent the most of genetic resources of 1000 B. napus gerplasm accessions. an example of the expected input data format is present as below:


R4157	0.859791123
R4158	0.87369142
R4163	0.842593709
R4168	0.884782609
R4171	NA
R4176	0.885619807
R4177	0.885884455
R4179	0.879374612
R4180	0.878567797
R4182	0.868825911
…	…

Where, column one correspond to samples, column two correspond to phenotype values.

An example of the expected input phenotype dataset can be accessible here.

Other parameters

Next you need enter your trait name (recommended) (default: Bna_trait). Now just support the EMMAX model. After all the prepared works are ready, then clink Run Analysis to start GWAS.

Visualization

For the Visualization section in this App, it is aiming to visualize the Manhattan plot and QQ plot. You can choose the alternate colors for alternate chromosomes and p-value threshold (default: p-value=5).

Extraction

The extraction of significant genes is based on the significant p-value of SNPs. So here you need choose the p-value threshold and the distance up/down-stream of SNPs (recommended) (default: 75kb).

Annotation

This section is designed for gene annotation based on different databases (eggNOG, GO, KEGG, NR, etc.).

Analysis Workflow

Step 1: Data Input Upload your phenotype data via RUN GWAS section, a distribution plot of your input data will be presented.

Step 2: visualization generate Manhattan plot and QQ plot.

Step 3: Extraction obtain the significant genes.

Step 4: Annotation annotate significant genes based on different databases.

Frequently Asked Questions and Answers

Q1. How to organize my own phenotype data?

A: Now the input data MUST be .txt format. I highly recommended first download the example of the expected input phenotype dataset from here, and then replace the phenotype values with you own data.

Q2. Can I choose other models (GLM, MLM, CMLM, etc.) to run GWAS?

A: NO, this app can only run the EMMAX model. It is easy to implement other models for GWAS, but other models need more time to run.

Q3. Can I run the GWAS of quality trait?

A: Yes! quantitative trait and quality trait are fully support.

Q4. How to choose the distance up/down-stream for gene extraction?

A: This dependents on your GWAS results and Linkage Disequilibria of your sample. Here we recommend 75kb~100kb.

Feedback

This App is developed and maintained by Tao Yan at the JiangLX Lab, Institute of Crop science, College of Agriculture and Biotechnology (CAB), Zhejiang University.

If you have any questions, comments, or suggestions, feel free to contact the developer at tyan<at>zju.edu.cn.

Step1: Upload Your Phenotype File

Upload Your Phenotype File (Support txt, csv and xlsx type) (NO Header!) (Leave blank for example run).

Browse...

Step2: Select Reference Genome, GWAS Model and Enter Trait Name

Select The Reference Genome:

Enter Your Trait Name:

Select the GWAS Model (Now just support EMMAX model):

Distribution Of Your Phenotype

Result of GWAS

Download GWAS Results

If you use EMMAX to publish research, please cite:

Kang HM, Sul JH, Service SK, Zaitlen NA, Kong SY, Freimer NB, Sabatti C, Eskin E. (2010) Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42:348-54.

Customized the visualization of Manhattan Plot and QQ Plot

Select Color1:

Select Color2:

Choose -log 10 p-value:

Manhattan Plot

Manhattan Plot Download

QQ plot

QQ Plot Download

Extract genes based on significant SNPs

Choose significant -log 10 p-value:

Choose the distance (kb):

Genes extracted based on significant SNPs

Download Significant Genes

Gene annotation based on different databases

If you are sure your previous analysis are right, then clink the button: Run Annotation

Gene annotation

Download Genes with Annotation