Biology

Chemistry

Biochemical Enrichment Analysis

Informatics

Identification of altered
biochemical domains between
pumpkin and tomatillo leaf
metabolites

Goal: Identify significantly over represented biological
pathways based on significant differences in leaf
metabolites (Use DATA: Pathway Enrichment data.csv)
Topics:
1. KEGG Database
2. MetaboAnalyst: Pathway enrichment analysis
3. MBrole: Over Representation Analysis
4. Hypergeometric test for enrichment
Biology

KEGG Pathway Visualization
Chemistry

Biochemical Enrichment Analysis

Informatics

Goals:
Use KEGG to:
1. Overview glutamate entry in KEGG (C00025)
2. Visualize a pathway of interest
3. Map metabolite of interest to pathway
•
•

http://www.kegg.jp/dbget-bin/www_bget?C00025
http://www.kegg.jp/keggbin/show_pathway?org_name=ath&mapno=00250

• Mapping example
C00064 green, black
C00025 green, black
Biology

Pathway Visualization
Chemistry

Biochemical Enrichment Analysis

Informatics
Biology

Chemistry

Biochemical Enrichment Analysis

Informatics

Pathway over representation
analysis (ORA)

Steps:
1. Use MBrole to conduct:
• Pathway over representation analysis
• url: http://csbg.cnb.csic.es/mbrole/

ORA:
• is used to evaluate whether a particular set of
metabolites is represented more than expected by
chance within a given compound list
[doi: 10.1093/nar/gkq329].
• p-value is calculated using hypergeometric or
Fisher’s exact test
Biology

Chemistry
Informatics

MBRole:
Pathway Over Representation
Analysis (ORA)

Biochemical Enrichment Analysis

Goal: Identify an over represented pathway and visualize it in KEGG
MBRole

Biology

Chemistry

Biochemical Enrichment Analysis

Informatics

http://www.genome.jp/keggbin/show_pathway?map01070+C06427+C00158+C00049+C00493+C00079+C00026+C00042+C00751+C00149+C00078+C00073+
Biology

Chemistry

Test for significance:
Hypergeometric Test

Biochemical Enrichment Analysis

Informatics

How to calculate statistics to determine network enrichment?
hit.num = 51 # number of significantly changed pathway metabolites
set.num = 1455 # number of metabolites in pathway
full = 3358 # all possible metabolites in organism
q.size = 72 # number of significantly changed metabolites

phyper(hit.num-1, set.num, full-set.num, q.size, lower.tail=F)
= 1.717553e-06
Biology

Chemistry

Biochemical Enrichment Analysis

Informatics

MetaboAnalyst:
Pathway Enrichment Analysis
(PEA)

Use MetaboAnalyst to conduct:
• Pathway enrichment Analysis
• url: http://www.metaboanalyst.ca/MetaboAnalyst/faces/UploadView.jsp
PEA:
• is an advanced form of over representation analysis (ORA) which takes
into account pathway topology and is based on gene set enrichment
analysis (GSEA) [doi:10.1093/bioinformatics/btq418]
• p-value is calculated using hypergeometric or Fisher’s exact test

Questions:
1. What pathway is the most important based on ORA and topology?
Biology

KEGG Pathway Enrichment
Chemistry

Biochemical Enrichment Analysis

Informatics

6 metabolite enrichment analysis