The Big Data for Discovery Science Center (BDDS) - comprised of leading experts in biomedical imaging, genetics, proteomics, and computer science - is taking an "-ome to home" approach toward streamlining big data management, aggregation, manipulation, integration, and the modeling of biological systems across spatial and temporal scales.


The PANTHER (protein annotation through evolutionary relationship) classification system is a comprehensive system that combines gene function, ontology, pathways and statistical analysis tools that enable biologists to analyze large-scale, genome-wide data from sequencing, proteomics or gene expression experiments. The system is built with 104 complete genomes organized into gene families and subfamilies, and their evolutionary relationships are captured in phylogenetic trees, multiple sequence alignments and statistical models (hidden Markov models or HMMs). Genes are classified according to their function in several different ways: families and subfamilies are annotated with ontology terms (Gene Ontology (GO) and PANTHER protein class), and sequences are assigned to PANTHER or Reactome pathways.

PANTHER provides two statistical tests that enable users to analyze large-scale genome-wide experimental data against the current annotated gene set data, including Gene Ontology and PANTHER Pathway. They are widely used by bench scientists, bioinformaticians, computer scientists and systems biologists. The first tool is the PANTHER overrepresentation test tool, which,compares a test gene list to a reference gene list, and determines whether a particular class (e.g. molecular function, biological process, cellular component, PANTHER protein class, the PANTHER pathway or Reactome pathway) of genes is overrepresented or underrepresented. A more detailed description of the algorithm is available here. The second tool is the PANTHER enrichment test tool, which determines whether the numerical values of the genes associated with a particular ontology class or pathway were drawn randomly from the overall distribution of values. The Mann-Whitney U Test (Wilcoxon Rank-Sum Test) is used to determine the P-value. A more detailed description of the algorithm is available here. Both tools can be accessed via web-service call. Support for VCF file format is available from the PANTHER website.

