Protein Family Neighborhood Analysis - Gene list



In prokaryotic genomes functionally coupled genes can be organized in conserved clusters of neighboring genes, called operons, enabling their coordinated regulation. Thus, it is possible to predict function of uncharacterised genes by analysing functional annotations within their neighborhoods. Here, we present an algorithm that gives an insight into genomic neighborhoods of a query protein family by calculating statistical significance for overrepresentation of functional domains in the neighborhoods.

» As query, the user must upload a file that contains list of genes (one gene per line) using IMG-JGI gene identifiers .

» The size of neighborhood (e.g. 5000 bp) is described as number of base pairs (bp) before and after the gene whose protein product possesses the query Pfam domain. Typically, the size of neighborhood set to +-5000 bp corresponds to 10 genes in whole neighborhood.

» Due to possibility of high false discovery rate that can be a serious problem when multiple tests are performed (e.g. many Pfam domains are evaluated for overrepresentation), the user can decide which method of multiple test correction to use – Bonferroni correction or Benjamini-Hochberg procedure
» The genomic data that is used in this server is downloaded from IMG-JGI Integrated Microbial Genomes & Microbiomes and it contains prokaryotic complete genomes and uncomplete genomic data from environmental sequencing.
» Currently, during neighborhood analysis genes encoded on both strands are considered.
» Cut-off value is used to limit the output based on p-value, and name of analysis is optional.


Upload file that contains gene list to analyze

Example input file