Protein Family Neighborhood Analysis - Sequence
(All database)



In prokaryotic genomes functionally coupled genes can be organized in conserved clusters of neighboring genes, called operons, enabling their coordinated regulation. Thus, it is possible to predict function of uncharacterised genes by analysing functional annotations within their neighborhoods. Here, we present an algorithm that gives an insight into genomic neighborhoods of a query protein family by calculating statistical significance for overrepresentation of functional domains in the neighborhoods.

» As query, the userinputs protein sequence in FASTA format just like in example
» The size of neighborhood (e.g. 5000 bp) is described as number of base pairs (bp) before and after the gene whose protein product possesses the query Pfam domain. Typically, the size of neighborhood set to +-5000 bp corresponds to 10 genes in whole neighborhood.
» Due to possibility of high false discovery rate that can be a serious problem when multiple tests are performed (e.g. many Pfam domains are evaluated for overrepresentation), the user can decide which method of multiple test correction to use – Bonferroni correction or Benjamini-Hochberg procedure
»The genomic data that is used in this server is downloaded from JGI IMG Integrated Microbial Genomes & Microbiomes and it contains prokaryotic complete genomes and uncomplete genomic data from environmental sequencing.
» Currently, during neighborhood analysis both strands are taken to collect information about protein families. However, in the future it will be possible also to choose only the strand on which the query pfam domain is present in the analysed genome.
» Cut-off value is used to narrow the output based on p-value, and name of analysis is just for user's convenience



BlastP input

Your sequence



Neighborhood analysis parameters