About

Human diseases have led to rapidly expanding research to understand their cause and find solutions. This has led to the association of many genes to these diseases. However the underlying mechanisms often remain unknown or are unclear. There currently is a bias toward well studied genes and a tendency to ignore poorly annotated genes, which may be the missing pieces of the puzzle. By using the same pieces over and over again, we may be never able to see the complete picture of the disease under study. To identify new pieces and complete the puzzle we have created this online tool. This tool associates unstudied genes with well studied and disease related genes, by identifying which genes tend to work together. This can then be used to predict the function of unstudied genes adding new pieces to the puzzle.

A 1 minute presentation on GeneFriends' motivation and purpose:

GeneFriends:RNAseq relies on a co-expression map that describes which genes tend to generally activate (increase in expression) and deactivate (decrease in expression) simultaneously in approximately 4000 Human RNAseq samples (obtained from the Short Read Archive (SRA) database of pubmed). This creates a general impression of which genes tend to activate simultaneously.

Since co-expressed genes tend to be involved in the same biological processes this map can be used to:

  1. Assign putative functions to poorly annotated genes.
  2. Identify new target genes related to a disease or biological process using a guild by association approach.

GeneFriends Results

The full co-expression maps can be downloaded below

Human:
Mouse:

Downloads

The GeneFriends tool employs a genome wide co-expression map which describes which genes are related based on how often they are co-expressed. To construct this map 4133 RNAseq/2531 (Human/Mouse) samples with at least 10.000.000 reads were downloaded and analysed using STAR. Raw read counts were counted using a custom script performing a similar job to the freely available HTseq tool(albeit >20 fold faster). Each sample was normalized by dividing the read counts for each gene/exon by the total read count (only reads that overlap features are included) in the sample. The correlation between each gene pair was calculated using a weighted Pearson correlation. For each gene all other genes were ranked based on their co-expression strength. Then the mutual rank was calculated by adding the ranks of each pair in each others Pearson correlation ranked lists and divided by two. The mouse co-expression map will be updated 21-03-2015 to include >4000 samples.