Bioinformatics

Source Genomics offers a range of bioinformatics & computational analysis options to aid researchers in the interpretation of their data. We understand the need for biological and clinical data analysis and support.

Our Bioinformatics Services Offer

  • Free Consultation & Design
  • Large Selection of Tools & Packages
  • Primary & Secondary Data Analysis
  • Downstream Data Analysis Workflows

Talk to us about your requirements today. We are here to help you reach your research goals quickly and efficiently.

Gene Expression

Digital Gene Expression

Quantification of transcripts in a sample using RNA-Seq data.

This analysis uses RNA-Seq data to quantify expression levels by comparison to a reference genome, or de novo transcriptome assembly if a reference genome is not available. Raw reads are trimmed, aligned and quantified by gene-wise read counting using the featureCounts model. You will receive raw (fastq) and aligned (bam) files for each sample as well as text files detailing transcripts per million (TPM) or fragments per kb per million mapped reads (FPKM).

Differential Gene Expression

Pairwise comparison of gene expression between sample groups.

This analysis is similar to digital gene expression, but includes a comparative analysis between groups of samples. You will receive raw (fastq), aligned (bam) files and text files of gene counts as well as a PCA plot of all samples, a sample-to-sample distance heatmap and MA plots and differential expression heatmaps for each comparison.

Other Gene Expression Models

Custom analyses of other RNA-Seq data, such as (differential) miRNA or methylation quantification; identification of splice variants; detection and quantification of fusion genes.

Variant Calling

Germline

Investigation of germline single nucleotide polymorphisms (SNPs), insertions and deletions (INDELs) present in either whole genome or exome data.

This analysis uses NGS data to identify any germline mutations by comparison to a reference sequence. Raw reads are trimmed, aligned and deduplicated, followed by variant calling, recalibrating, filtering and finally annotating variants with entries from relevant public databases such as dbSNP and ClinVar. You will receive raw (fastq) and aligned (bam) files for each sample, as well as VCF files listing all variants.

Somatic

Identification and quantification of somatic single nucleotide variants (SNVs), insertions and deletions (INDELs).

This analysis uses NGS data from cancer samples to identify somatic mutations by comparison to either a reference sequence and/or a matched healthy sample. Raw reads are trimmed, aligned, deduplicated and somatic variants are called via local de-novo assembly of haplotypes in active regions. Regions showing signs of somatic variation are then reassembled to generate candidate variant haplotypes. Further steps include contamination calculation, orientation bias detection and finally filtering and functional annotation of variants with databases such as GENCODE and dbSNP.

Genome Assembly

De novo or reference-guided assembly of reads into sequence contigs.

Depending on the availability of a reference sequence, reads are de novo assembled, or against a reference genome (or a combination of the two) to create sequence scaffolds.

The analysis includes quality trimming of raw reads, assembly into contigs – mapping against a reference, if available – filtering, creating sequence scaffolds and structural annotation. You will receive raw (fastq) files, the assembled sequence file (fasta) and a bioinformatics report detailing the tools used and contig size statistics.

ChIP-Seq

Mapping of protein binding sites using a combination of chromatin immunoprecipitation (ChIP) and DNA sequencing.

ChIP-Seq allows identification of binding sites to DNA-associated proteins of interest such as transcription factors.

This analysis includes quality trimming of raw reads, mapping to the reference genome, peak calling and peak annotation. You will receive raw (fastq) and aligned (bam) files for each sample, a list of called and annotated peaks (xls), as well as a list of peak summits and locations (bed).

Transcriptome Assembly

De novo or reference-guided assembly of RNAseq data into one transcriptome.

Depending on the availability of a reference sequence, reads are de novo assembled, or against a reference genome (or a combination of the two) to create sequence scaffolds.

The analysis includes quality trimming of raw reads, assembly into contigs – mapping against a reference, if available – filtering and creating sequence scaffolds. You will receive raw (fastq) files, the assembled sequence file (fasta)

Taxonomical Classification

Taxonomic classification of mixed samples to provide a profile of microbial diversity.

By sequencing conserved regions of ribosomal RNA – 16S for prokaryotic and 18S for eurkaryotic, or the ITS region for fungal samples – taxonomic classification can be achieved. The analysis involves trimming raw sequencing reads, picking operational taxonomic units (OTUs), matching these against a database of choice, such as GreenGenes or Silva, to obtain a break-down for each sample at each of 7 levels: kingdom, phylum, class, order, family, genus and species. You will receive raw sequencing files (fastq), text files detailing OTUs and taxonomic classifications as well as bar charts for each sample.

Custom Analysis

In addition to the standard pipelines listed above, our bioinformatics team can develop analyses suited to your needs.

Contact us today and one of our skilled account managers will be in touch with a free consultation including further information and pricing details.