Tag Archives: newbler

Small project on genome shotgun: comparing results

As some of you correctly understood the script with no name and no instructions takes as input a single sequence and simulate a whole genome shotgun.

It takes three parameters: filename of the sequence, desired coverage and read length.

I used Yeast chromosome 2 as sequence for tests, and tried both with 20X and 50X coverage with increasing read length from 50bp (SOLiD like) to 400bp (454 like) and 800bp (Sanger like).

All the datasets were independently assembled with Newbler. We know that we started from a single sequence  573,563 bp long.

"de novo" assembly with Newbler: results


Tagged , ,

A “primer” on Genome Assembly

During our first lesson we had a brief overview of the sequencing strategy used for Nannochloropsis.

First, a whole genome shotgun approach was used to have a first draft (see picture below). In particular we used the Roche 454 machine that provides reads ~500bp long, and we assembled them with the Newbler package.

As we saw both read length and sequence coverage affects the quality of the assembly. In particular repeated regions make the assembly program to “break” the sequence. This happens if the length of the repeated region is longer than the single fragments (reads) sequenced. Repeated regions collapse in the same contig, that will have a higher coverage (approximately n-times the average, where n is the number of repeats in the genome).

Continue reading

Tagged , , ,