Category Archives: perl

Databases: the very basic introduction

Databases… a whole ‘big thing’ in the bioinformatics field. Databases are the “backbone” of all the public data banks of course, but can be very useful for many other aspects. Even our scaffolding program uses a database for fast data retrieval.

Databases stores data, as a very simple “text file”. But they have a powerful engine to quickly retrieve data using selected criteria. Consider our “pick primer” program: it takes a while to take the sequence of both contigs because it has to parse the file containing all of them. If we decided to store all sequences into a database we could perform a simple query to get them.

This blog is powered by a quite complex database. Continue reading

Tagged ,

Take a self-test

If you want to test your Perl skills try this online test made for you πŸ™‚

Let me know your score!


Good work guys! (final script solution)

Again, you outperformed! We still have to check your primers but you worked as scientist should, with care and analyzing alternative hypotesis rather than launching a couple of programs and taking note of the output. Good work!

Some of you asked for the script. Well, you deserve something more: an updated script πŸ™‚
Did you notice the bottleneck of the script? It takes a while to load the sequence of the two contigs parsing a long file. So I decided to split the 454contigs.fna file producing 1 file for each contig. Thus knowing the name of the contig, you can directly open its file (eg. contig00323.fna).

The script (after the breack) was accordingly modified…

Continue reading

Fifth lab: pick your primers

Today we’ll meet at 1.30 pm in aula F pr to finish together our “” program. During last lab most of you got Primer3 running with our custom parameters. Primer3 output consists in a list of possible primer pairs.

We want to blast each primer against reference contigs to check how many sequence similar to it are present in the genome. This is a pivotal test to have specific PCRs. To make this we design a subroutine called ‘blast’ that takes as input a sequence (primer) and blast it against the contigs and parses the output returning a list of hints. In particular we consider dangerous those aligment long as the query -2 bp. Thus if our primers is long 20 and the match is 15 we will discard it. If the match with another contig is 18 bases long, we will report this to the user.

Continue reading


Lab 4: solution

Click “read more” to see today’s solution…

Continue reading

Fourth lab: shell commands via Perl

Today we:

  • have a small lesson on shell commands with Perl
  • optimize last time’s script so that Primer3 execution will be faster in most cases
  • see some more example of regular expressions

Small project on genome shotgun: comparing results

As some of you correctly understood the script with no name and no instructions takes as input a single sequence and simulate a whole genome shotgun.

It takes three parameters: filename of the sequence, desired coverage and read length.

I used Yeast chromosome 2 as sequence for tests, and tried both with 20X and 50X coverage with increasing read length from 50bp (SOLiD like) to 400bp (454 like) and 800bp (Sanger like).

All the datasets were independently assembled with Newbler. We know that we started from a single sequence  573,563 bp long.

"de novo" assembly with Newbler: results


Tagged , ,

A small game: guess what this script does

Below you’ll find a script.
Guess what’s the purpouse of the program… and if you like it give him a try.
Write me your guess at . I’ll post both the solution and a test on the ouput produced on Mar 25th.

Continue reading

Now that we know where Perl stores command line parameters (hint: @ARGV), we can revise the original Hello world! script.

1) Write a script that takes a single argument from command line, that is the name of a person, and then prints “Hello name!”.

2) Write a script that receive as input a list of names, and then print for each name a “Hello name” line.
This means that we want to type a command like this:

perl Lineweaver Burk

and to have, as output:

Hello Lineweaver!
Hello Burk!

3) Write a program that counts the people to greet and prints the number. This means that we want to type a command like this:

perl Lineweaver Burk 

and to have, as output:

Hello you 2 guys!

IF statement and Perl operators

See this tutorial for a comprehensive list.

Remember that “=” has to be used onlyΒ to assign a value to a scalar. To compare two numbers we’ll use the ‘==’ operator, while to compare two strings the ‘eq’ operator.

This script is a shor example:

$n1 = 10;
$n2 = $n1;
if ($n1 == $n2) {
print "Equal numbers: $n1 = $n2.\n";
} elsif ($n1>$n2) {
print "$n1 is greater than $n2.\n";
} else {
print "$n2 is greater than $n1.\n";