Last lab guys!

We are approaching the end of a cycle… during this genomics laboratory you put your hands on genome sequencing and finishing, and of course you had a nice «primer» on Perl programming, perhaps with a boost in your primer design skills as an extra bonus.

It’s time to sum up, with a «self service» laboratory…

Due to the remarkable success of this year laboratory, that has been an experiment I wanted to try, we will repeat the 20 hours “Perl practical” next year. So stay tuned 🙂

See you this afternoon,

Andrea

Refresh your Perl, the size you want.

Today we aim to:

  • Briefly discuss the wet lab part: results and what do they mean
  • Have a small insight into SQL queries
  • Enjoy our Linux powered Paolotti computer room (i.e. refreshing Perl and testing our skills. This is your «full self service» area. You should review past posts and eventually see below some tasks.

Perl tasks:

  1. Make a script that requires at least two numbers from the user (via command line) and prints their sum and their average;
  2. Make a script that reads a file (filename supplied by the user via command line) and prints the number of lines and the average line length.
  3. Make a script that parses a multi fasta file (filename supplied by the user via command line) and prints the reverse complementary of all sequences. Of course the output should be in multifasta as well, maybe appending a “-rc” string to the name (header) of each sequence.
  4. Make a script that ask the user a pattern (eg TATAA) and a filename of a multifasta file and prints the NAME of all the sequences containing that pattern. Could be a plus telling the percentage of positive sequence, and even better to tell how many patterns per sequence are present.

Sample multifasta file for your scripts:

>Sequence1
ACGTACGTACGTACGTACTACTTATATCGATCGTAGCATCGATCGTACGTAC
ACGGGCGACGATCTACTGACAGCTACGTAGCAGTCacgacgtNNNNNNNCAG
CAGCTACGTACGTACGACTTACTGACTGATCGATCGTACAGCATCGTAGCTA
>Sequence2
CGGCGACGGCGACACGAGGCGCAGGCGCGGCGGCGCGCGGCGCGGCGCTATA
AAGCGACGCAGCGAGCGACGAGCGGCGGCGCGCGCGCCCCCCCCCCCCCCCC
>Sequence3
TGTGTTCACTAGCAACCTCAAACAGACACCATGGTGCACCTGACTCCTGAGG
AGAAGTCTGCCGTTACTGCCCTGTGGGGCAAGGTGAACGTGGATGAAGTTGG
TGGTGAGGCCCTGGGCAGGCTGCTGGTGGTCTACCCTTGGACCCAGAGGTTC
CTTGAGTCCTTTGGGGATCTGTCCACTCCTGATGCTGTTATGGGCAACCCTA
AGGTGAAGGCTCATGGCAAGAAAGTGCTCGGTGCCTTTAGTGATGGCCTGGC
TCACCTGGACAACCTCAAGGGCACCTTTGCCACACTGAGTGAGCTGCACTGT
GACAAGCTGCACGTGGATCCTGAGAACTTCAGGCTCCTGGGCAACGTGCTGG
TCTGTGTGCTGGCCCATCACTTTGGCAAAGAATTCACCCCACCAGTGCAGGC
TGCCTATCAGAAAGTGGTGGCTGGTGTGGCTAATGCCCTGGCCCACAAGTAT
CACTAAGCTCGCTTTCTTGCTGTCCAATTTCTATTAAAGGTTCCTTTGTTCC
CTAAGTCCAACTACTAAACTGGGGGATATTATGAAGGGCCTT
>Sequence4
AGAAGTCTGCCGTTACTGCCCTGTGGGGCAAGGTGAACGTGGATGAAGTTGG
TGGTGAGGCCCTGGGCAGGCTGCTGGTGGTCTACCCTTGGACCCAGAGGTTC
CTTGAGTCCTTTGGGGATCTGTCCACTCCTGATGCTGTTATGGGCAACCCTA
TAAGGTGAAGGCTCATGGCAAGAAAGTGCTCGGTGCCTTTAGTGATGGCCTG
TCACCTGGACAACCTCAAGGGCACCTTTGCCACACTGAGTGAGCTGCACTGT
GACAAGCTGCACGTGGATCCTGAGAACTTCAGGCTCCTGGGCAACGTGCTGG
TCTGTGTGCTGGCCCATCACTTTGGCAAAGAATTCACCCCACCAGTGCAGG

As a final remark, it has been a pleasure for us to play with you in a high-level genomics project.
Hope you’ll appreciate this experience too (in the near future when you’ll work on your own master project)…

Tagged , , ,

Leave a Reply

Your email address will not be published. Required fields are marked *