We are approaching the end of a cycle… during this genomics laboratory you put your hands on genome sequencing and finishing, and of course you had a nice «primer» on Perl programming, perhaps with a boost in your primer design skills as an extra bonus.
It’s time to sum up, with a «self service» laboratory…
Due to the remarkable success of this year laboratory, that has been an experiment I wanted to try, we will repeat the 20 hours “Perl practical” next year. So stay tuned 🙂
See you this afternoon,
Today we aim to:
- Briefly discuss the wet lab part: results and what do they mean
- Have a small insight into SQL queries
- Enjoy our Linux powered Paolotti computer room (i.e. refreshing Perl and testing our skills. This is your «full self service» area. You should review past posts and eventually see below some tasks.
- Make a script that requires at least two numbers from the user (via command line) and prints their sum and their average;
- Make a script that reads a file (filename supplied by the user via command line) and prints the number of lines and the average line length.
- Make a script that parses a multi fasta file (filename supplied by the user via command line) and prints the reverse complementary of all sequences. Of course the output should be in multifasta as well, maybe appending a “-rc” string to the name (header) of each sequence.
- Make a script that ask the user a pattern (eg TATAA) and a filename of a multifasta file and prints the NAME of all the sequences containing that pattern. Could be a plus telling the percentage of positive sequence, and even better to tell how many patterns per sequence are present.
Sample multifasta file for your scripts:
>Sequence1 ACGTACGTACGTACGTACTACTTATATCGATCGTAGCATCGATCGTACGTAC ACGGGCGACGATCTACTGACAGCTACGTAGCAGTCacgacgtNNNNNNNCAG CAGCTACGTACGTACGACTTACTGACTGATCGATCGTACAGCATCGTAGCTA
>Sequence2 CGGCGACGGCGACACGAGGCGCAGGCGCGGCGGCGCGCGGCGCGGCGCTATA AAGCGACGCAGCGAGCGACGAGCGGCGGCGCGCGCGCCCCCCCCCCCCCCCC
>Sequence3 TGTGTTCACTAGCAACCTCAAACAGACACCATGGTGCACCTGACTCCTGAGG AGAAGTCTGCCGTTACTGCCCTGTGGGGCAAGGTGAACGTGGATGAAGTTGG TGGTGAGGCCCTGGGCAGGCTGCTGGTGGTCTACCCTTGGACCCAGAGGTTC CTTGAGTCCTTTGGGGATCTGTCCACTCCTGATGCTGTTATGGGCAACCCTA AGGTGAAGGCTCATGGCAAGAAAGTGCTCGGTGCCTTTAGTGATGGCCTGGC TCACCTGGACAACCTCAAGGGCACCTTTGCCACACTGAGTGAGCTGCACTGT GACAAGCTGCACGTGGATCCTGAGAACTTCAGGCTCCTGGGCAACGTGCTGG TCTGTGTGCTGGCCCATCACTTTGGCAAAGAATTCACCCCACCAGTGCAGGC TGCCTATCAGAAAGTGGTGGCTGGTGTGGCTAATGCCCTGGCCCACAAGTAT CACTAAGCTCGCTTTCTTGCTGTCCAATTTCTATTAAAGGTTCCTTTGTTCC CTAAGTCCAACTACTAAACTGGGGGATATTATGAAGGGCCTT
>Sequence4 AGAAGTCTGCCGTTACTGCCCTGTGGGGCAAGGTGAACGTGGATGAAGTTGG TGGTGAGGCCCTGGGCAGGCTGCTGGTGGTCTACCCTTGGACCCAGAGGTTC CTTGAGTCCTTTGGGGATCTGTCCACTCCTGATGCTGTTATGGGCAACCCTA TAAGGTGAAGGCTCATGGCAAGAAAGTGCTCGGTGCCTTTAGTGATGGCCTG TCACCTGGACAACCTCAAGGGCACCTTTGCCACACTGAGTGAGCTGCACTGT GACAAGCTGCACGTGGATCCTGAGAACTTCAGGCTCCTGGGCAACGTGCTGG TCTGTGTGCTGGCCCATCACTTTGGCAAAGAATTCACCCCACCAGTGCAGG
As a final remark, it has been a pleasure for us to play with you in a high-level genomics project.
Hope you’ll appreciate this experience too (in the near future when you’ll work on your own master project)…