Exercise: DNA Sequence Analyzer


Create a function for researching DNA sequences. Below, you can see the "source" file. There is all kind of extra information, that is currently not interesting to us. Implement the read_file function that goes over the given file extracting and returning the list of all the DNA sequences. The user of your function can now, create a foreach-loop going over the list of DNA sequences and call her own function that will analyze the sequence and print the results. See the input file, the skeleton of the solution in which I already added the analyze function the end user is going to write. The expected output can be also found. (Of course this read_file should work regardless of what the analyze function does.)


examples/references/dna.txt
Some header text
-----

name: Name of the source
date: 1999.01.23
DNA: GAGATTC

name: Name of the source
date: 1999.01.24
DNA: GAGATCCTGC

name: Name of the source
date: 2007.01.24
DNA: CGTGAGAATCTGC

name: Name of the source
date: 2008.01.24
DNA: CGTGATCTGC

examples/references/dna1_skeleton.pl
#!/usr/bin/perl
use strict;
use warnings;

my $file = 'dna.txt';

my @dna_sequences = read_file($file);
foreach my $seq (@dna_sequences) {
    analyze_dna($seq);
}
sub analyze_dna {
    my ($dna) = @_;
    if ($dna =~ /(.)\1/) {
        print "$dna has double $1\n";
    }
}


sub read_file {
   ...
}

GAGATTC has double T
GAGATCCTGC has double C
CGTGAGAATCTGC has double A