Answer to Question #254629 in Bioinformatics for area08

Question #254629

The frequent words with mismatches problem

One way to solve the Frequent Words with Mismatches problem is to generate all 4k k-mers Pattern, compute ApproximatePatternCount(TextPatternd) for each k-mer Pattern, and then find k-mers with the maximum number of approximate occurrences. This is an inefficient approach in practice, since many of the 4k k-mers should not be considered because neither they nor their mutated versions (with up to d mismatches) appear in Text.

Genome= GCAAAATGGAGCAGGATCAGCAAAATGGAAAATAAATGGAGGATCAAAATAAATGGAGGAGGAAAATGGAGGAAAATAAATGGATCAGGAAAATGCAGCAGGATCATCATCAGGAGCAGGATCAAAATTCAGGAGCAGGAGGATCAGCATCAGGAGGATCAGCAGGAAAATGCAGGAGGAGGAGGAAAATTCAAAATGGAGGAGGAGGAGCATCAGCAGCATCAGGAGGAGGATCAGCAGCAGGAGGAGGAGGAGGAAAATGGAGGAGGAGCAGGAGGAGCATCAGGAGGATCAGGAGCATCAGCAAAATTCAAAATGGAGGAAAATGCAGGAAAATGGAGCAGGAAAATAAATTCATCAAAATGCAGGAGGA

k= 6

d= 2


1
Expert's answer
2021-10-22T13:42:02-0400

You can see that ACTAT is a most frequent 5-mer of ACAACTATGCATACTATCGGGAACTATCCT, and ATA is a most frequent 3-mer of CGATATATCCATAG.


Need a fast expert's response?

Submit order

and get a quick answer at the best price

for any assignment or question with DETAILED EXPLANATIONS!

Comments

No comments. Be the first!

Leave a comment

LATEST TUTORIALS
APPROVED BY CLIENTS