## Exercise classes: Information Theory

### Formalia

The exercise class takes place in the Westbau seminar room on Wednesdays 9-11. Problem sets are distributed the week before the class. Please hand your solutions by Tuesday noon before the class. 50 percent of the total marks are required to take the class for credit (Scheinkriterium).

A number of problem sets involve programming in the Matlab language. An introduction to the Matlab language will be provided for those without programming experience. Please submit your solutions by email in the following format: The code should be compatible either with Matlab or with Octave 3.03. Your code should be a function, with your name, platform (Matlab or Octave) and an example line how to run the code provided in the %-commentary of the function. Please also hand in your code in hardcopy for grading. Marks will be given for code which is functioning, clear and well-documented, efficient, and compact, in that order.

### Problem sets

The problem sets are also available for download.

 5.11. 1. Übung (.ps) 1. Übung (.pdf) 12.11. 2. Übung (.ps) 2. Übung (.pdf) 19.11. 3. Übung (.ps) 3. Übung (.pdf) 25.11. This is a computer class, provided for those with no or little prior experience with Matlab. A free version of Matlab, called Octave, is publicly available. To prepare for this class, read the first ten sections of the Matlab Primer. 3.12. Implement the Lempel-Ziv encoding in Matlab. (120 points) 17.12. Implement the exon finder (Grosse et al., Phys. Rev E, 1999): Write a function of a string of nucleotides which gives the in- and out-of frame information I(k=3,4). Apply your code to substrings of DNA to find the location of exons. The DNA sequence of a human gene, containing four exons of length 100 to 400 base pairs, is available here. For purposes of testing, a synthetic sequence of length 6000 is available here. The first 3000 nucleotides were generated independently and equiprobably, the second 3000 nucleotides represent codons. (120 points) 7.1. 7. Übung (.ps) 7. Übung (.pdf) 14.1. Tutorial class for the Gibbs sampler of next week's exercise class. 21.1. Implement the Gibbs-sampler to find regulatory motifs in DNA. A set of synthetic DNA sequences is available here. Each of the 20 sequences of length 35 nucleotides contains an instance of the 5-nucleotide motif at different locations. Where are they located? (120 points) 28.1. 9. Übung (.ps) 9. Übung (.pdf) 4.2. By request we will have a question & answer session. 11.2. 10. Übung (.ps) 10. Übung (.pdf)

Back