Use the Lander and Waterman application of the Poisson distribution to calculate the number of reads that you need to sequence t
o obtain 20X coverage for a genome of size 5 Mb and Illumina reads of 100 bp. How does that compare with the number of reads that you would need with Sanger capillary sequencing (read length 800 bp)
The number of the calculated gap of .2% is refered to the genome size in the question. For instance, if we want to sequence 4Mb, the total number of readings will be 800 000. Thus, for a total information, there will be 80 000 000 bp (80 Mbp) which will be 20 times the coverage of the genomes. In fact, there is an insignificant change in the distribution of the genome depending on the size of the reads. Thus, the total information. What is likely to be affected is the general assembly of the genome. A maximum of 25 X on average will yield the about 250 bp. However, for 100 bp, there might be a need of more than 120 X.