|
|
In silico prediction of UTR repeats using clustered EST data Stefan A. Rensing*, Daniel Lang and Ralf Reski University of Freiburg, Plant Biotechnology, Sonnenstr. 5, D-79104 Freiburg, Germany* stefan.rensing@biologie.uni-freiburg.de, fon +49 761 203-6974, fax -6990 Abstract Clustering of EST data is a method for the non-redundant representation
of an organisms transcriptome. During clustering of large amounts of
EST data, usually some large clusters (>500 sequences) are created.
Those can lead to iterative contig builds, consumation of lots of computing
time and improbable exon alignments, which is unfavourable. In addition,
these clusters sometimes contain transcripts for more than one gene,
which is not desired. Such large clusters come into existence due to:
(1) large numbers of identical ESTs / high transcript levels; (2) large
gene families with highly similar members; (3) false clustering due
to a) unremoved vector or rRNA sequences, b) undetected cloning artifacts
or c) repetitive elements in UTRs. Rensing S.A., Lang D. and Reski R. (2003): In silico prediction of UTR repeats using clustered EST data.
In: Proceedings of the German Conference on Bioinformatics 2003, Mewes H.-W., Heun V., Frishman D., Kramer S. (eds.), pp 117-122, Belleville Verlag Michael Farin, Munich, Germany |
|
|
Currently available: 17 predicted Physcomitrella
repeats as a FastA file |