Mouse BAC ends quality assessment and sequence analyses

S. Zhao, S. Shatsman, B. Ayodeji, K. Geer, G. Tsegaye, M. Krol, E. Gebregeorgis, A. Shvartsbeyn, D. Russell, L. Overton, L. Jiang, G. Dimitrov, K. Tran, J. Shetty, J. A. Malek, T. Feldblyum, W. C. Nierman, C. M. Fraser

Research output: Contribution to journalArticle

45 Citations (Scopus)

Abstract

A large-scale BAC end-sequencing project at The Institute for Genomic Research (TIGR) has generated one of the most extensive sets of sequence markers for the mouse genome to date. With a sequencing success rate of >80%, an average read length of 485 bp, and ABI3700 capillary sequencers, we have generated 449,234 nonredundant mouse BAC end sequences (mBESs) with 218 Mb total from 257,318 clones from libraries RPCI-23 and RPCI-24, representing 15 × clone coverage, 7% sequence coverage, and a marker every 7 kb across the genome. A total of 191,916 BACs have sequences from both ends providing 12× genome coverage. The average Q20 length is 406 bp and 84% of the bases have phred quality scores ≥ 20. RPCI-23 mBESs have more Q20 bases and longer reads on average than RPCI-23 sequences. ABI3700 sequencers and the sample tracking system ensure that > 95% of mBESs are associated with the right clone identifiers. We have found that a significant fraction of mBESs contains L1 repeats and ∼48% of the clones have both ends with ≥ 100 bp contiguous unique Q20 bases. About 3% mBESs match ESTs and > 70% of matches were conserved between the mouse and the human or the rat. Approximately 0.1% mBESs contain STSs. About 0.2% mBESs match human finished sequences and > 70% of these sequences have EST hits. The analyses indicate that our high-quality mouse BAC end sequences will be a valuable resource to the community.

Original languageEnglish
Pages (from-to)1736-1745
Number of pages10
JournalGenome Research
Volume11
Issue number10
DOIs
Publication statusPublished - 20 Nov 2001

    Fingerprint

ASJC Scopus subject areas

  • Genetics
  • Genetics(clinical)

Cite this

Zhao, S., Shatsman, S., Ayodeji, B., Geer, K., Tsegaye, G., Krol, M., Gebregeorgis, E., Shvartsbeyn, A., Russell, D., Overton, L., Jiang, L., Dimitrov, G., Tran, K., Shetty, J., Malek, J. A., Feldblyum, T., Nierman, W. C., & Fraser, C. M. (2001). Mouse BAC ends quality assessment and sequence analyses. Genome Research, 11(10), 1736-1745. https://doi.org/10.1101/gr.179201