"Motivation: Mammalian genomes are highly complex.
To identify the unique sequences of each gene in a
mammalian gene database containing tens of thousands
of DNA sequences is a computation intensive task. With
the advent of parallel genetic analysis methods such
as microarrays and the availability of more and more
whole genome sequences of organisms, an algorithm
allowing speedy identification of the unique gene probes
for functional studies of individual genes will be a very
useful tool.
Results: We have developed a fast algorithm as well as
a software program based on the algorithm for identifying
gene specific probes of complex organisms. The algorithm
was applied to the assemblies of gene sequences and
was highly efficient for large databases such as the TIGR
human THC and mouse TC databases. The results were
assessed with the BLAST sequence alignment software.
Two probe data sets have been compiled to contain
specific probes for around 100 000 putative human gene
transcripts and 70 000 putative mouse gene transcripts.
Availability: The gene specific probes for the putative
human and mouse genes referenced in the TIGR gene
indices are available at: ftp://genestamp.ibms.sinica.edu.
tw/pub/SpecificP/ The software program and the source
codes are available upon request.
Contact: [email protected]"