[Genome] knownIsoforms

Fan Hsu fanhsu at soe.ucsc.edu
Mon May 7 11:38:32 PDT 2007


Hi Jeff,

The knownisoforms table is generated by a program txGeneCanonical,
written by Jim Kent.  Jim is out of town today, not sure he has email access
or not.

My guess is that this program first identifies overlapping genes and then
find the longest one and designate it as the representative
canonical gene of the cluster.

In the mean time, you can find the details of this program
by downloading our src tree and find this program
under:

kent/src/hg/txGene/txGeneCanonical

Fan.
-----Original Message-----
From: genome-bounces at soe.ucsc.edu [mailto:genome-bounces at soe.ucsc.edu]On
Behalf Of Jeffrey Rosenfeld
Sent: Monday, May 07, 2007 10:16 AM
To: genome at soe.ucsc.edu
Subject: [Genome] knownIsoforms


How is the knownisoforms table constructed? It seems that all
overlapping genes are clustered together, but there are examples, such
as at the very beginning of chromosome 1, where a transcript within a
cluster becomes its own cluster.  When I run the following query on hg18:

select clusterID,name,chrom,strand,txStart,txEnd,cdsStart,cdsEnd  from
knownIsoforms, knownGene where transcript = name;

These are the results I get:

| clusterID | name       | chrom | strand | txStart | txEnd  | cdsStart
| cdsEnd |
+-----------+------------+-------+--------+---------+--------+----------+---
-----+
|         1 | uc001aaa.1 | chr1  | +      |    1736 |   4121 |     1736
|   1736 |
|         2 | uc001aab.1 | chr1  | -      |    4558 |  14764 |     4558
|   4558 |
|         2 | uc001aac.1 | chr1  | -      |    4558 |  19346 |     4558
|   4558 |
|         2 | uc001aad.1 | chr1  | -      |    4558 |   7231 |     4558
|   7173 |
|         2 | uc001aae.1 | chr1  | -      |    4558 |   9622 |     4558
|   4558 |
|         2 | uc001aaf.1 | chr1  | -      |    4832 |  19672 |     4832
|   4832 |
|         2 | uc001aag.1 | chr1  | -      |    5658 |   7231 |     5658
|   5658 |
|         2 | uc001aah.1 | chr1  | -      |    6720 |  19346 |     6720
|   6720 |
|         2 | uc001aai.1 | chr1  | -      |    6720 |   9622 |     6720
|   6720 |
|         3 | uc001aaj.1 | chr1  | -      |    7777 |  19346 |     7777
|  14749 |


Shouldn't cluster 3 be included as part of cluster 4?

Thank You,

Jeffrey Rosenfeld
_______________________________________________
Genome maillist  -  Genome at soe.ucsc.edu
http://www.soe.ucsc.edu/mailman/listinfo/genome



More information about the Genome mailing list