[Genome] Pfam foreign keys in data tables // mapping protein domains to genomic coordinates

John Major major at cbio.mskcc.org
Thu Mar 1 09:34:49 PST 2007


Hello-

I am trying to map the pfam protein domains to genomic coordinates and  
am having some problems.
I see that in the proteome tables, there are 2 obvious pfam tables: 
pfamDesc and pfamXref.
Neither of these tables appear to be linked to other tables... or at 
least the table description pages do not offer any information as to 
which tables these 2 link to.
Also, I do not seem to see a table which gives the start and end 
coordinates for the pfam doamins (in protein, mrna, or genomic space).

What I would like to get is a simple table of domain information in 
genomic coordinate space. Ie:
GenomeBuildID     Chrm   Start         End          ProteinDomainName   
SourceDatabase
hg18                         chr1    100000    100050        
Protein-Kinase           pFam
hg18                         chr2    200010    200090        
X-binding-site            uniprot


I would like to get this info for both uniprot and pfam.  The uniprot 
tables (uniprot.feature and uniprot.description) appear to be linked to 
kgXref via acc->spid.  And I should be able to derive genomic 
coordinates for the uniprot features via these tables.


If you have any advice on an easier way to get this mapping of domains 
to genomic coordinates, I'd be thrilled to hear it.  Otherwise, could 
you please advise me on the pfam tables.

Thanks!
John Major


More information about the Genome mailing list