[Genome] question on chain data format
Kayla Smith
kayla at soe.ucsc.edu
Tue Aug 7 12:18:58 PDT 2007
Hello Zhaoshi,
1. The chain files don't contain sequence data. Here is where you can
download sequence data:
http://hgdownload.cse.ucsc.edu/goldenPath/hg18/bigZips
You might also find the axt downloads useful (but note that these are
for the nets and so will not include all chains):
http://hgdownload.cse.ucsc.edu/goldenPath/hg18/vsSelf/axtNet/
2. The dt/dq distances mean how much of your sequence is skipped before
the next aligning block. Not everything aligns. Here is an example
(from the chain format help page):
chain 4900 chrY 58368225 + 25985406 25985566 chr5 151006098 - 43549808
43549970 2
16 0 2
60 4 0
10 0 4
70
size dt dq
This shows 4 ungapped alignment blocks, sizes 16, 60, 10, and 70, with
one 4bp gap in the reference (between 1st and 2nd block). So the total
extent of the chain is 16+60+4+10+70 (160bp) in the reference,
which agrees with the tEnd-tStart from the header line (25985566-25985406).
I hope this information is helpful to you. Please don't hesitate to
contact us again if you require further assistance.
Kayla Smith
UCSC Genome Bioinformatics Group
zhaoshi wrote:
> Hi--
>
> I was trying to get the pairwise sequence identity information from the
> humans self chain data.
> I download the chain files and read the
> http://genome.ucsc.edu/goldenPath/help/chain.html and
> have some questions:
>
> 1) it seems that chain file dose not contain sequence identity
> information, is there any other data that contain this
> information or I need compute by my own base on these chain data?
>
> 2) I read the chain format explanation, but I do not quite understand
> what does 'dt, dq' mean?
> It states in the explanation like:
> dt -- the difference between the end of this block and the beginning of
> the next block (reference sequence)
> dq -- the difference between the end of this block and the beginning of
> the next block (query sequence)
> what does this really mean? difference means what ? mismatch ?
>
> Thanks for your help.
>
> Zhaoshi
>
>
> _______________________________________________
> Genome maillist - Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
More information about the Genome
mailing list