diffseq |
diffseq finds the region of overlap of the input sequences and then reports differences within this region, like a local alignment.
The start and end positions of the overlap are reported.
diffseq should be of value when looking for SNPs, differences between strains of an organism and anything else that requires the differences between sequences to be highlighted.
The sequences can be very long. The program does a match of all sequence words of size 10 (by default). It then reduces this to the minimum set of overlapping matches by sorting the matches in order of size (largest size first) and then for each such match it removes any smaller matches that overlap. The result is a set of the longest ungapped alignments between the two sequences that do not overlap with each other. The mismatched regions between these matches are reported.
It should be possible to find differences between sequences that are Mega-bases long.
% diffseq tembl:ap000504 tembl:af129756 Find differences between nearly identical sequences Word size [10]: Output report [ap000504.diffseq]: Output features [AP000504.diffgff]: Second output features [AF129756.diffgff]: |
Go to the input files for this example
Go to the output files for this example
Standard (Mandatory) qualifiers: [-asequence] sequence Sequence USA [-bsequence] sequence Sequence USA -wordsize integer The similar regions between the two sequences are found by creating a hash table of 'wordsize'd subsequences. 10 is a reasonable default. Making this value larger (20?) may speed up the program slightly, but will mean that any two differences within 'wordsize' of each other will be grouped as a single region of difference. This value may be made smaller (4?) to improve the resolution of nearby differences, but the program will go much slower. [-outfile] report Output report file name [-aoutfeat] featout File for output of first sequence's features [-boutfeat] featout File for output of second sequence's features Additional (Optional) qualifiers: -globaldifferences boolean Normally this program will find regions of identity that are the length of the specified word-size or greater and will then report the regions of difference between these matching regions. This works well and is what most people want if they are working with long overlapping nucleic acid sequences. You are usually not interested in the non-overlapping ends of these sequences. If you have protein sequences or short RNA sequences however, you will be interested in differences at the very ends . It this option is set to be true then the differences at the ends will also be reported. Advanced (Unprompted) qualifiers: (none) Associated qualifiers: "-asequence" associated qualifiers -sbegin1 integer Start of the sequence to be used -send1 integer End of the sequence to be used -sreverse1 boolean Reverse (if DNA) -sask1 boolean Ask for begin/end/reverse -snucleotide1 boolean Sequence is nucleotide -sprotein1 boolean Sequence is protein -slower1 boolean Make lower case -supper1 boolean Make upper case -sformat1 string Input sequence format -sdbname1 string Database name -sid1 string Entryname -ufo1 string UFO features -fformat1 string Features format -fopenfile1 string Features file name "-bsequence" associated qualifiers -sbegin2 integer Start of the sequence to be used -send2 integer End of the sequence to be used -sreverse2 boolean Reverse (if DNA) -sask2 boolean Ask for begin/end/reverse -snucleotide2 boolean Sequence is nucleotide -sprotein2 boolean Sequence is protein -slower2 boolean Make lower case -supper2 boolean Make upper case -sformat2 string Input sequence format -sdbname2 string Database name -sid2 string Entryname -ufo2 string UFO features -fformat2 string Features format -fopenfile2 string Features file name "-outfile" associated qualifiers -rformat3 string Report format -rname3 string Base file name -rextension3 string File name extension -rdirectory3 string Output directory -raccshow3 boolean Show accession number in the report -rdesshow3 boolean Show description in the report -rscoreshow3 boolean Show the score in the report -rusashow3 boolean Show the full USA in the report "-aoutfeat" associated qualifiers -offormat4 string Output feature format -ofopenfile4 string Features file name -ofextension4 string File name extension -ofdirectory4 string Output directory -ofname4 string Base file name -ofsingle4 boolean Separate file for each entry "-boutfeat" associated qualifiers -offormat5 string Output feature format -ofopenfile5 string Features file name -ofextension5 string File name extension -ofdirectory5 string Output directory -ofname5 string Base file name -ofsingle5 boolean Separate file for each entry General qualifiers: -auto boolean Turn off prompts -stdout boolean Write standard output -filter boolean Read standard input, write standard output -options boolean Prompt for standard and additional values -debug boolean Write debug output to program.dbg -verbose boolean Report some/full command line options -help boolean Report command line options. More information on associated and general qualifiers can be found with -help -verbose -warning boolean Report warnings -error boolean Report errors -fatal boolean Report fatal errors -die boolean Report deaths |
Standard (Mandatory) qualifiers | Allowed values | Default | |
---|---|---|---|
[-asequence] (Parameter 1) |
Sequence USA | Readable sequence | Required |
[-bsequence] (Parameter 2) |
Sequence USA | Readable sequence | Required |
-wordsize | The similar regions between the two sequences are found by creating a hash table of 'wordsize'd subsequences. 10 is a reasonable default. Making this value larger (20?) may speed up the program slightly, but will mean that any two differences within 'wordsize' of each other will be grouped as a single region of difference. This value may be made smaller (4?) to improve the resolution of nearby differences, but the program will go much slower. | Integer 2 or more | 10 |
[-outfile] (Parameter 3) |
Output report file name | Report output file | |
[-aoutfeat] (Parameter 4) |
File for output of first sequence's features | Writeable feature table | $(asequence.name).diffgff |
[-boutfeat] (Parameter 5) |
File for output of second sequence's features | Writeable feature table | $(bsequence.name).diffgff |
Additional (Optional) qualifiers | Allowed values | Default | |
-globaldifferences | Normally this program will find regions of identity that are the length of the specified word-size or greater and will then report the regions of difference between these matching regions. This works well and is what most people want if they are working with long overlapping nucleic acid sequences. You are usually not interested in the non-overlapping ends of these sequences. If you have protein sequences or short RNA sequences however, you will be interested in differences at the very ends . It this option is set to be true then the differences at the ends will also be reported. | Boolean value Yes/No | No |
Advanced (Unprompted) qualifiers | Allowed values | Default | |
(none) |
ID AP000504 standard; DNA; HUM; 100000 BP. XX AC AP000504; BA000025; XX SV AP000504.1 XX DT 28-SEP-1999 (Rel. 61, Created) DT 22-AUG-2001 (Rel. 68, Last updated, Version 3) XX DE Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section DE 3/20. XX KW . XX OS Homo sapiens (human) OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; OC Eutheria; Primates; Catarrhini; Hominidae; Homo. XX RN [1] RP 1-100000 RA Hirakawa M., Yamaguchi H., Imai K., Shimada J.; RT ; RL Submitted (21-SEP-1999) to the EMBL/GenBank/DDBJ databases. RL Mika Hirakawa, Japan Science and Technology Corporation (JST), Advanced RL Databases Department; 5-3, Yonbancho, Chiyoda-ku, Tokyo 102-0081, Japan RL (E-mail:mika@tokyo.jst.go.jp, URL:http://www-alis.tokyo.jst.go.jp/, RL Tel:81-3-5214-8491, Fax:81-3-5214-8470) XX RN [2] RA Shiina S., Tamiya G., Oka A., Inoko H.; RT "Homo sapiens 2,229,817bp genomic DNA of 6p21.3 HLA class I region"; RL Unpublished. XX DR SWISS-PROT; O00299; CLI1_HUMAN. DR SWISS-PROT; O43196; MSH5_HUMAN. DR SWISS-PROT; O95445; APOM_HUMAN. DR SWISS-PROT; O95865; DDH2_HUMAN. DR SWISS-PROT; O95867; NG24_HUMAN. DR SWISS-PROT; P13862; KC2B_HUMAN. XX CC This sequence is conducted by Tokai University as a JST sequencing CC Team. CC Principal Investigator: Hidetoshi Inoko Ph.D CC Phone:+81-463-93-1121, Fax:+81-463-94-8884, CC The sequence is submitted by Human Genome Sequencing in ALIS CC project of JST CC Japan Science and Technology Corporation (JST) CC 5-3, Yonbancyo, Chiyoda-ku, Tokyo, 102-0081 Japan CC For further infomation about this sequences, please visit our CC sequence archive Web site (http://www-alis.tokyo.jst.go.jp/HGS/top. [Part of this file has been deleted for brevity] gggtggatca tgaggtcaag agatcgagac tatcctggct aacatgatga aaccccgtct 97080 ctactaaaaa tacaaaaaat tagctgggca tggtggcggg cacctgtagt cccagctact 97140 cgggaggctg agtcaggaga atggtgtgaa cccaggagac ggagcttgca gtgagctgag 97200 gtcgcaccac tgcactccag cctgggtgat agagcgagac tctgtctcaa aaaaaaaaaa 97260 aaaaaaaaaa aaaacaaaaa ttagccgggt gtggtggcag gcaacttaat cccagctact 97320 tgggaggcag aggcaggaga atcgtttgaa cctgggaggc ggaggttgaa gagaatagaa 97380 gctctgctgg tccagagaag gattgggcca gggctctggg agaccaggga gaaagagggc 97440 acatgtggtc cctgttgact gtgagggtgg gaatctgagg aaggctttgg ctcattgccc 97500 cttgggtttg tccacagcca tccttcccct gcggagtatg tcgaggtgct ccaggagcta 97560 cagcggctgg agagtcgcct ccagcccttc ttgcagcgct actacgaggt tctgggtgct 97620 gctgccacca cggactacaa taacaatgtg agccctttga tggccctgcc ctttctcctc 97680 agccccagta ctcccaaaac agaacaggct gaaatacaga taactctttc cctccctgga 97740 aaaacattgc aacagggcca ggtgcagtgg ctcacgcctg taatcccagc actttgggag 97800 gccaaggtgg gcggatcatc tgagatcggg agtttgagac cagcctggcc aacatggtgc 97860 aaccccatct ctactgaaaa tataaacatt agctggatgt agtggtgcac acctgtaatc 97920 ccagctactc aggaggctga ggcaggagaa tcgctagaac tcgggaggag ggggttgcag 97980 tgagccgaga ttgcactact gcactctagc ctgggtgaca gagcgagact gtctcaaaaa 98040 acaaaacaaa acaaaaaaac acacattgca acaaaacaat ttctctctaa acctgtaagt 98100 gattttgtcc tcccttacag agaaggtgat aatctttgct gtaagcactg tcctcgtatc 98160 gtaccccttg tgcccctgaa tgaatttaga aaatgtaaag tacaggagat cagtatatga 98220 tgacttactg attcatagta gtgttttaat aggatgttcc ttatgtgaat aagatataat 98280 ttatttgcaa agatttggtc tacatgtaaa cttccaagga tataactgaa agttttggag 98340 gacatggtat tctcagtagg cattattgct tttattagtg agatggactc cagcttgata 98400 ttttctgcct ttttgtgttt ggctggttgt gcgcagcacg agggccggga ggaggatcag 98460 cggttgatca acttggtagg ggagagcctg cgactgctgg gcaacacctt tgttgcactg 98520 tctgacctgc gctgcaatct ggcctgcacg cccccacgac acctgcatgt ggtccggcct 98580 atgtctcact acaccacccc catggtgctc cagcaggcag ccattcccat acaggtgggt 98640 tagggggagt ctggcctgag ggagagtgag gggtgttgat agagtgaccc agggtagcta 98700 ctgggcctga aggaggttag gaaaggagga gactggaaac atggtgatga aggctggaga 98760 tactttagag gtttatcatg aggttttctt ggttaggctc ttgtattttt ctcacatctg 98820 cctgtccatc tgtctttttc agatcaatgt gggaaccact gtgaccatga caggaaatgg 98880 gactcggccc cccccaactc ccaatgcaga ggcacctccc cctggtcctg ggcaggcctc 98940 atccgtggct ccgtcttcta ccaatgtcga gtcctcagct gagggggctc ccccgccagg 99000 tccagctccc ccgccagcca ccagccaccc gagggtcatc cggatttccc accagagtgt 99060 ggaacccgtg gtcatgatgc acatgaacat tcaaggtgag aatagttgct ggcgagaaga 99120 gcaggatcag catgatgagg gaggttcatg ctgaggtgtg agggaacagg gtggggaagg 99180 gagaggcaca tgctggtggt ggtagcctgg ggaccagagc agaagcttaa gtagacagat 99240 gtggggggtg tgggggttgg tttgtctttg gaggtgtgtt tgtgtggtga agggagtacc 99300 tctccctgtt tagatggagg gaaaggcagg ctttctgatt gggggattat gggcctgaag 99360 tatgcctgat ctcagaagga tatagttagg ccttggccct acctacctca gggccactgt 99420 ctctgtctcc ctgcccagat tctggcacac agcctggtgg tgttccgagt gctcccactg 99480 gccccctggg accccctggt catggccaaa ccctgggtaa gagtgagggc atcagggcag 99540 gctgagctct gggtagagaa agggaagggc tgagtgggtg ggttgaaggg gtccaggttc 99600 aaggttacat cagacccgcc ccccaggctc caccctcatc cagctgccct ccctgccccc 99660 tgagttcatg cacgccgtcg cccaccagat cactcatcag gccatggtgg cagctgttgc 99720 ctccgcggcc gcaggtaatg acctggaagg ggaggcttgg gaggtagggc acagtccatg 99780 gtggcagctg gctggcaagg gcctggccct cagccctctt cggtctgtct cttctgccac 99840 ccacaggaca gcaggtgcca ggcttcccaa cagctccaac ccgggtggtg attgcccggc 99900 ccactcctcc acaggctcgg ccttcccatc ctggagggcc cccagtctct gggacactgg 99960 tgagcaaggg tcggggagtt ctagtgcgta acagtctagg 100000 // |
ID AF129756 standard; DNA; HUM; 184666 BP. XX AC AF129756; XX SV AF129756.1 XX DT 12-MAR-1999 (Rel. 59, Created) DT 29-OCT-1999 (Rel. 61, Last updated, Version 2) XX DE Homo sapiens MSH55 gene, partial cds; and CLIC1, DDAH, G6b, G6c, G5b, G6d, DE G6e, G6f, BAT5, G5b, CSK2B, BAT4, G4, Apo M, BAT3, BAT2, AIF-1, 1C7, LST-1, DE LTB, TNF, and LTA genes, complete cds. XX KW . XX OS Homo sapiens (human) OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; OC Eutheria; Primates; Catarrhini; Hominidae; Homo. XX RN [1] RP 1-184666 RA Rowen L., Madan A., Qin S., Shaffer T., James R., Ratcliffe A., Abbasi N., RA Dickhoff R., Loretz C., Madan A., Dors M., Young J., Lasky S., Hood L.; RT "Sequence of the human major histocompatibility complex class III region"; RL Unpublished. XX RN [2] RP 1-184666 RA Rowen L.; RT ; RL Submitted (22-FEB-1999) to the EMBL/GenBank/DDBJ databases. RL Department of Molecular Biotechnology, Box 357730 University of Washington, RL Seattle, WA 98195, USA XX RN [3] RP 1-184666 RA Rowen L.; RT ; RL Submitted (28-OCT-1999) to the EMBL/GenBank/DDBJ databases. RL Multimegabase Sequencing Center, University of Washington, PO Box 357730, RL Seattle, WA 98195, USA XX DR EPD; EP11158; HS_TNFA. DR EPD; EP11159; HS_TNFB. DR SPTREMBL; O00452; O00452. DR SPTREMBL; O14931; O14931. DR SPTREMBL; O95866; O95866. DR SPTREMBL; O95868; O95868. DR SPTREMBL; O95869; O95869. DR SPTREMBL; O95870; O95870. [Part of this file has been deleted for brevity] aaaccagttt accaccactc ctaacactaa acttaaatct gactctaaat gtaagtccaa 181740 tctgagccac aagcctaaag ttgaacttta tcctgcttta tgaattattc atccattcct 181800 ccatttagtg agtatctgcg tgcctaacac atgctgggca ttgtcctaag gcaggaggga 181860 catggaggca aagggatcag agaaggtacc agcacctgtg gagcttgtat tccagtgagg 181920 ccagacggaa aagaaagaaa ctgaagaaga aattggtact atgagaaaat aagacaggct 181980 gatgttgtaa gagtggcagg gagctacttt taaatacagt agtcagcaaa atcctctttg 182040 agtgtttggg tggcactgga gctgagaccc aaatgacaaa aaatagtgac caggtaaaag 182100 tttgggagca aagcatttca ggtaaaggga gcagctactg caaaggctgg aaggcggaac 182160 caagctgggg gtgttgacga caaacagaag gccagtgtgg ctggagcaga gagagagact 182220 gggaggcggg tgggagatga ggtcagagag gagggcaggg gccaggtcat gcagggccat 182280 gcaagaaggg taaagcctct agatttcatc cagccacagg aagcctttaa aggtcgtcag 182340 agtgtgtggt gcgtgcgtgt gtgtgtgtgt gtgtgtgtgt gttgcagggg agagaggggg 182400 agggagagag agagagagag agagaagagg gaggtgagca gaggtgattg gatttttttt 182460 tcttttgaca tggtgtcttg ctctgtggcc taggctggag tgcagtggca ccatcatagc 182520 ccactgcaac ctcaaaacca tgggctcaag tcatccttcc acctcagctt cccaagtatc 182580 taggactaca ggtgtgtgcc actgtgcctg gctaatttta aaaaatattt taaaattttt 182640 gttgagacag ggtctatgct gctcaggctg gtctcgaact cctggtttca agtgatctgc 182700 ccatcttggc ctcccaaagt ttttttttgt tagtttgaga ggcggtttcg ctcgttgccc 182760 aggctggagt gcaatgactg atctcatctc actgcaacct ctgcctcctg ggttcaagcg 182820 attctcctgc ttcagcctcc caagtagctg ggattacagg tgcatgccac cattcccggc 182880 taattttttg tatttagtag agatggggtt tcaccatgtt agtcaggctg atctcaaact 182940 cctgacctca ggtgatccgc ctgcctcagc ctcccaaagt tttgggatta caggtgtgag 183000 ccaccatgct gggccagcct cccaaagttt tgggattaca ggcatgagtc accacactgg 183060 ccctggattt tttttctttc ttttttttgg agacggagtc tcactctgtt gcccaggctg 183120 gagtgcaatg gcgtaatctc agctcactgc aacctctgct gcccgggttc aaacgattct 183180 cctgtcttag cctcctgagt agctgggatt ataggtgcat gccaccatgc ctggctaatt 183240 tttgtacttt tagtagagaa agtacaccat cttggccagg ctggtctcga actcctgacc 183300 tcaggtgatc cacttgcgtc ggcctcccaa agtgctggga ttacaggcgt gagacaccgc 183360 acccagcctt tttttttttt tttcttttaa gacagaatcg ctctgtcacc caggctggag 183420 tgcagtggca caatctcggc tcactgcaac ctctgcctcc caggtttaag caatccacct 183480 atgtcagtct cccaagtagc tgggattata ggtgcatgtc accatgcctg gctaattttt 183540 gtacttttag tatagaaagt acaccatgtt ggccaggctg gtcttgaact cctgacctca 183600 agtgatccgc ctgcctcagc ctcccgaagt gctggaatta cagacatgtg ccactgcacc 183660 cggcctggtt ttttttttct aagagatgga gtctcacttt tctgcccagg ttggagtgca 183720 atggcaccat catagctcac tgcagccttc aactcttggc ctcaggcaat ccttgcacct 183780 tagcctcgca gtgttgggat tacaggcatg agccactgag ccttgcctgg actttttttt 183840 ttttttgaga tggcgtctcg ctctgttgcc caggttggag tgctacggca tgatcttggc 183900 tcactgcaac ttccacctcc caggttcaag cgattctctt gcctcggccc cccgagtagc 183960 tgggattaca ggcatgcgcc accgtgcctg gctaattttg gtatttttag tagagatagg 184020 gtttcatcat gttgggcagg ctggtcttga actcctgacc tcgtgatcca cccacctcgg 184080 cctcccaaag tgctgggatt ataggcatag ccaacgcgcc cagcctggac ttgtttttaa 184140 aagatcactg tggctcctgt gtttaggctg gctggtagga gacaggtggc agtggcattg 184200 atggtgaaga gaaaatagtg gcagccatgg agatggagag aagtagacaa gtttgggata 184260 tattatacat tccaggggta gaaacaacag gactagatga tggattgatg ggtgggagat 184320 gtagatactg ggagagaagc aggattctga tggatggaaa aactaaaaaa ttctattttg 184380 ggtgtggtaa gtctaagtct attagacatg caagtagaga tgtcactggg cagatacaca 184440 tctggatttc aggggcaagg tccaagctag agaaagaaac ctgggcatgg tcagcatgag 184500 gatggtgttt aaagccatgg aacttatctt gtgcatccct ataagacccc tttgaggcac 184560 ttgtttcccc tcacaatgga tgcagtgcat cttccattct gaattccaga ggcaacaacc 184620 tcctgctcct agaagctaaa ctctccagac ttagtcttct gaattc 184666 // |
The output is a standard EMBOSS report file.
The results can be output in one of several styles by using the command-line qualifier -rformat xxx, where 'xxx' is replaced by the name of the required format. The available format names are: embl, genbank, gff, pir, swiss, trace, listfile, dbmotif, diffseq, excel, feattable, motif, regions, seqtable, simple, srs, table, tagseq
See: http://emboss.sf.net/docs/themes/ReportFormats.html for further information on report formats.
By default diffseq writes a 'diffseq' report file.
######################################## # Program: diffseq # Rundate: Fri Jul 15 2005 12:00:00 # Report_format: diffseq # Report_file: ap000504.diffseq # Additional_files: 2 # 1: AP000504.diffgff (Feature file for first sequence) # 2: AF129756.diffgff (Feature file for second sequence) ######################################## #======================================= # # Sequence: AP000504 from: 1 to: 100000 # HitCount: 119 # # Compare: AF129756 from: 1 to: 184666 # # AP000504 overlap starts at 1 # AF129756 overlap starts at 6036 # # (AP000504) start end length sequence # (AF129756) start end length sequence # # #======================================= AP000504 847-847 Length: 1 Sequence: a Sequence: t AF129756 6882-6882 Length: 1 AP000504 1795-1795 Length: 1 Sequence: g Sequence: a AF129756 7830-7830 Length: 1 AP000504 2273-2273 Length: 1 Sequence: t Sequence: Feature: repeat_region 7920-8351 rpt_family='MSTB' AF129756 8307 Length: 0 AP000504 2466-2466 Length: 1 Sequence: g Sequence: a Feature: repeat_region 8391-8686 rpt_family='AluSg' AF129756 8500-8500 Length: 1 AP000504 2655-2658 Length: 4 [Part of this file has been deleted for brevity] Sequence: t Sequence: c AF129756 99280-99280 Length: 1 AP000504 93696-93696 Length: 1 Sequence: t Sequence: g AF129756 99726-99726 Length: 1 AP000504 93860-93860 Length: 1 Sequence: t Sequence: g AF129756 99890-99890 Length: 1 AP000504 95451-95451 Length: 1 Sequence: c Sequence: t AF129756 101481-101481 Length: 1 AP000504 96650-96650 Length: 1 Sequence: c Sequence: t AF129756 102680-102680 Length: 1 AP000504 97273-97274 Length: 2 Sequence: aa Sequence: Feature: repeat_region 103299-103402 rpt_family='AluSq' AF129756 103302 Length: 0 AP000504 97716-97716 Length: 1 Sequence: a Sequence: g AF129756 103744-103744 Length: 1 AP000504 97827-97827 Length: 1 Sequence: c Sequence: t Feature: repeat_region 103784-104083 rpt_family='AluSx' AF129756 103855-103855 Length: 1 #--------------------------------------- # # Overlap_end: 100000 in AP000504 # Overlap_end: 106028 in AF129756 # # SNP_count: 86 # Transitions: 58 # Transversions: 28 # #--------------------------------------- |
##gff-version 2.0 ##date 2005-07-15 ##Type DNA AF129756 AF129756 diffseq conflict 6882 6882 1.000 + . Sequence "AF129756.1" ; note "SNP in AP000504" ; replace "a" AF129756 diffseq conflict 7830 7830 1.000 + . Sequence "AF129756.2" ; note "SNP in AP000504" ; replace "g" AF129756 diffseq conflict 8500 8500 1.000 + . Sequence "AF129756.3" ; note "SNP in AP000504" ; replace "g" AF129756 diffseq conflict 10945 10962 1.000 + . Sequence "AF129756.4" ; note "Insertion of 18 bases in AF129756" ; replace "" AF129756 diffseq conflict 10999 11001 1.000 + . Sequence "AF129756.5" ; note "AP000504" ; replace "aaa" AF129756 diffseq conflict 12915 12915 1.000 + . Sequence "AF129756.6" ; note "SNP in AP000504" ; replace "g" AF129756 diffseq conflict 15139 15139 1.000 + . Sequence "AF129756.7" ; note "SNP in AP000504" ; replace "g" AF129756 diffseq conflict 17192 17192 1.000 + . Sequence "AF129756.8" ; note "SNP in AP000504" ; replace "c" AF129756 diffseq conflict 19761 19761 1.000 + . Sequence "AF129756.9" ; note "SNP in AP000504" ; replace "a" AF129756 diffseq conflict 20291 20291 1.000 + . Sequence "AF129756.10" ; note "SNP in AP000504" ; replace "c" AF129756 diffseq conflict 20462 20462 1.000 + . Sequence "AF129756.11" ; note "SNP in AP000504" ; replace "g" AF129756 diffseq conflict 25686 25686 1.000 + . Sequence "AF129756.12" ; note "SNP in AP000504" ; replace "t" AF129756 diffseq conflict 26192 26192 1.000 + . Sequence "AF129756.13" ; note "SNP in AP000504" ; replace "c" AF129756 diffseq conflict 27227 27227 1.000 + . Sequence "AF129756.14" ; note "Insertion of 1 bases in AF129756" ; replace "" AF129756 diffseq conflict 27837 27837 1.000 + . Sequence "AF129756.15" ; note "SNP in AP000504" ; replace "c" AF129756 diffseq conflict 29328 29328 1.000 + . Sequence "AF129756.16" ; note "SNP in AP000504" ; replace "t" AF129756 diffseq conflict 29458 29458 1.000 + . Sequence "AF129756.17" ; note "SNP in AP000504" ; replace "a" AF129756 diffseq conflict 29629 29629 1.000 + . Sequence "AF129756.18" ; note "SNP in AP000504" ; replace "t" AF129756 diffseq conflict 29646 29646 1.000 + . Sequence "AF129756.19" ; note "SNP in AP000504" ; replace "g" AF129756 diffseq conflict 30838 30838 1.000 + . Sequence "AF129756.20" ; note "SNP in AP000504" ; replace "c" AF129756 diffseq conflict 31349 31349 1.000 + . Sequence "AF129756.21" ; note "SNP in AP000504" ; replace "c" AF129756 diffseq conflict 31901 31901 1.000 + . Sequence "AF129756.22" ; note "SNP in AP000504" ; replace "g" AF129756 diffseq conflict 36682 36682 1.000 + . Sequence "AF129756.23" ; note "SNP in AP000504" ; replace "c" AF129756 diffseq conflict 38225 38226 1.000 + . Sequence "AF129756.24" ; note "Insertion of 2 bases in AF129756" ; replace "" AF129756 diffseq conflict 38379 38379 1.000 + . Sequence "AF129756.25" ; note "SNP in AP000504" ; replace "c" AF129756 diffseq conflict 38537 38537 1.000 + . Sequence "AF129756.26" ; note "SNP in AP000504" ; replace "t" AF129756 diffseq conflict 39114 39114 1.000 + . Sequence "AF129756.27" ; note "SNP in AP000504" ; replace "t" AF129756 diffseq conflict 39816 39816 1.000 + . Sequence "AF129756.28" ; note "SNP in AP000504" ; replace "g" AF129756 diffseq conflict 40807 40807 1.000 + . Sequence "AF129756.29" ; note "SNP in AP000504" ; replace "a" AF129756 diffseq conflict 40977 40977 1.000 + . Sequence "AF129756.30" ; note "Insertion of 1 bases in AF129756" ; replace "" AF129756 diffseq conflict 41204 41204 1.000 + . Sequence "AF129756.31" ; note "SNP in AP000504" ; replace "a" AF129756 diffseq conflict 42548 42548 1.000 + . Sequence "AF129756.32" ; note "SNP in AP000504" ; replace "g" AF129756 diffseq conflict 45315 45315 1.000 + . Sequence "AF129756.33" ; note "Insertion of 1 bases in AF129756" ; replace "" AF129756 diffseq conflict 48382 48382 1.000 + . Sequence "AF129756.34" ; note "SNP in AP000504" ; replace "a" AF129756 diffseq conflict 50635 50635 1.000 + . Sequence "AF129756.35" ; note "SNP in AP000504" ; replace "t" AF129756 diffseq conflict 50809 50809 1.000 + . Sequence "AF129756.36" ; note "SNP in AP000504" ; replace "g" AF129756 diffseq conflict 51286 51286 1.000 + . Sequence "AF129756.37" ; note "SNP in AP000504" ; replace "g" AF129756 diffseq conflict 51645 51645 1.000 + . Sequence "AF129756.38" ; note "Insertion of 1 bases in AF129756" ; replace "" AF129756 diffseq conflict 52388 52388 1.000 + . Sequence "AF129756.39" ; note "SNP in AP000504" ; replace "c" AF129756 diffseq conflict 52646 52646 1.000 + . Sequence "AF129756.40" ; note "SNP in AP000504" ; replace "a" AF129756 diffseq conflict 53596 53596 1.000 + . Sequence "AF129756.41" ; note "SNP in AP000504" ; replace "a" AF129756 diffseq conflict 53621 53621 1.000 + . Sequence "AF129756.42" ; note "SNP in AP000504" ; replace "t" AF129756 diffseq conflict 54883 54883 1.000 + . Sequence "AF129756.43" ; note "SNP in AP000504" ; replace "g" AF129756 diffseq conflict 55377 55377 1.000 + . Sequence "AF129756.44" ; note "SNP in AP000504" ; replace "g" AF129756 diffseq conflict 55571 55571 1.000 + . Sequence "AF129756.45" ; note "SNP in AP000504" ; replace "c" AF129756 diffseq conflict 55611 55611 1.000 + . Sequence "AF129756.46" ; note "SNP in AP000504" ; replace "t" AF129756 diffseq conflict 55655 55661 1.000 + . Sequence "AF129756.47" ; note "Insertion of 7 bases in AF129756" ; replace "" [Part of this file has been deleted for brevity] AF129756 diffseq conflict 66604 66604 1.000 + . Sequence "AF129756.55" ; note "SNP in AP000504" ; replace "t" AF129756 diffseq conflict 69445 69445 1.000 + . Sequence "AF129756.56" ; note "SNP in AP000504" ; replace "g" AF129756 diffseq conflict 70182 70183 1.000 + . Sequence "AF129756.57" ; note "AP000504" ; replace "ta" AF129756 diffseq conflict 70195 70195 1.000 + . Sequence "AF129756.58" ; note "SNP in AP000504" ; replace "c" AF129756 diffseq conflict 71102 71102 1.000 + . Sequence "AF129756.59" ; note "Insertion of 1 bases in AF129756" ; replace "" AF129756 diffseq conflict 73566 73566 1.000 + . Sequence "AF129756.60" ; note "SNP in AP000504" ; replace "t" AF129756 diffseq conflict 73758 73758 1.000 + . Sequence "AF129756.61" ; note "SNP in AP000504" ; replace "g" AF129756 diffseq conflict 74597 74597 1.000 + . Sequence "AF129756.62" ; note "SNP in AP000504" ; replace "t" AF129756 diffseq conflict 76175 76176 1.000 + . Sequence "AF129756.63" ; note "Insertion of 2 bases in AF129756" ; replace "" AF129756 diffseq conflict 76463 76463 1.000 + . Sequence "AF129756.64" ; note "SNP in AP000504" ; replace "c" AF129756 diffseq conflict 76710 76710 1.000 + . Sequence "AF129756.65" ; note "SNP in AP000504" ; replace "c" AF129756 diffseq conflict 77331 77331 1.000 + . Sequence "AF129756.66" ; note "SNP in AP000504" ; replace "a" AF129756 diffseq conflict 77597 77597 1.000 + . Sequence "AF129756.67" ; note "SNP in AP000504" ; replace "g" AF129756 diffseq conflict 78092 78092 1.000 + . Sequence "AF129756.68" ; note "SNP in AP000504" ; replace "c" AF129756 diffseq conflict 79671 79671 1.000 + . Sequence "AF129756.69" ; note "SNP in AP000504" ; replace "t" AF129756 diffseq conflict 80042 80042 1.000 + . Sequence "AF129756.70" ; note "SNP in AP000504" ; replace "g" AF129756 diffseq conflict 80115 80115 1.000 + . Sequence "AF129756.71" ; note "SNP in AP000504" ; replace "a" AF129756 diffseq conflict 81882 81882 1.000 + . Sequence "AF129756.72" ; note "AP000504" ; replace "tttggaat" AF129756 diffseq conflict 82132 82132 1.000 + . Sequence "AF129756.73" ; note "SNP in AP000504" ; replace "a" AF129756 diffseq conflict 83649 83649 1.000 + . Sequence "AF129756.74" ; note "SNP in AP000504" ; replace "t" AF129756 diffseq conflict 84290 84290 1.000 + . Sequence "AF129756.75" ; note "SNP in AP000504" ; replace "g" AF129756 diffseq conflict 86465 86465 1.000 + . Sequence "AF129756.76" ; note "SNP in AP000504" ; replace "t" AF129756 diffseq conflict 86842 86844 1.000 + . Sequence "AF129756.77" ; note "Insertion of 3 bases in AF129756" ; replace "" AF129756 diffseq conflict 87014 87014 1.000 + . Sequence "AF129756.78" ; note "SNP in AP000504" ; replace "g" AF129756 diffseq conflict 87102 87102 1.000 + . Sequence "AF129756.79" ; note "Insertion of 1 bases in AF129756" ; replace "" AF129756 diffseq conflict 87605 87605 1.000 + . Sequence "AF129756.80" ; note "SNP in AP000504" ; replace "g" AF129756 diffseq conflict 87893 87893 1.000 + . Sequence "AF129756.81" ; note "SNP in AP000504" ; replace "t" AF129756 diffseq conflict 88359 88359 1.000 + . Sequence "AF129756.82" ; note "SNP in AP000504" ; replace "c" AF129756 diffseq conflict 88635 88635 1.000 + . Sequence "AF129756.83" ; note "SNP in AP000504" ; replace "g" AF129756 diffseq conflict 88750 88750 1.000 + . Sequence "AF129756.84" ; note "SNP in AP000504" ; replace "a" AF129756 diffseq conflict 88822 88826 1.000 + . Sequence "AF129756.85" ; note "Insertion of 5 bases in AF129756" ; replace "" AF129756 diffseq conflict 89118 89118 1.000 + . Sequence "AF129756.86" ; note "SNP in AP000504" ; replace "a" AF129756 diffseq conflict 89738 89738 1.000 + . Sequence "AF129756.87" ; note "SNP in AP000504" ; replace "t" AF129756 diffseq conflict 91271 91271 1.000 + . Sequence "AF129756.88" ; note "Insertion of 1 bases in AF129756" ; replace "" AF129756 diffseq conflict 92311 92311 1.000 + . Sequence "AF129756.89" ; note "SNP in AP000504" ; replace "g" AF129756 diffseq conflict 92345 92345 1.000 + . Sequence "AF129756.90" ; note "SNP in AP000504" ; replace "c" AF129756 diffseq conflict 93979 93979 1.000 + . Sequence "AF129756.91" ; note "SNP in AP000504" ; replace "c" AF129756 diffseq conflict 94959 94959 1.000 + . Sequence "AF129756.92" ; note "SNP in AP000504" ; replace "a" AF129756 diffseq conflict 95246 95246 1.000 + . Sequence "AF129756.93" ; note "SNP in AP000504" ; replace "t" AF129756 diffseq conflict 95809 95810 1.000 + . Sequence "AF129756.94" ; note "AP000504" ; replace "aat" AF129756 diffseq conflict 96756 96756 1.000 + . Sequence "AF129756.95" ; note "SNP in AP000504" ; replace "c" AF129756 diffseq conflict 97713 97713 1.000 + . Sequence "AF129756.96" ; note "AP000504" ; replace "tgtgtgtgtgtgtgtgt" AF129756 diffseq conflict 97827 97827 1.000 + . Sequence "AF129756.97" ; note "SNP in AP000504" ; replace "t" AF129756 diffseq conflict 98195 98195 1.000 + . Sequence "AF129756.98" ; note "SNP in AP000504" ; replace "t" AF129756 diffseq conflict 99280 99280 1.000 + . Sequence "AF129756.99" ; note "SNP in AP000504" ; replace "t" AF129756 diffseq conflict 99726 99726 1.000 + . Sequence "AF129756.100" ; note "SNP in AP000504" ; replace "t" AF129756 diffseq conflict 99890 99890 1.000 + . Sequence "AF129756.101" ; note "SNP in AP000504" ; replace "t" AF129756 diffseq conflict 101481 101481 1.000 + . Sequence "AF129756.102" ; note "SNP in AP000504" ; replace "c" AF129756 diffseq conflict 102680 102680 1.000 + . Sequence "AF129756.103" ; note "SNP in AP000504" ; replace "c" AF129756 diffseq conflict 103744 103744 1.000 + . Sequence "AF129756.104" ; note "SNP in AP000504" ; replace "a" AF129756 diffseq conflict 103855 103855 1.000 + . Sequence "AF129756.105" ; note "SNP in AP000504" ; replace "c" |
##gff-version 2.0 ##date 2005-07-15 ##Type DNA AP000504 AP000504 diffseq conflict 847 847 1.000 + . Sequence "AP000504.1" ; note "SNP in AF129756" ; replace "t" AP000504 diffseq conflict 1795 1795 1.000 + . Sequence "AP000504.2" ; note "SNP in AF129756" ; replace "a" AP000504 diffseq conflict 2273 2273 1.000 + . Sequence "AP000504.3" ; note "Insertion of 1 bases in AP000504" ; replace "" AP000504 diffseq conflict 2466 2466 1.000 + . Sequence "AP000504.4" ; note "SNP in AF129756" ; replace "a" AP000504 diffseq conflict 2655 2658 1.000 + . Sequence "AP000504.5" ; note "Insertion of 4 bases in AP000504" ; replace "" AP000504 diffseq conflict 4951 4953 1.000 + . Sequence "AP000504.6" ; note "AF129756" ; replace "tat" AP000504 diffseq conflict 6600 6600 1.000 + . Sequence "AP000504.7" ; note "Insertion of 1 bases in AP000504" ; replace "" AP000504 diffseq conflict 6868 6868 1.000 + . Sequence "AP000504.8" ; note "SNP in AF129756" ; replace "a" AP000504 diffseq conflict 8218 8221 1.000 + . Sequence "AP000504.9" ; note "Insertion of 4 bases in AP000504" ; replace "" AP000504 diffseq conflict 9096 9096 1.000 + . Sequence "AP000504.10" ; note "SNP in AF129756" ; replace "a" AP000504 diffseq conflict 11149 11149 1.000 + . Sequence "AP000504.11" ; note "SNP in AF129756" ; replace "a" AP000504 diffseq conflict 13718 13718 1.000 + . Sequence "AP000504.12" ; note "SNP in AF129756" ; replace "c" AP000504 diffseq conflict 14248 14248 1.000 + . Sequence "AP000504.13" ; note "SNP in AF129756" ; replace "t" AP000504 diffseq conflict 14419 14419 1.000 + . Sequence "AP000504.14" ; note "SNP in AF129756" ; replace "c" AP000504 diffseq conflict 19643 19643 1.000 + . Sequence "AP000504.15" ; note "SNP in AF129756" ; replace "c" AP000504 diffseq conflict 20149 20149 1.000 + . Sequence "AP000504.16" ; note "SNP in AF129756" ; replace "t" AP000504 diffseq conflict 21316 21319 1.000 + . Sequence "AP000504.17" ; note "Insertion of 4 bases in AP000504" ; replace "" AP000504 diffseq conflict 21797 21797 1.000 + . Sequence "AP000504.18" ; note "SNP in AF129756" ; replace "t" AP000504 diffseq conflict 23288 23288 1.000 + . Sequence "AP000504.19" ; note "SNP in AF129756" ; replace "a" AP000504 diffseq conflict 23418 23418 1.000 + . Sequence "AP000504.20" ; note "SNP in AF129756" ; replace "c" AP000504 diffseq conflict 23589 23589 1.000 + . Sequence "AP000504.21" ; note "SNP in AF129756" ; replace "c" AP000504 diffseq conflict 23606 23606 1.000 + . Sequence "AP000504.22" ; note "SNP in AF129756" ; replace "a" AP000504 diffseq conflict 24798 24798 1.000 + . Sequence "AP000504.23" ; note "SNP in AF129756" ; replace "t" AP000504 diffseq conflict 25309 25309 1.000 + . Sequence "AP000504.24" ; note "SNP in AF129756" ; replace "t" AP000504 diffseq conflict 25861 25861 1.000 + . Sequence "AP000504.25" ; note "SNP in AF129756" ; replace "t" AP000504 diffseq conflict 28039 28040 1.000 + . Sequence "AP000504.26" ; note "Insertion of 2 bases in AP000504" ; replace "" AP000504 diffseq conflict 30644 30644 1.000 + . Sequence "AP000504.27" ; note "SNP in AF129756" ; replace "t" AP000504 diffseq conflict 32339 32339 1.000 + . Sequence "AP000504.28" ; note "SNP in AF129756" ; replace "g" AP000504 diffseq conflict 32497 32497 1.000 + . Sequence "AP000504.29" ; note "SNP in AF129756" ; replace "c" AP000504 diffseq conflict 33074 33074 1.000 + . Sequence "AP000504.30" ; note "SNP in AF129756" ; replace "c" AP000504 diffseq conflict 33776 33776 1.000 + . Sequence "AP000504.31" ; note "SNP in AF129756" ; replace "a" AP000504 diffseq conflict 34767 34767 1.000 + . Sequence "AP000504.32" ; note "SNP in AF129756" ; replace "c" AP000504 diffseq conflict 35163 35163 1.000 + . Sequence "AP000504.33" ; note "SNP in AF129756" ; replace "g" AP000504 diffseq conflict 36507 36507 1.000 + . Sequence "AP000504.34" ; note "SNP in AF129756" ; replace "a" AP000504 diffseq conflict 37760 37762 1.000 + . Sequence "AP000504.35" ; note "Insertion of 3 bases in AP000504" ; replace "" AP000504 diffseq conflict 38680 38683 1.000 + . Sequence "AP000504.36" ; note "Insertion of 4 bases in AP000504" ; replace "" AP000504 diffseq conflict 42347 42347 1.000 + . Sequence "AP000504.37" ; note "SNP in AF129756" ; replace "g" AP000504 diffseq conflict 42637 42638 1.000 + . Sequence "AP000504.38" ; note "Insertion of 2 bases in AP000504" ; replace "" AP000504 diffseq conflict 44602 44602 1.000 + . Sequence "AP000504.39" ; note "SNP in AF129756" ; replace "c" AP000504 diffseq conflict 44776 44776 1.000 + . Sequence "AP000504.40" ; note "SNP in AF129756" ; replace "t" AP000504 diffseq conflict 45253 45253 1.000 + . Sequence "AP000504.41" ; note "SNP in AF129756" ; replace "a" AP000504 diffseq conflict 46354 46354 1.000 + . Sequence "AP000504.42" ; note "SNP in AF129756" ; replace "t" AP000504 diffseq conflict 46612 46612 1.000 + . Sequence "AP000504.43" ; note "SNP in AF129756" ; replace "g" AP000504 diffseq conflict 47562 47562 1.000 + . Sequence "AP000504.44" ; note "SNP in AF129756" ; replace "g" AP000504 diffseq conflict 47587 47587 1.000 + . Sequence "AP000504.45" ; note "SNP in AF129756" ; replace "c" AP000504 diffseq conflict 48849 48849 1.000 + . Sequence "AP000504.46" ; note "SNP in AF129756" ; replace "a" AP000504 diffseq conflict 49343 49343 1.000 + . Sequence "AP000504.47" ; note "SNP in AF129756" ; replace "a" [Part of this file has been deleted for brevity] AP000504 diffseq conflict 58685 58685 1.000 + . Sequence "AP000504.55" ; note "SNP in AF129756" ; replace "t" AP000504 diffseq conflict 60558 60558 1.000 + . Sequence "AP000504.56" ; note "SNP in AF129756" ; replace "c" AP000504 diffseq conflict 61209 61209 1.000 + . Sequence "AP000504.57" ; note "Insertion of 1 bases in AP000504" ; replace "" AP000504 diffseq conflict 62958 62959 1.000 + . Sequence "AP000504.58" ; note "Insertion of 2 bases in AP000504" ; replace "" AP000504 diffseq conflict 63402 63402 1.000 + . Sequence "AP000504.59" ; note "SNP in AF129756" ; replace "a" AP000504 diffseq conflict 64139 64140 1.000 + . Sequence "AP000504.60" ; note "AF129756" ; replace "at" AP000504 diffseq conflict 64152 64152 1.000 + . Sequence "AP000504.61" ; note "SNP in AF129756" ; replace "t" AP000504 diffseq conflict 65317 65317 1.000 + . Sequence "AP000504.62" ; note "Insertion of 1 bases in AP000504" ; replace "" AP000504 diffseq conflict 67523 67523 1.000 + . Sequence "AP000504.63" ; note "SNP in AF129756" ; replace "c" AP000504 diffseq conflict 67715 67715 1.000 + . Sequence "AP000504.64" ; note "SNP in AF129756" ; replace "c" AP000504 diffseq conflict 68554 68554 1.000 + . Sequence "AP000504.65" ; note "SNP in AF129756" ; replace "a" AP000504 diffseq conflict 69285 69285 1.000 + . Sequence "AP000504.66" ; note "Insertion of 1 bases in AP000504" ; replace "" AP000504 diffseq conflict 70419 70419 1.000 + . Sequence "AP000504.67" ; note "SNP in AF129756" ; replace "a" AP000504 diffseq conflict 70666 70666 1.000 + . Sequence "AP000504.68" ; note "SNP in AF129756" ; replace "t" AP000504 diffseq conflict 71287 71287 1.000 + . Sequence "AP000504.69" ; note "SNP in AF129756" ; replace "c" AP000504 diffseq conflict 71553 71553 1.000 + . Sequence "AP000504.70" ; note "SNP in AF129756" ; replace "a" AP000504 diffseq conflict 72048 72048 1.000 + . Sequence "AP000504.71" ; note "SNP in AF129756" ; replace "t" AP000504 diffseq conflict 73627 73627 1.000 + . Sequence "AP000504.72" ; note "SNP in AF129756" ; replace "g" AP000504 diffseq conflict 73998 73998 1.000 + . Sequence "AP000504.73" ; note "SNP in AF129756" ; replace "a" AP000504 diffseq conflict 74071 74071 1.000 + . Sequence "AP000504.74" ; note "SNP in AF129756" ; replace "t" AP000504 diffseq conflict 75838 75845 1.000 + . Sequence "AP000504.75" ; note "AF129756" ; replace "g" AP000504 diffseq conflict 76095 76095 1.000 + . Sequence "AP000504.76" ; note "SNP in AF129756" ; replace "g" AP000504 diffseq conflict 77612 77612 1.000 + . Sequence "AP000504.77" ; note "SNP in AF129756" ; replace "c" AP000504 diffseq conflict 78253 78253 1.000 + . Sequence "AP000504.78" ; note "SNP in AF129756" ; replace "a" AP000504 diffseq conflict 80428 80428 1.000 + . Sequence "AP000504.79" ; note "SNP in AF129756" ; replace "c" AP000504 diffseq conflict 80974 80974 1.000 + . Sequence "AP000504.80" ; note "SNP in AF129756" ; replace "t" AP000504 diffseq conflict 81564 81564 1.000 + . Sequence "AP000504.81" ; note "SNP in AF129756" ; replace "a" AP000504 diffseq conflict 81852 81852 1.000 + . Sequence "AP000504.82" ; note "SNP in AF129756" ; replace "c" AP000504 diffseq conflict 82318 82318 1.000 + . Sequence "AP000504.83" ; note "SNP in AF129756" ; replace "t" AP000504 diffseq conflict 82594 82594 1.000 + . Sequence "AP000504.84" ; note "SNP in AF129756" ; replace "a" AP000504 diffseq conflict 82709 82709 1.000 + . Sequence "AP000504.85" ; note "SNP in AF129756" ; replace "g" AP000504 diffseq conflict 83072 83072 1.000 + . Sequence "AP000504.86" ; note "SNP in AF129756" ; replace "g" AP000504 diffseq conflict 83692 83692 1.000 + . Sequence "AP000504.87" ; note "SNP in AF129756" ; replace "g" AP000504 diffseq conflict 86264 86264 1.000 + . Sequence "AP000504.88" ; note "SNP in AF129756" ; replace "t" AP000504 diffseq conflict 86298 86298 1.000 + . Sequence "AP000504.89" ; note "SNP in AF129756" ; replace "a" AP000504 diffseq conflict 87932 87932 1.000 + . Sequence "AP000504.90" ; note "SNP in AF129756" ; replace "t" AP000504 diffseq conflict 88912 88912 1.000 + . Sequence "AP000504.91" ; note "SNP in AF129756" ; replace "g" AP000504 diffseq conflict 89199 89199 1.000 + . Sequence "AP000504.92" ; note "SNP in AF129756" ; replace "g" AP000504 diffseq conflict 89762 89764 1.000 + . Sequence "AP000504.93" ; note "AF129756" ; replace "ca" AP000504 diffseq conflict 90710 90710 1.000 + . Sequence "AP000504.94" ; note "SNP in AF129756" ; replace "a" AP000504 diffseq conflict 91667 91683 1.000 + . Sequence "AP000504.95" ; note "AF129756" ; replace "g" AP000504 diffseq conflict 91797 91797 1.000 + . Sequence "AP000504.96" ; note "SNP in AF129756" ; replace "c" AP000504 diffseq conflict 92165 92165 1.000 + . Sequence "AP000504.97" ; note "SNP in AF129756" ; replace "a" AP000504 diffseq conflict 93250 93250 1.000 + . Sequence "AP000504.98" ; note "SNP in AF129756" ; replace "c" AP000504 diffseq conflict 93696 93696 1.000 + . Sequence "AP000504.99" ; note "SNP in AF129756" ; replace "g" AP000504 diffseq conflict 93860 93860 1.000 + . Sequence "AP000504.100" ; note "SNP in AF129756" ; replace "g" AP000504 diffseq conflict 95451 95451 1.000 + . Sequence "AP000504.101" ; note "SNP in AF129756" ; replace "t" AP000504 diffseq conflict 96650 96650 1.000 + . Sequence "AP000504.102" ; note "SNP in AF129756" ; replace "t" AP000504 diffseq conflict 97273 97274 1.000 + . Sequence "AP000504.103" ; note "Insertion of 2 bases in AP000504" ; replace "" AP000504 diffseq conflict 97716 97716 1.000 + . Sequence "AP000504.104" ; note "SNP in AF129756" ; replace "g" AP000504 diffseq conflict 97827 97827 1.000 + . Sequence "AP000504.105" ; note "SNP in AF129756" ; replace "t" |
The first line is the title giving the names of the sequences used.
The next two non-blank lines state the positions in each sequence where the detected overlap between them starts.
There then follows a set of reports of the mismatches between the sequences.
Each report consists of 4 or more lines.
This is followed by the equivalent information for the second sequence, but in the reverse order, namely 'Sequence:' line, 'Feature:' lines and line giving the position of the mismatch in the second sequence.
At the end of the report are two non-blank lines giving the positions in each sequence where the detected overlap between them ends.
The last three lines of the report gives the counts of SNPs (defined as a change of one nucleotide to one other nucleotide, no deletions or insertions are counted, no multi-base changes are counted).
If the input sequences are nucleic acid, The counts of transitions (Pyrimide to Pyrimidine or Purine to Purine) and transversions (Pyrimidine to Purine) are also given.
It should be noted that not all features are reported.
The 'source' feature found in all EMBL/Genbank feature table entries is not reported as this covers all of the sequence and so overlaps with any difference found in that sequence and so is uninformative and irritating. It has therefore been removed from the output report.
The translation information of CDS features is often extremely long and does not add useful information to the report. It has therefore been removed from the output report.
The 'source' feature found in all EMBL/Genbank feature table entries is not reported as this covers all of the sequence and so overlaps with any difference found in that sequence and so is uninformative and irritating. It has therefore been removed from the output report.
The translation information of CDS features is often extremely long and does not add useful information to the report. It has therefore been removed from the output report.
If you run out of memory, use a larger word size.
Using a larger word size increases the length between mismatches that will be reported as one event. Thus a word size of 50 will report two single-base differences that are with 50 bases of each other as one mismatch.
Program name | Description |
---|
18th Aug 2000 - Added writing out GFF files of the mismatched regions