biosed |
biosed was inspired by the useful UNIX utility sed which searches for a pattern in text and can replace or delete the found pattern.
If the target subsequence occurs more than once, then each instance of the target is replaced.
The target subsequence is not any sort of an ambiguity pattern, it is just a short sequence. A simple string match is done and if it exactly matches then the replacement is done. The matching is independent of the case of the sequence or the target - both uppercase and lowercase will match.
Replace all 'T's with 'U's to create an RNA sequence
% biosed tembl:hsfau hsfau.rna -target T -replace U Replace or delete sequence sections |
Go to the input files for this example
Go to the output files for this example
Example 2
Replace all 'PPP' protein motifs with 'XXPPPXX'
% biosed tsw:AMIR_PSEAE AMIR_PSEAE.pep -target PPP -replace XXPPPXX Replace or delete sequence sections |
Go to the input files for this example
Go to the output files for this example
Standard (Mandatory) qualifiers (* if not always prompted): [-sequence] seqall Sequence database USA -target string Sequence section to match * -replace string Replacement sequence section [-outseq] seqout Output sequence USA Additional (Optional) qualifiers: (none) Advanced (Unprompted) qualifiers: -delete toggle Delete the target sequence sections Associated qualifiers: "-sequence" associated qualifiers -sbegin1 integer Start of each sequence to be used -send1 integer End of each sequence to be used -sreverse1 boolean Reverse (if DNA) -sask1 boolean Ask for begin/end/reverse -snucleotide1 boolean Sequence is nucleotide -sprotein1 boolean Sequence is protein -slower1 boolean Make lower case -supper1 boolean Make upper case -sformat1 string Input sequence format -sdbname1 string Database name -sid1 string Entryname -ufo1 string UFO features -fformat1 string Features format -fopenfile1 string Features file name "-outseq" associated qualifiers -osformat2 string Output seq format -osextension2 string File name extension -osname2 string Base file name -osdirectory2 string Output directory -osdbname2 string Database name to add -ossingle2 boolean Separate file for each entry -oufo2 string UFO features -offormat2 string Features format -ofname2 string Features file name -ofdirectory2 string Output directory General qualifiers: -auto boolean Turn off prompts -stdout boolean Write standard output -filter boolean Read standard input, write standard output -options boolean Prompt for standard and additional values -debug boolean Write debug output to program.dbg -verbose boolean Report some/full command line options -help boolean Report command line options. More information on associated and general qualifiers can be found with -help -verbose -warning boolean Report warnings -error boolean Report errors -fatal boolean Report fatal errors -die boolean Report deaths |
Standard (Mandatory) qualifiers | Allowed values | Default | |
---|---|---|---|
[-sequence] (Parameter 1) |
Sequence database USA | Readable sequence(s) | Required |
-target | Sequence section to match | Any string is accepted | N |
-replace | Replacement sequence section | Any string is accepted | A |
[-outseq] (Parameter 2) |
Output sequence USA | Writeable sequence | <sequence>.format |
Additional (Optional) qualifiers | Allowed values | Default | |
(none) | |||
Advanced (Unprompted) qualifiers | Allowed values | Default | |
-delete | Delete the target sequence sections | Toggle value Yes/No | No |
ID HSFAU standard; RNA; HUM; 518 BP. XX AC X65923; XX SV X65923.1 XX DT 13-MAY-1992 (Rel. 31, Created) DT 23-SEP-1993 (Rel. 37, Last updated, Version 10) XX DE H.sapiens fau mRNA XX KW fau gene. XX OS Homo sapiens (human) OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; OC Eutheria; Primates; Catarrhini; Hominidae; Homo. XX RN [1] RP 1-518 RA Michiels L.M.R.; RT ; RL Submitted (29-APR-1992) to the EMBL/GenBank/DDBJ databases. RL L.M.R. Michiels, University of Antwerp, Dept of Biochemistry, RL Universiteisplein 1, 2610 Wilrijk, BELGIUM XX RN [2] RP 1-518 RX MEDLINE; 93368957. RA Michiels L., Van der Rauwelaert E., Van Hasselt F., Kas K., Merregaert J.; RT " fau cDNA encodes a ubiquitin-like-S30 fusion protein and is expressed as RT an antisense sequences in the Finkel-Biskis-Reilly murine sarcoma virus"; RL Oncogene 8:2537-2546(1993). XX DR SWISS-PROT; P35544; UBIM_HUMAN. DR SWISS-PROT; Q05472; RS30_HUMAN. XX FH Key Location/Qualifiers FH FT source 1..518 FT /chromosome="11q" FT /db_xref="taxon:9606" FT /organism="Homo sapiens" FT /tissue_type="placenta" FT /clone_lib="cDNA" FT /clone="pUIA 631" FT /map="13" FT misc_feature 57..278 FT /note="ubiquitin like part" FT CDS 57..458 FT /db_xref="SWISS-PROT:P35544" FT /db_xref="SWISS-PROT:Q05472" FT /gene="fau" FT /protein_id="CAA46716.1" FT /translation="MQLFVRAQELHTFEVTGQETVAQIKAHVASLEGIAPEDQVVLLAG FT APLEDEATLGQCGVEALTTLEVAGRMLGGKVHGSLARAGKVRGQTPKVAKQEKKKKKTG FT RAKRRMQYNRRFVNVVPTFGKKKGPNANS" FT misc_feature 98..102 FT /note="nucleolar localization signal" FT misc_feature 279..458 FT /note="S30 part" FT polyA_signal 484..489 FT polyA_site 509 XX SQ Sequence 518 BP; 125 A; 139 C; 148 G; 106 T; 0 other; ttcctctttc tcgactccat cttcgcggta gctgggaccg ccgttcagtc gccaatatgc 60 agctctttgt ccgcgcccag gagctacaca ccttcgaggt gaccggccag gaaacggtcg 120 cccagatcaa ggctcatgta gcctcactgg agggcattgc cccggaagat caagtcgtgc 180 tcctggcagg cgcgcccctg gaggatgagg ccactctggg ccagtgcggg gtggaggccc 240 tgactaccct ggaagtagca ggccgcatgc ttggaggtaa agttcatggt tccctggccc 300 gtgctggaaa agtgagaggt cagactccta aggtggccaa acaggagaag aagaagaaga 360 agacaggtcg ggctaagcgg cggatgcagt acaaccggcg ctttgtcaac gttgtgccca 420 cctttggcaa gaagaagggc cccaatgcca actcttaagt cttttgtaat tctggctttc 480 tctaataaaa aagccactta gttcagtcaa aaaaaaaa 518 // |
ID AMIR_PSEAE STANDARD; PRT; 196 AA. AC P10932; DT 01-JUL-1989 (Rel. 11, Created) DT 01-JUL-1989 (Rel. 11, Last sequence update) DT 15-DEC-1998 (Rel. 37, Last annotation update) DE ALIPHATIC AMIDASE REGULATOR. GN AMIR. OS Pseudomonas aeruginosa. OC Bacteria; Proteobacteria; gamma subdivision; Pseudomonas group; OC Pseudomonas. RN [1] RP SEQUENCE FROM N.A. RC STRAIN=PAC; RX MEDLINE; 89211409. RA LOWE N., RICE P.M., DREW R.E.; RT "Nucleotide sequence of the aliphatic amidase regulator gene (amiR) RT of Pseudomonas aeruginosa."; RL FEBS Lett. 246:39-43(1989). RN [2] RP CHARACTERIZATION. RX MEDLINE; 95286483. RA WILSON S.A., DREW R.E.; RT "Transcriptional analysis of the amidase operon from Pseudomonas RT aeruginosa."; RL J. Bacteriol. 177:3052-3057(1995). CC -!- FUNCTION: POSITIVE CONTROLLING ELEMENT OF AMIE, THE GENE FOR CC ALIPHATIC AMIDASE. ACTS AS A TRANSCRIPTIONAL ANTITERMINATION CC FACTOR. IT IS THOUGHT TO ALLOW RNA POLYMERASE READ THROUGH A RHO- CC INDEPENDENT TRANSCRIPTION TERMINATOR BETWEEN THE AMIE PROMOTER AND CC GENE. CC -------------------------------------------------------------------------- CC This SWISS-PROT entry is copyright. It is produced through a collaboration CC between the Swiss Institute of Bioinformatics and the EMBL outstation - CC the European Bioinformatics Institute. There are no restrictions on its CC use by non-profit institutions as long as its content is in no way CC modified and this statement is not removed. Usage by and for commercial CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ CC or send an email to license@isb-sib.ch). CC -------------------------------------------------------------------------- DR EMBL; X13776; CAA32023.1; -. DR PIR; S03884; S03884. KW Transcription regulation; Activator. SQ SEQUENCE 196 AA; 21776 MW; 560A8AE3 CRC32; MSANSLLGSL RELQVLVLNP PGEVSDALVL QLIRIGCSVR QCWPPPEAFD VPVDVVFTSI FQNGHHDEIA ALLAAGTPRT TLVALVEYES PAVLSQIIEL ECHGVITQPL DAHRVLPVLV SARRISEEMA KLKQKTEQLQ DRIAGQARIN QAKVLLMQRH GWDEREAHQH LSREAMKRRE PILKIAQELL GNEPSA // |
The sequence will be in uppercase.
>HSFAU X65923.1 H.sapiens fau mRNA UUCCUCUUUCUCGACUCCAUCUUCGCGGUAGCUGGGACCGCCGUUCAGUCGCCAAUAUGC AGCUCUUUGUCCGCGCCCAGGAGCUACACACCUUCGAGGUGACCGGCCAGGAAACGGUCG CCCAGAUCAAGGCUCAUGUAGCCUCACUGGAGGGCAUUGCCCCGGAAGAUCAAGUCGUGC UCCUGGCAGGCGCGCCCCUGGAGGAUGAGGCCACUCUGGGCCAGUGCGGGGUGGAGGCCC UGACUACCCUGGAAGUAGCAGGCCGCAUGCUUGGAGGUAAAGUUCAUGGUUCCCUGGCCC GUGCUGGAAAAGUGAGAGGUCAGACUCCUAAGGUGGCCAAACAGGAGAAGAAGAAGAAGA AGACAGGUCGGGCUAAGCGGCGGAUGCAGUACAACCGGCGCUUUGUCAACGUUGUGCCCA CCUUUGGCAAGAAGAAGGGCCCCAAUGCCAACUCUUAAGUCUUUUGUAAUUCUGGCUUUC UCUAAUAAAAAAGCCACUUAGUUCAGUCAAAAAAAAAA |
>AMIR_PSEAE P10932 ALIPHATIC AMIDASE REGULATOR. MSANSLLGSLRELQVLVLNPPGEVSDALVLQLIRIGCSVRQCWXXPPPXXEAFDVPVDVV FTSIFQNGHHDEIAALLAAGTPRTTLVALVEYESPAVLSQIIELECHGVITQPLDAHRVL PVLVSARRISEEMAKLKQKTEQLQDRIAGQARINQAKVLLMQRHGWDEREAHQHLSREAM KRREPILKIAQELLGNEPSA |