|
|
biosed |
biosed was inspired by the useful UNIX utility sed which searches for a pattern in text and can replace or delete the found pattern.
If the target subsequence occurs more than once, then each instance of the target is replaced.
The target subsequence is not any sort of an ambiguity pattern, it is just a short sequence. A simple string match is done and if it exactly matches then the replacement is done. The matching is independent of the case of the sequence or the target - both uppercase and lowercase will match.
Replace all 'T's with 'U's to create an RNA sequence
% biosed tembl:hsfau hsfau.rna -target T -replace U Replace or delete sequence sections |
Go to the input files for this example
Go to the output files for this example
Example 2
Replace all 'PPP' protein motifs with 'XXPPPXX'
% biosed tsw:AMIR_PSEAE AMIR_PSEAE.pep -target PPP -replace XXPPPXX Replace or delete sequence sections |
Go to the input files for this example
Go to the output files for this example
Standard (Mandatory) qualifiers (* if not always prompted):
[-sequence] seqall Sequence database USA
-target string Sequence section to match
* -replace string Replacement sequence section
[-outseq] seqout Output sequence USA
Additional (Optional) qualifiers: (none)
Advanced (Unprompted) qualifiers:
-delete toggle Delete the target sequence sections
Associated qualifiers:
"-sequence" associated qualifiers
-sbegin1 integer Start of each sequence to be used
-send1 integer End of each sequence to be used
-sreverse1 boolean Reverse (if DNA)
-sask1 boolean Ask for begin/end/reverse
-snucleotide1 boolean Sequence is nucleotide
-sprotein1 boolean Sequence is protein
-slower1 boolean Make lower case
-supper1 boolean Make upper case
-sformat1 string Input sequence format
-sdbname1 string Database name
-sid1 string Entryname
-ufo1 string UFO features
-fformat1 string Features format
-fopenfile1 string Features file name
"-outseq" associated qualifiers
-osformat2 string Output seq format
-osextension2 string File name extension
-osname2 string Base file name
-osdirectory2 string Output directory
-osdbname2 string Database name to add
-ossingle2 boolean Separate file for each entry
-oufo2 string UFO features
-offormat2 string Features format
-ofname2 string Features file name
-ofdirectory2 string Output directory
General qualifiers:
-auto boolean Turn off prompts
-stdout boolean Write standard output
-filter boolean Read standard input, write standard output
-options boolean Prompt for standard and additional values
-debug boolean Write debug output to program.dbg
-verbose boolean Report some/full command line options
-help boolean Report command line options. More
information on associated and general
qualifiers can be found with -help -verbose
-warning boolean Report warnings
-error boolean Report errors
-fatal boolean Report fatal errors
-die boolean Report deaths
|
| Standard (Mandatory) qualifiers | Allowed values | Default | |
|---|---|---|---|
| [-sequence] (Parameter 1) |
Sequence database USA | Readable sequence(s) | Required |
| -target | Sequence section to match | Any string is accepted | N |
| -replace | Replacement sequence section | Any string is accepted | A |
| [-outseq] (Parameter 2) |
Output sequence USA | Writeable sequence | <sequence>.format |
| Additional (Optional) qualifiers | Allowed values | Default | |
| (none) | |||
| Advanced (Unprompted) qualifiers | Allowed values | Default | |
| -delete | Delete the target sequence sections | Toggle value Yes/No | No |
ID HSFAU standard; RNA; HUM; 518 BP.
XX
AC X65923;
XX
SV X65923.1
XX
DT 13-MAY-1992 (Rel. 31, Created)
DT 23-SEP-1993 (Rel. 37, Last updated, Version 10)
XX
DE H.sapiens fau mRNA
XX
KW fau gene.
XX
OS Homo sapiens (human)
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Primates; Catarrhini; Hominidae; Homo.
XX
RN [1]
RP 1-518
RA Michiels L.M.R.;
RT ;
RL Submitted (29-APR-1992) to the EMBL/GenBank/DDBJ databases.
RL L.M.R. Michiels, University of Antwerp, Dept of Biochemistry,
RL Universiteisplein 1, 2610 Wilrijk, BELGIUM
XX
RN [2]
RP 1-518
RX MEDLINE; 93368957.
RA Michiels L., Van der Rauwelaert E., Van Hasselt F., Kas K., Merregaert J.;
RT " fau cDNA encodes a ubiquitin-like-S30 fusion protein and is expressed as
RT an antisense sequences in the Finkel-Biskis-Reilly murine sarcoma virus";
RL Oncogene 8:2537-2546(1993).
XX
DR SWISS-PROT; P35544; UBIM_HUMAN.
DR SWISS-PROT; Q05472; RS30_HUMAN.
XX
FH Key Location/Qualifiers
FH
FT source 1..518
FT /chromosome="11q"
FT /db_xref="taxon:9606"
FT /organism="Homo sapiens"
FT /tissue_type="placenta"
FT /clone_lib="cDNA"
FT /clone="pUIA 631"
FT /map="13"
FT misc_feature 57..278
FT /note="ubiquitin like part"
FT CDS 57..458
FT /db_xref="SWISS-PROT:P35544"
FT /db_xref="SWISS-PROT:Q05472"
FT /gene="fau"
FT /protein_id="CAA46716.1"
FT /translation="MQLFVRAQELHTFEVTGQETVAQIKAHVASLEGIAPEDQVVLLAG
FT APLEDEATLGQCGVEALTTLEVAGRMLGGKVHGSLARAGKVRGQTPKVAKQEKKKKKTG
FT RAKRRMQYNRRFVNVVPTFGKKKGPNANS"
FT misc_feature 98..102
FT /note="nucleolar localization signal"
FT misc_feature 279..458
FT /note="S30 part"
FT polyA_signal 484..489
FT polyA_site 509
XX
SQ Sequence 518 BP; 125 A; 139 C; 148 G; 106 T; 0 other;
ttcctctttc tcgactccat cttcgcggta gctgggaccg ccgttcagtc gccaatatgc 60
agctctttgt ccgcgcccag gagctacaca ccttcgaggt gaccggccag gaaacggtcg 120
cccagatcaa ggctcatgta gcctcactgg agggcattgc cccggaagat caagtcgtgc 180
tcctggcagg cgcgcccctg gaggatgagg ccactctggg ccagtgcggg gtggaggccc 240
tgactaccct ggaagtagca ggccgcatgc ttggaggtaa agttcatggt tccctggccc 300
gtgctggaaa agtgagaggt cagactccta aggtggccaa acaggagaag aagaagaaga 360
agacaggtcg ggctaagcgg cggatgcagt acaaccggcg ctttgtcaac gttgtgccca 420
cctttggcaa gaagaagggc cccaatgcca actcttaagt cttttgtaat tctggctttc 480
tctaataaaa aagccactta gttcagtcaa aaaaaaaa 518
//
|
ID AMIR_PSEAE STANDARD; PRT; 196 AA.
AC P10932;
DT 01-JUL-1989 (Rel. 11, Created)
DT 01-JUL-1989 (Rel. 11, Last sequence update)
DT 15-DEC-1998 (Rel. 37, Last annotation update)
DE ALIPHATIC AMIDASE REGULATOR.
GN AMIR.
OS Pseudomonas aeruginosa.
OC Bacteria; Proteobacteria; gamma subdivision; Pseudomonas group;
OC Pseudomonas.
RN [1]
RP SEQUENCE FROM N.A.
RC STRAIN=PAC;
RX MEDLINE; 89211409.
RA LOWE N., RICE P.M., DREW R.E.;
RT "Nucleotide sequence of the aliphatic amidase regulator gene (amiR)
RT of Pseudomonas aeruginosa.";
RL FEBS Lett. 246:39-43(1989).
RN [2]
RP CHARACTERIZATION.
RX MEDLINE; 95286483.
RA WILSON S.A., DREW R.E.;
RT "Transcriptional analysis of the amidase operon from Pseudomonas
RT aeruginosa.";
RL J. Bacteriol. 177:3052-3057(1995).
CC -!- FUNCTION: POSITIVE CONTROLLING ELEMENT OF AMIE, THE GENE FOR
CC ALIPHATIC AMIDASE. ACTS AS A TRANSCRIPTIONAL ANTITERMINATION
CC FACTOR. IT IS THOUGHT TO ALLOW RNA POLYMERASE READ THROUGH A RHO-
CC INDEPENDENT TRANSCRIPTION TERMINATOR BETWEEN THE AMIE PROMOTER AND
CC GENE.
CC --------------------------------------------------------------------------
CC This SWISS-PROT entry is copyright. It is produced through a collaboration
CC between the Swiss Institute of Bioinformatics and the EMBL outstation -
CC the European Bioinformatics Institute. There are no restrictions on its
CC use by non-profit institutions as long as its content is in no way
CC modified and this statement is not removed. Usage by and for commercial
CC entities requires a license agreement (See http://www.isb-sib.ch/announce/
CC or send an email to license@isb-sib.ch).
CC --------------------------------------------------------------------------
DR EMBL; X13776; CAA32023.1; -.
DR PIR; S03884; S03884.
KW Transcription regulation; Activator.
SQ SEQUENCE 196 AA; 21776 MW; 560A8AE3 CRC32;
MSANSLLGSL RELQVLVLNP PGEVSDALVL QLIRIGCSVR QCWPPPEAFD VPVDVVFTSI
FQNGHHDEIA ALLAAGTPRT TLVALVEYES PAVLSQIIEL ECHGVITQPL DAHRVLPVLV
SARRISEEMA KLKQKTEQLQ DRIAGQARIN QAKVLLMQRH GWDEREAHQH LSREAMKRRE
PILKIAQELL GNEPSA
//
|
The sequence will be in uppercase.
>HSFAU X65923.1 H.sapiens fau mRNA UUCCUCUUUCUCGACUCCAUCUUCGCGGUAGCUGGGACCGCCGUUCAGUCGCCAAUAUGC AGCUCUUUGUCCGCGCCCAGGAGCUACACACCUUCGAGGUGACCGGCCAGGAAACGGUCG CCCAGAUCAAGGCUCAUGUAGCCUCACUGGAGGGCAUUGCCCCGGAAGAUCAAGUCGUGC UCCUGGCAGGCGCGCCCCUGGAGGAUGAGGCCACUCUGGGCCAGUGCGGGGUGGAGGCCC UGACUACCCUGGAAGUAGCAGGCCGCAUGCUUGGAGGUAAAGUUCAUGGUUCCCUGGCCC GUGCUGGAAAAGUGAGAGGUCAGACUCCUAAGGUGGCCAAACAGGAGAAGAAGAAGAAGA AGACAGGUCGGGCUAAGCGGCGGAUGCAGUACAACCGGCGCUUUGUCAACGUUGUGCCCA CCUUUGGCAAGAAGAAGGGCCCCAAUGCCAACUCUUAAGUCUUUUGUAAUUCUGGCUUUC UCUAAUAAAAAAGCCACUUAGUUCAGUCAAAAAAAAAA |
>AMIR_PSEAE P10932 ALIPHATIC AMIDASE REGULATOR. MSANSLLGSLRELQVLVLNPPGEVSDALVLQLIRIGCSVRQCWXXPPPXXEAFDVPVDVV FTSIFQNGHHDEIAALLAAGTPRTTLVALVEYESPAVLSQIIELECHGVITQPLDAHRVL PVLVSARRISEEMAKLKQKTEQLQDRIAGQARINQAKVLLMQRHGWDEREAHQHLSREAM KRREPILKIAQELLGNEPSA |