freak

 

Function

Residue/base frequency table or plot

Description

freak takes one or more sequences as input and a set of bases or residues to search for. It then calculates the frequency of these bases/residues in a window as it moves along the sequence. The frequency is output to a data file or (optionally) plotted.

The default set of bases is 'cg' which will calculate the frequency of 'G' + 'C' bases within the default moving window of 30 bases.

Usage

Here is a sample session with freak


% freak tembl:hsfau 
Residue/base frequency table or plot
Residue letters [gc]: 
Output file [hsfau.freak]: 

Go to the input files for this example
Go to the output files for this example

Command line arguments

   Standard (Mandatory) qualifiers (* if not always prompted):
  [-seqall]            seqall     Sequence database USA
   -letters            string     Residue letters
*  -graph              xygraph    Graph type
*  -outfile            outfile    Output file name

   Additional (Optional) qualifiers:
   -step               integer    Stepping value
   -window             integer    Averaging window

   Advanced (Unprompted) qualifiers:
   -plot               toggle     Produce graphic

   Associated qualifiers:

   "-seqall" associated qualifiers
   -sbegin1             integer    Start of each sequence to be used
   -send1               integer    End of each sequence to be used
   -sreverse1           boolean    Reverse (if DNA)
   -sask1               boolean    Ask for begin/end/reverse
   -snucleotide1        boolean    Sequence is nucleotide
   -sprotein1           boolean    Sequence is protein
   -slower1             boolean    Make lower case
   -supper1             boolean    Make upper case
   -sformat1            string     Input sequence format
   -sdbname1            string     Database name
   -sid1                string     Entryname
   -ufo1                string     UFO features
   -fformat1            string     Features format
   -fopenfile1          string     Features file name

   "-graph" associated qualifiers
   -gprompt             boolean    Graph prompting
   -gtitle              string     Graph title
   -gsubtitle           string     Graph subtitle
   -gxtitle             string     Graph x axis title
   -gytitle             string     Graph y axis title
   -goutfile            string     Output file for non interactive displays
   -gdirectory          string     Output directory

   "-outfile" associated qualifiers
   -odirectory          string     Output directory

   General qualifiers:
   -auto                boolean    Turn off prompts
   -stdout              boolean    Write standard output
   -filter              boolean    Read standard input, write standard output
   -options             boolean    Prompt for standard and additional values
   -debug               boolean    Write debug output to program.dbg
   -verbose             boolean    Report some/full command line options
   -help                boolean    Report command line options. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose
   -warning             boolean    Report warnings
   -error               boolean    Report errors
   -fatal               boolean    Report fatal errors
   -die                 boolean    Report deaths


Standard (Mandatory) qualifiers Allowed values Default
[-seqall]
(Parameter 1)
Sequence database USA Readable sequence(s) Required
-letters Residue letters Any string is accepted gc
-graph Graph type EMBOSS has a list of known devices, including postscript, ps, hpgl, hp7470, hp7580, meta, colourps, cps, xwindows, x11, tektronics, tekt, tek4107t, tek, none, null, text, data, xterm, png, xml EMBOSS_GRAPHICS value, or x11
-outfile Output file name Output file <sequence>.freak
Additional (Optional) qualifiers Allowed values Default
-step Stepping value Any integer value 1
-window Averaging window Any integer value 30
Advanced (Unprompted) qualifiers Allowed values Default
-plot Produce graphic Toggle value Yes/No No

Input file format

freak reads normal sequence USAs.

Input files for usage example

'tembl:hsfau' is a sequence entry in the example nucleic acid database 'tembl'

Database entry: tembl:hsfau

ID   HSFAU      standard; RNA; HUM; 518 BP.
XX
AC   X65923;
XX
SV   X65923.1
XX
DT   13-MAY-1992 (Rel. 31, Created)
DT   23-SEP-1993 (Rel. 37, Last updated, Version 10)
XX
DE   H.sapiens fau mRNA
XX
KW   fau gene.
XX
OS   Homo sapiens (human)
OC   Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC   Eutheria; Primates; Catarrhini; Hominidae; Homo.
XX
RN   [1]
RP   1-518
RA   Michiels L.M.R.;
RT   ;
RL   Submitted (29-APR-1992) to the EMBL/GenBank/DDBJ databases.
RL   L.M.R. Michiels, University of Antwerp, Dept of Biochemistry,
RL   Universiteisplein 1, 2610 Wilrijk, BELGIUM
XX
RN   [2]
RP   1-518
RX   MEDLINE; 93368957.
RA   Michiels L., Van der Rauwelaert E., Van Hasselt F., Kas K., Merregaert J.;
RT   " fau cDNA encodes a ubiquitin-like-S30 fusion protein and is expressed as
RT   an antisense sequences in the Finkel-Biskis-Reilly murine sarcoma virus";
RL   Oncogene 8:2537-2546(1993).
XX
DR   SWISS-PROT; P35544; UBIM_HUMAN.
DR   SWISS-PROT; Q05472; RS30_HUMAN.
XX
FH   Key             Location/Qualifiers
FH
FT   source          1..518
FT                   /chromosome="11q"
FT                   /db_xref="taxon:9606"
FT                   /organism="Homo sapiens"
FT                   /tissue_type="placenta"
FT                   /clone_lib="cDNA"
FT                   /clone="pUIA 631"
FT                   /map="13"
FT   misc_feature    57..278
FT                   /note="ubiquitin like part"
FT   CDS             57..458
FT                   /db_xref="SWISS-PROT:P35544"
FT                   /db_xref="SWISS-PROT:Q05472"
FT                   /gene="fau"
FT                   /protein_id="CAA46716.1"
FT                   /translation="MQLFVRAQELHTFEVTGQETVAQIKAHVASLEGIAPEDQVVLLAG
FT                   APLEDEATLGQCGVEALTTLEVAGRMLGGKVHGSLARAGKVRGQTPKVAKQEKKKKKTG
FT                   RAKRRMQYNRRFVNVVPTFGKKKGPNANS"
FT   misc_feature    98..102
FT                   /note="nucleolar localization signal"
FT   misc_feature    279..458
FT                   /note="S30 part"
FT   polyA_signal    484..489
FT   polyA_site      509
XX
SQ   Sequence 518 BP; 125 A; 139 C; 148 G; 106 T; 0 other;
     ttcctctttc tcgactccat cttcgcggta gctgggaccg ccgttcagtc gccaatatgc        60
     agctctttgt ccgcgcccag gagctacaca ccttcgaggt gaccggccag gaaacggtcg       120
     cccagatcaa ggctcatgta gcctcactgg agggcattgc cccggaagat caagtcgtgc       180
     tcctggcagg cgcgcccctg gaggatgagg ccactctggg ccagtgcggg gtggaggccc       240
     tgactaccct ggaagtagca ggccgcatgc ttggaggtaa agttcatggt tccctggccc       300
     gtgctggaaa agtgagaggt cagactccta aggtggccaa acaggagaag aagaagaaga       360
     agacaggtcg ggctaagcgg cggatgcagt acaaccggcg ctttgtcaac gttgtgccca       420
     cctttggcaa gaagaagggc cccaatgcca actcttaagt cttttgtaat tctggctttc       480
     tctaataaaa aagccactta gttcagtcaa aaaaaaaa                               518
//

Output file format

Output files for usage example

File: hsfau.freak

FREAK of HSFAU from 1 to 518 Window 30 Step 1

1          0.500000
2          0.533333
3          0.566667
4          0.533333
5          0.533333
6          0.566667
7          0.566667
8          0.566667
9          0.600000
10         0.633333
11         0.633333
12         0.666667
13         0.666667
14         0.666667
15         0.666667
16         0.633333
17         0.666667
18         0.633333
19         0.633333
20         0.633333
21         0.666667
22         0.666667
23         0.700000
24         0.733333
25         0.700000
26         0.666667
27         0.633333
28         0.600000
29         0.566667
30         0.600000
31         0.633333
32         0.600000
33         0.600000
34         0.633333
35         0.600000
36         0.600000
37         0.566667
38         0.566667
39         0.533333
40         0.533333
41         0.500000
42         0.500000
43         0.500000
44         0.500000
45         0.533333
46         0.566667
47         0.566667
48         0.600000


  [Part of this file has been deleted for brevity]

439        0.433333
440        0.400000
441        0.366667
442        0.333333
443        0.333333
444        0.300000
445        0.333333
446        0.366667
447        0.400000
448        0.366667
449        0.333333
450        0.300000
451        0.333333
452        0.333333
453        0.333333
454        0.333333
455        0.300000
456        0.300000
457        0.300000
458        0.300000
459        0.300000
460        0.266667
461        0.266667
462        0.233333
463        0.233333
464        0.266667
465        0.300000
466        0.333333
467        0.300000
468        0.333333
469        0.333333
470        0.333333
471        0.333333
472        0.366667
473        0.333333
474        0.333333
475        0.333333
476        0.300000
477        0.300000
478        0.300000
479        0.333333
480        0.333333
481        0.300000
482        0.300000
483        0.266667
484        0.266667
485        0.266667
486        0.266667
487        0.266667
488        0.266667
489        0.266667

The ouput consists of a title line and then two columns containing the position of the start of the window and then the frequency in that window of the bases or residues being searched for.

Data files

None.

Notes

None.

References

None.

Warnings

None.

Diagnostic Error Messages

None.

Exit status

It always exits with status 0.

Known bugs

None.

See also

Program nameDescription
backtranseqBack translate a protein sequence
bananaBending and curvature plot in B-DNA
btwistedCalculates the twisting in a B-DNA sequence
chaosCreate a chaos game representation plot for a sequence
chargeProtein charge plot
checktransReports STOP codons and ORF statistics of a protein
compseqCount composition of dimer/trimer/etc words in a sequence
danCalculates DNA RNA/DNA melting temperature
emowseProtein identification by mass spectrometry
iepCalculates the isoelectric point of a protein
isochorePlots isochores in large DNA sequences
mwcontamShows molwts that match across a set of files
mwfilterFilter noisy molwts from mass spec output
octanolDisplays protein hydropathy
pepinfoPlots simple amino acid properties in parallel
pepstatsProtein statistics
pepwindowDisplays protein hydropathy
pepwindowallDisplays protein hydropathy of a set of sequences
sirnaFinds siRNA duplexes in mRNA
wordcountCounts words of a specified size in a DNA sequence

Author(s)

Alan Bleasby (ajb © ebi.ac.uk)
European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK

History

Written (Aug 2000) Alan Bleasby

Target users

This program is intended to be used by everyone and everything, from naive users to embedded scripts.

Comments

None