embed program

Documentation for the embed program is below, with links to related programs in the "see also" section.

{   version = 1.28; (* of embed.p 2007 Nov 15}

(* begin module describe.embed *)
(*
name
   embed: embed an aligned set of DNA sequences into random sequences

synopsis
   embed(inst: in, book: in, mkvseqs, in: ranbook, in: embedp: in,
         embedbk: out, output: out)

files
   inst:  delila instructions of the form 'get from 56 -5 to 56 +10;'

   book:  the book generated by delila using inst

   mkvseqs: random sequence output from the markov program

   ranbook: book made from random sequences using makebk program; either
            mkvseqs or ranbook must be contain sequence.  If both contain
            sequence, then mkvseqs will be used as the source for random
            sequences.

   embedp:  parameters to control the program.  The file must contain the
            following parameters, one per line:

      parameterversion:  The version number of the program.  This allows the
         user to be warned if an old parameter file is used.

      alignmenttype:  The type of alignment to use. f: first base, i: inst,
         b: book alignment
        
         'b' is to be used when 'default coordinate zero;' is used in the
         inst file, resulting in a book whose coordinates do not match the
         inst coordinates. 'i' is to be used when the book contains a normal
         coordinate system corresponding to the inst file. 'f' simply aligns
         by the first base in the book.  See alist.p for more details on
         alignmenttype.

      InFrom, InTo: the from-to range of the input sequences to be used.

      OutFrom, OutTo: the from-to range of the sequences to output.
         This includes the Infrom range AND the random sequences.

   embedbk: book created by the program. Contains the sequences embedded
            within random sequences to the specified range.

   output: messages to the user

description

   Embed embeds a given set of aligned sequences into random sequences
   having a specified range.  If there is an incomplete sequence in
   the region to be embedded, it is filled in with random sequences as
   well.

   This allows one to destroy a pattern in the aligned sequences, so
   that the sequences can be realigned to find other patterns nearby.

   The parameters OutFrom, InFrom, InTo, OutTo in embedp set the range
   to do the embedding.  In order for the program to function
   correctly, the following must be true: OutFrom <= InFrom <= InTo <=
   InFrom.  The sequence from InFrom to InTo is not changed, and
   random sequence is filled in around it from OutFrom on the left to
   OutTo on the right.  See example below.

   If the orginal sequence is longer than the range OutFrom to OutTo
   then the book will contain the embedded sequence with orginal
   sequence on either side of the random sequence.

   The program stores the random sequence as a string and then uses it
   base by base until there is no more in the string. Then it reads
   another string of random sequence.  In this way, none of the random
   sequence is "thrown away".

   If the program finds the end of mkvseqs or ranbook before it has
   embedded all the sequences, it gives a message that it is out of
   random sequence and halts.  Why doesn't the program reuse the
   random sequence?  This is not a good idea because the embedded
   sequences are designed to be fed into malign, and malign would pick
   up on this reused sequence and find unnatural sequence
   conservation.

   Aligned sequences can be viewed with the alist program.

   The random sequences are generated by the markov program.  They can
   be read from either mkvseqs or ranbook.  mkvseqs is directly
   generated from markov to a given composition and length.  Ranbook
   can be made using the makebk program.  If both files are present,
   mkvseqs is used.

   The output of this program is designed to be fed into the malign
   program for multiple alignment.

examples 

   With the following parameters from embedp the sequence would be embedded
   as shown below.

   -10 10  InFrom, InTo: range of input sequences to be used
   -30 30  OutFrom, OutTo: range of the sequences to output
   
   original:
    -----|-------------------<---------0--------->-------------------|-----
        -30                 -10                 +10                 +30
       OutFrom              InFrom              InTo               OutTo
   
   embedded:
         ********************<---------0--------->********************
        -30     random      -10     original    +10     random      +30
                sequence            sequence            sequence

   Note that if there is any sequence in the original alignment outside
   the range OutFrom to OutTo, it will be copied to the embedbk.

documentation
 
see also

   alist.p, markov.p, makebk.p, malign.p
 
author

   Elaine Bucheimer
 
bugs

   The program cannot handle sequences longer than dnamax.  This is a fixable
   bug.

   A possible future addition to the program would be to allow the user to
   specify if they want the old sequence hanging around or if the sequence
   should be chopped outside the OutFrom and OutTo coordinates.

   It appears that the 'i' option does not embed correctly.  The resulting
   book does not have the advertised coordinates.  A temporary solution is to
   use the f option with appropriate ranges.

technical notes
 
*)
(* end module describe.embed *)
{This manual page was created by makman 1.44}
{created by htmlink 1.52}