Format

This command is used in a data block to define the format of the character matrix. The correct usage is

format datatype=<name> ... <parameter>=<option>

The format command must be the second command in a data block. The following provides an example of the proper use of this command:

begin data;
   dimensions ntax=4 nchar=10;
   format datatype=dna gap=-;
   matrix
   taxon_1 AACGATTCGT
   taxon_2 AAGGAT--CA
   taxon_3 AACGACTCCT
   taxon_4 AAGGATTCCT
   ;
end;

Here, the format command tells MrBayes to expect a matrix with DNA characters and with gaps coded as "-".

The following are valid options for format:

Datatype -- This parameter MUST BE INCLUDED in the format command. More-over, it must be the first parameter in the line. The datatype command specifies what type of characters are in the matrix. The following are valid options:

Datatype = Dna: DNA states (A,C,G,T,R,Y,M,K,S,W,H,B,V,D,N)
Datatype = Rna: DNA states (A,C,G,U,R,Y,M,K,S,W,H,B,V,D,N)
Datatype = Protein: Amino acid states (A,R,N,D,C,Q,E,G,H,I,L,K,M,F,P,S,T,W,Y,V)
Datatype = Restriction: Restriction site (0,1) states
Datatype = Standard: Morphological (0,1) states
Datatype = Continuous: Real number valued states
Datatype = Mixed(<type>:<range>,...,<type>:<range>): A mixture of the above datatypes. For example, "datatype=mixed(dna:1-100,protein:101-200)" would specify a mixture of DNA and amino acid characters with the DNA characters occupying the first 100 sites and the amino acid characters occupying the last 100 sites.

Interleave -- This parameter specifies whether the data matrix is in interleave format. The valid options are "Yes" or "No", with "No" as the default. An interleaved matrix looks like

format datatype=dna gap=- interleave=yes;
matrix
taxon_1 AACGATTCGT
taxon_2 AAGGAT--CA
taxon_3 AACGACTCCT
taxon_4 AAGGATTCCT

taxon_1 CCTGGTAC
taxon_2 CCTGGTAC
taxon_3 ---GGTAG
taxon_4 ---GGTAG
;

Gap -- This parameter specifies the format for gaps. Note that gap character can only be a single character and that it cannot correspond to a standard state (e.g., A,C,G,T,R,Y,M,K,S,W,H,B,V,D,N for nucleotide data).

Missing -- This parameter specifies the format for missing data. Note that the missing character can only be a single character and cannot correspond to a standard state (e.g.,A,C,G,T,R,Y,M,K,S,W,H,B,V,D,N for nucleotide data). This is often an unnecessary parameter to set because many data types, such as nucleotide or amino acid, already have a missing character specified. However, for morphological or restriction site data, "missing=?" is often used to specify ambiguity or unobserved data.

Matchchar -- This parameter specifies the matching character for the matrix. For example,

format datatype=dna gap=- matchchar=.;
matrix
taxon_1 AACGATTCGT
taxon_2 ..G...--CA
taxon_3 .....C..C.
taxon_4 ..G.....C.
;

is equivalent to

format datatype=dna gap=-;
matrix
taxon_1 AACGATTCGT
taxon_2 AAGGAT--CA
taxon_3 AACGACTCCT
taxon_4 AAGGATTCCT
;

The only non-standard NEXUS format option is the use of the "mixed", "restriction", "standard" and "continuous" datatypes. Hence, if you use any of these datatype specifiers, a program like PAUP* or MacClade will report an error (as they should because MrBayes is not strictly NEXUS compliant).

Return to Help Menu.