The ASPEX programs use the ``TCL'' library for reading and parsing parameter files. TCL is a simple, flexible scripting language that is in the public domain, and is available for many different systems. The same parameter file can generally be used for all of the ASPEX programs, with each program extracting just the information it needs and ignoring the rest.
All the ASPEX programs accept the following command-line parameters:
-V
Reports the ASPEX release number for this program.
-v
Selects more verbose output. Multiple -v
options can be
specified for increasing levels of verbosity.
-q
Selects ``quiet'' output: suppresses warnings about Mendelian incompatibilities in marker data files.
-c
cmdExecutes the specified TCL command.
-f
fileExecutes TCL commands from the specified file.
The programs read parameters either from the command line or from
separate parameter files. Multiple -c
and -f
options will
be evaluated in order from left to right.
Here is a sample TCL parameter file:
# The number of marker loci
set nloc 5
# Marker names
set loc { l1 l2 l3 l4 l5 }
# Distances between markers
set dist { 0.1 0.1 0.1 0.1 0.1 0.1 }
# Sibling recurrence risk ratio
set risk 2.0
# Identity by descent probabilities: additive model
set z "[expr 0.25/$risk] 0.50 [expr 0.50-0.25/$risk]"
Parameters are specified by name (as in, ``set nloc 5
'').
Comments are preceded by the ``#'' character. The TCL interpreter
can perform basic math, as shown in the last line.
Lists of values can be grouped using braces or quotes. List elements are separated by blanks, not commas. Lists can also be split across several lines, for example:
set dist {
0.1 0.1 0.1 0.1 0.1
}
The ASPEX programs will also use default parameter files if present.
If they exist, the aspexrc1
file in the current directory will be
processed before any command line parameters, and the aspexrc2
file will be processed after the command line. The names and paths
for these parameter files can be changed by setting the ASPEXRC1
and ASPEXRC2
system environment variables.
The following parameters are common to all the ASPEX programs:
An integer: the number of marker loci. If not specified explicitly,
it will be determined from the loc
parameter.
A list of nloc
strings identifying the markers. The markers
should be listed in map order along the chromosome.
A list of strings identifying families or individuals whose data should be left out of analyses. Specify an individual using a string of the form ``X.Y'' where X is the family ID and Y is the person's ID. To omit an entire family, specify ``X.*'' where X is the family ID.
A string: the allele identifier used for missing data. The default is ``0''.
An integer: the width of the family identifiers, in characters, to use when formatting tabular output. The default is 2.
An integer: the width of the person identifiers, in characters, to use when formatting tabular output. The default is 2.
An integer: the width of the allele names, in characters, to use when formatting tabular output. The default is 2.
An integer: the width of marker names, in characters, to use when formatting tabular output. The default is 10.
A boolean value: indicates if partially typed loci (one allele known,
one blank) should be counted, or whether they should be treated
as both-blank. The default is 1 or true. The sib_tdt
program
always discards partial genotypes.
A boolean value: indicates if the allele data is for the sex chromosomes, as opposed to autosomes. The default is false (autosomal).
A string: the list of character values for the affected status field that are interpreted as ``affected''. The default is ``YyTt2''.
A string: the list of character values for the affected status field that are interpreted as ``unaffected''. The default is ``NnFf1''.
A string: the list of character values for the affected status field that are interpreted as ``unknown''. The default is ``Uu?0''.
Programs that generate exclusion maps (sib_ibd
and sib_phase
) use the following additional parameters:
A list of nloc+1
map distances between all the markers, including
distances from the end markers to the corresponding telomere. All map
distances are specified in Morgans.
These parameters are similar to the dist
parameter, but specify
sex-specific recombination maps.
Specifies the mapping function for recombination fractions. Valid values are ``Kosambi'' or ``Haldane''. The default is ``Kosambi''.
A list of three numbers: the probabilities of two siblings being
identical by descent for 0, 1, or 2 alleles, given that they are both
affected. For sex-linked data, the list should consist of only two
numbers, since identity by descent is only calculated for the maternal
alleles. These values are only used when most_likely
is false.
The maximum gap to leave between data points interpolated between markers in the lod map, in units of map distance. The default is 0.01 Morgans
A flag indicating if the gap size between map points should be fixed
at max_step
, or whether it should be allowed to vary between
markers. The default is false, which guarantees that there will be a
data point at every marker position.
A flag indicating if a maximum likelihood calculation of the sharing at each locus should be done. The default is false (don't do the calculation). When this flag is set, for each marker position or point along the map, the programs will determine the set of Z values that give the highest LOD score at that position. The corresponding % sharing and maximized LOD scores are reported.
A flag indicating if the maximum likelihood calculation should fit to a linear model, or to a two-parameter model over all possible Z values. The default is true (i.e., use just a linear model). At present, the two-parameter model does not use a ``possible triangle'' constraint.
A flag indicating if maximum likelihood calculations with a linear
model should use a model with no dominance variance. The default is
false. When false, maximum likelihood calculations assume the following
``multiplicative'' model for z
values:
z[2] = y^2
z[1] = 2*y*(1-y)
z[0] = (1-y)^2
where y
is the sharing at this locus. If no_Dv
is true,
then this model is replaced by an additive model, where z[1]
is
fixed at 0.5:
z[2] = y-0.25
z[1] = 0.5
z[0] = 0.75-y
In terms of the sibling recurrence risk ratio, lambda
, the
multiplicative model has the form:
z[2] = 1 + 0.25/lambda - 1/sqrt(lambda)
z[1] = 1/sqrt(lamda) - 0.5/lamdba
z[0] = 0.25/lambda
and the additive model has the form:
z[2] = 0.5 - 0.25/lambda
z[1] = 0.5
z[0] = 0.25/lambda
If most_likely
is turned on, then this flag indicates if sharing
should be required to be at least 50% for the likelihood maximization.
The effect is that positive LOD scores will only be indicated for
positions with greater than expected sharing. For affected sib pairs,
this is sensible because a predicted sharing of less than 50% would be
inconsistent with any simple genetic model. If count_discordant
is enabled, then the direction of truncation is reversed: sharing is
required to be no more than 50%. The default is 1 or true.
A floating point value, in LOD score units. If set, then instead of finding a maximum likelihood model, the programs will find the model farthest from the null hypothesis that has a LOD score no higher than the specified value. Thus, this finds an upper bound on the effect of a putative gene at a given position, for exclusion at this level. The default (0.0) disables the exclusion calculation. This value should never be positive.
A boolean value: indicates if only strictly independent sib pairs should be counted. Normally, for families with more than two sibs, all pairwise combinations are scored. If this flag is set, then only pairs with the first affected sib will be counted. The default is 0 or false.
A boolean value: indicates if only the first appropriate sib pair in
each family should be counted, as opposed to all pairs, or all pairs
including the first sib, as indicated by count_once
. The default
is 0 or false.
A boolean value: indicates if sib pairs should be counted where the first sib is unaffected. The default is 0 or false, i.e., count pairs with affected sibs.
A boolean value: indicates if the disease status for the second sib in a pair should be discordant with the first sib. The default is 0 or false, i.e., count pairs that are concordant for disease status.
A floating-point number: this specifies the probability of a typing
error at an arbitrary marker position. The sib_ibd
,
sib_phase
, and sib_map
programs use this to identify marker
data that is likely to represent typing errors. The method is based
on detection of unlikely recombination patterns, so it is only
effective in regions that are densely typed. When an error is
detected, all marker data at that position for that family will be
excluded from subsequent calculations. The default error frequency is
0 (meaning that all data is assumed to be correct). Reasonable values
are on the order of 0.01.
Be careful when using count_once
in conjunction with
count_discordant
. The disease status of the first member of each
pair is always determined by count_unaffected
. When
count_once
is enabled, the number of discordant sib pairs counted
will depend on whether count_unaffected
is set or not. If
count_unaffected
is false, then within each family, pairs of the
first affected sib with all unaffected sibs will be counted. If
count_unaffected
is true, then pairs of the first unaffected sib
with all affected sibs will be counted.