Next Previous Contents

6. The sib_tdt program

This program reads a file of genotype data for nuclear families consisting of two parents and one or more ``affected'' siblings. It reports frequencies for all alleles in the parents, and allele transmission frequencies for transmission disequilibrium (TDT) tests.

This version of sib_tdt will calculate empirical probabilities for chi-squared statistics, which will accurately reflect association independent of linkage within families. This calculation is done by permuting parent alleles while fixing the IBD status of sibs within a family.

There are two run-time parameters that affect the accuracy of p-values calculated using the permutation procedure: min_reps and max_reps. For each replicate dataset, a TDT score is calculated and compared with the score for the actual data, and the empirical p-value equals the number of replicate scores that exceed the actual score, divided by the total number of replicates. The algorithm will generate new replicates until the numerator exceeds min_reps, or until the denominator exceeds max_reps, whichever happens first.

6.1 Command syntax

sib_tdt [-v] [-c cmd] [-f file] [marker_data ...] [> tdt_data]

One or more marker data files can be listed on the command line. If no files are specified, marker data will be read from standard input. The TDT listing will be sent to standard output.

6.2 TCL parameters

The following parameters should be specified using TCL commands via either the -c or -f mechanisms:

one_sib

A boolean value: indicates that transmissions to just the first sib in each family should be scored. The default is false.

min_reps

During p-value estimation, stop generating replicates when at least this many have scores larger than the observed TDT score. This determines the accuracy of large p-values. The default is 1000.

max_reps

During p-value estimation, never generate more than this number of sample replicates. This determines the accuracy of small p-values. The default is 40000.

Here is a sample parameter file for sib_tdt:

set loc { "l1" "l2" "l3" "l4" }
set blank "00"
set discard_partial true

6.3 Output

The default non-verbose output is a table summarizing sample coverage and chi-squared statistics for each marker. The table lists, for each position, the number of distinct alleles, the heterozygosity based on just typed parents, and the overall percentage of typed individuals. The TDT results are summarized by the sum of chi-squared statistics for transmission of all alleles, and the maximum chi-squared obtained for any one allele, and the corresponding estimated p-values.

If sex_split is true, then separate chi-squared scores and p-values are reported for maternal and paternal transmissions.

For the default min_reps and max_reps settings, p-values are accurate to within about 5%, but gradually become less accurate for p<0.01.

The sum statistic and maximum statistic are generally similar, but the sum statistic is more sensitive to cases where multiple alleles are in disequilibrium.

If verbose (-v) output is selected, the summary table will be replaced by allele transmission tables for each marker, with the form:

[al] [n] [%]  [ft] [fn] [fc]  [mt] [mn] [mc]  [st] [sn] [sc]

where [al] is the allele name, [n] is the number of times it is seen in the parents, and [%] is the percent frequency. [ft] is the number of times the allele was transmitted through the father, [fn] is the number of times it was not transmitted, and [fc] is the chi-squared score for this outcome. [mt], [mn], and [mc] are the same, for the mother. Likewise, [st], [sn], and [sc] combine the results for both parents.

The combined counts may be larger than the sum of the counts for the two parents. If two parents and a child are all heterozygous with the same genotype at a given position, one copy of each of the child's alleles was transmitted and the other was not. These cases are added into the ``combined'' totals.

If only one parent is typed at a particular marker, transmission through that parent will be scored in cases where it is not biased by allele frequencies. This reduces to cases where the parent and child are both heterozygous, but have only one allele in common (i.e., parent AB, child AC). This differs from treatment in previous versions: prior to version 1.11, single-parent transmissions were scored (incorrectly) even in situations that were biased, and from version 1.11 through 1.16, single-parent cases were never scored.

If very verbose (-vv) output is selected, the TDT results are replaced by a detailed listing of allele transmissions for each child. The listing indicates which allele was inherited from which parent, for all children. In the special case described above where sib_tdt cannot assign the transmitted alleles to specific parents, the allele pair is enclosed in square brackets.


Next Previous Contents