Next Previous Contents

11. Other useful add-ons

These utilities are all written in ``Perl'', a widely available public domain interpreted programming language. Some of them work by running one of the other programs in this package and digesting the output one way or another. Others are useful for manipulating genetic marker data files.

For more information about Perl, or to find out how to get it, see http://www.perl.com.

11.1 The ligate filter

UNIX command syntax:

ligate [-r] [-a] [-n] [-f] [-p] [-b blank] [-c markers] [-d markers] [marker_data ...]

The ligate filter can be used to cut and paste linkage data files together, and to convert between common file format variants. Any number of files can be specified, in either Risch or LINKAGE format, in any combination. The default output is a LINKAGE format file formed by merging all the input data.

If a linkage file is missing the ASPEX-style header line listing the marker names, then marker names ``M1.1'', ``M1.2'', etc will be generated automatically. The next such file would get marker names ``M2.1'', ``M2.2'', and so on.

Command line parameters:

-r

Specifies that the data should be written in Risch format, as opposed to LINKAGE format.

-a

Specifies ``alphabetic'' codes for gender and blanks. Numeric gender codes (1, 2) will be translated to ``m'' and ``f'', and blank parents and alleles will be coded as ``x''.

-n

Specifies ``numeric'' codes for gender and blanks. Alphabetic gender codes (``m'', ``f'') will be translated to the corresponding numbers, and blanks will be coded as ``0''.

-b blank

Specifies the code for the blank allele. The default is ``0''.

-f blank

Specifies that new rows (with all blank genotype data) should be created in the output file for untyped parents. ASPEX does not require that an untyped parent be listed in the data file, but some linkage programs do.

-p blank

Specifies that the original allele codes should be ``packed'' and recoded as 1, 2, ... for each marker. Some linkage programs require that alleles be coded this way, rather than, say, as raw allele sizes.

-c markers

The list of markers will be included in the output file, in the order given, and all other markers will be omitted.

-d markers

The specified list of markers will be deleted from the output file.

11.2 The restrict filter

UNIX command syntax:

restrict [-b blank] [-kids [+-]n] [-sick [+-]n] [-well [+-]n] [-parents [+-]n] [-onlysick] [-pos m[-n]] [-sib m[-,]n] [-keep id...] [-remove id...] [marker_data]

The restrict filter selects families from a linkage data file based on family structure. It is structured somewhat like the UNIX find command. This filter replaces the ``*_kids'', ``*_parent'', and ``only_sick'' filters in previous ASPEX releases.

Command line parameters:

-b blank

Specifies the code for the blank allele.

-parents [+-]n

Only families with the specified number of typed parents will be accepted. A parameter of the form ``+n'' selects for families with at least ``n'' typed parents; ``-n'' selects for no more than ``n''; and a plain number selects for an exact match.

-kids [+-]n

Selects families based on the total number of children.

-sick [+-]n

Selects families based on the number of affected children.

-well [+-]n

Selects families based on the number of unaffected of children.

-onlysick

Specifies that only affected children should be included in the output.

-pos m[-n]

For families meeting the other criteria, this selects based on family order in the input file. Either a single family number or a range of family numbers can be specified.

-sib m[-,]n

For families meeting the other criteria, this selects specific siblings from each family. Sibs are numbered starting at 1 in their order in the input file. Either ranges (``m-n'') or specific pairs (``m,n'') can be selected.

-keep id...]

Only include families with the specified ID's. The ID list may be space- or comma-delimited; if space-delimited, it should be enclosed in quotes. An ID can either be just a family ID, or a family ID and person ID separated by a period.

-remove id...]

Remove individuals with the specified ID's, with the same format rules as -keep.

11.3 The list_untyped filter

UNIX command syntax:

list_untyped [blank_allele] [< marker_data]

DOS command syntax:

perl list_unt [blank_allele] [< marker_data]

This filter scans a file of marker data in sib_ibd format, and extracts any individual that has an untyped or partially typed marker. It accepts one argument:

blank_allele

Specifies the allele that indicates missing data. The default is '0'.

11.4 The list_incompat filter

UNIX command syntax:

list_incompat [-c cmd] [-f file] [< marker_data]

DOS command syntax:

perl list_inc [-c cmd] [-f file] [< marker_data]

The list_incompat filter has exactly the same syntax as the sib_ibd program, except that the -v option is not used. It runs the sib_ibd program, and filters the allele inheritance data to produce a listing of all siblings whose genotypes are incompatible with their parents.

11.5 The rec_dist filter

UNIX command syntax:

rec_dist [-v] [-c cmd] [-f file] [< marker_data]

The rec_dist filter has the same syntax as sib_ibd. It processes the output of sib_ibd and derives estimates of the recombination fractions between all pairs of markers. For compactness, the default output is a matrix showing the recombination fraction multiplied by 1000.

If -v is specified, then a table is generated showing, for each marker pair, the number of times the IBD state was known to be the same at those two positions, and the number of times it differed, with the corresponding recombination fraction.

11.6 The xmgr_map script

UNIX command syntax:

xmgr_map file

DOS command syntax:

perl xmgr_map file

The xmgr_map script is used with the xmgr plotting program to generate nicely formatted multipoint exclusion maps. It reads a sib_ibd parameter file, and outputs a set of xmgr commands to label the X axis with the marker names at the appropriate positions. The output of xmgr_map should be appended to the output of sib_ibd to create a complete xmgr input file.

The xmgr program is a completely separate public domain graphing program for Unix systems. Current information about xmgr can be found at http://plasma-gate.weizmann.ac.il/Xmgr/.


Next Previous Contents