[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3. Checking Pedigree Validity


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.1 Introduction to pedcheck

pedcheck reads the pedigree file and checks for errors in the pedigree structure. Specifically, it checks for the following errors:

Note that, unlike some other packages, name identifiers must be unique across the entire data set, not only within pedigree components. If using pedigree files from other packages we recommend that new name identifiers be created if necessary, for example by combining pedigree (family) and individual identifiers: for example ‘pedname_indname’. In MORGAN names are arbitrary strings (subject to no whitespace) up to 15 characters in length to accommodate this translation easily.

If no errors are found, pedcheck reports the number of components (connected pedigrees) found and lists for each component:

If there are changes to the file, pedcheck writes an output pedigree file. Requested changes may include reordering of the pedigree chronologically (by component, then by individual), the addition of gender, the addition of an observed indicator, and reversing the order of the parental names.

Other MORGAN programs do their own pedigree checking by calling the relevant pedcheck functions, but it is still useful to do preliminary processing of data files first.

See Concept Index for: pedcheck introduction, pedigree validity, component, pedigree component.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.2 Sample pedcheck parameter file

Files for pedcheck may be found in the ‘Pedcheck’ subdirectory of ‘MORGAN_Examples’. Below is the sample parameter file ‘check.par’ for pedcheck:


 
set printlevel 5
input pedigree file 'check.ped' 
input pedigree size 30 
input pedigree record gender absent 
input pedigree record observed present 
assign gender 
output pedigree chronological 
output overwrite pedigree file 'check.oped' 
output overwrite individuals file 'indiv_oped'

The ‘assign gender’ statement requests that pedcheck determine gender, when possible, and output that information to the output pedigree file. The gender determination is made based on the default order for the listing of parents, which is father followed by mother. Individuals who are not parents will be assigned missing gender, ‘0’.

output pedigree chronological’ causes the pedigree to be sorted into chronological order in the output pedigree file, first by component, then by individual name. MORGAN refers to each connected pedigree (i.e., distinct family) in a file as a component. The first individual in the input listing who is not genealogically connected to individual 1 defines component 2, and the first who is not connected to either of these defines component 3, etc. Although pedcheck groups individuals by their MORGAN–assigned component numbers in the output pedigree file, it does not list the component numbers. That is, the first three columns of the output file are just as they were in the input file: individual name, father’s name, mother’s name.

output overwrite pedigree file 'check.oped'’ specifies the output pedigree file. The overwrite option permits a previously existing ‘check.oped’ to be overwritten. You should be cautious is using this option, in order not to overwrite files you wish to keep. However, if this option is not used, you will get an error message and the program will quit if ‘check.oped’ already exists. If this occurs, you may delete the file and try again or use another output file name.

Finally, the ‘output overwrite individuals file 'indiv_oped'’ in this version of ‘check.par’ provides for the additional creation of an individuals file. See see Creating an individuals file.

See Concept Index for: pedcheck sample parameter file, component, individuals file, overwrite file options.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.3 Running pedcheck examples

Examples for the program pedcheck are under the subdirectory ‘Pedcheck/’. The commands using example files are listed below. Have a look inside the pedigree and parameter files, then verify that the output files are as you would expect them to be. If error messages are generated, verify that they make sense and see if you can make the necessary corrections so that pedcheck will run.

./pedcheck  check.par

runs on input pedigree file ‘check.ped’. The pedigree contains no errors, but has no gender specified and is not in chronological order. Look at the parameter file: you will see that it specifies the absence of gender, and requests that gender be assigned and that the output pedigree be chronologically ordered. Then, indeed, the output pedigree file ‘check.oped’ has gender assigned and has the members reordered. Notice that individuals who are not parents (531 and 541) have missing gender, ‘0’, in the fourth column of ‘check.oped’. The overwrite option permits a previously existing ‘check.oped’ to be overwritten.

./pedcheck  imp.par

runs on input pedigree file ‘imp.ped’. The pedigree contains an individual who is his own ancestor.

./pedcheck  empty.par  ped sex.ped

runs with an empty parameter file. The input pedigree file ‘sex.ped’ is specified on the command line, and includes the set printlevel 5 request. What does the output say is wrong with this pedigree?

./pedcheck  empty.par  ped dup.ped

runs with an empty parameter file, with input pedigree file ‘dup.ped’ specified on the command line. (in this case there is no printlevel statement.) What does the output say is wrong with this pedigree?

See Concept Index for: pedcheck examples.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.4 Creating an individuals file

An individuals file may be created for those downstream programs that require it by using the output individuals file parameter statement. Inclusion of the statement

 
output overwrite individuals file "indiv.oped"

in any pedcheck parameter file will cause pedcheck to produce a list of individuals, their component numbers, and any integer and real covariate or trait information that was included in the input pedigree file.

If an individuals file with the name already exists, the program will append the output to the end of the file unless the overwrite option is specified. Since appending the output to a previous individuals file is probably not what the user wants it is highly recommended that the overwrite option is used.

The ordering of individuals is that of the output pedigree file (or input file, if this is unchanged): see File identification statements and Sample pedcheck parameter file.

As an example you may add this line into the file ‘check.par’ in the ‘Pedcheck’ subdirectory of ‘MORGAN_Examples’. Compare the output individuals file ‘indiv.oped’ with the output pedigree file ‘check.oped’.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.5 pedcheck statements

pedcheck statements apply to other MORGAN programs since the programs call the pedcheck functions first to check the pedigree file before doing computations on the pedigree data.

(assign | ignore) gender

Optional. ‘assign gender’ causes gender to be determined by parentage, whether or not gender is included in the pedigree file. ‘ignore gender’, causes the program to not check or assign gender. The default action is to assign gender when it is absent and to check gender if it is present.

output pedigree chronological

Optional. If this statement is present and if the input file is not in chronological order, the pedigree is sorted and written out in chronological order. The pedigree is sorted by components, and within each component, each non-founding member is preceded by her or his parents. If this statement is not given, the input order is preserved in the output file, if written. See the previous section of this chapter for further discussion of pedigree components.

output pedigree record (father mother | mother father)

Optional. This statement causes the parents to be named in the specified order. The default arrangement for each triplet of names is the input order.

output [overwrite] individuals file filename

An individuals file may be created for those downstream programs that require is by using this pedcheck parameter statement. See Creating an individuals file.

See Concept Index for: pedcheck statements, pedcheck statements, pedigree options, individuals –gender, pedigree order, component, individuals file.


[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

This document was generated by Elizabeth Thompson on September 6, 2019 using texi2html 1.82.