README file: Interval Censoring Programs 1. "IntervalCensoring" is a C application written by Piet Groeneboom to implement the methods discussed in the book Groeneboom, P. and Wellner, J. A. (1992) Information Bounds and Nonparametric Maximum Likelihood Estimation. Birkhauser, Basel. (go the the URL http://www.stat.washington.edu/jaw/RESEARCH/BOOKS/book2.html for more information. See also Groeneboom, P. (1996) Lectures on inverse problems}, in: Lectures on probability theory, Ecole d'Et\'e de Probabilit\'es de Saint-Flour XXIV-1994, Editor: P. Bernard. Springer Verlag, Berlin. 2. The model for "Interval-censored data" treated here means that: (a) X_1 , ... , X_n are i.i.d. with d.f. F (b) (U_1, V_1), ... , (U_n, V_n) are i.i.d. pairs with U_i < V_i , which are independent of the X_i's in (a). (c) observations consist of (U_i , V_i , D_i ), i = 1, ... , n where D_i = 1 if X_i <= U_i , D_i = 2 if U_i < X_i <= V_i , D_i = 3 if V_i < X_i . The object is to estimate the unknown d.f. F nonparametrically; the estimator computed in the program is the NPMLE (nonparametric maximum likelihood estimator). 2. Methods available in the program: ICM algorithm; the iterative convex minorant algorithm as described in Groeneboom and Wellner (1992); EM algorithm; Hybrid algorithm (alternating steps of EM and ICM), or Iterative Convex Minorant). One can choose to do ICM, EM, or Hybrid in the menu "Algorithms". The default is set on the Hybrid algorithm. One can also switch between the two by the keyboard equivalents "command-1" and "command-2". 3. Data included and generated in the application: (i) the data sets of size n = 100, 1000 called data100, and data1000, respectively, generated from a Uniform(0,1) distribution for the X_i and the Uniform distribution for (U_i,V_i) on the upper triangle of the unit square. This situation is analyzed in detail, in particular in connection with smooth functional estiamtion, in Groeneboom (1996). (ii) Monte-Carlo generation routines for the situation where the X_i are generated by a standard exponential distribution, and the (U_i,V_i) are the order statistics of a sample of size 2 from a standard exponential distribution. One can choose to use input files or to generate random samples by the menu "Samples". 4. User-specified data is allowed; input files should be in the form of three columns, the first two for U_i,V_i, and the third for the indicators D_i; see the format of the example files data100, and data1000. 5. For running one of the algorithms on either input files or randomly generated data, press "command-r" or use the menu "Run". 6. Output: (i) values of the Fenchel conditions and log-likelihood for successive iterations of the algorithm. For more information on the Fenchel conditions, see the reference above: Groeneboom and Wellner (1992). The iterations can be stopped at any time (particularly needed for the EM algorithm) by pressing "command-period" (="apple-period"). The iterations are stopped automatically if the criteria are zero in 6 decimals. The text output of the iterations can be saved if desired, the name can be specified in the dialog box that will appear. The value of the estimator at the end of the iterations is written to a file called "NPMLE" giving the NPMLE at the jump times (not at all observations). (ii) a plot of the estimated d.f. F, which can be saved as a PICT file via the menu or "command-s". The PICT file can be opened by SimpleText or applications like Adobe Illustrator or Mathematica. One can exit the plot by pressing return or clicking the "go-away" box. Because the output files are large for big n, usually bigger than 32K, the program uses a freeware replacement of the Apple toolbox Text Edit routines. At the time that the program was written (1996), the toolbox Text Edit routines still had the limitation of 32K. Since in Mac OS 8.1 SimpleText still refuses to open files bigger than 32K, we fear that the situation in this respect still has not changed, which is an absolute disgrace in this day and age! The replacement of the Apple toolbox Text Edit routines is called TE32K.c and was written by Roy Wood and Michael J. Lowe, whose contribution (given unknowingly) is gratefully acknowledged. The text files that are produced can be read by any editor that can handle text files bigger than 32K (in which case they cannot be read by SimpleText!), like BBEdit or Textures. 7. For more information on Macintosh compatible C source code for this program, please contact either Jon Wellner or Piet Groeneboom. (Copyright issues are involved in making source code available.) Written by Jon Wellner and Piet Groeneboom, June 1998.