next up previous
Next: Error Messages & Troubleshooting Up: MemExp Documentation Previous: Analysis of the Lifetime


Files Output by MemExp

Recommendation: Because MemExp generates several files, it's convenient to place each raw data file in its own subdirectory prior to analyzing it with MemExp.

MemExp produces textual and graphical summaries of each numerical inversion. Let's look at the output generated for the test case that is downloaded with the MemExp software. It results from the following command:

memexp.exe allin1 1 memc_no_errors.def dir1/data.001 0. mem_no_errors.analysis dir1/data.001_e1 memc_errors.def dir1/data.001_e2

Here, the data in file data.001 stored in subdirectory dir1 are to be analyzed. All MemExp output will be written to this subdirectory. This allin1 analysis calls for three calculations in turn: one preliminary MEM inversion, an estimation of the data's standard errors based on the appropriate MEM fit, and a final series of fits to the data by both continuous and discrete kinetic descriptions. NOTE: When MemExp is run on different platforms, small differences in round-off may propagate to produce slightly different output. Therefore, your results for the test case may not be exactly the same as the output shown here.

1. Preliminary MEM Fit with Uniform Errors

The preliminary MEM-only run is governed by the parameters read from the file memc_no_errors.def. The progress of the calculation is written to the file data.001.out. NOTE: If MemExp should abort due to incorrect input, the last few lines in the truncated .out file will indicate where the input needs to be corrected.

Information is plotted in the PostScript file data.001.out.ps every time (up to the first nine) a MEM distribution is written to disk. The calculation proceeds from top to bottom in these graphical summaries. Each row of plots characterizes the MEM distribution at one point along the calculation. On the left, the data (black), fit (colored), and baseline (dotted, if nonzero) are plotted for the last half of the time interval spanned by the measurement. This column is useful in evaluating the computed baseline. In the middle column, the data, fit, and baseline are plotted over the entire temporal range. More importantly, the residuals, $R_i = ({\cal F}_i-D_i)/\sigma_i$ (colored), and the autocorrelation of the residuals (black) are plotted. On the right, the continuous g (colored solid) and h (colored dashed) rate distributions are plotted. If F is derived from f during the MEM calculation by uniform or differential blurring, F is plotted (black solid) (See data.001_e1.out.ps below).

In each graphical summary, the fit recommended by MemExp is marked by an asterisk (*) on the plot's right hand side. Because no automated selection criteria will choose optimally in all cases, the user of MemExp should compare these recommended fits to those that immediately precede and follow them in the graphical summary. The MemExp selection criteria performed very well for realistic simulated data [1].

2. Error Estimation

Following this preliminary MEM inversion, the recommended fit is analyzed to determine error estimates according to parmeters read from the file mem_no_errors.analysis. These errors are assigned to the kinetics by writing a new file named in the MemExp command dir1/data.001_e1. The errors are also plotted in the file data.001_e1.sigma.ps Here, the root mean-square deviations (rmsd) of the data and the recommended MEM fit are plotted (squares), along with the smoothed error estimates derived from them (dashed). Also plotted at the top of the page (solid) are the deviations of the smoothed estimates from the unsmoothed rmsd values.

3. MEM / NLS Fits with Time-dependent Errors

Next, data.001_e1 (original kinetics with MEM-derived standard errors) is numerically inverted. The parameters in the file memc_errors.def call for a hybrid MEM / NLS fitting of the data (MAXEXP $>$ 0). The MEM calculation is summarized in data.001_e1.out and in data.001_e1.out.ps. Note that with the time-dependent error estimates, the residuals are more uniform in magnitude and the values of $\chi^2$ are near 1.0. Here, F was derived from the f plotted in the second row by uniform blurring, and the revised F is plotted in black in the third row. From the third row down, this new F defines maximum entropy according to equation 3.

In addition, the NLS fits by discrete exponentials are summarized in data.001_e1.exp.ps. Up to nine NLS fits can be plotted in this PostScript summary. On the right, the discrete exponentials (colored vertical lines, dashed for negative amplitudes) are plotted. Also plotted in the right column of data.001_e1.exp.ps is the continuous distribution from which the initial NLS parameter values were derived (black solid).

Once again, the errors estimated from this second MEM inversion are appended to a data file named in the memexp command line, dir1/data.001_e2. These errors are plotted in data.001_e2.sigma.ps. Convergence of the error estimates can be checked by comparing dir1/data.001_e1 and dir1/data.001_e2.

The .out Files

The .out files produced by MemExp begin with the echoing of input parameters. During inversions, these lines begin with `RDPARS', the name of the subroutine that reads the parameters. (During analyses of images, these lines begin with `ANALYZ'.) Should a MemExp run terminate prematurely due to incorrect input data, refer to the last few lines of successfully input data and a sample .def file to correct the mistake in the input data. If all inversion parameters have been read successfully, the .out file contains the line, `Exiting routine RDPARS.'

The progress of the MEM calculation is recorded every NPRINT steps. Upon writing the MEM distributions to files, peaks in the distribution are characterized. The lines in the .out files beginning with `LT_i:' and `A_i:' report the mean log lifetime and the area estimated for each peak. The isolation/resolution of each peak is also characterized by two ratios: the intensity of the peak maximum divided by the intensity at the minimum on either side of the maximum. The smaller of these two ratios is reported as `r_i'. The total area of the distribution is given by `A_t'. When MAXEXP $>$ 0, NLS fits are performed with parameters initialized based on peaks in the MEM distribution. The NLS fits are summarized by reporting the initial and final parameter values: `Initial LT_i', `Initial A_i', `Final LT_i', etc.

MemExp automatically recommends one discrete and one distributed fit. See the lines that begin `CHOOSX' and `CHOOSE', respectively.

Image Files and Others

Distributions obtained by the MEM are written to files with names derived from the data file being inverted. For example, one MEM image stored during the numerical inversion of file data.001 might be named data.001.FLT.a.110. This would be the image obtained after 110 evaluations of the function Q to be optimized by the MEM (see above). The FLT indicates that the file was generated by MemExp, i.e, that it contains $f(log \tau)$ values. The a specifies that the optional Lagrange multiplier $\alpha$ was used; an l would indicate that only the Lagrange multiplier $\lambda$ was used. When data are analyzed with both the g and h distributions (NDIST $= 2$), all the g parameters are written to the file, then all the h parameters, and finally the baseline parameters. For example, if a linear baseline is used, the final four lines in the output file are the positive constant coefficient, the negative constant coefficient, the positive linear coefficient, and the negative linear coefficient, respectively. To facilitate subsequent plotting of the results when NDIST $= 2$, the g and h distributions are written separately to data.001.FLT.a.110_pos and data.001.FLT.a.110_neg, respectively.

The $log \tau$ values and amplitudes obtained in fits by discrete exponentials are written to similar files, named data.001.FLT.a.N, where N is the number of exponentials in the fit. The associated baseline parameters are written to files named data.001.FLT.a.N_bln.

The rate of MEM convergence is also recorded in the files data.001.cor and data.001_e1.cor, to which the step number, the estimated correlation length of the residuals $\tau_c$, and $\chi^2$ are written every NPRINT evaluations of Q. These .cor files are of less interest.

Recommendation

The residuals and the autocorrelation function of the residuals plotted in the MEM and NLS PostScript summaries should be inspected visually. These plots are very helpful in evaluating the `optimal' distributed and discrete fits recommended automatically by MemExp. Inspection of the PostScript file helps strike the compromise desired between two goals: a minimally structured kinetic description and residuals that are acceptably uniform in magnitude throughout the entire temporal range of the data with a value of $\chi^2$ of about 1.0 or somewhat less. Simply choosing the final f distribution stored during the inversion will generally result in over-fitting the data; the residuals may look good but at the expense of unwarranted structure in f and an unneccessarily large number of exponentials used in the discrete fit.



next up previous
Next: Error Messages & Troubleshooting Up: MemExp Documentation Previous: Analysis of the Lifetime
Steinbach 2002-04-09