mcs | publications | abstracts

1993 Abstracts of MCS Reports and Preprints


Reports

C. H. Bischof, ``A Summary of Block Schemes for Reducing a General Matrix to Hessenberg Form,'' ANL/MCS-TM-175 (February 1993). Various strategies have been proposed for arriving at block algorithms for reducing a general matrix to Hessenberg form by means of orthogonal similarity transformations. This paper reviews and systematically categorizes the various strategies and discusses their computational characteristics.
C. H. Bischof, A. Griewank, and P. M. Khademi, ``Workshop Report on First Theory Institute on Computational Differentiation,'' ANL/MCS-TM-183 (December 1993). The first Theory Institute on Combinatorial Challenges in Computational Differentiation was held at Argonne National Laboratory, May 24-26, 1993. The primary purposes of the meeting were to explore the deep complexity issues that lie at the heart of the computation of derivatives from computer programs, and to provide a forum for brainstorming on future research directions, including the applications of automatic differentiation (AD) in scientific computing and the development of AD tools.
G. H. Chisholm, ``Toward a Methodology for Complexity Management,'' ANL-93/1 (December 1992). This report focuses on the Battle Management/Command, Control, and Communication element of the Global Protection Against Limited Strike system. The approach is based on the development and validation of a generic BM/C3 model. Central to the approach is the tenet that the design is divided into multiple layers. The critical functions make up the bottom layer, where trust is established and significant design effort is required.
A. J. Conley, ``Using a Transfer Function to Describe the Load-Balancing Problem," ANL-93/40 (November 1993). The dynamic load-balancing problem for mesh-connected parallel computers can be clearly described by introducing a function that identifies how much work is to be transmitted between neighboring processors. This function is a solution to an elliptic problem for which a wealth of knowledge exists. The nonuniqueness of the solution to the load-balancing problem is made explicit.
I. Foster, ``A Toolkit for Building Earth System Models,'' ANL/MCS-TM-171 (March 1993). An earth system model is a computer code designed to simulate the interrelated processes that determine the earth's weather and climate, such as atmospheric circulation, atmospheric physics, atmospheric chemistry, oceanic circulation, and biosphere. This paper is a proposal to develop a toolkit that would support a modular, or object-oriented, approach to the implementation of such models.
I. Foster and K. M. Chandy, ``Fortran M Language Definition,'' ANL-93/28 (August 1993). This document defines the Fortran M extensions to Fortran 77. It updates an earlier definition, dated June 1992, in several minor respects.
I. Foster, R. Olson, and S. Tuecke, ``Programming in Fortran M,'' ANL-93/26 (August 1993). Fortran M is a small set of extensions to Fortran that supports a modular approach to the construction of sequential and parallel programs. This report incorporates both a tutorial introduction to Fortran M and a users guide for the Fortran M compiler developed at Argonne.
I. Foster and S. Tuecke, ``Parallel Programming with PCN,'' ANL-91/32, Rev. 2 (January 1993). PCN is a system for developing and executing parallel programs. It comprises a high-level programming language, a set of tools for developing and debugging programs in this language, and interfaces to Fortran and C that allow the reuse of existing code in multilingual parallel programs. Programs developed using PCN are portable across many different workstations, networks, and parallel computers. This document provides all the information required to develop parallel programs with the PCN programming system. It includes both tutorial and reference material. It also presents the basic concepts that underlie PCN.
N. Galbreath, W. Gropp, and D. Levine, ``Applications-driven Parallel I/O,'' Proceedings of Supercomputing '93, 1993, pp. 462-471. (Also MCS-P381-0893). This paper investigates the needs of some massively parallel applications running on distributed-memory parallel computers at Argonne. The authors identify some common parallel I/O operations for which routines were developed that hide the details of the actual implementation (such as the number of parallel disks) from the application, while providing good performance. An important feature is the ability for the application programmer to specify that a file be accessed either as a high-performance parallel file or as a conventional Unix file, simply by changing the value of a parameter on the file open call. these routines are examples of a parallel I/O abstraction that can enhance development, portability, and performance of I/O operations in applications. Some of the specific issues in their design and implementation in a distributed-memory toolset are discussed.
M. Garbey and H. G. Kaper, ``Asymptotic Analysis: Working Note No. 1, Basic Concepts and Definitions,'' ANL/MCS-TM-179 (July 1993). In this note the authors introduce the basic concepts of asymptotic analysis. After some comments of historical interest, they define order relations; introduce order functions, asymptotic sequences of order functions, and more general gauge sets of order functions; and define the concepts of an asymptotic approximation and a asymptotic expansion with respect to a given gauge set. They conclude with the asymptotic analysis of an initial value problem whose solution is obtained in the form of a regular asymptotic expansion.
M. Garbey and H. G. Kaper, ``Asymptotic Analysis: Working Note No. 2, Approximation of Integrals,'' ANL/MCS-TM-180 (July 1993). In this note the authors discuss the approximation of integrals that depend on a parameter. The basic tool is integration by parts. The applications discussed include Laplace integrals, generalized Laplace integrals, Fourier integrals, and Stokes's method of stationary phase for generalized Fourier integrals.
M. Garbey and H. G. Kaper, Asymptotic Analysis Working Note No. 3, Boundary Layers, ANL/MCS-TM-181 (September 1993). This working note discusses the asymptotic approximation of functions that display boundary-layer behavior. The purpose is to introduce the basic concepts underlying the phenomenon, to illustrate its importance, and to describe some of the fundamental tools available for its analysis. The focus here is on functions that are assumed to be given explicitly. 
W. Gropp, ed., ``Early Experiences with the IBM SP-1,'' ANL/MCS-TM-177 (June 1993). The IBM SP1 is IBM's newest parallel distributed-memory computer. As part of a joint project with IBM, Argonne took delivery of an early system in order to evaluate the software environment and to begin porting programming packages and applications to this machine. This report discusses the results of those early efforts. Despite the newness of the machine and the lack of a fast interprocessor switch (part of the SP1 but not available for the beta machine), every code that the Argonne researchers attempted to port ran on the SP1 with little or no modification. 
W. Gropp, ed., ``Early Experiences with the IBM SP1 and the High-Performance Switch,'' ANL-93/41 (November 1993). This report discusses the results of Argonne's early experience on the IBM SP1 once the high-performance switch was installed. It updates the technical report TM-177. 
M. T. Jones and P. E. Plassmann, ``BlockSolve v1.1: Scalable Library Software for the Parallel Solution of Sparse Linear Systems,'' ANL-92/46 (March 1993). BlockSolve is a software library for solving large, sparse systems of linear equations on massively parallel computers. The matrices must be symmetric, but may have an arbitrary sparsity structure. BlockSolve is a portable package that is compatible with several different message-passing paradigms. This report gives detailed instructions on the use of BlockSolve in applications programs. 
R. Overbeek and M. Price, ``Accessing Integrated Genomic Data Using GenoBase: A Tutorial, Part 1,'' ANL/MCS-TM-173 (January 1993). GenoBase integrates genomic information from many existing databases, offering convenient access to the curated data. This document is the first part of a two-part tutorial on how to use GenoBase for accessing integrated genomic data.
M. Price, M. Muralidharan Raju, and R. Taylor, ``A Summary of Genomic Data Relating to E. coli Organized by Metabolic Pathways: An Initial Version,'' ANL/MCS-TM-174 (January 1993). This report summarizes the reactions that occur in some of the principal metabolic pathways of E. coli. These pathways have been encoded as objects in GenoBase, as integrated database under development at Argonne in collaboration with researchers at the National Institutes of Health and at Harvard University. The report lists the substrates, projects, enzymes, and cofactors for each pathway as a whole, followed by a detailed description of each reaction in the pathway. In addition, for each enzyme, the report displays a description and activity as listed in the Enzyme Data Bank, followed by the corresponding Swiss Protein Data Bank entries. Separate summary lines are included for each of the E. coli genes associated with each system.
G. Schlossnagle, J. M. Restrepo, and G. K. Leaf, ``Periodized Wavelets,'' ANL-93/34 (December 1993). The properties of periodized Daubechies wavelets on [0,1] are detailed and contrasted against their counterparts which form a basis for L**2 (R). Numerical examples illustrate the analytical estimates for convergence and demonstrate by comparison with Fourier spectral methods the superiority of wavelet projection methods for approximations. The analytical solution to inner products of periodized wavelets and their derivatives, which are known as connection coefficients, is presented, and several tabulated values are included. 

Preprints

M. Dryja, B. F. Smith, and O. B. Widlund, Schwarz Analysis of Iterative Substructuring Algorithms for Elliptic Problems in Three Dimensions, Preprint MCS-P250-0791 and SIAM J. Numer. Anal. 31(6) (December 1994), pp. 1662-1694. This paper explores a number of different domain decomposition algorithms that can be built from modules that represent local and global components of the preconditioner. The authors demonstrate how Schwarz analysis can be used to design and analyze fast iterative structuring algorithms for 3D elliptic problems.
J. J. More' and D. J. Thuente, ``Line Search Algorithms with Guaranteed Sufficient Decrease,'' Preprint MCS-P330-1092 and ACM Transactions on Mathematical Software 20(3) (September 1994), p. 286-307. The problem of finding a point that satisfies the sufficient decrease and curvature condition is formulated in terms of finding a point in a set a T(mu). This paper describes a search algorithm for this problem that produces a sequence of iterates that converge to a point in T( mu ) and that, except for pathological cases, terminates in a finite number of steps. Numerical results for an implementation of the search algorithm on a set of test functions show that the algorithm terminates within a small number of iterations. 
A. Griewank, C. Bischof, G. Corliss, A. Carle, and K. Williamson, ``Derivative Convergence for Iterative Equation Solvers,'' Preprint MCS-P333-1192 and Optimization Methods and Software 2 (1993), pp. 321-355.  This paper shows that derivative convergence can be achieved with an R-linear or possibly R-superlinear rate for a large class of memoryless contractions or secant updating methods. For a wider class of multistep contractions, the authors obtain R-linear convergence of a simplified derivative recurrence, which is more economical and can be easily generalized to second higher derivatives. The authors also formulate a constructive criterion for derivative convergence based on the implicit function theorem. All theoretical results are confirmed by numerical experiments on small test examples.  
S. Wright, ``A Path-Following Infeasible-Interior-Point Algorithm for Linear Complementarity Problems,'' Preprint MCS-P334-1192 and Optimization Methods and Software 2 (1993), pp. 79-106. This paper describes an infeasible-interior-point algorithm for monotone linear complementarity problems that has polynomial complexity, global linear convergence, and local superlinear convergence with a Q-order of 2. Only one matrix factorization is required per iteration, and the analysis assumes only that a strictly complementary solution exists. 
L. Wos, ``The Problem of Automated Theorem Finding,'' Preprint MCS-P335-1192 and Journal of Automated Reasoning 10 (1993), pp. 137-138.  The problem posed for research asks one to identify appropriate properties to permit an automated reasoning program to find new and interesting theorems, in contrast to proving conjectured theorems. Such programs are now functioning in many domains as valuable reasoning assistants. A sharp increase in their value would occur if they could also be used as colleagues to (so to speak) produce research on their own.  
L. Wos, ``The Problem of Selecting an Approach Based on Prior Success,'' Preprint MCS-P336-1192 and Journal of Automated Reasoning 10 (1993), pp. 283-284.  The problem posed for research asks one to find criteria for choosing from among previously answered questions a question that dictates the approach to take for attacking the problem of current interest. Especially because the better automated reasoning programs often offer a wide range of choices for representation, inference rule, and strategy, a solution to the proposed problem would materially reduce the difficulty of using these powerful aids for research. 
L. Wos, ``The Problem of Reasoning by Analogy,'' Preprint MCS-P337-1192 and Journal of Automated Reasoning 10 (1993), pp. 421-422.  The problem posed for research asks one to find criteria that an automated reasoning program can apply to determine whether to attack a given question with reasoning by analogy. The imprecise term ``reasoning by analogy'' refers to a type of reasoning in which the type of proof being sought is sharply influenced by the style of proof that was successfully used to prove related theorems. 
T. E. Hull, T. F. Fairgrieve, and P. T. P. Tang, ``Implementing Complex Elementary Functions Using Exception Handling,'' Preprint MCS-P338-1192 and ACM Trans. on Math. Softw. 20(2) (June 1994), pp. 215-244.

Not available by FTP.

The authors present algorithms for reliable and accurate evaluations of the complex elementary functions required in Fortran 77 and Fortran 90, namely, cabs, csqrt, cexp, clog, csin, and ccos. The algorithms are first presented in a pseudo-code which has convenient exception-handling facilities. A tight error bound is derived for each algorithm. Corresponding Fortran programs for an IEEE environment are also developed to illustrate the practicality of the algorithms, and these programs have been tested very carefully to help confirm the correctness of the algorithms and their error bounds.
H. G. Kaper and M. K. Kwong, ``Vortex Configurations in High-Tc Superconducting Films,'' Journal of Computational Physics 119 (1995), pp. 120-131. Also Preprint MCS-P340-1292. This article address the Ginzburg-Landau (GL) model for high-temperature superconductivity in thin films (two-dimensional periodic domains). A new gauge is defined to reduce the coupling between the equations for the nonzero components of the vector potential. The GL equations are written in a novel form by means of continuous link variables; this form is symmetric and has particular advantages for numerical analysis. The continuous GL model is approximated by a discrete model, which is shown to be second-order accurate. Two methods are used for the numerical solution of the discrete model-a modified Newton's method, in combination with a sweeping algorithm for the solution of the linear system, and a time-like integration method based on gradient flow. Numerical experiments demonstrate that the discrete GL model leads to asymmetric solutions in the plane; symmetry is recovered only in the limit as the mesh size goes to zero. The results of computational experiments to find the upper critical field and establish an empirical power law for vortex interactions are given.
J. N. Lyness and S. E. Plowman, ``Trapezoidal Rule Quadrature Algorithms for MIMD Distributed-Memory Computers,'' MCS-P342-1292.

Not available by FTP.

An approach to multidimensional quadrature, designed to exploit parallel architectures, is described. This involves transforming the integral in such a way that an accurate result is given by the trapezoidal rule, and evaluating the resulting sum in a manner that may be efficiently implemented on parallel architectures. This approach is to be implemented in the Liverpool NAG transputer library.
M. K. Kwong, ``A Review of MATLAB 4.0,'' Preprint MCS-P343-1292 and SIAM News 26(2) (March 1993).  This article reviews the new features of MATLAB 4.0 and presents the author's experiences in using the new package. Upward incompatibilities are discussed.  
S. Wright and D. Ralph, ``A Superlinear Infeasible-Interior-Point Algorithm for Monotone Complementarity Problems,'' Mathematics of Operations Research 21, no. 4 (1996), pp. 815-838. Also Preprint MCS-P344-1292, Rev. 1. We use the globally convergent framework proposed by Kojima, Noma, and Yoshise to construct an infeasible-interior-point algorithm for monotone nonlinear complementarity problems. Superlinear convergence is attained when the solution is nondegenerate and also when the problem is linear. Numerical experiments confirm the efficacy of the proposed approach. 
I. Foster, ``Fortran M as a Language for Building Earth System Models,'' Preprint MCS-P345-0193 and Parallel Supercomputing in Atmospheric Sciences, World Scientific, Singapore, 1993. Fortran M is a small set of extensions to Fortran 77 that supports a modular or object-oriented approach to the development of parallel programs. This paper focuses on the use of Fortran M as a tool for building earth system models on massively parallel computers. The author hypothesizes that the use of Fortran M has software engineering advantages. Experiments are outlined to investigate this hypothesis. 
K. M. Chandy and I. T. Foster, ``A Deterministic Notation for Cooperating Processes,'' IEEE Transactions on Parallel and Distributed Systems, Vol. 6, no. 8 (August 1995), pp. 863-871. Also preprint MCS-P346-0193. This paper proposes extensions of sequential programming languages for parallel programming that have the following features: (1) dynamic structures, (2) paradigm integration, and (3) determinism. The ideas have been incorporated in an extension of Fortran, but the underlying sequential imperative language is not central to the ideas described here. 
C. Bischof, M. Marques, and X. Sun, ``Parallel Band Reduction and Tridiagonalization,'' MCS-P347-0193. This paper presents a parallel implementation of a blocked band reduction algorithm suggested by Bischof and Sun for symmetric matrices. The reduction to tridiagonal or block tridiagonal form is a special case of this algorithm. A blocked double torus wrap mapping is used as the underlying data distribution and the so-called WY representation is employed to represent block orthogonal transformations. Preliminary performance results on the Intel DELTA indicate that the algorithm is well suited to a MIMD computing environment and that the use of a block approach significantly improves performance. 
B. M. Averick, J. J. More', C. H. Bischof, A. Carle, and A. Griewank, ``Computing Large Sparse Jacobian Matrices Using Automatic Differentiation,'' Preprint MCS-P348-0193 and SIAM J. Sci. Comput. 15(2) (March 1994), pp. 285-294. The computation of large sparse Jacobian matrices is required in many important large-scale scientific problems. This paper considers three approaches to computing such matrices: hand-coding, difference approximations, and automatic differentiation using the ADIFOR tool. The authors compare the numerical reliability and computational efficiency of these approaches on applications from the MINPACK-2 test problem collection. The conclusion is that automatic differentiation is the method of choice, leading to results that are as accurate as hand-coded derivatives, while at the same time outperforming difference approximations in both accuracy and speed. 
J. J. More', ``Generalizations of the Trust Region Method,'' Preprint MCS-P349-0193 and Optimization Methods and Software 2 (1993), pp. 189-209. The trust region problem requires the global minimum of a general quadratic function subject to an ellipsoidal constraint. The development of algorithms for the solution of this problem has found applications in nonlinear and combinatorial optimization. This paper generalizes the trust region problem by allowing a general quadratic constraint. The main results are a characterization of the global minimizer of the generalized trust region problem and the development of an algorithm that finds an approximate global minimizer in a finite number of iterations.
K. Mani Chandy and I. Foster, ``Deterministic Parallel Fortran,'' MCS-P350-0193. This paper describes Fortran M, message-passing extensions to Fortran 77 that provide deterministic execution and information hiding while preserving desirable properties of message passing.
I. Foster and P. H. Worley, ``Parallelizing the Spectral Transform Method: A Comparison of Alternative Parallel Algorithms,'' MCS-P351-0193. The spectral transform method is a standard numerical technique for solving partial differential equations on the sphere and is widely used in global climate modeling. This paper outlines different approaches to parallelizing the method and describes experiments that we are conducting to evaluate the efficiency of these approaches on parallel computers. The experiments are conducted using a testbed code that solves the nonlinear shallow water equations on a sphere but are designed to permit evaluation in the context of a global model. They allow one to evaluate the relative merits of the approaches as a function of problem size and number of processors. The results are guiding ongoing work on PCCM2, a parallel implementation of the Community Climate Model developed at the National Center for Atmospheric Research.
J. M. Boyle, ``Simplification-driven Automated Partial Evaluation,'' MCS-P352-0193.

Not available by FTP.

This paper presents an automated approach to partial evaluation based on simplification and implemented by program transformations. The approach emphasizes program algebra and relies on canonical forms and distributive laws to expose instances to which simplifications can be applied. The author discusses some of the considerations that led to the design of this approach. The discussion should be useful in understanding the structure of partial evaluation transformations and as an example of how to approach the design of automated program transformations in general. This approach has been applied to a number of practical examples; the chief barrier to its wider application is the growth of the intermediate program text during partial evaluation. Nevertheless, the approach has the virtues of being implemented, automated, and able to partially evaluate specifications containing implicit data.
H. G. Kaper, G. K. Leaf, D. M. Levine, and V. Vinokur, ``Glassy Motion of an Elastic String,'' Preprint MCS-P353-0293 and Physical Review Letters 71(22) (29 November 1993), pp. 3713-3716. Numerical simulations of the dynamics of an elastic string in a quenched random potential at finite temperatures confirm the existence of glassy motion for sufficiently weak driving forces. The results show excellent qualitative and quantitative agreement with analytical predictions and thus confirm the basic assumptions underlying the theoretical model. P353.ps.Z
I. Foster, ``Compositional Parallel Programming Languages,'' ACM Trans. on Programming Languages and Systems 18, no. 4 (1996), pp. 454-476. Also Preprint MCS-P354-0293, Rev. 1, April 1996. In task-parallel programs, diverse activities can take place concurrently, and communication and synchronization patterns are complex and not easily predictable. Previous work has identified compositionality as an important design principle for task-parallel programs. In this paper, we discuss alternative approaches to the realization of this principle, which holds that properties of program components should be preserved when those components are composed in parallel with other program components. We review two programming languages, Strand and Program Composition Notation, that support compositionality via a small number of simple concepts, namely monotone operations on shared objects, a uniform addressing mechanism, and parallel composition. Both languages have been used extensively for large-scale application development, allowing us to provide an informed assessment of their strengths and weaknesses. We observe that while compositionality simplifies development of complex applications, the use of specialized languages hinders use of existing code and tools, and the specification of domain decomposition strategies. This suggests an alternative approach based on small extensions to existing sequential languages. We conclude the paper with a discussion of two languages that realize this strategy.
A. Griewank, ``Some Bounds on the Complexity of Gradients, Jacobians, and Hessians,'' MCS-P355-0393. The evaluation or approximation of derivatives is an important part of many nonlinear computations. The cost of evaluating first- and second-derivative matrices is often assumed to grow linearly and quadratically with the number of independent variables, respectively. It is shown here that much tighter bounds can be achieved through the exploitation of partial function- and argument-separability in combination with the forward and reverse mode of computational, or automatic, differentiation. The new separability concepts facilitate the reduction of chromatic numbers and maximal row lengths, which determine the complexity of the Curtis-Powell-Reid and Newsam-Ramsdell schemes for estimating sparse derivative matrices. Because of the duality between the forward and reverse modes, these techniques can be applied to Jacobians as well as their transposes and the associated row-intersection graphs. A key result presented in this paper is that gradients and Hessians of partially separable functions can also be obtained surprisingly cheaply in the easily implemented forward mode as well as in the more sophisticated reverse mode of computational differentiation. 
B. F. Smith and W. D. Gropp, ``The Design of Data-Structure-Neutral Libraries for the Iterative Solution of Sparse Linear Systems,'' MCS-P356-0393. Over the past few years several proposals have been made for the standardization of sparse matrix storage formats in order to allow for the development of portable matrix libraries for the iterative solution of linear systems. Here, instead, the authors present a different approach. Rather than define one standard for matrix storage, the community should define an interface for the functions that act on the data. In addition, one must consider the interface to the vector operations since, in many applications, vectors may not be stored as consecutive elements in memory. With the acceptance of shared-memory, distributed-memory, and cluster-memory parallel machines, the flexibility of the distribution of the elements of vectors is extremely attractive. 
R. D. C. Monteiro and S. Wright, ``Local Convergence of Interior-Point Algorithms for Degenerate Monotone LCP,'' Preprint MCS-P357-0493 and Computational Optimization and Applications 3 (1994), pp. 131-155. Most asymptotic convergence analysis of interior-point algorithms for monotone linear complementarity problems assumes that the problem is nondegenerate, that is, that the solution set contains a strictly complementary solution. Here the authors investigate the behavior of these algorithms when this assumption is removed. 
G. F. Corliss and A. Griewank, ``Operator Overloading as an Enabling Technology for Automatic Differentiation,'' MCS-P358-0493. This paper presents an example of the science that is enabled by object-oriented programming techniques. Scientific computation often needs derivatives for solving nonlinear systems such as those arising in many PDE algorithms, optimization, parameter identification, stiff ordinary differential equations, or sensitivity analysis. Automatic differentiation computes derivatives accurately and efficiently by applying the chain rule to each arithmetic operation or elementary function. Operator overloading enables the techniques of either the forward or the reverse mode of automatic differentiation to be applied to real-world scientific problems. We illustrate automatic differentiation with an example drawn from a model of unsaturated flow in a porous medium. The problem arises from planning for the long-term storage of radioactive waste. 
G. H. Chisholm, ``Toward a Verifiable Approach to the Design of Concurrent Computations,'' MCS-P359-0493.

Not available by FTP.

This paper investigates an approach for proving correctness of distributed programs under an assumed data-exchange capability. State informally, the data-exchange assumption would be to embed an abstract model of the data communications mechanism into the program specification. The Message Passing Interface (MPI) standard provides a basis for such a model. The authors present here a high-level specification they have developed using the ASLAN specification language. The specification is based on a generalized communications model from which the MPI model may be derived. The authors also describe an approach to the specification of distributed programs with explicit message passing based on a verifiable data exchange model.
R. D. C. Monteiro and S. J. Wright, ``Superlinear Primal-Dual Affine Scaling Algorithms for LCP,'' MCS-P360-0693. This paper describes an interior-point algorithm for monotone linear complementarity problems in which primal-dual affine scaling is used to generate the search directions. The algorithm is shown to have global and superlinear convergence with Q-order up to (but not including) two. The technique is shown to be consistent with a potential-reduction algorithm, yielding the first potential-reduction algorithm that is both globally and superlinearly convergent. 
R. D. C. Monteiro and S. J. Wright, ``A Superlinear Infeasible-Interior-Point Affine Scaling Algorithm for LCP,'' SIAM J. Optimization, Vol. 6, no. 1 (1996) 1-18. Also Preprint MCS-P361-0693. This paper presents an infeasible-interior-point algorithm for monotone linear complementarity problems in which the search directions are affine scaling directions and the step lengths are obtained from simple formulas that ensure both global and superlinear convergence. By choosing the value of a parameter in appropriate ways, polynomial complexity and convergence with Q-order up to (but not including) two can be achieved. The only assumption made to obtain the superlinear convergence is the existence of a strictly complementary solution. 
R. M. Butler and E. L. Lusk, ``Monitors, Messages, and Clusters: The p4 Parallel Programming System,'' MCS-P362-0493. p4 is a portable library of C and Fortran subroutines for programming parallel computers. It is the current version of a system that has been in use since 1984. It includes features for explicit parallel programming of shared-memory machines, distributed-memory machines (including heterogeneous networks of workstations), and clusters (shared-memory multiprocessors communicating via message passing). This paper discusses the design goals, history, and system architecture of p4. Also described is a diverse collection of applications that have demonstrated the utility of p4. 
J. N. Lyness, ``Quadrature over Curved Surfaces by Extrapolation,'' MCS-P363-0593. This paper describes and justifies a method for integrating over curved surfaces. This method does not require that the Jacobian be known explicitly. This is a natural extension of extrapolation (or Romberg integration) for plane squares or triangles. 
L. T. Biegler, J. Nocedal, and C. Schmid, ``A Reduced Hessian Method for Large-Scale Constrained Optimization,'' MCS-P364-0693. A quasi-Newton algorithm is proposed for solving large optimization problems with nonlinear equality constraints. It is designed for problems with few degrees of freedom and is motivated by the need to use sparse matrix factorizations. The algorithm can be considered to be, in some sense, a practical implementation of an algorithm of Coleman and Conn. The authors give conditions under which local and superlinear convergence is obtained. 
L. Wos, ``The Problem of Naming and Function Replacement,'' MCS-P365-0693. This article is the twenty-eighth of a series of articles discussing various open research problems in automated reasoning. The problem proposed for research asks one to find criteria that an automated reasoning program can profitably use to remove functions present in the representation and replace them with appropriate predicates or constants that name the entities that were named by the functions. The notation used to present a problem to a reasoning program can have a profound effect on the likelihood of the program's success. 
L. Wos, ``The Problem of Reasoning by Case Analysis,'' MCS-P366-0693. The problem posed for research asks one to find criteria that an automated reasoning program can apply to decide when to conduct a case analysis argument and to decide which cases are appropriate. When the choice of employing a case analysis argument is wise and the cases are well chosen, the likelihood that the reasoning program-or, for that matter, a person-will find an answer to the given question is sharply increased. 
L. Wos, ``The Problem of Induction,'' MCS-P367-0693. The problem posed for research asks for criteria for accurately determining when an induction argument is the appropriate form of argument for an automated reasoning program to use. This research problem also asks for criteria for choosing well the property on which to conduct the induction argument. 
T. Gaasterland, ``Restricting Query Relaxation through User Constraints,'' MCS-P368-0693.

Not available by FTP.

This paper describes techniques to restrict and to heuristically control relaxation of deductive database queries. The process of query relaxation provides a user with a means to automatically identify new queries that are related to the user's original query. However, for large databases, many relaxations may be possible. The methods to control and restrict the relaxation process introduced in this paper focus on the relaxation process and make it more efficient. User restrictions over the database domain may be expressed as user constraints. This paper describes how user constraints can restrict relaxed queries. Also, a set of heuristics based on cooperative answering techniques are presented for controlling the relaxation process. Finally, the interaction of the methods for relaxing queries, processing user constraints, and applying the heuristic rules is described.
T. Gaasterland and J. Lobo, ``Using Semantic Information for Processing Negation and Disjunction in Logic Programming,'' MCS-P369-0693.

Not available by FTP.

Integrity constraints play an important role in many applications. An example is the semantic query optimization method developed by Chakravarthy, Grant, and Minker for definite deductive databases. They use integrity constraints during query processing to prevent the exploration of search space that is bound to fail. This paper generalizes the semantic query optimization method to apply to negated atoms. The generalized method is referred to as semantic compilation. The authors show that semantic compilation provides an alternative search space for negative query literals. They also show how semantic compilation can be used to transform a disjunctive database with or without functions and denial constraints without negation into a new disjunctive database that complies with the integrity constraints.
T. Gaasterland and J. Lobo, ``Processing Negation and Disjunction in Logic Programs through Integrity Constraints,'' MCS-P370-0693.

Not available by FTP.

This paper generalizes the semantic query optimization method developed by Chakravarthy, Grant, and Minker to apply to negated atoms. The generalized method is referred to as semantic compilation. This exploration has led to two significant results. First, semantic compilation provides an alternative search space for negative query literals. The alternative search space can find answers in cases for which negation-as-finite-failure and constructive negation cannot. Second, semantic compilation can be used to transform a disjunctive database with or without functions and denial constraints without negation into a new disjunctive database that complies with the integrity constraints.
C. Y. Chang and M. K. Kwong, ``Proceedings of the Organized Session `Nonlinear Problems in Superconductivity,' First World Congress of Nonlinear Analysts,'' MCS-P371-0793. These papers represent recent efforts by mathematicians to study the Ginzburg-Landau model. The most current findings in experimental physics indicate that modifications to the original Ginzburg-Landau theory is needed to adequately describe the phenomena exhibited by the new superconductors, which have a layered structure at the atomic level and possess high anisotropy. Impurities in the material act as pinning centers for vortices and random thermal noise tends to depin them. Theoretical physicists have even suggested the existence of more phases: the vortex-liquid and vortex-glass phases, but the debate is still going on. The papers presented here give a better understanding of the classical model and will surely shed some light on how to modify it to accommodate the newer theories. 
E. L. Lusk, S. Mudambi, R. Overbeek, and P. Szeredi, ``Applications of the Aurora Parallel Prolog System to Computational Molecular Biology,'' MCS-P372-0793. This paper describes an investigation into the use of the Aurora parallel Prolog system in two applications within the area of computational molecular biology. The computational requirements were large because of the nature of the applications and were carried out on a scalable parallel computer (the BBN TC2000). Results demonstrate that logic programming can be effective in the context of demanding applications on large-scale parallel machines. Some insights into parallel programming in Prolog are also provided.  
L. V. Curfman, ``A New Finite Element Formulation for Incompressible Flow,'' MCS-P373-0793. A basic objective in computational fluid dynamics is the efficient solution of nonlinear systems of equations that arise in finite element modeling of convective-diffusive flow. The use of implicit Newton-like schemes to solve the coupled system of Navier-Stokes and continuity equations enables rapid convergence, although the well-known difficulty of indirect pressure linkage requires attention when forming the Jacobian matrices. This paper presents a primitive variable finite-element formulation that uses an auxiliary pressure equation derived from the Navier-Stokes and continuity equations. This formulation extends the work of Rice and Schnipke, where a similar equation was developed in the context of a segregated solution method. Approximate Newton methods using the new finite element formulation are evaluated in terms of accuracy, convergence rate, and overall efficiency for flow problems with varying degrees of linearity. 
R. M. Butler, A. L. Leveton, and E. L. Lusk, ``p4-Linda: A Portable Implementation of Linda,'' MCS-P374-0793. Facilities such as interprocess communication and protection of shared resources have been added to operating systems to support multiprogramming and have since been adapted to exploit explicit multiprocessing within the scope of two models: the shared-memory model and the distributed (message-passing) model. When multiprocessors (or networks of heterogeneous processors) are used for explicit parallelism, the difference between these models is exposed to the programmer. The p4 tool set was originally developed to buffer the programmer from synchronization issues while offering an added advantage in portability; however, two models are often still needed to develop parallel algorithms. Here the authors provide two implementations of Linda in an attempt to support a single high-level programming model on top of the existing paradigms. The result is a consistent semantics regardless of the underlying model. Linda's fundamental properties associated with generative communication eliminate the distinction between shared and distributed memory. 
C. Bischof, S. Huss-Lederman, X. Sun, and A. Tsao, ``The PRISM Project: Infrastructure and Algorithms for Parallel Eigensolvers,'' MCS-P383-0993. The goal of the PRISM project is the development of infrastructure and algorithms for the parallel solution of eigenvalue problems. The authors are investigating a complete eigensolver based on the invariant subspace decomposition algorithm for dense symmetric matrices (SYISDA). After reviewing SYISDA, they discuss the algorithmic highlights of a distributed-memory implementation of this approach. These include a fat matrix-matrix multiplication algorithm, a new approach to parallel band reduction and tridiagonalization, and a harness for coordinating the divide-and-conquer parallelism in the problem. Also presented are performance results of these kernels as well as the overall SYISDA implementation on the Intel DELTA. 
M. Chandy, I. Foster, K. Kennedy, C. Koebel, and C.-W. Tseng, ``Integrated Support for Task and Data Parallelism,'' Preprint MCS-P384-0993 and The International Journal of Supercomputer Applications 8:2 (Summer 1994), pp. 80-98. This paper discusses CRPC research designed to provide an efficient, portable programming model for scientific applications possessing both task and data parallelism. The authors present an interface for using a task-parallel language to coordinate concurrent data-parallel computations. A key concept is the integration of Fortran M resource management constructs and Fortran D data decomposition constructs.  
C. Zhu, ``Inexact Proximal Point Algorithm for Minimization,'' MCS-P385-0993. All of the existing variants of the proximal point algorithm (PPA) require that the inner loop subproblem be solved exactly or asymptotically accurately, an approach that is computationally expensive in most cases. The author here develops an inexact PPA for minimization, where the inner loop can be restricted to only a few iterations. He proves that, under mild conditions, the sequence of the objective values generated by the inexact PPA converges linearly to the optimal value if the inner loop iteration has a linear rate on the regularized subproblems. Also discussed are applications of the inexact PPA on hemiquadratic extended linear-quadratic programming, and on minimization of functions with singular Hessian matrices. 
P. Padmanabhan and W. McCune, ``Single Identities for Ternary Boolean Algebras,'' MCS-P386-1093. This paper presents the first known single axioms for ternary Boolean algebra. The first single axiom was found algebraically by a previous technique of Padmanabhan and Quackenbush; then simpler single axioms were found by using the automated reasoning program OTTER. 
L. Wos, ``The Resonance Strategy,'' MCS-P387-1093. This article focuses on the concept of resonator and on the resonance strategy that keys on resonators to direct the search for the information needed for assignment completion. Various examples from group theory, Robbins algebra, and logic calculi are discussed in which the resonance strategy played a role. 
R. C. Brown, D. B. Hinton, and M. K. Kwong, ``A Remark on a Weighted Landau Inequality of Kwong and Zettl,'' MCS-P388-1093. This paper extends a weighted Landau inequality of Kwong and Zettl introduced in 1981. 
J. Drake, I. Foster, J. J. Hack, J. Michalakes, B. D. Semeraro, B. Toonen, D. L. Williamson, and P. Worley, ``PCCM2: A CGCM Adapted for Scalable Parallel Computers,'' MCS-P389-1093. This paper reviews the various parallel algorithms used in PCCM2 and the work done to arrive at a validated model.
R. Stevens, ``High-Performance Computing and Communications,'' MCS-P390-1093. This paper discusses the U.S. HPCC program-its goals, funding, progress, revisions, and research. The paper then turns to specific work conducted under this program at Argonne. 
Ian Foster, ``Language Constructs for Modular Parallel Programs,'' MCS-P391-1093. His paper describes programming language constructs that facilitate the application of modular design technique in parallel programming. These constructs allow one to isolate resource management and processor scheduling decisions from the specification of individual modules, which can themselves encapsulate design decisions concerned with concurrency, communication, process mapping, and data distribution. This approach permits development of libraries of reusable parallel program components and the reuse of these components in different contexts. In particular, alternative mapping strategies can be explored without modifying other aspects of program logic. The paper describes how these constructs are incorporated in PCN and Fortran M. Compilers have been developed for both languages, allowing experimentation in substantial applications. 
W. Gropp and E. Lusk, ``An Abstract Device Definition to Support the Implementation of a High-Level Point-to-Point Message-Passing Interface,'' Preprint MCS-P392-1193. In this paper we describe an abstract device interface (ADI) that may be used to efficiently implement message-passing systems. This work was done to provide an implementation of the Message Passing Interface (MPI); however, the interface is appropriate for many message-passing systems. The ADI provides for both simple devices and those capable of significant processing. We discuss some of the issues in the implementation and provide a sample implementation for a "device" that is capable of message-passing.
C. Bischof, X. Sun, A. Tsao, and T. Turnbull, ``A Study of the Invariant Subspace Decomposition Algorithm for Banded Symmetric Matrices,'' MCS-P394-1193. This paper gives an overview of the invariant subspace decomposition algorithm for banded symmetric matrices and describes a sequential implementation of this algorithm. The implementation uses a specialized routine for performing banded matrix multiplication together with successive band reduction, yielding a sequential algorithm that is competitive for large problems with the LAPACK QR code in computing all of the eigenvalues and eigenvectors of a dense symmetric matrix. Performance results are given on a variety of machines. 
J. S. Rowlan and B. T. Wightman, ``PORTAL: Communication Library for Run-Time Visualization of Distributed, Asynchronous Data,'' MCS-P395-1193. This paper presents a method for collecting and visualizing data generated by a parallel computational simulation during run time. Data distributed across multiple processes is sent across parallel communication lines to a workstation, which sorts and queues the data for visualization. The method has been implemented in a set of tools called PORTAL. The tools comprise generic routines for sending data from a parallel program and a run-time connection to the scientific visualization program AVS. The method is most valuable when used to examine large datasets that can be efficient generated and do not need to be stored on disk. 
K. W. Dritz, ``The Numerics Annex and Related Material,'' MCS-P396-1193. This paper presents an overview of the numerical features of Ada 9X.
K. M. Chandy and I. T. Foster, ``Parallel Language Constructs for Paradigm Integration and Deterministic Computations,'' MCS-P397-1193. This paper describes parallel extensions of sequential programming languages for writing programs that integrate different programming paradigms and that execute in heterogeneous environments comprising both distributed and shared memory. The extensions can be used to write programs with dynamic process and communicate ion structures. Programs can use shared-memory, message passing, and data-parallel programming paradigms, and can be written in a way that permits the compiler and run-time system to verify that they are deterministic. The extensions also provide the programmer with control over how data and processes are mapped to processors, and hence how computational resources are allocated to different parts of a program. A subset of these ideas has been incorporated in Fortran M; however, the underlying sequential notation is not central to the ideas. 
J. B. Drake, R. E. Flanery, I. T. Foster, J. J. Hack, J. G. Michalakes, R. L. Stevens, D. W. Walker, D. L. Williamson, and P. H. Worley, ``The Message-passing Version of the Parallel Community Climate Model,'' MCS-P398-1193. This paper is a brief overview o a parallel version of the NCAR Community Climate Model CCM2, implemented for MIMD massively parallel computers using a message-passing programming paradigm. The parallel implementation was developed on an Intel iPSC/860 with 128 processors and on the Intel DELTA with 512 processors; the initial target for the production code in the Intel Paragon with 2048 processors. The code can be easily ported to other multiprocessors supporting a message-passing programming paradigm, or run on machines distributed across network. 
W. McCune, ``Otter 3.0,'' MCS-P399-1193. This brief note summarizes the important features of Otter 3.0, Argonne's most powerful general-purpose automated reasoning system. 
S. J. Wright, ``Stability of Linear Equations Solvers in Interior-Point Methods,'' SIAM J. Matrix Anal. Appl. 16, no. 4 (October 1995), pp. 1184-1196. Also Preprint MCS-P400-1293. Primal-dual interior-point methods for linear complementarity and linear programming problems solve a linear system of equations to obtain a modified Newton step at each iteration. These linear systems become increasingly ill conditioned in the later stages of the algorithm, but the computed steps are often sufficiently accurate to be useful. The author uses error analysis technique tailored to the special structure of the linear systems to explain this observation. He examines how theoretically superlinear convergence of a path-following algorithm is affected by the roundoff errors. 
S. J. Wright, ``A Path-Following Interior-Point Algorithm for Linear and Quadratic Problems,'' Preprint MCS-P401-1293. This paper describes an algorithm for the monotone linear complementarity problem that converges from any positive, not necessarily feasible, starting point and exhibits polynomial complexity if some additional assumptions are made on the starting point. If the problem has a strictly complementary solution, the method converges subquadratically. The author shows that the algorithm and its convergence properties extend readily to the mixed monotone linear complementarity problem and hence to all the usual formulations of the linear programming and convex quadratic programming problems.  
G. J. Olsen, C. R. Woese, and R. A. Overbeek, ``The Winds of (Evolutionary) Change: Breathing New Life into Microbiology,'' Preprint MCS-P402-1293. This paper reports on the construction of a phylogenetic tree of over 1500 prokaryotes characterized by small subunit rRNA sequencing. The importance of this tree on scientists' understanding of the relationship between eukaryotes and prokaryotes and between archaea and bacteria is discussed. The paper also discusses broader evolutionary questions about the origins of parts of the eukaryotic genome and the nature of the most recent common ancestor of present-day life.
R. H. Byrd, P. Lu, and J. Nocedal, ``A Limited-Memory Algorithm for Bound-Constrained Optimization,'' MCS-P404-1293. An algorithm for solving large nonlinear optimization problems with simple bounds is described. It is based on the gradient projection method and uses a limited-memory BFGS matrix to approximate the Hessian of the objective function. The authors show how to take advantage of the form of the limited-memory approximation to implement the algorithm efficiently. The results of numerical tests on a set of large problems are reported. 
J. M. Restrepo and J. L. Bona, ``Model for the Formation of Longshore Sand Ridges on the Continental Shelf,'' MCS-P406-1293. A model is proposed for the formation and evolution of three-dimensional sedimentary structures such as longshore sand ridges on the continental shelf in water deeper than that of the shoaling region. Owing to the striking similarity between the bar spacing and the length scales in which interactions among the most energetic modes of shallow water waves take place, the author argue that these bars are formed slowly by flows in the turbulent boundary layer generated by weakly nonlinear, dispersive waves. Hence, the model is based on the interaction between surficial, weakly nonlinear shallow water waves, having weak spanwise spatial dependence, and the bottom topography. 
J. M. Restrepo and J. L. Bona, ``Model for the Formation of Longshore Sand Ridges on the Continental Shelf: The Interaction of Internal Waves and the bottom Topography,'' MCS-P407-1293. Longshore sand ridges are frequently observed to occur on the continental shelf where the overlying ocean is stratified. This study formulates a model for the formation and evolution of three-dimensional longshore sand ridges on the continental shelf. The model is based on the interaction of interfacial, weakly nonlinear waves in a stratified ocean with the sedimentary bottom topography. 
J. M. Restrepo and J. L. Bona, ``Discretization of a Model for the Formation of Longshore Sand Ridges,'' MCS-P408-1293. This paper presents and evaluates the numerical solution of a coupled system of equations that arises in a model for the formation and evolution of three-dimensional longshore sand ridges. The model is based on the interaction between surficial or internal weakly nonlinear shallow-water waves, having weak spanwise spatial dependence, and the deformable bottom topography. 
J. M. Restrepo and J. L. Bona, ``Structure and Behavior of Triad Interactions for a Boussinesq System Arising in a Model for the Formation of Sand Ridges,'' MCS-409-1293. The Boussinesq system describes weakly nonlinear dispersive long waves in plasmas and incompressible irrotational fluids. this study presents some results regarding the structure and behavior of a system of equations that yield the spatial structure of triad interactions in the Boussinesq system. Such a system forms part of a model for the formation and evolution of sand ridges on the continental shelf. The aims of this study are to provide some insight into the behavior of the triad system and into the sand ridge model in particular.
[MCS | Research | Resources | People | Collaboration | Software | Publications | Information]