Rice Header
CAAM Header

Technical Reports (2017)

TR17-09          PDF File

An Inverse Free Projected Gradient Descent Method for the Generalized Eigenvalue Problem
Frankie Camacho

The generalized eigenvalue problem is a fundamental numerical linear algebra problem whose applications are wide ranging. For truly large-scale problems, matrices themselves are often not directly accessible, but their actions as linear operators can be probed through matrix-vector multiplications. To solve such problems, matrix-free algorithms are the only viable option. In addition, algorithms that do multiple matrix-vector multiplications simultaneously (instead of sequentially), or so-called block algorithms, generally have greater parallel scalability that can prove advantageous on highly parallel, modern computer architectures.

In this work, we propose and study a new inverse-free, block algorithmic framework for generalized eigenvalue problems that is based on an extension of a recent framework called eigpen|an unconstrained optimization formulation utilizing the Courant Penalty function. We construct a method that borrows several key ideas, including projected gradient descent, back-tracking line search, and Rayleigh-Ritz (RR) projection. We establish a convergence theory for this framework.

We conduct numerical experiments to assess the performance of the proposed method in comparison to two well-known existing matrix-free algorithms, as well as to the popular solver arpack as a benchmark (even though it is not matrix-free). Our numerical results suggest that the new method is highly promising and worthy of further study and development.

May 2017


TR17-08          PDF File

Novel Techniques for the Zero-Forcing and p-Median Graph Location Problems
Caleb C. Fast

This thesis presents new methods for solving two graph location problems, the p-Median problem and the zero-forcing problem. For the p-median problem, I present a branch decomposition based method that finds the best p-median solution that is limited to some input support graph. The algorithm can be used to either find an integral solution from a fractional linear programming solution, or it can be used to improve on the solutions given by a pool of heuristics. In either use, the algorithm compares favorably in running time or solution quality to state-of-the-art heuristics.

For the zero-forcing problem, this thesis gives both theoretical and computational results. In the theoretical section, I show that the branchwidth of a graph is a lower bound on its zero-forcing number, and I introduce new bounds on the zero-forcing iteration index for cubic graphs. This thesis also introduces a special type of graph structure, a zero-forcing fort, that provides a powerful tool for the analysis and modeling of zero-forcing problems.

In the computational section, I introduce multiple integer programming models for finding minimum zero-forcing sets and integer programming and combinatorial branch and bound methods for finding minimum connected zero-forcing sets. While the integer programming methods do not perform better than the best combinatorial method for the basic zero-forcing problem, they are easily adapted to the connected zero-forcing problem, and they are the best methods for the connected zero-forcing problem.

May 2017


TR17-07          PDF File

A Parallel-In-Time Gradient-Type Method For Optimal Control Problems
Xiaodi Deng

This thesis proposes and analyzes a new parallel-in-time gradient-type method for time-dependent optimal control problems. When the classical gradient method is applied to such problems, each iteration requires the forward solution of the state equations followed by the backward solution of the adjoint equations before the gradient can be computed and control can be updated. The solution of the state/adjoint equations is computationally expensive and consumes most of the computation time. The proposed parallel-in-time gradient-type method introduces parallelism by splitting the time domain into N subdomains and executes the forward and backward computation in each time subdomain in parallel using state and adjoint variables at time subdomain boundaries from the last optimization iteration as initial values. The proposed method is generalized to allow different time domain partitions for forward/backward computations and overlapping time subdomains.

Convergence analyses for the parallel-in-time gradient-type method applied to discrete-time optimal control problems are established. For linear-quadratic problems, the method is interpreted as a multiple-part splitting method and convergence is proven by analyzing the corresponding iteration matrix. In addition, the connection of the new method to the multiple shooting reformulation of the problem is revealed and an alternative convergence proof based on this connection is established. For general non-linear problems, the new method is combined with metric projection to handle bound constraints on the controls and convergence of the method is proven.

Numerically, the parallel-in-time gradient-type method is applied to linear-quadratic optimal control problems and to a well-rate optimization problem governed by a system of highly non-linear partial differential equations. For linear-quadratic problems, the method exhibits strong scaling with up to 50 cores. The parallelization in time is on top of the existing parallelization in space to solve the state/adjoint equations. This is exploited in the well-rate optimization problem. If the existing parallelism in space scales well up to K processors, the addition of time domain decomposition by the proposed method scales well up to K X N processors for small to moderate number N of time subdomains.

May 2017


TR17-06          PDF File

Hermite Methods for the Simulation of Wave Propagation
Arturo Vargas

Simulations of wave propagation play a crucial role in science and engineering. In applications of geophysics, they are the engine of many seismic imaging algorithms. For electrical engineers, they can be a useful tool for the design of radars and antennas. In these applications achieving high fidelity, simulations are challenging due to the inherent issues in modeling highly oscillatory waves and the associated high computational cost of high-resolution simulations. Thus the ideal numerical method should be able to capture high-frequency waves and be suitable for parallel computing.

In both seismic applications and computational electromagnetics the Yee scheme, a finite difference time domain (FDTD) method, is the method of choice for structured grids. The scheme has the benefit of being easy to implement but performs poorly in the presence of high-frequency waves. High order accurate FDTD methods may be derived but ultimately rely on neighboring grid points when approximating derivative.

In contrast to FDTD methods, the Hermite methods of Goodrich and co-authors (2006) use Hermite interpolation and a staggered (dual) grid to construct high order accurate numerical methods for first order hyperbolic equations. These methods achieve high order approximations in both time and space by reconstructing local polynomials within cells of the computational domain and employing Hermite-Taylor time stepping. The resulting schemes are able to evolve the solution locally within a cell making them ideal for parallel computing. Building on the original Hermite methods this thesis focuses on two goals: (1) the development of new Hermite methods and (2) their implementation on modern computing architectures.

To accomplish the first objective, this thesis presents two variations of Hermite methods which are designed to simplify the scheme while preserving the favorable features. The first variation is a family of Hermite methods which do not require a dual grid. These methods eliminate the need for storing dual coefficients while maintaining optimal convergence rates. The second type of variation are Hermite methods which use leapfrog time-stepping. These schemes propagate the solution with less computation than the original scheme and may be used for either first or second order equations.

To address the second objective, this thesis presents algorithms which take advantage of the many-core architecture of graphics processing units (GPU). As threedimensional simulations can easily exceed the memory of a single GPU, techniques for partitioning the data across multiple GPUs are presented. Finally, this thesis presents numerical results and performance studies which confim the accuracy and efficiency of the proposed Hermite methods for linear and nonlinear wave equations.

May 2017


TR17-05          PDF File

Bilevel Clique Interdiction and Related Problems
Timothy Joseph Becker

I introduce a formulation of the bilevel clique interdiction problem. Interdiction, a military term, describes the removal of enemy resources. The single level clique interdiction problem describes the attempt of an attacker to interdict a maximum number of cliques. The bilevel form of the problem introduces a defender who attempts to minimize the number of cliques interdicted by the attacker. An algorithm and formulation for the bilevel clique interdiction problem has not previously been investigated. I start by introducing a formulation and a column-generation algorithm to solve the problem of bilevel interdiction of a minimum clique transversal and move forward to the creation of a delayed row-and-column generation algorithm for bilevel clique interdiction.

Next, I introduce a formulation and algorithm to solve the bilevel interdiction of a maximum stable set problem. Bilevel interdiction of a maximum stable set is choosing a maximum stable set, but with a defender who is attempting to minimize the maximum stable set that can be chosen by the interdictor. I introduce a deterministic formulation and a delayed column generation algorithm. Additionally, I introduce a stochastic formulation of the problem. I solve this problem using a cross-decomposition method that involves L-shaped cuts into a master problem as well as new "clique" cuts for the inner problem.

Lastly, I define new classes of valid inequalities and facets for the clique transversal polytope. The valid inequalities come from two graph structures who have a closed form for their vertex cover number, which we use as a specic case for nding a minimum clique transversal. The first class of facets are just the maximal clique constraints of the clique transversal polytope. The next class contains an odd hole with distinct cliques on each edge of the hole. Another similar class contains an odd clique with distinct maximal cliques on the edges of one of its spanning cycles. The fourth class contains a clique with distinct maximal cliques on every edge of the initial clique, while the last class is a prism graph with distinct maximal cliques on every edge of the prism.

May 2017


TR17-04          PDF File

GPU-Accelerated Discontinuous Galerkin Methods on Hybrid Meshes: Applications in Seismic Imaging
Zheng Wang

Seismic imaging is a geophysical technique assisting in the understanding of subsurface structure on a regional and global scale. With the development of computer technology, computationally intensive seismic algorithms have begun to gain attention in both academia and industry. These algorithms typically produce high-quality subsurface images or models, but require intensive computations for solving wave equations.

Achieving high-fidelity wave simulations is challenging: first, numerical wave solutions may suffer from dispersion and dissipation errors in long-distance propagations; second, the efficiency of wave simulators is crucial for many seismic applications. High-order methods have advantages of decreasing numerical errors efficiently and hence are ideal for wave modelings in seismic problems.

Various high order wave solvers have been studied for seismic imaging. One of the most popular solvers is the finite difference time domain (FDTD) methods. The strengths of finite difference methods are the computational efficiency and ease of implementation, but the drawback of FDTD is the lack of geometric flexibility. It has been shown that standard finite difference methods suffer from first order numerical errors at sharp media interfaces.

In contrast to finite difference methods, discontinuous Galerkin (DG) methods, a class of high-order numerical methods built on unstructured meshes, enjoy geometric flexibility and smaller interface errors. Additionally, DG methods are highly parallelizable and have explicit semi-discrete form, which makes DG suitable for large-scale wave simulations. In this dissertation, the discontinuous Galerkin methods on hybrid meshes are developed and applied to two seismic algorithms---reverse time migration (RTM) and full waveform inversion (FWI).

This thesis describes in depth the steps taken to develop a forward DG solver for the framework that efficiently exploits the element specific structure of hexahedral, tetrahedral, prismatic and pyramidal elements. In particular, we describe how to exploit the tensor-product property of hexahedral elements, and propose the use of hex-dominant meshes to speed up the computation.

The computational efficiency is further realized through a combination of graphics processing unit (GPU) acceleration and multi-rate time stepping. As DG methods are highly parallelizable, we build the DG solver on multiple GPUs with element-specific kernels. Implementation details of memory loading, workload assignment and latency hiding are discussed in the thesis. In addition, we employ a multi-rate time stepping scheme which allows different elements to take different time steps.

This thesis applies DG schemes to RTM and FWI to highlight the strengths of the DG methods. For DG-RTM, we adopt the boundary value saving strategy to avoid data movement on GPUs and utilize the memory load in the temporal updating procedure to produce images of higher qualities without a significant extra cost. For DG-FWI, a derivation of the DG-specific adjoint-state method is presented for the fully discretized DG system. Finally, sharp media interfaces are inverted by specifying perturbations of element faces, edges and vertices.

May 2017


TR17-03          PDF File

Numerical Methods and Applications for Reduced Models of Blood Flow
Charles Puelz

The human cardiovascular system is a vastly complex collection of interacting components, including vessels, organ systems, valves, regulatory mechanisms, mi- crocirculations, remodeling tissue, and electrophysiological signals. Experimental, mathematical, and computational research efforts have explored various hemody- namic questions; the scope of this literature is a testament to the intricate nature of cardiovascular physiology. In this work, we focus on computational modeling of blood ow in the major vessels of the human body. We consider theoretical questions related to the numerical approximation of reduced models for blood ow, posed as nonlinear hyperbolic systems in one space dimension. Further, we apply this modeling framework to abnormal physiologies resulting from surgical intervention in patients with congenital heart defects. This thesis contains three main parts: (i) a discussion of the implementation and analysis for numerical discretizations of reduced models for blood ow, (ii) an investigation of solutions to different classes of models in the realm of smooth and discontinuous solutions, and (iii) an application of these mod- els within a multiscale framework for simulating ow in patients with hypoplastic left heart syndrome. The two numerical discretizations studied in this thesis are a characteristics{based method for approximating the Riemann{invariants of reduced blood ow models, and a discontinuous Galerkin scheme for approximating solutions to the reduced models directly. A priori error estimates are derived in particular cases for both methods. Further, two classes of hyperbolic systems for blood ow, namely the mass{momentum and the mass{velocity formulations, are systematically compared with each numerical method and physiologically relevant networks of ves- sels and boundary conditions. Lastly, closed loop vessel network models of various Fontan physiologies are constructed. Arterial and venous trees are built from net- works of one{dimensional vessels while the heart, valves, vessel junctions, and organ beds are modeled by systems of algebraic and ordinary differential equations.

May 2017


TR17-02          PDF File

Representation and Estimation of Seismic Sources via Multipoles
Mario J. Bencomo

Accurate representation and estimation of seismic sources are essential to the seismic inversion problem. General sources can be approximated by a truncated series of multipoles depending on the source anisotropy. Most research in the joint inversion of source and medium parameters assumes seismic sources can be modeled as isotropic point-sources resulting in an inability to fit the anisotropy observed in data, ultimately impacting the recovery of medium parameters. In this thesis I lay the groundwork for joint source-medium parameter inversion with potentially anisotropic seismic sources via full waveform inversion through three key contributions: a mathematical and computational framework for the modeling and inversion of sources via multipoles, construction and analysis of discretizations of multipole sources on regular grids, and preconditioners based on fractional time derivative/integral operators for the ill-conditioned source estimation subproblem. As an application of my multipole framework, I also study the efficacy of multipoles in modeling the airgun array source, the most common type of active source in marine seismic surveying. Inversion results recovered a dominating isotropic component of the multipole source model that accounted for 84% of the observed radiation pattern. An extra 10% of the observed output pressure field can be explained when incorporating dipole terms in the source representation, thus motivating the use of multipoles to capture source anisotropy.

May 2017


TR17-01          PDF File

Projection-Based Model Reduction in the Context of Optimization with Implicit PDE Constraints
Caleb Magruder

I use reduced order models (ROMs) to substantially decrease the computational cost of Newton's method for large-scale time-dependent optimal control problems in settings where solving the implicit constraints and their adjoints can be prohibitively expensive. Reservoir management in particular can take weeks to solve the state and adjoint equations necessary to generate one gradient evaluation. Model order reduction has potential to reduce the cost; however, ROMs are valid only nearby their training and must be retrained as the optimization progresses. I will demonstrate that in the case of reservoir management, frequent retraining defeats of purpose the reduced order modeling the state equation altogether.

Rather than generating a reduced order model to replace the original implicit constraint, I use the structure of Hessian and subspace-based ROMs to compute approximate reduced order Hessian information by recycling data from the full order state and adjoint solves in the gradient computation. The resulting method often enjoys convergence rates similar to those of Newton's Method, but at the computational cost of the gradient method.

I demonstrate my approach on nonlinear parabolic optimal control problems on two cases: (1) a semilinear parabolic case with nonlinear reaction terms and (2) the well rate optimization problem in reservoir engineering. For semilinear parabolic case, I consider a cubic reaction term and an exponential reaction term modeling solid fuel injection. My results show a dramatic speedup in wall clock time for the entire optimization procedure. This overall acceleration includes the training and precomputation of the ROMs that is conventionally discounted as "offline" computational costs.

May 2017

Department of Computational and Applied Mathematics
6100 Main MS-134   Houston, TX 77005   713.348.4805

Rice University   |   School of Engineering   |   J. Joyce Young Memorial Fund   |   Pearlman Memorial Fund   |   Weiser Memorial Fund   |   Contact