TetGen is a C++ program for generating good quality tetrahedral meshes aimed to support numerical methods and scientific computing. The problem of quality tetrahedral mesh generation is challenged by many theoretical and practical issues. TetGen uses Delaunay-based algorithms which have theoretical guarantee of correctness. It can robustly handle arbitrary complex 3D geometries and is fast in practice. The source code of TetGen is freely available. This article presents the essential algorithms and techniques used to develop TetGen. The intended audience are researchers or developers in mesh generation or other related areas. It describes the key software components of TetGen, including an efficient tetrahedral mesh data structure, a set of enhanced local mesh operations (combination of flips and edge removal), and filtered exact geometric predicates. The essential algorithms include incremental Delaunay algorithms for inserting vertices, constrained Delaunay algorithms for inserting constraints (edges and triangles), a new edge recovery algorithm for recovering constraints, and a new constrained Delaunay refinement algorithm for adaptive quality tetrahedral mesh generation. Experimental examples as well as comparisons with other softwares are presented.

We describe the University of Florida Sparse Matrix Collection, a large and actively growing set of sparse matrices that arise in real applications. The Collection is widely used by the numerical linear algebra community for the development and performance evaluation of sparse matrix algorithms. It allows for robust and repeatable experiments: robust because performance results with artificially generated matrices can be misleading, and repeatable because matrices are curated and made publicly available in many formats. Its matrices cover a wide spectrum of domains, include those arising from problems with underlying 2D or 3D geometry (as structural engineering, computational fluid dynamics, model reduction, electromagnetics, semiconductor devices, thermodynamics, materials, acoustics, computer graphics/vision, robotics/kinematics, and other discretizations) and those that typically do not have such geometry (optimization, circuit simulation, economic and financial modeling, theoretical and quantum chemistry, chemical process simulation, mathematics and statistics, power networks, and other networks and graphs). We provide software for accessing and managing the Collection, from MATLAB™, Mathematica™, Fortran, and C, as well as an online search capability. Graph visualization of the matrices is provided, and a new multilevel coarsening scheme is proposed to facilitate this task.

A general-purpose MATLAB software program called GPOPSII is described for solving multiple-phase optimal control problems using variable-order Gaussian quadrature collocation methods. The software employs a Legendre-Gauss-Radau quadrature orthogonal collocation method where the continuous-time optimal control problem is transcribed to a large sparse nonlinear programming problem (NLP). An adaptive mesh refinement method is implemented that determines the number of mesh intervals and the degree of the approximating polynomial within each mesh interval to achieve a specified accuracy. The software can be interfaced with either quasi-Newton (first derivative) or Newton (second derivative) NLP solvers, and all derivatives required by the NLP solver are approximated using sparse finite-differencing of the optimal control problem functions. The key components of the software are described in detail and the utility of the software is demonstrated on five optimal control problems of varying complexity. The software described in this article provides researchers a useful platform upon which to solve a wide variety of complex constrained optimal control problems.

An algorithm is described to solve multiple-phase optimal control problems using a recently developed numerical method called the Gauss pseudospectral method . The algorithm is well suited for use in modern vectorized programming languages such as FORTRAN 95 and MATLAB. The algorithm discretizes the cost functional and the differential-algebraic equations in each phase of the optimal control problem. The phases are then connected using linkage conditions on the state and time. A large-scale nonlinear programming problem (NLP) arises from the discretization and the significant features of the NLP are described in detail. A particular reusable MATLAB implementation of the algorithm, called GPOPS , is applied to three classical optimal control problems to demonstrate its utility. The algorithm described in this article will provide researchers and engineers a useful software tool and a reference when it is desired to implement the Gauss pseudospectral method in other programming languages.

Bundle adjustment constitutes a large, nonlinear least-squares problem that is often solved as the last step of feature-based structure and motion estimation computer vision algorithms to obtain optimal estimates. Due to the very large number of parameters involved, a general purpose least-squares algorithm incurs high computational and memory storage costs when applied to bundle adjustment. Fortunately, the lack of interaction among certain subgroups of parameters results in the corresponding Jacobian being sparse, a fact that can be exploited to achieve considerable computational savings. This article presents sba, a publicly available C/C++ software package for realizing generic bundle adjustment with high efficiency and flexibility regarding parameterization.

NOMAD is software that implements the Mesh Adaptive Direct Search (MADS) algorithm for blackbox optimization under general nonlinear constraints. Blackbox optimization is about optimizing functions that are usually given as costly programs with no derivative information and no function values returned for a significant number of calls attempted. NOMAD is designed for such problems and aims for the best possible solution with a small number of evaluations. The objective of this article is to describe the underlying algorithm, the software's functionalities, and its implementation.

We describe here a library aimed at automating the solution of partial differential equations using the finite element method. By employing novel techniques for automated code generation, the library combines a high level of expressiveness with efficient computation. Finite element variational forms may be expressed in near mathematical notation, from which low-level code is automatically generated, compiled, and seamlessly integrated with efficient implementations of computational meshes and high-performance linear algebra. Easy-to-use object-oriented interfaces to the library are provided in the form of a C++ library and a Python module. This article discusses the mathematical abstractions and methods used in the design of the library and its implementation. A number of examples are presented to demonstrate the use of the library in application code.

An overview of the software design and data abstraction decisions chosen for deal.II, a general purpose finite element library written in C++, is given. The library uses advanced object-oriented and data encapsulation techniques to break finite element implementations into smaller blocks that can be arranged to fit users requirements. Through this approach, deal.II supports a large number of different applications covering a wide range of scientific areas, programming methodologies, and application-specific algorithms, without imposing a rigid framework into which they have to fit. A judicious use of programming techniques allows us to avoid the computational costs frequently associated with abstract object-oriented class libraries. The paper presents a detailed description of the abstractions chosen for defining geometric information of meshes and the handling of degrees of freedom associated with finite element spaces, as well as of linear algebra, input/output capabilities and of interfaces to other software, such as visualization tools. Finally, some results obtained with applications built atop deal.II are shown to demonstrate the powerful capabilities of this toolbox.

Tapenade is an Automatic Differentiation (AD) tool which, given a Fortran or C code that computes a function, creates a new code that computes its tangent or adjoint derivatives. Tapenade puts particular emphasis on adjoint differentiation, which computes gradients at a remarkably low cost. This article describes the principles of Tapenade, a subset of the general principles of AD. We motivate and illustrate with examples the AD model of Tapenade, that is, the structure of differentiated codes and the strategies used to make them more efficient. Along with this informal description, we formally specify this model by means of data-flow equations and rules of Operational Semantics, making this the reference specification of the tangent and adjoint modes of Tapenade. One benefit we expect from this formal specification is the capacity to formally study the AD model itself, especially for the adjoint mode and its sophisticated strategies. This article also describes the architectural choices of the implementation of Tapenade. We describe the current performance of Tapenade on a set of codes that include industrial-size applications. We present the extensions of the tool that are planned in a foreseeable future, deriving from our ongoing research on AD.

Firedrake is a new tool for automating the numerical solution of partial differential equations. Firedrake adopts the domain-specific language for the finite element method of the FEniCS project, but with a pure Python runtime-only implementation centered on the composition of several existing and new abstractions for particular aspects of scientific computing. The result is a more complete separation of concerns that eases the incorporation of separate contributions from computer scientists, numerical analysts, and application specialists. These contributions may add functionality or improve performance. Firedrake benefits from automatically applying new optimizations. This includes factorizing mixed function spaces, transforming and vectorizing inner loops, and intrinsically supporting block matrix operations. Importantly, Firedrake presents a simple public API for escaping the UFL abstraction. This allows users to implement common operations that fall outside of pure variational formulations, such as flux limiters.

CHOLMOD is a set of routines for factorizing sparse symmetric positive definite matrices of the form A or AA T , updating/downdating a sparse Cholesky factorization, solving linear systems, updating/downdating the solution to the triangular system Lx = b , and many other sparse matrix functions for both symmetric and unsymmetric matrices. Its supernodal Cholesky factorization relies on LAPACK and the Level-3 BLAS, and obtains a substantial fraction of the peak performance of the BLAS. Both real and complex matrices are supported. CHOLMOD is written in ANSI/ISO C, with both C and MATLAB TM interfaces. It appears in MATLAB 7.2 as x = A\b when A is sparse symmetric positive definite, as well as in several other sparse matrix functions.

An overview of the software design and data abstraction decisions chosen for deal. II, a general purpose finite element library written in C++, is given. The library uses advanced object-oriented and data encapsulation techniques to break finite element implementations into smaller blocks that can be arranged to fit users requirements. Through this approach, deal. II supports a large number of different applications covering a wide range of scientific areas, programming methodologies, and application-specific algorithms, without imposing a rigid framework into which they have to fit. A judicious use of programming techniques allows us to avoid the computational costs frequently associated with abstract object-oriented class libraries. The paper presents a detailed description of the abstractions chosen for defining geometric information of meshes and the handling of degrees of freedom associated with finite element spaces, as well as of linear algebra, input/output capabilities and of interfaces to other software, such as visualization tools. Finally, some results obtained with applications built atop deal. II are shown to demonstrate the powerful capabilities of this toolbox.

We present the basic principles that underlie the high-performance implementation of the matrix-matrix multiplication that is part of the widely used GotoBLAS library. Design decisions are justified by successively refining a model of architectures with multilevel memories. A simple but effective algorithm for executing this operation results. Implementations on a broad selection of architectures are shown to achieve near-peak performance.

We present the Unified Form Language (UFL), which is a domain-specific language for representing weak formulations of partial differential equations with a view to numerical approximation. Features of UFL include support for variational forms and functionals, automatic differentiation of forms and expressions, arbitrary function space hierarchies for multifield problems, general differential operators and flexible tensor algebra. With these features, UFL has been used to effortlessly express finite element methods for complex systems of partial differential equations in near-mathematical notation, resulting in compact, intuitive and readable programs. We present in this work the language and its construction. An implementation of UFL is freely available as an open-source software library. The library generates abstract syntax tree representations of variational problems, which are used by other software libraries to generate concrete low-level implementations. Some application examples are presented and libraries that support UFL are highlighted.

NFFT 3 is a software library that implements the nonequispaced fast Fourier transform (NFFT) and a number of related algorithms, for example, nonequispaced fast Fourier transforms on the sphere and iterative schemes for inversion. This article provides a survey on the mathematical concepts behind the NFFT and its variants, as well as a general guideline for using the library. Numerical examples for a number of applications are given.

We introduce TestU01 , a software library implemented in the ANSI C language, and offering a collection of utilities for the empirical statistical testing of uniform random number generators (RNGs). It provides general implementations of the classical statistical tests for RNGs, as well as several others tests proposed in the literature, and some original ones. Predefined tests suites for sequences of uniform random numbers over the interval (0, 1) and for bit sequences are available. Tools are also offered to perform systematic studies of the interaction between a specific test and the structure of the point sets produced by a given family of RNGs. That is, for a given kind of test and a given class of RNGs, to determine how large should be the sample size of the test, as a function of the generator's period length, before the generator starts to fail the test systematically. Finally, the library provides various types of generators implemented in generic form, as well as many specific generators proposed in the literature or found in widely used software. The tests can be applied to instances of the generators predefined in the library, or to user-defined generators, or to streams of random numbers produced by any kind of device or stored in files. Besides introducing TestU01 , the article provides a survey and a classification of statistical tests for RNGs. It also applies batteries of tests to a long list of widely used RNGs.

We present a collection of 52 nonlinear eigenvalue problems in the form of a MATLAB toolbox. The collection contains problems from models of real-life applications as well as ones constructed specifically to have particular properties. A classification is given of polynomial eigenvalue problems according to their structural properties. Identifiers based on these and other properties can be used to extract particular types of problems from the collection. A brief description of each problem is given. NLEVP serves both to illustrate the tremendous variety of applications of nonlinear eigenvalue problems and to provide representative problems for testing, tuning, and benchmarking of algorithms and codes.

The BLAS-like Library Instantiation Software (BLIS) framework is a new infrastructure for rapidly instantiating Basic Linear Algebra Subprograms (BLAS) functionality. Its fundamental innovation is that virtually all computation within level-2 (matrix-vector) and level-3 (matrix-matrix) BLAS operations can be expressed and optimized in terms of very simple kernels. While others have had similar insights, BLIS reduces the necessary kernels to what we believe is the simplest set that still supports the high performance that the computational science community demands. Higher-level framework code is generalized and implemented in ISO C99 so that it can be reused and/or reparameterized for different operations (and different architectures) with little to no modification. Inserting high-performance kernels into the framework facilitates the immediate optimization of any BLAS-like operations which are cast in terms of these kernels, and thus the framework acts as a productivity multiplier. Users of BLAS-dependent applications are given a choice of using the traditional Fortran-77 BLAS interface, a generalized C interface, or any other higher level interface that builds upon this latter API. Preliminary performance of level-2 and level-3 operations is observed to be competitive with two mature open source libraries (OpenBLAS and ATLAS) as well as an established commercial product (Intel MKL).

The Trilinos Project is an effort to facilitate the design, development, integration, and ongoing support of mathematical software libraries within an object-oriented framework for the solution of large-scale, complex multiphysics engineering and scientific problems. Trilinos addresses two fundamental issues of developing software for these problems: (i) providing a streamlined process and set of tools for development of new algorithmic implementations and (ii) promoting interoperability of independently developed software.Trilinos uses a two-level software structure designed around collections of packages . A Trilinos package is an integral unit usually developed by a small team of experts in a particular algorithms area such as algebraic preconditioners, nonlinear solvers, etc. Packages exist underneath the Trilinos top level, which provides a common look-and-feel, including configuration, documentation, licensing, and bug-tracking.Here we present the overall Trilinos design, describing our use of abstract interfaces and default concrete implementations. We discuss the services that Trilinos provides to a prospective package and how these services are used by various packages. We also illustrate how packages can be combined to rapidly develop new algorithms. Finally, we discuss how Trilinos facilitates high-quality software engineering practices that are increasingly required from simulation software.

KLU is a software package for solving sparse unsymmetric linear systems of equations that arise in circuit simulation applications. It relies on a permutation to Block Triangular Form (BTF), several methods for finding a fill-reducing ordering (variants of approximate minimum degree and nested dissection), and Gilbert/Peierls' sparse left-looking LU factorization algorithm to factorize each block. The package is written in C and includes a MATLAB interface. Performance results comparing KLU with SuperLU, Sparse 1.3, and UMFPACK on circuit simulation matrices are presented. KLU is the default sparse direct solver in the Xyce TM circuit simulation package developed by Sandia National Laboratories.