research-article

Public Access

ARKODE: A Flexible IVP Solver Infrastructure for One-step Methods

Authors:
Daniel R. Reynolds

Southern Methodist University, USA

Southern Methodist University, USA

0000-0002-0911-7841
View Profile

,
David J. Gardner

Lawrence Livermore National Laboratory, USA

Lawrence Livermore National Laboratory, USA

0000-0002-7993-8282
View Profile

,
Carol S. Woodward

Lawrence Livermore National Laboratory, USA

Lawrence Livermore National Laboratory, USA

0000-0002-6502-8659
View Profile

,
Rujeko Chinomona

Temple University, USA

Temple University, USA

0000-0002-5267-8433
View Profile

Authors Info & Claims

ACM Transactions on Mathematical Software Volume 49 Issue 2Article No.: 19pp 1–26https://doi.org/10.1145/3594632

Published:15 June 2023Publication History

ACM Transactions on Mathematical Software

Abstract

We describe the ARKODE library of one-step time integration methods for ordinary differential equation (ODE) initial-value problems (IVPs). In addition to providing standard explicit and diagonally implicit Runge–Kutta methods, ARKODE supports one-step methods designed to treat additive splittings of the IVP, including implicit-explicit (ImEx) additive Runge–Kutta methods and multirate infinitesimal (MRI) methods. We present the role of ARKODE within the SUNDIALS suite of time integration and nonlinear solver libraries, the core ARKODE infrastructure for utilities common to large classes of one-step methods, as well as its use of “time stepper” modules enabling easy incorporation of novel algorithms into the library. Numerical results show example problems of increasing complexity, highlighting the algorithmic flexibility afforded through this infrastructure, and include a larger multiphysics application leveraging multiple algorithmic features from ARKODE and SUNDIALS.

1 INTRODUCTION

The SUite of Nonlinear and DIfferential/ALgebraic Solvers (SUNDIALS) [4, 18, 32] is a numerical software library providing time integrators and nonlinear solvers for use on a range of computing systems—from laptops to leadership class supercomputers. The newest package in SUNDIALS, ARKODE, provides highly flexible time integration methods for additive systems with partitions of differing stiffness (implicit-explicit methods) and differing time scales (multirate methods). ARKODE focuses on one-step, multi-stage methods for first-order ordinary differential equation (ODE) initial-value problems (IVPs) that typically include robust temporal adaptivity, based on estimates of both solution error and solver efficiency. These methods leverage significant theoretical developments over recent decades in Runge–Kutta methods, and they naturally map to space-time systems of partial differential equations (PDEs) that employ spatially adaptive meshes. Furthermore, ARKODE can be considered as an infrastructure for one-step methods, providing a range of reusable components that may be replaced with problem-optimal choices, or even experimental methods for rapid development and testing. The ARKODE implementations for different classes of methods follow similar APIs and leverage many heuristics from other SUNDIALS packages, allowing for similar usage and efficiency.

Over the last two decades, there have been significant advances in the derivation of numerical methods that allow both high accuracy and increased flexibility with regard to how various components of a problem are treated. These methods range from those that apply a uniform time step size for an entire problem while varying the algorithms used on individual components, to multirate methods that evolve separate solution components using different step sizes.

Methods in the former category have been introduced primarily for problems that couple stiff and nonstiff processes. Instead of employing a fully implicit or fully explicit method that would be ideally suited to only the stiff or nonstiff components of the problem, respectively, these approaches apply robust implicit solvers to the stiff components while treating the remaining nonstiff (and frequently nonlinear) components explicitly. While simplistic first-order operator-split approaches have been utilized for decades, including higher-order variants that introduce complex arithmetic or backward integration [7], novel methods that provide increased accuracy and stability for such problems include implicit-explicit (ImEx) additive Runge–Kutta methods, first introduced in [10], with newer embedded versions supporting temporal adaptivity developed in [36, 37].

Multirate methods, on the other hand, evolve separate problem components using different time step sizes and are frequently applied to multiphysics problems that combine various physical processes that may separately evolve on disparate time scales. In such circumstances the “fast” processes are often evolved with small step sizes, while the “slow” processes are evolved using larger time steps. Here again, simple first-order “subcycling” approaches have been employed for many years; however, research into higher-order approaches has recently seen dramatic advances through the development of multirate methods [9, 20] including multirate infinitesimal methods [8, 38, 40, 50, 51, 52, 64] and multirate generalized additive Runge–Kutta methods [22, 49].

Recently, several software packages have been developed to meet some of the challenges presented by time-dependent multiphysics problems. Some of the most notable ones include the PETSc/TS [2] and Trilinos/Tempus [41] packages, which respectively provide C and C++ implementations of explicit, implicit, and additive Runge–Kutta methods for ODEs and differential-algebraic equations (DAEs). Additionally, the DifferentialEquations.jl package [43] provides a suite for numerically solving ordinary differential equations and stochastic differential equations (including back-end interfaces to SUNDIALS) written in Julia, Python, and R. All of these packages include temporal adaptivity and additive formulations that support splitting a problem into its stiff and nonstiff components. Additionally, these packages provide support for solving the nonlinear and linear systems of equations that arise from implicit time discretizations, as well as implementations that enable computations on GPU accelerators. Of these packages, multirate methods, which are critical for a wide range of multiphysics applications, are only supported by PETSc, which includes second-order Multirate Partitioned Runge–Kutta methods [34].

The goal of the ARKODE solver library is to provide a bridge between recent mathematical advances into flexible time integration methods and the large-scale simulations that need them. Our overall aims in designing ARKODE were threefold:

(1)	Provide high-order, stable, efficient, and scalable methods for multiphysics, multirate systems of IVPs, for which no single “textbook” solution technique optimally applies to all of its various components.
(2)	Provide both an IVP solver and experimentation framework that allows users to rapidly explore algorithmic optimizations.
(3)	Provide a high-quality, scalable, open-source software library that can easily be incorporated into existing simulation codes without requiring application developers to rewrite or convert their native data structures, and that can be run on a broad range of computing systems.

While ARKODE may be used as a standalone library, it is released as part of, and leverages numerous components from, the larger SUNDIALS suite, which also consists of the solvers CVODE (for ODE IVP systems), IDA (for DAE IVP systems), CVODES and IDAS (forward and adjoint sensitivity analysis variants of CVODE and IDA, respectively), and KINSOL (for nonlinear algebraic systems). As part of SUNDIALS, ARKODE has access to a rich ecosystem of vector, linear solver, and nonlinear solver classes both to foster experimentation between SUNDIALS integrators (e.g., CVODE vs. ARKODE) and to leverage existing interfaces and infrastructure. SUNDIALS, including ARKODE, is written in C with modern Fortran interfaces. All operations on data within the SUNDIALS packages are done through a clear set of abstract vector, matrix, linear solver, and nonlinear solver interfaces. To wit, SUNDIALS integrators, including ARKODE, make no assumption about how a user’s data is laid out in memory; specifically, users can (if they wish) supply their own class implementations, as long as they provide the methods required by the integrator.

This flexibility allows for ARKODE, along with other SUNDIALS integrators, to be easily employed from within existing applications. For example, ARKODE has been used in several applications including the High-Order Methods Modeling Environment (HOMME) dynamical core in the Energy Exascale Earth System Model (E3SM) [63], the Tempest atmospheric dynamical core [17], the ParaDiS dislocation dynamics simulator [19], the Modular Finite Element Methods Library (MFEM) [3], the PeleC adaptive-mesh compressible hydrodynamics code for reacting flows [27], the MuSHrooM fusion code [15], and the MEUMAPPS phase-field modeling code [11, 44].

ARKODE also serves as an infrastructure for one-step methods, and, as such, it provides a variety of shared modules for (1) handling temporal adaptivity to achieve a desired solution accuracy and efficiently utilize any underlying iterative algebraic (nonlinear and linear) solvers, (2) interfaces to translate user-defined IVP right-hand side and Jacobian routines into the routines required by SUNDIALS’ general purpose algebraic solver classes, (3) stiff and nonstiff temporal interpolation modules for dense output and efficient implicit predictors for iterative algebraic solvers, (4) root-finding capabilities for event detection, and (5) enforcing solution inequality constraints.

The rest of this article is organized as follows. In Section 2 we introduce the current time-stepping modules provided within ARKODE, as these user-facing components define both the types of problems that may be solved and the algorithms that can be applied to them. For readers interested in the underlying structure of ARKODE, in Section 3 we discuss the shared infrastructure on which ARKODE’s time-stepping modules are built. Then, in Section 4, we present the standard usage of ARKODE. In Section 5 we present a sequence of simple numerical examples highlighting the algorithmic flexibility of ARKODE and a demonstration example showing an advanced use of ARKODE for a large-scale multiphysics application. Finally, in Section 6 we point out open-source options for accessing ARKODE and provide concluding remarks. For the sake of readability, we omit some specific usage and implementation details; for this information, as well as additional details on ARKODE itself, we refer readers to the ARKODE documentation at sundials.readthedocs.io.

2 TIME-STEPPING MODULES

ARKODE considers time-dependent, first-order ODE IVPs in linearly implicit form: (1) \(\begin{align} M(t)\, y^{\prime }(t) = f(t,y),\qquad y(t_0) = y_0, \end{align}\) where the independent variable typically satisfies \(t\in [t_0,t_f]\) (ARKODE also allows \(y(t_f)=y_f\), in which case Equation (1) is a terminal-value problem), and the dependent variable is \(y\in \mathbb {R}^N\). \(M(t)\) is a user-specified nonsingular operator from \(\mathbb {R}^N\rightarrow \mathbb {R}^N\) that is assumed to be independent of y. For standard systems of ODEs and for problems arising from the spatial semi-discretization of PDEs using finite difference, finite volume, or spectral element methods, M is typically the identity matrix, I. For PDEs using finite-element spatial semi-discretizations, M is (or can be taken to be) a well-conditioned mass matrix.

While each of the time-stepping modules supplied with ARKODE considers different formulations of Equation (1), all share the same basic structure wherein y is provided at only the initial (or final) time, and ARKODE computes approximate values of \(y(t)\) at a discrete set of independent variable values, either \(t_0 \lt t_1 \lt \cdots \lt t_f\) for forward-in-time evolution or \(t_f \gt \cdots \gt t_1 \gt t_0\) for reverse evolution. We denote these approximations of the solution \(y(t_n)\) as \(y_n\). One-step methods for the solution of Equation (1) generate these approximate solutions using some formula: (2) \(\begin{align} y_n = \varphi (t_{n-1}, y_{n-1}, y_n, h_n), \end{align}\) where \(h_n = t_n-t_{n-1}\) is the step size, and \(\varphi\) denotes the approximation method that updates the solution \(y_{n-1} \rightarrow y_n\). ARKODE makes no assumption that the step sizes are uniform, i.e., \(h_n \ne h_{n-1}\), although such use is supported. Instead, ARKODE strives to take the largest possible step that simultaneously ensures that the approximations, \(y_n\), are sufficiently accurate and that the time-stepping algorithms used to evaluate \(\varphi\) are efficient and robust.

While the ARKODE infrastructure provides utilities for temporal integration of one-step IVP methods, it does not specify either the type of IVP problems that should be solved (i.e., the structure of the function \(f(t,y)\)), nor does it require any specific numerical algorithm for taking the step, \(\varphi (t_{n-1}, y_{n-1}, y_n, h_n)\). These aspects of integration are left up to specific time stepping modules that rest on the ARKODE infrastructure—a diagram of this relationship is shown in Figure 1. The time-stepping modules provide the user-facing software layer for problem definition, evolution, and reporting solver statistics. Upon creation of a time-stepping module, that module internally creates the main ARKODE memory structure and “attaches” itself to ARKODE by providing two functions used by ARKODE during the IVP evolution:

Fig. 1. The relationship between ARKODE and its time-stepping modules. ARKODE provides the time integration loop; each stepper performs individual time steps leveraging the shared infrastructure as needed.

(1)	an initialization function that is called by ARKODE once all setup has been completed and temporal integration of the problem is about to commence, and
(2)	a step function that computes a candidate update, \(\tilde{y} = \varphi (t_{n-1}, y_{n-1}, h_n)\), and the corresponding estimate of the local truncation error, \(\Vert T_n\Vert\).

Currently, ARKODE includes three time-stepping modules: ERKStep, ARKStep, and MRIStep, discussed in the following subsections.

2.1 ERKStep

The ERKStep time-stepping module is designed for IVPs of the form (3) \(\begin{equation} y^{\prime }(t) = f(t,y), \qquad y(t_0) = y_0, \end{equation}\) i.e., \(M(t)=I\) in (1). For such problems, ERKStep provides variable-step, embedded, explicit Runge–Kutta(ERK) methods that evolve a single time step \(t_{n-1}\rightarrow t_{n-1}+h_n = t_n\) using the algorithm (4a) \(\begin{align} z_i &= y_{n-1} + h_n \sum _{j=1}^{i-1} A_{i,j} f(t_{n,j}, z_j), \quad i=1,\ldots ,s, \end{align}\) (4b) \(\begin{align} y_n &= y_{n-1} + h_n \sum _{i=1}^{s} b_i f(t_{n,i}, z_i), \end{align}\) (4c) \(\begin{align} \tilde{y}_n &= y_{n-1} + h_n \sum _{i=1}^{s} \tilde{b}_i f(t_{n,i}, z_i), \end{align}\)

where the internal stage times are abbreviated using the notation \(t_{n,j}=t_{n-1}+c_jh_n\), and the coefficients \(A\in \mathbb {R}^{s\times s}\), \(b\in \mathbb {R}^s\), \(c\in \mathbb {R}^s\), and optional embedding coefficients \(\tilde{b}\in \mathbb {R}^s\) (a.k.a., the Butcher table) define the method. The embedding, \(\tilde{y}_n\), typically provides a slightly lower accuracy approximation than the computed solution, \(y_n\). ARKODE provides a number of ERK Butcher tables having orders of accuracy \(q=\lbrace 2,3,4,5,6,8\rbrace\); each is supplied with an embedding \(\tilde{b}\) that usually has order of accuracy \(p=q-1\). User-supplied Butcher tables are also supported and are not required to include embedding coefficients, although if these are not provided then automatic temporal adaptivity will be disabled. We have found this “fixed-step” mode to be particularly useful when existing simulation codes transition to ARKODE. By providing the ERK Butcher table that a code currently uses and ceding control over \(h_n\) to the application code, users may run verification tests to ensure reproducibility before switching to higher-order and/or adaptive ERK methods.

In order to allow users to “tune” ERKStep to better exploit the problem structure, numerous optional features may be enabled or modified by the user. The full set of features is described in [48], a few of which are highlighted in the example problems in Section 5.1.1.

2.2 ARKStep

The ARKStep time-stepping module is designed for IVPs where the right-hand side function may be additively split into two components: (5) \(\begin{equation} M(t)\, y^{\prime }(t) = f^E(t,y) + f^I(t,y), \qquad y(t_0) = y_0, \end{equation}\) where the user-provided right-hand side function \(f^E(t,y)\) contains the “nonstiff” components of the system, and \(f^I(t,y)\) contains the “stiff” components of the system. In solving Equation (5), we first convert to standard additive form: (6) \(\begin{equation} y^{\prime }(t) = \hat{f}^E(t,y) + \hat{f}^I(t,y), \qquad y(t_0) = y_0, \end{equation}\) where \(\hat{f}^E(t,y) = M(t)^{-1}f^E(t,y)\) and \(\hat{f}^I(t,y) = M(t)^{-1}f^I(t,y)\). ARKStep utilizes variable-step, embedded, ImEx additive Runge–Kutta(ARK) methods. Given a pair of s-stage Butcher tables \((A^E, b^E, \tilde{b}^E, c^E)\) and \((A^I, b^I, \tilde{b}^I, c^I)\), a single time step \(t_{n-1}\rightarrow t_{n}\) is advanced via the algorithm (7a) \(\begin{align} z_i &= y_{n-1} + h_n \sum _{j=1}^{i-1} A^E_{i,j} \hat{f}^E\left(t^E_{n,j}, z_j\right) + h_n \sum _{j=1}^{i} A^I_{i,j} \hat{f}^I\left(t^I_{n,j}, z_j\right)\!, \quad i=1,\ldots ,s, \end{align}\) (7b) \(\begin{align} y_n &= y_{n-1} + h_n \sum _{i=1}^{s} \left(b^E_i \hat{f}^E\left(t^E_{n,i}, z_i\right) + b^I_i \hat{f}^I\left(t^I_{n,i}, z_i\right)\right)\!, \end{align}\) (7c) \(\begin{align} \tilde{y}_n &= y_{n-1} + h_n \sum _{i=1}^{s} \left(\tilde{b}^E_i \hat{f}^E\left(t^E_{n,i}, z_i\right) + \tilde{b}^I_i \hat{f}^I\left(t^I_{n,i}, z_i\right)\right)\!, \end{align}\)

where we denote the internal stage times as \(t^E_{n,j} = t_{n-1} + c^E_j h_n\) and \(t^I_{n,j} = t_{n-1} + c^I_j h_n\). We note that ARKStep enforces the constraint that the explicit and implicit Butcher tables share the same number of stages, s, but it allows for the possibility of different explicit and implicit abscissae, i.e., \(c^E \ne c^I\) (as required for various SSP ImEx-ARK methods [29, 30, 31, 42]).

Through appropriate selection of the Butcher tables and right-hand side functions, ARKStep supports three classes of Runge–Kutta methods: ImEx, explicit, and diagonally implicit. Several Butcher tables are provided for each class, all of which include the embedding coefficients \(\tilde{b}^E\) and \(\tilde{b}^I\), although users may provide their own tables without embeddings (in which case adaptivity will be disabled). For mixed stiff/nonstiff problems, a user should provide both of the functions \(f^E\) and \(f^I\) that define the IVP system, in which case ARKStep provides the ImEx-ARK Butcher tables proposed in [35, 37] with orders of accuracy \(q = \lbrace 3,4,5\rbrace\) and embeddings of orders \(p = \lbrace 2,3,4\rbrace\), respectively.

Either of the function pointers for \(f^I\) or \(f^E\) may be NULL to disable that component. For nonstiff problems one can disable \(f^I\), in which case Equation (5) reduces to the non-split IVP: (8) \(\begin{equation} M(t)\, y^{\prime }(t) = f^E(t,y), \qquad y(t_0) = y_0. \end{equation}\) In this scenario, the implicit Butcher table (\(A^I\), \(b^I\), \(\tilde{b}^I\), \(c^I\)) in Equation (7) is ignored, and the ARK methods reduce to classical ERK methods. Here, all methods from ERKStep are again available in ARKStep. We note that in this mode, both the problem in Equation (8) and the method in Equation (7) fully encapsulate the ERKStep problem in Equation (3) and method in Equation (4). While it therefore follows that ARKStep can be used to solve every problem solvable by ERKStep (even with the same Butcher tables), we retain ERKStep as a distinct time-stepping module since its simplified form admits a more efficient and memory-friendly implementation than that provided by ARKStep.

Alternately, for stiff problems \(f^E\) may be disabled so that Equation (5) reduces to (9) \(\begin{equation} M(t)\, y^{\prime }(t) = f^I(t,y), \qquad y(t_0) = y_0. \end{equation}\) Here the explicit Butcher table (\(A^E\), \(b^E\), \(\tilde{b}^E\), \(c^E\)) in Equation (7) is ignored, and the ARK methods reduce to standard diagonally implicit Runge–Kutta(DIRK) methods, for which ARKODE provides tables with orders of accuracy \(q = \lbrace 2,3,4,5\rbrace\) and embeddings of orders \(p = \lbrace 1,2,3,4\rbrace\), respectively.

As with ERKStep, the ARKStep module includes a number of features that allow users to optimize its usage on their problems. The full list of these is included in [48]. One such feature is the routine ARKStepSetLinear, which allows the user to specify that \(f^I\) depends linearly on y, and thus each implicit stage equation for \(z_i\) from Equation (7) corresponds to a linear system of equations. In this case, only one iteration of the Newton nonlinear solver is performed, avoiding extraneous nonlinear iterations and residual norm computations to assess convergence.

A second notable feature in ARKStep is its native interface to the scalable parallel-in-time (PinT) library XBraid [12]. Instead of performing a standard time-marching approach from one step to the next \((t_{n-1}\rightarrow t_{n})\), XBraid solves for all time values simultaneously using multigrid reduction in time(MGRIT) [13], a highly parallel iterative method that exposes parallelism in the time domain in addition to spatial parallelization. We have designed the ARKStep + XBraid interface so that simulation codes using ARKStep should require minimal modifications to explore using PinT.

2.3 MRIStep

MRIStep is the newest time-stepping module in ARKODE and targets multirate IVPs of the form (10) \(\begin{equation} y^{\prime }(t) = f^{E}(t,y) + f^{I}(t,y) + f^F(t,y), \qquad y(t_0) = y_0. \end{equation}\) The right-hand side function is split into slow components, \(f^E(t,y) + f^I(t,y)\), that should be integrated with a large step size H, and where \(f^E(t,y)\) contains nonstiff terms and \(f^I(t,y)\) contains stiff terms, and fast components, \(f^F(t,y)\), that should be integrated with a small step size \(h \ll H\).

For approximating solutions to Equation (10), MRIStep utilizes multirate infinitesimal (MRI) methods, which are characterized by an alternation between an outer ARK-based method for the slow time scale and a sequence of fast time scale IVP systems that are evolved using an inner solver.

When both slow components \(f^E\) and \(f^I\) are present, MRIStep uses implicit-explicit multirate infinitesimal generalized-structure additive Runge–Kutta (ImEx-MRI-GARK) methods [8]. When either \(f^E\) or \(f^I\) are NULL-valued, ImEx-MRI-GARK methods simplify to either implicit or explicit multirate infinitesimal GARK (MRI-GARK) methods [51]. The following algorithm outlines a single step \(t_{n-1}\rightarrow t_{n-1}+H=t_n\) of an s-stage MRI method from either family:

(1)

Let \(z_1 = y_{n-1}\) and \(t_{n,1}=t_{n-1}\).

(2)

For \(i = 2,\ldots ,s\)

(a)

Let \(t_{n,i} = t_{n-1} + c_{i} H\), \(\Delta c_i = c_i-c_{i-1}\), \(v(t_{n,i-1}) = z_{i-1}\), and

\(r_i(t) = \frac{1}{\Delta c_i}\sum \nolimits _{j=1}^{i-1} \omega _{i,j}\left(\frac{t-t_{n,i-1}}{\Delta c_i H}\right) f^{E}(t_{n,j}, z_j) + \frac{1}{\Delta c_i}\sum \nolimits _{j=1}^i \gamma _{i,j}\left(\frac{t-t_{n,i-1}}{\Delta c_i H}\right) f^{I}(t_{n,j}, z_j)\).

(b)

Solve \(v^{\prime }(t) = f^F(t, v) + r_i(t)\) over \(t \in [t_{n,i-1}, t_{n,i}]\).

(c)

Set \(z_i = v(t_{n,i})\).

(3)

Set \(y_{n} = z_{s}\).

Here the coefficients, \(0=c_1 \le c_s= 1\), define the abscissae for the method. The coefficient functions, \(\omega _{i,j}\) and \(\gamma _{i,j}\), are polynomials in time that dictate the couplings from the slow to the fast time scale and can be expressed as in [8] as \(\begin{equation*} \omega _{i,j}(\theta) = \sum _{k=0}^K \omega _{i,j}^{\lbrace k\rbrace }\theta ^k, \quad \gamma _{i,j}(\theta) = \sum _{k=0}^K \gamma _{i,j}^{\lbrace k\rbrace } \theta ^k, \end{equation*}\) where typically \(0\le K \le 2\). The coupling tables, \(\Omega ^{\lbrace k\rbrace }\in \mathbb {R}^{s\times s}\) and \(\Gamma ^{\lbrace k\rbrace }\in \mathbb {R}^{s\times s}\), contain the explicit, \(\omega _{i,j}^{\lbrace k\rbrace }\), and implicit, \(\gamma _{i,j}^{\lbrace k\rbrace }\), coefficients, respectively.

Like Runge–Kutta methods, implicitness at the slow time scale is characterized by nonzero values on or above the diagonal. MRIStep currently supports diagonally implicit, “solve-decoupled” methods, i.e., \(\omega _{i,j}^{\lbrace k\rbrace } = 0\) for \(j \ge i\), \(\gamma _{i,j}^{\lbrace k\rbrace }=0\) for \(j\gt i\), and \(\Delta c_i=0\) if \(\gamma _{i,i}^{\lbrace k\rbrace } \ne 0\). When \(\Delta c_i=0\), the “fast” IVP solve in step 2b reduces to a standard ImEx-ARK stage in Equation (7a), with step size H and coefficients \(\begin{equation*} A^E_{i,j} = \sum _{k=0}^K \frac{\omega _{i,j}^{\lbrace k\rbrace }}{k+1}, \quad A^I_{i,j} = \sum _{k=0}^K \frac{\gamma _{i,j}^{\lbrace k\rbrace }}{k+1}. \end{equation*}\) Thus, an MRI stage only requires a nonlinear implicit solve if \(A^I_{i,i} \ne 0\).

MRIStep also supports multirate infinitesimal step (MIS) methods [53, 54, 55]. These methods are a subset of MRI-GARK methods where the coupling coefficients are uniquely defined based on a slow explicit Butcher table \((A^E,b^E,c^E)\) with \(s-1\) stages and sorted abscissae \(c^E\). The MIS method then has abscissae \(c = [c^E_1 \cdots c^E_{s-1} 1]^T\) and coupling coefficients \(\gamma _{i,j}^{\lbrace 0\rbrace } = 0\) and (11) \(\begin{equation} \omega _{i,j}^{\lbrace 0\rbrace } = {\left\lbrace \begin{array}{ll} 0, & \text{if} i=1,\\ A^E_{i,j} - A^E_{i-1,j}, & \text{if} 2\le i\le s-1,\\ b^E_{j} - A^E_{s-1,j}, & \text{if} i=s. \end{array}\right.} \end{equation}\)

At present, the MRIStep module implements all third- and fourth-order ImEx-MRI-GARK methods from [8], the third-order MIS method defined by the slow Butcher table from [38], and all of the third- and fourth-order MRI-GARK methods from [51]. Alternately, users may provide a custom set of coupling tables, \(\Gamma ^{\lbrace k\rbrace }\) and/or \(\Omega ^{\lbrace k\rbrace }\), to define their own MRI method. MRIStep also includes a utility routine that will convert a slow Butcher table \((A^E,b^E,c^E)\) with sorted abscissae \(c^E\) into a set of MRI-GARK coefficients using Equation (11). If the slow Butcher table has order of accuracy at least two, then the MIS method will also have order two. Third-order accuracy can be obtained if the slow table itself is at least third order and satisfies the condition [38] \(\begin{equation*} \sum _{i=2}^{\hat{s}} \left(c_i^E-c_{i-1}^E\right) \left(e_i+e_{i-1}\right)^T A^E c^E + \left(1-c_{\hat{s}}^E\right) \left(\frac{1}{2}+e_{\hat{s}}^T A^E c^E\right) = \frac{1}{3}, \end{equation*}\) where \(A^{E}\) has \(\hat{s}\) stages and \(e_k\) is the kth column of an \(\hat{s}\times \hat{s}\) identity matrix.

The IVPs in step 2b may be solved using any sufficiently accurate method by supplying a “fast IVP” stepper to the MRIStep module. For convenience, ARKODE includes a constructor that will attach an ARKStep instance as the inner integrator, thereby enabling an adaptive explicit, implicit, or ImEx treatment of the fast time scale. Alternatively, applications can supply a user-defined integrator for the fast time scale as demonstrated in the simple example problem in Section 5.1.3 and in the large-scale demonstration problem in Section 5.2. Currently MRIStep supports temporal adaptivity only at the fast time scale. Efficient multirate temporal adaptivity is an open research question (see, e.g., [14]), but future support is anticipated to follow new developments in the field. Finally, like ERKStep and ARKStep, MRIStep provides numerous configuration options for users to control integrator behavior. The full list of options is included in [48].

3 SHARED INFRASTRUCTURE

As mentioned above, the ARKODE time-stepping modules rest on a shared infrastructure providing various utilities for one-step methods. This infrastructure manages the time loop, calling the time stepper modules both to advance the solution with a given step size and to compute a temporal error estimate for that step. This outer loop handles the temporal error control and time step adaptivity, the detection of events (root finding) and solution constraint violations, and, if necessary, interpolation of the solution to a requested output time. Moreover, ARKODE provides time steppers with interfaces to algebraic solvers and methods for predicting the new time solution. We note that ARKODE itself rests on the shared SUNDIALS infrastructure. Thus, throughout this section we focus on the ARKODE-specific utilities, deferring to more general SUNDIALS references [4, 18, 32] for descriptions of that shared infrastructure.

3.1 User-supplied Tolerances and Error Control

To control errors at various levels (time integration and algebraic solvers), ARKODE, like other SUNDIALS solvers [32, Equation (4)], uses a weighted root-mean-square norm for all error-like quantities: (12) \(\begin{equation} \Vert v\Vert _{\text{WRMS}} = \left(\frac{1}{N} \sum _{i=1}^N \left(v_i\,w_i\right)^2\right)^{1/2}. \end{equation}\) This norm allows problem-specific error measurement via the weighting vector, w, that combines the units of the problem with user-supplied values that specify “acceptable” error levels. Thus, at the start of each time step \(t_{n-1} \rightarrow t_n\), we construct two weight vectors. The first is an error weight vector that, as in CVODE [32, Equation (5)], uses the most recent step solution and user-supplied relative (rtol) and absolute (atol, scalar- or vector-valued) tolerances: (13) \(\begin{equation} w_i = \left(\text{rtol}\, |y_{n-1,i}| + \text{atol}_{i}\right)^{-1}. \end{equation}\) The second focuses on problems with a non-identity operator \(M(t)\), since the units of Equation (1) may differ from the units of the solution y. Here, we construct a residual weight vector: (14) \(\begin{equation} \sigma _i = \left(\text{rtol}\, |r_i| + \text{ratol}_i\right)^{-1}, \quad r = M(t_{n-1}) y_{n-1}, \end{equation}\) where the user may specify a separate residual absolute tolerance value or vector, \(\text{ratol}\).

ARKODE uses w or \(\sigma\) based on the quantity being measured: errors having “solution” units use w, whereas those having “equation” units use \(\sigma\). Obviously, when \(M=I\), the solution and equation units match, and ARKODE uses w for all error norms. In both cases, since \(w_i^{-1}\) (or \(\sigma _i^{-1}\)) represents a tolerance in the ith component of the solution (or equation) vector, then an error whose WRMS norm is \(\le \!\! 1\) is regarded as being within an acceptable error tolerance.

3.2 Stepsize Adaptivity

A critical utility provided by ARKODE is its adaptive control of local truncation error (LTE), \(T_n\), at each step, which corresponds to the error induced within the time step \(t_{n-1}\rightarrow t_n\), under an assumption that the initial condition for that step was exact. At every internal step, ARKODE requests both an approximate solution \(y_n\) and an estimate of \(T_n\) from the time-stepping module. These approximations have global orders of accuracy q and p, respectively, where generally \(p=q-1\). Since the global and local errors for one-step methods usually differ by one factor of \(h_n\) [25], then (15) \(\begin{align} \Vert y_n - y(t_n) \Vert = C h_n^{q+1} + \mathop {}\mathopen {}\mathcal {O}\mathopen {}\left(h_n^{q+2}\right)\!, \end{align}\) (16) \(\begin{align} \Vert T_n \Vert = D h_n^{p+1} + \mathop {}\mathopen {}\mathcal {O}\mathopen {}\left(h_n^{p+2}\right)\!, \end{align}\) where C and D are constants independent of \(h_n\), and where we have assumed exact initial conditions for the step, i.e., \(y_{n-1} = y(t_{n-1})\). Given this time-stepper-estimated value of \(T_n\), ARKODE adopts the same local temporal error test as other SUNDIALS integrators, simply (17) \(\begin{equation} \Vert T_n\Vert _{\text{WRMS}} \le 1. \end{equation}\) If this test passes, the step is accepted and \(T_n\) is used to estimate a prospective next step size, \(h^{\prime }\), using an adaptive error controller. If Equation (17) fails, then the step is rejected, and \(T_n\) is used to estimate a reduced step size, \(h^{\prime }\). In either case, temporal adaptivity in ARKODE focuses on choosing the maximum \(h^{\prime }\) such that Equation (17) should pass on the next attempted step. We thus define the biased local error estimate: \(\begin{equation*} \varepsilon _k \ \equiv \ \beta \, \Vert T_k\Vert _{\text{WRMS}}, \end{equation*}\) where the error bias \(\beta \gt 1\) helps to account for the error constant D in Equation (16). ARKODE stores the biased error estimates corresponding to the three most recent steps, \(\varepsilon _n\), \(\varepsilon _{n-1}\), and \(\varepsilon _{n-2}\). Of these, both \(\varepsilon _{n-1}\) and \(\varepsilon _{n-2}\) correspond to the last two successful steps, whereas \(\varepsilon _n\) corresponds to the most recently attempted step. ARKODE provides a variety of error controllers that utilize these estimates, including PID (the ARKODE default) [35, 57, 58, 59], PI [35, 57, 58, 59], I (the “standard” method used in most publicly available IVP solvers) [33], the explicit Gustafsson controller [23], the implicit Gustafsson controller [24], and an “ImEx” Gustaffson controller that sets \(h^{\prime }\) as the minimum of the explicit and implicit Gustafsson controller values. Alternately, users may provide their own functions to produce \(h^{\prime }\) using the inputs \(y_n, t_n, h_n, h_{n-1}, h_{n-2}, \varepsilon _n, \varepsilon _{n-1}\), and \(\varepsilon _{n-2}\). Each controller bases its adaptivity off the order of accuracy for the error estimate p, although the user may request to use the solution order q instead. All of the provided controllers include tuning parameters, with default values determined using an exhaustive genetic optimization process based off of dozens of challenging IVP test problems, including nearly all from the “Test Set For IVP Solvers” [62].

In addition to error-based temporal adaptivity, ARKODE adopts many of the heuristics from CVODE [32] for bounding \(h^{\prime }\), including approaches for handling repeated temporal error test failures, overly aggressive growth or decay in prospective steps (\(h^{\prime } \gg h_n\) or \(h^{\prime } \ll h_n\)), step size reduction following a failed implicit solve, and user-supplied step size bounds \(h_\text{min} \le h^{\prime } \le h_\text{max}\). Complete details on these heuristics, as well as information on how each parameter may be adjusted by users, are provided in the ARKODE documentation [48].

3.3 Interpolation Modules

ARKODE supports interpolation of approximate solutions \(y(t_{out})\) and derivatives \(y^{(d)}(t_{out})\), where \(t_{out}\) occurs within the most recently completed step \(t_{n-1}\rightarrow t_n\). This utility also supports extrapolation of y and its derivatives outside this time step to construct predictors for iterative algebraic solvers. These approximations are based on polynomial interpolants \(\pi _{\xi }(t)\) of degree up to \(\xi =5\) that are constructed using one of two complementary approaches: “Hermite” and “Lagrange.”

3.3.1 Hermite Interpolation Module.

For non-stiff problems ARKODE provides Hermite polynomial interpolants, constructed using both solution and derivative data (\(y_n \approx y(t_n)\) and \(f_n \equiv f(t_n,y_n)\approx y^{\prime }(t_n)\)), following the approach outlined in [25, II.6]. The available Hermite interpolants are:

\(\pi _0(t)\): satisfies \(\pi _0(t) = \tfrac{1}{2}\left(y_{n-1} + y_{n}\right)\).
\(\pi _1(t)\): satisfies \(\pi _1(t_{n-1}) = y_{n-1}\) and \(\pi _1(t_n) = y_{n}\).
\(\pi _2(t)\): satisfies \(\pi _2(t_{n-1}) = y_{n-1}\), \(\pi _2(t_n) = y_n\), and \(\pi _2^{\prime }(t_n) = f_{n}\).
\(\pi _3(t)\): satisfies \(\pi _3(t_{n-1}) = y_{n-1}\), \(\pi _3(t_n) = y_n\), \(\pi _3^{\prime }(t_{n-1}) = f_{n-1}\), and \(\pi _3^{\prime }(t_n) = f_{n}\).
\(\pi _4(t)\): satisfies \(\pi _4(t_{n-1}) = y_{n-1}\), \(\pi _4(t_{n}) = y_{n}\), \(\pi _4^{\prime }(t_{n-1}) = f_{n-1}\), \(\pi _4^{\prime }(t_n) = f_{n}\), and
\(\pi _4^{\prime }(t_{n} - \tfrac{h_n}{3}) = f(t_{n} - \tfrac{h_n}{3},\, \pi _3(t_{n} - \tfrac{h_n}{3}))\).
\(\pi _5(t)\): satisfies \(\pi _5(t_{n-1}) = y_{n-1}\), \(\pi _5(t_{n}) = y_{n}\), \(\pi _5^{\prime }(t_{n-1}) = f_{n-1}\), \(\pi _5^{\prime }(t_n) = f_{n}\),
\(\pi _5^{\prime }(t_{n} - \tfrac{h_n}{3}) = f(t_{n} - \tfrac{h_n}{3},\, \pi _4(t_{n} - \tfrac{h_n}{3}))\), and \(\pi _5^{\prime }(t_{n} - \tfrac{2h_n}{3}) = f(t_{n} - \tfrac{2h_n}{3},\, \pi _4(t_{n} - \tfrac{2h_n}{3}))\).

The interpolants \(\pi _4(t)\) and \(\pi _5(t)\) require one and four additional evaluations of \(f(t,y)\), respectively, whereas all others are constructed using readily available data from the last successful time step. Due to these increasing costs, interpolants of degree \(\xi \gt 5\) are not currently provided.

3.3.2 Lagrange Interpolation Module.

For stiff problems, interpolants that use derivative values \(f=y^{\prime }\) can result in order reduction. We thus provide a second module based on polynomial interpolants of Lagrange type, constructed using an extended “history” of solution data, \(\lbrace y_{n}, y_{n-1}, \ldots , y_{n-\xi }\rbrace\), \(\begin{align*} \pi _{\xi }(t) = \sum _{j=0}^{\xi } y_{n-j}\, \ell _j(t),\quad \text{where}\quad \ell _j(t) = \prod _{i=0\\ i\ne j}^{\xi } \left(\frac{t-t_i}{t_j-t_i}\right), j=0,\ldots ,\xi . \end{align*}\) Since generally each solution vector \(y_{n-j}\) has many more than five entries, we evaluate \(\pi _{\xi }\) at any desired \(t\in \mathbb {R}\) by first evaluating the basis functions at t and then performing a linear combination of the stored solution vectors \(\lbrace y_{n-k}\rbrace _{k=0}^{\xi }\). Derivatives \(\pi _{\xi }^{(d)}(t)\) may be evaluated similarly since \(\begin{equation*} \pi _{\xi }^{(d)}(t) = \sum _{j=0}^{\xi } y_{n-j}\, \ell _j^{(d)}(t). \end{equation*}\) Due to the increasing algorithmic complexity involved in evaluating \(\ell _j^{(d)}\), ARKODE only supports derivatives up to \(d=3\). Also, since in the first \((\xi \,-\,1)\) internal time steps ARKODE has an insufficient solution history to construct the full \(\xi\)-degree interpolant, during these initial steps we construct the highest-degree interpolants that are currently available.

3.4 Implicit Solvers

For both ARKStep and MRIStep, if \(f^{I}\) is nonzero, then each implicit stage, \(z_i\in \mathbb {R}^N\), may require the solution of an algebraic system of equations: (18) \(\begin{equation} M(t_{n,i})\left(z_i - a_i\right) - \gamma _i f^I(t_{n,i},z_i) = 0, \end{equation}\) where \(t_{n,i}\) is the corresponding implicit stage time, \(\gamma _i\in \mathbb {R}\) is proportional to \(h_n\), \(f^I(t,y)\) is the implicit portion of the IVP right-hand side, and \(a_i\) is “known data” that may include previous stages or IVP right-hand side evaluations. ARKODE provides utilities to map between \(f^I\) and the nonlinear or linear algebraic system for each stage (18), interfaces to a range of algebraic solvers from SUNDIALS, and optimizations to enhance the efficiency of these solvers for the IVP context.

3.4.1 Nonlinear Solver Methods.

SUNDIALS provides multiple nonlinear solver modules through its SUNNonlinearSolver interface [18], all of which utilize predictor-corrector form and are targeted to nonlinear systems of equations with either root-finding or fixed-point structure. Writing the stage solution as \(z_i=z_{i,p}+z_{i,c}\), where \(z_{i,p}\) is the prediction and \(z_{i,c}\) the desired correction, ARKODE rewrites the nonlinear system (Equation (18)) in root-finding or fixed-point form as (19a) \(\begin{align} 0 &= G_{rf}(z_{i,c}) := M(t_{n,i})z_{i,c} - \gamma _i f^I(t_{n,i},z_{i,p}+z_{i,c}) - M(t_{n,i})(a_i-z_{i,p}),\quad \text{or} \end{align}\) (19b) \(\begin{align} z_{i,c} &= G_{fp}(z_{i,c}) := M(t_{n,i})^{-1} \left(\gamma _i f^I(t_{n,i},z_{i,p}+z_{i,c}) \right) + (a_i - z_{i,p}), \end{align}\)

respectively. Note, these equations may be simplified in the cases where M is independent of t or when \(M=I\). When setting up an ARKODE simulation, users must identify the category of M at problem initialization, at which point ARKODE will provide a function pointer for the relevant nonlinear system to the generic SUNNonlinearSolver object. This ARKODE layer serves to translate from the user-supplied functions M and \(f^I\) and the user-selected IVP method to the generic nonlinear solver module. To simplify the discussion below, we focus on the general cases in Equation (19).

Nonlinear solvers for root-finding problems typically solve linear systems related to the Jacobian, (20) \(\begin{equation} \mathcal {A}(t,z,\gamma) := \frac{\partial G_{rf}}{\partial z_{i,c}} = M(t) - \gamma \frac{\partial f^I(t,z)}{\partial z}. \end{equation}\) When either the matrix \(\mathcal {A}\) is explicitly stored or a preconditioner is supplied by the user, ARKODE updates these infrequently to amortize the high costs of Jacobian construction and preconditioner formulation. Thus, ARKODE will use a Jacobian or preconditioner corresponding to Equation (20) at a previous time step, i.e., \(\tilde{\mathcal {A}}(\tilde{t},\tilde{z},\tilde{\gamma }),\) where \(\tilde{z}\), \(\tilde{t}\), and \(\tilde{\gamma }\) are previous time step values. The logic guiding this lagging process follows the heuristics described for CVODE in [32, Section 2.1] and each of the specific heuristic parameters may be modified by the user for their problem.

ARKODE utilizes an identical stopping test for the nonlinear iteration as is used in CVODE, which strives to solve the nonlinear system slightly more accurately than the temporal accuracy in an effort to minimize extraneous (and costly) nonlinear solver iterations. As with the other heuristics in ARKODE, all of the stopping test parameters are user modifiable.

3.4.2 Implicit Predictors.

For iterative nonlinear solvers, a good initial guess can dramatically affect both their speed and robustness, making the difference between rapid quadratic convergence versus divergence of the iteration. To this end, ARKODE provides a variety of algorithms to construct the predictor \(z_{i,p}\) from Equation (19), typically using an interpolating polynomial via ARKODE’s interpolation module (Section 3.3). Specifically, since each stage solution typically satisfies \(z_i \approx y(t_{n,i})\), where \(t_{n,i}\) is the stage time associated with \(z_i\), then we predict these stage solutions as (21) \(\begin{equation} z_{i,p} = \pi _\xi (t_{n,i}). \end{equation}\) Since \(t_{n,i}\) are usually outside of the previous successful step, \([t_{n-2},t_{n-1}]\) (containing the data used to construct \(\pi _\xi (t)\)), Equation (21) will correspond to an extrapolant instead of an interpolant. The dangers of using polynomial extrapolation are well known, with higher-order polynomials and evaluation points further outside the interpolation interval resulting in the greatest risk. To support “optimal” choices for different types of problems, ARKODE’s prediction algorithms construct a variety of interpolants with different degrees and using different interpolation data, as described below.

Trivial predictor. This predictor is given by \(\pi _0(t) = y_{n-1}\). While not highly accurate, this predictor is often the most robust choice for very stiff problems or for problems with implicit constraints whose violation may cause illegal solution values (e.g., a negative density or temperature).

Maximum order predictor. At the opposite end of the spectrum, we may construct the highest-degree interpolant available, \(\pi _{\xi _\text{max}}(t)\). We note that as \(\xi _\text{max}\) increases, this predictor provides a very accurate prediction for stage times that are “close” to \([t_{n-2},t_{n-1}]\) but could be dangerous otherwise. As discussed in Section 3.3, ARKODE caps the polynomial degree at 5, but the degree may be further limited by setting, \(\xi _\text{user}\), and the IVP solver method order, q, such that \(\xi _\text{max} \le \min \lbrace q-1,\xi _\text{user},5\rbrace\).

Variable order predictor. In between the two previous approaches, this predictor uses \(\pi _{\xi _\text{max}}(t)\) for predicting earlier stages and lower-degree polynomials for later stages. Thus, the polynomial degree is chosen adaptively based on the stage index i, \(\xi = \max \lbrace \xi _\text{max} - i, 1 \rbrace\), which may be reasonable under the assumption that the stage times are increasing, i.e., \(c_i \lt c_j\) for \(i\lt j\).

Cutoff order predictor. Following a similar idea as above, this predictor monitors the actual stage times to determine the polynomial degree for \(\pi _\xi (t)\): \(\begin{equation*} \xi = {\left\lbrace \begin{array}{ll} \xi _\text{max}, & \text{if}\quad \frac{t_{n,i}-t_{n-1}}{h_{n-1}} \lt \tfrac{1}{2},\\ 1, & \text{otherwise}. \end{array}\right.} \end{equation*}\)

User supplied. Finally, ARKODE supports a user-provided implicit predictor function that is called after the selected built-in routine. Thus, it could predict all components of \(z_{i,p}\) if desired, or it could merely “fix” specific components that result from a higher-order built-in predictor (e.g., to satisfy underlying physical constraints) before the prediction is passed to the nonlinear solver.

3.4.3 Linear Solver Methods.

For problems that require linear solves within a nonlinear solver or the action of \(M(t)^{-1}\), users may leverage any of the SUNLinearSolver and SUNMatrix modules provided by SUNDIALS [18]. These solvers generally fit within one of four categories: (1) direct solvers that store and operate on matrices, (2) iterative and matrix-based solvers, (3) iterative and matrix-free solvers, or (4) solvers that embed the linear system. For linear solvers of each type, ARKODE provides utilities to assist in translating from the user-supplied, problem-defining functions, \(f^I\), \(M(t)\) (or the product, \(M(t)\,v\)), and, optionally, the Jacobian, \(J(t,z) = \partial _z f^I(t,z)\), or product, \(J(t,z)\,v\), to the generic SUNDIALS modules. In the case of matrix-based linear solvers that utilize dense or banded matrices, ARKODE provides utilities to approximate the Jacobian matrix, J, using a minimum number of evaluations of \(f^I\) [32].

As with the nonlinear case, ARKODE attempts to solve each linear system to an accuracy just below that of the encompassing routine (e.g., an outer nonlinear solver in the case of \(\mathcal {A}\) from Equation (20), or the temporal error controller in the case of \(M^{-1}\)). Again, ARKODE follows the same techniques for this purpose as CVODE [32], with the only salient differences being ARKODE’s use of the residual weight vector, \(\sigma\), from Equation (14) for linear system residual norms. As before, all of the associated heuristic parameters are user modifiable. When solving problems with both a non-identity M and a matrix-based nonlinear solver, the “types” of the mass matrix and system SUNLinearSolver objects must match (matrix based, matrix-based iterative, matrix-free iterative, or embedded). If both are matrix based, these matrices must use the same SUNMatrix storage type for streamlined construction of the combination \(M - \gamma J\). Otherwise, ARKODE has no restriction that the solvers themselves match (e.g., a conjugate gradient method could be used for the mass matrix systems, along with a preconditioned generalized minimum residual method for the system matrix \(\mathcal {A}\)).

3.5 Temporal Root-finding

A particularly useful feature of SUNDIALS integrators that is also provided by the ARKODE infrastructure is event detection. While integrating an IVP, ARKODE can find roots of a set of user-defined functions, \(g_i(t,y(t)) = \tilde{g}_i(t)\). The number of these functions is arbitrary, and if more than one \(\tilde{g}_i\) is found to have a root in any given interval, the various root locations are found and reported in the order that they occur in the direction of integration. The basic scheme and heuristics match other SUNDIALS integrators [32]. During integration of Equation (1), following each successful step, ARKODE checks for sign changes of any \(\tilde{g}_i(t)\); if one is found, it then will home in on the root (or roots) with a modified secant method [28].

3.6 Inequality Constraints

A final notable feature of the ARKODE infrastructure, also shared with other SUNDIALS integrators, is support for user-imposed inequality constraints on components of the solution y: \(y_i \gt 0\), \(y_i \lt 0\), \(y_i \ge 0\), or \(y_i \le 0\). Constraint satisfaction is tested after a successful step and before the temporal error test. If any constraint fails, ARKODE estimates a reduced step size based on a linear approximation in time for each violated component (including a safety factor to cover the strict inequality case). If applicable, a flag is set to update the Jacobian or preconditioner in the next step attempt.

4 USAGE

Each time-stepping module in ARKODE admits a similar usage style, modeled after CVODE [32, Sec. 6]. Moreover, due to the recent addition of Fortran 2003 interfaces throughout SUNDIALS [18], users may call ARKODE from C, C++, or Fortran by following the same progression of steps. This general approach is as follows:

(1)	The user provides one or more functions that define the IVP to be solved, e.g., \(f(t,y)\) or \(f^I(t,y)\), as well as any problem-specific preconditioners or utility routines.
(2)	For problems with a non-identity mass matrix, the user provides a function to either compute the mass matrix, \(M(t)\), or to perform the mass matrix-vector product, \(M(t)\, v\).
(3)	The user creates a vector of initial conditions, \(y_0\), for Equation (1).
(4)	The user creates the ARKODE time-stepper memory structure that contains default values for solver options, such as solution error tolerances and time-stepping method order.
(5)	The user creates nonlinear and/or linear solver objects for use by the stepper, e.g., a Newton solver, a sparse linear solver, and a sparse matrix. These objects also include default options.
(6)	The user attaches the algebraic solver objects to the ARKODE time-stepper.
(7)	The user adjusts time-stepper or algebraic solver options via “set” routines. The user may additionally specify temporal root-finding functions, inequality constraints, and so forth.
(8)	The user advances the solution toward a desired output time, \(t_{out}\), using one of four “modes”: NORMAL takes internal steps until the simulated time has just passed \(t_{out}\) and returns an approximation of \(y(t_{out})\) using interpolation (see Section 3.3). ONE-STEP takes a single internal step and returns to the user. If this step overtakes \(t_{out}\), then the solver approximates \(y(t_{out})\); otherwise it returns \(y_n\). NORMAL-TSTOP takes internal steps as in NORMAL mode but additionally checks if the next step will pass \(t_{stop}\) (specified by the user). If the subsequent step will overtake \(t_{stop}\), it is limited so that \(t_n = t_{stop}\), and the internal solution \(y_n\) is returned. ONE-STEP-TSTOP takes a single internal step and returns to the user (like “one-step”). If the step will overtake \(t_{stop}\), then this mode is identical to NORMAL-TSTOP. Most users call ARKODE in either NORMAL or NORMAL-TSTOP mode, where the latter is preferred when full integrator accuracy is required (to avoid interpolation error). The two ONE-STEP modes are frequently used when debugging or first beginning with ARKODE, in order to verify the results following each internal step.
(9)	The user retrieves optional outputs from the time-stepper via “get” routines, including solver statistics or interpolated solution values.
(10)	The user may repeat the above process starting at step 7.
(11)	When the solution process is complete, the user destroys any SUNDIALS objects they have created, e.g., solution vectors, algebraic solvers, time integrators, and so forth.

A variety of additional options are supported within this basic structure. Three particularly useful ARKODE features include the ability to re-initialize, reset, and resize the integrator. Re-initialization is useful when a user has already solved an IVP using ARKODE and wants to solve a new problem with a state vector of the same size. Here, no memory is (de)allocated, but all solver history (previous step sizes, temporal error estimates, stored solutions, solver statistics, etc.) is cleared. This is useful when performing parameter sweeps, where the user wishes to run repeated forward simulations using different problem parameters. This is also useful when treating jump discontinuities in the IVP right-hand side functions that cause solution characteristics to change dramatically. Here, one may run ARKODE up to the discontinuity in “tstop” or root-finding mode and then re-initialize the solver with the new right-hand side function to be used following the discontinuity. Alternately, ARKODE steppers may be “reset,” allowing them to solve an IVP with the same right-hand side function(s), but using an updated initial condition \((t_0,y_0)\). As with re-initialization, the memory footprint remains unchanged and the step size and solution history are cleared, but all solver statistics (total number of steps, etc.) are retained. This mode is used by MRIStep to ensure that each fast time scale IVP is solved using the prescribed initial condition, while still accumulating all fast time scale solver statistics. Finally, since ARKODE is built for one-step methods, it is natural to apply it to PDEs that adaptively refine the spatial grid. Here, the ARKODE “resize” functions support simulations in which the number of equations and unknowns in the IVP system change between time steps. This resizing obviously modifies ARKODE’s internal memory structures to use the new problem size, but does so without destroying the temporal adaptivity heuristics.

5 NUMERICAL RESULTS

To highlight the algorithmic flexibility provided by ARKODE, we first present a sequence of simple numerical examples that progressively increase in both problem and integrator complexity, emulating the progression of a typical new ARKODE user. We then present a demonstration problem showing an advanced use of ARKODE for a large-scale multiphysics application.

5.1 Simple Example Problem

For each stepper module example, we consider a one-dimensional advection-diffusion-reaction equation using a variation of the Brusselator chemical kinetics model [26]. The system is given by (22) \(\begin{equation} \begin{aligned}u_t &= -c\, u_x + d\, u_{xx} + a - (w + 1)\, u + v\, u^2, \\ v_t &= -c\, v_x + d\, u_{xx} + w\, u - v\, u^2, \\ w_t &= -c\, w_x + d\, u_{xx} + (b - w) / \epsilon - w\, u, \end{aligned} \end{equation}\) for \(t \in [0, 10]\) with \(x \in [0, 1]\), where u, v, and w are the chemical species concentrations; \(c = 0.001\) is the advection speed; d is the diffusion rate (typically 0.01); \(a = 0.6\) and \(b = 2\) are the concentrations of species that remain constant over time; and \(\epsilon = 0.01\) is a parameter that determines the stiffness of the system. The initial conditions are \(\begin{equation*} \begin{gathered}u(0,x) = a + 0.1 \sin (\pi x), \qquad v(0,x) = b/a + 0.1 \sin (\pi x), \qquad w(0,x) = b + 0.1 \sin (\pi x). \end{gathered} \end{equation*}\) Spatial derivatives are computed using second-order centered differences with the data distributed over \(N = 512\) points on a uniform spatial grid with stationary boundary conditions, i.e., \(\begin{equation*} \begin{gathered}u_t(t,0) = u_t(t,1) = 0, \qquad v_t(t,0) = v_t(t,1) = 0, \qquad w_t(t,0) = w_t(t,1) = 0. \end{gathered} \end{equation*}\) In the subsections that follow, we explore various time integration methods and ImEx splittings of Equation (22), so we first summarize the relevant time scales present in this problem for each of the advection, diffusion, and reaction terms. For this problem setup, if explicit time-stepping methods were used for all operators and linear stability were the only consideration, then diffusion would require the smallest time step, followed by reaction, followed by advection. However, if linear stability were not an issue (e.g., when using A-stable implicit methods) and temporal accuracy were the only consideration, then reaction would require the smallest time steps, followed by advection and diffusion, which would evolve on comparable time scales. Finally, we note that the \((b-w)/\epsilon\) term in Equation (22) induces a mild amount of additional stiffness into the reaction processes, thus requiring slightly smaller “stable” than “accurate” steps.

All of the examples use the serial N_Vector implementation and results were obtained on a single node of the Quartz cluster at Lawrence Livermore National Laboratory. Each Quartz node contains two Intel Xeon 5-2695 v4 CPUs. The source files for this example problem are ark_advection_diffusion_reaction.hpp and ark_advection_diffusion_reaction.cpp. In each test case the maximum relative error at the final time is computed using a reference solution generated with the default fifth-order DIRK method (ARK5(4)8L[2]SA-ESDIRK in [35]) with relative and absolute tolerances of \(10^{-8}\) and \(10^{-14}\), respectively.

5.1.1 ERKStep.

We first consider an advection-reaction setup (i.e., \(d = 0\) in Equation (22)) and evolve the system in time using the default second- (Heun-Euler), third- [5], fourth- [65], and fifth-order [6] ERK methods in ARKODE. The optimal method choice will depend on the user’s desired accuracy, the cost per step of the method, and the step sizes selected by the adaptivity controller. As such, we show how the options in ARKODE can help users determine more effective methods in their accuracy regimes of interest. We compare the performance of each method using “loose,” “medium,” and “tight” sets of relative and absolute tolerances pairs (\(10^{-4}\)/\(10^{-9}\), \(10^{-5}\)/\(10^{-10}\), and \(10^{-6}\)/\(10^{-11}\), respectively) and the PID, PI, I, and explicit Gustafsson controllers.

Figure 2 shows a work-precision plot for the various configurations where the total number of right-hand side function evaluations for successful and failed steps is used as the work metric. Given its lower cost per step, the second-order method gives the best performance at coarser levels of accuracy (three left-most solid red markers) despite requiring more than double the number of step attempts as the third-order method. For errors between \(10^{-5}\) and \(10^{-7}\) the third-order method is the most efficient (lower left orange markers) since the second-order method requires at least three times as many step attempts, while the fourth- and fifth-order methods make fewer attempts but the reduced number of attempts is insufficient to offset their higher cost per step. The fifth-order method becomes most efficient at accuracies below \(10^{-7}\) (blue markers near the bottom left).

Fig. 2. Work-precision plot for the default second- to fifth-order ERK methods with various error controllers and tolerances. Configurations further to the left and lower in the plot indicate methods with less work and greater accuracy, respectively.

Across the methods and tolerances, the I controller is frequently the least efficient for this example as it only considers the error in the current step leading to a high step rejection rate (between 13% and 50%). The additional error history retained by the PI and Gustafsson controllers leads to better step size predictions and often the greatest efficiency (rejection rates <7%). The longer history of the PID controller does not lead to further improvement and it generally produces results that fall between the PI and I controllers. As explored in [45], the choice of controller parameters (tunable in ARKODE but not explored here) also plays an important role in performance.

5.1.2 ARKStep.

We extend the previous model to now include diffusion (i.e., \(d=0.01\) in Equation (22)). To avoid the stability limitations of a fully explicit approach, users could either treat the full system implicitly with a DIRK method or partition the right-hand side and apply an ImEx-ARK method with implicit diffusion. In the latter case, different splitting approaches could be considered, and in both cases the choice of predictor may not be clear. As such, we show how these methods could be compared. In the DIRK case, we advance the system in time using the implicit part of the ImEx-ARK method, ARK4(3)6L[2]SA, from [35], and, in the ImEx case, we use the ARK4(3)6L[2]SA ImEx-ARK method. We consider two different ImEx splittings that both treat advection explicitly, as is typical when evolving hyperbolic terms (particularly for flows with discontinuities), but the reaction terms are treated either implicitly (ImEx-1) to step over the fastest reaction dynamics or explicitly (ImEx-2) to track the reaction time scale. In each setup, we solve the nonlinear systems at each implicit stage using a modified Newton method paired with a banded direct linear solver. Additionally, in the ImEx-2 case we enable the “linearly implicit” ARKStep option to allow only one Newton iteration per nonlinear solve. The integration relative and absolute tolerances are set to \(10^{-4}\) and \(10^{-9}\), respectively. For each method we compare the trivial, max order, variable order, and cutoff implicit predictor algorithms.

Table 1 shows various integrator statistics for each configuration. In this test, where there are long periods of slow change followed by shorter, faster transition intervals, predictors using higher-order extrapolation can increase efficiency compared to the trivial predictor. In the DIRK and ImEx-1 cases, the max order predictor provides significant reductions (32% or more) in all integrator statistics while producing results close to the requested tolerance. The variable order and cutoff predictors also improve performance in the DIRK and ImEx-1 cases but to a lesser degree (reductions of 19% or more) and are better suited to cases where the max order predictor becomes unstable. For the ImEx-2 setup, where only a single Newton iteration is applied in the implicit solve, results are more sensitive to predictor error and the more conservative trivial and cutoff predictors have the best performance. Here, the cutoff predictor gives the best results providing a small improvement (up to a 20%) over the trivial predictor in all integrator statistics.

Table 1.

Stats	DIRK Predictor				ImEx-1 Predictor				ImEx-2 Predictor
Stats	T	M	V	C	T	M	V	C	T	M	V	C
Steps	34	21	24	24	31	21	25	25	255	258	256	250
Err fails	0	0	0	0	0	0	0	0	56	67	62	44
Solve fails	6	0	1	1	4	0	1	1	0	0	0	0
\(f^E\) evals	-	-	-	-	204	129	158	158	1,869	1,953	1,911	1,767
\(f^I\) evals	758	385	460	487	672	385	475	495	3,424	3,578	3,501	3,237
NLS iters	528	256	308	335	468	256	317	337	1,555	1,625	1,590	1,470
NLS fails	24	6	10	14	18	6	10	13	0	0	0	0
LS setups	46	18	24	28	35	18	25	28	156	171	167	134
\(J\) evals	25	7	11	15	19	7	11	14	57	68	63	45
Error \(\times 10^{-4}\)	0.34	1.7	2.0	1.8	0.18	2.5	2.1	2.2	0.05	7.2	3.2	0.1

The ImEx results use two splittings of Equation (22); in both cases diffusion is treated implicitly and advection explicitly, while the reactions are treated implicitly (ImEx-1) or explicitly (ImEx-2). Each method is paired with the trivial (T), max order (M), variable order (V), and cutoff (C) predictors. The statistics given are the number of successful steps (Steps), number of failed steps due to an error test or nonlinear solver failure (Err and Solve fails), number of explicit and implicit function evaluations (\(f^E\) and \(f^I\) evals), number of nonlinear solver iterations and failures (NLS iters and fails), number of linear solver setups (LS setups), and number of Jacobian evaluations (\(J\) evals). Additionally, the maximum relative error at the final time compared to a reference solution is also given.

View Table

Table 1. Comparison of Integrator Statistics Using the Default Fourth-order DIRK and ImEx Methods

The ImEx results use two splittings of Equation (22); in both cases diffusion is treated implicitly and advection explicitly, while the reactions are treated implicitly (ImEx-1) or explicitly (ImEx-2). Each method is paired with the trivial (T), max order (M), variable order (V), and cutoff (C) predictors. The statistics given are the number of successful steps (Steps), number of failed steps due to an error test or nonlinear solver failure (Err and Solve fails), number of explicit and implicit function evaluations (\(f^E\) and \(f^I\) evals), number of nonlinear solver iterations and failures (NLS iters and fails), number of linear solver setups (LS setups), and number of Jacobian evaluations (\(J\) evals). Additionally, the maximum relative error at the final time compared to a reference solution is also given.

Comparing the integration methods more broadly, the performance of the DIRK and ImEx-1 methods are nearly identical as they use the same implicit method. We note that with ImEx-ARK methods, the terms included in \(f^E\) are evaluated far less often than those in \(f^I\), since \(f^E\) is only evaluated once per stage rather than once per nonlinear iteration. As such, when \(f^E\) is expensive but non-stiff, an ImEx approach can offer greater efficiency. Alternatively, when an efficient nonlinear solver is unavailable for the unsplit right-hand side, an ImEx approach may offer similar performance without requiring an algebraic solver for the full system. Considering ImEx-2, we see that it is far more expensive than the DIRK or ImEx-1 approaches since its time steps are determined by the reactions. However, in terms of \(f^E\) evaluations, ImEx-2 is on par with the advection-reaction results using ERKStep, as its implicit treatment of diffusion successfully alleviates the step size restrictions that would be present if this problem were treated fully explicitly.

5.1.3 MRIStep.

With ARKStep, Equation (22) was split into implicit and explicit partitions using the same step size for all components, but, as was clear from Table 1, resolving the reactions required smaller step sizes than advection and diffusion. Therefore, we apply the default third-order ImEx-MRI-GARK method (IMEX-MRI-GARK3a from [8]) in MRIStep with advection and diffusion at the slow time scale using a fixed step size of \(H=0.1\), and the reactions at the fast time scale. For this fast scale we consider three integration methods: the default third-order adaptive step size ERK method [5] in ARKStep that will resolve the stiffest reaction time scale, the default third-order adaptive step size DIRK method (ARK3(2)4L[2]SA–ESDIRK from [35]) in ARKStep that will step over the stiffest reaction time scale but at an increased cost per step, or an adaptive order and step size BDF method from CVODE that will also step over the stiffest reaction time scale but with a lower cost per step. At the slow time scale, the linearly implicit systems are solved with a single Newton iteration and a banded direct linear solver. When using the DIRK and BDF methods at the fast time scale, the nonlinear implicit systems are solved with a modified Newton iteration paired with a banded direct linear solver. The integrators all use the same relative and absolute tolerances of \(10^{-4}\) and \(10^{-9}\), respectively.

To use BDF methods for the fast integration, the example code wraps CVODE as an MRIStepInnerStepper object. The derived class implemented in the example code contains the data needed by the fast integrator (e.g., the CVODE memory structure) and defines three methods utilized by MRIStep: Evolve, FullRHS, and Reset. The first evolves the fast IVP system from step 2b of the MRI method over a given time interval, the second evaluates \(f^f(t,y)\) for slow time scale dense output, and the third resets the integrator to a given time and state value. We note that the linear multistep methods in CVODE bootstrap up from first order following each Reset and, unlike the Runge–Kutta methods, do not retain the error history across fast integrations. To prevent overly aggressive step size changes in these bootstrap phases, we reduce the maximum step size change after the first internal step from \(10^4\) to 10, the default value for subsequent steps.

Table 2 shows various integrator statistics for each configuration. Since all three setups apply the same ImEx method at the slow time scale, the “slow” statistics are identical. Comparing the performance statistics from the fast integration, we see various tradeoffs between the methods. The DIRK method has the lowest error, significantly over-solving the problem, and takes the fewest steps (approximately one per fast solve), but it requires the most fast function evaluations due to the multiple nonlinear solves in each step. The ERK method takes more steps and has a larger error but requires fewer function evaluations, making it the most efficient for this reaction model. Finally, the BDF method gives a result close to the requested tolerance and takes the most steps (approximately six per fast solve) as it must bootstrap up from first order. However, the method has 24% fewer fast function evaluations than the DIRK method since only one nonlinear solve is required per step. Thus, for stiffer reaction mechanisms with a greater separation between the fast and slow time scales, the BDF method may become the best approach.

Table 2.

Stats	MRI-ERK	MRI-DIRK	MRI-BDF
Steps slow	100	100	100
Steps fast	426	307	1,709
Failed steps fast	1	0	0
\(f^E\) evals	401	401	401
\(f^I\) evals	701	701	701
\(f^f\) evals	2,111	3,470	2,620
Slow NLS iters	300	300	300
Slow NLS fails	0	0	0
Slow LS setups	1	1	1
Slow \(J\) evals	1	1	1
Fast NLS iters	–	1,839	1,720
Fast NLS fails	–	0	0
Fast LS setups	–	304	1,636
Fast \(J\) evals	–	300	401
Error \(\times 10^{-4}\)	0.44	0.01	1.3

The statistics given are the number of slow time steps (Steps slow); number of successful fast time steps (Steps fast); number of failed fast time steps (Failed steps fast); number of slow explicit, slow implicit, and fast function evaluations (\(f^E\), \(f^I\), and \(f^f\) evals); number of nonlinear iterations and failures at the slow and fast time scales (NLS iters and fails); number of slow and fast linear solver setups (LS setups); and number of slow and fast Jacobian evaluations (\(J\) evals). Additionally, the maximum relative error at the final time compared to a reference solution is also given.

View Table

Table 2. Integrator Statistics Using the Default Third-order ImEx MRI Method Paired with Either an ERK Method, DIRK Method, or BDF Method for the Fast Time Scale Integrator

The statistics given are the number of slow time steps (Steps slow); number of successful fast time steps (Steps fast); number of failed fast time steps (Failed steps fast); number of slow explicit, slow implicit, and fast function evaluations (\(f^E\), \(f^I\), and \(f^f\) evals); number of nonlinear iterations and failures at the slow and fast time scales (NLS iters and fails); number of slow and fast linear solver setups (LS setups); and number of slow and fast Jacobian evaluations (\(J\) evals). Additionally, the maximum relative error at the final time compared to a reference solution is also given.

For any given application, the optimal choice of integrator family (explicit, ImEx, multirate), problem splittings (implicit-explicit and/or fast-slow), Runge–Kutta or MRI method coefficients, algebraic solvers, adaptivity methods, and solver parameters will obviously be problem specific. However, this sequence of examples illustrates that through ARKODE’s broad range of options and simple API, users may easily explore these choices to discover what works best for their application.

5.2 Large-scale 3D Demonstration Problem

As a final demonstration of ARKODE’s flexibility and performance, we consider the three-dimensional nonlinear inviscid compressible Euler equations, with advection and reaction of chemical species: (23) \(\begin{equation} \mathbf {w}_t = -\nabla \cdot F(\mathbf {w}) + R(\mathbf {w}), \end{equation}\) with the independent variables \((X,t) = (x,y,z,t) \in \Omega \times [0, t_f]\), and where the spatial domain is a three-dimensional cube, \(\Omega = [0, 1] \times [0, 1] \times [0, 1]\). The solution is given by \(w = [ \rho \rho v_x \rho v_y \rho v_z e_t \mathbf {c} ]^T\), corresponding to the density; momentum in the x, y, and z directions; total energy per unit volume; and some number of chemical densities \(\mathbf {c}\in \mathbb {R}^{nchem}\) that are advected along with the fluid. The fluxes are given by (24) \(\begin{align} F_x(w) &= \begin{bmatrix} \rho v_x & \rho v_x^2 + p & \rho v_x v_y & \rho v_x v_z & v_x (e_t+p) & \mathbf {c} v_x \end{bmatrix}^T, \end{align}\) (25) \(\begin{align} F_y(w) &= \begin{bmatrix} \rho v_y & \rho v_x v_y & \rho v_y^2 + p & \rho v_y v_z & v_y (e_t+p) & \mathbf {c} v_y \end{bmatrix}^T, \end{align}\) (26) \(\begin{align} F_z(w) &= \begin{bmatrix} \rho v_z & \rho v_x v_z & \rho v_y v_z & \rho v_z^2 + p & v_z (e_t+p) & \mathbf {c} v_z \end{bmatrix}^T. \end{align}\) The reaction mechanism is encoded in \(R(w)\), and the ideal gas equation of state gives \(p = (\gamma -1) (e_t - \frac{\rho }{2} (v_x^2 + v_y^2 + v_z^2))\), where \(\gamma = c_p/c_v\) is the constant ratio of specific heats for the gas.

We discretize this problem using the method of lines, where we first semi-discretize in space using a regular finite volume grid with dimensions \(n_x \times n_y \times n_z\), with fluxes at cell faces calculated using a fifth-order FD-WENO reconstruction [56]. MPI parallelization is achieved using a standard 3D domain decomposition of \(\Omega\). We organize the corresponding spatially discretized solution vector using the SUNDIALS MPIManyVector [18] that wraps node-local vectors for each of the fields, \(\rho , \ldots , \mathbf {c}\), to create the overall state vector, \(\mathbf {w}\), and provides the requisite MPI functionality for coordinating vector operations among the subvectors. The hydrodynamic fields (\(\rho\), \(\rho \mathbf {v}\), and e) are stored in CPU memory, and the chemical densities, \(\mathbf {c}\), are stored in GPU memory. After spatial semi-discretization, we are faced with a large IVP system: (27) \(\begin{equation} {w}^{\prime }(t) = f_1({w}) + f_2({w}), \quad {w}(t_0)={w}_0, \end{equation}\) where \(f_1({w})\) and \(f_2({w})\) contain the spatially discretized forms of \(-\nabla \cdot F(\mathbf {w})\) and \(R(\mathbf {w})\), respectively. For additional information on this test problem or our computational implementation, please see the technical report [47] and the GitHub repository [46, v4.0].

In the results that follow, we simulate the advection and reaction of a low-density primordial gas, present in models of the early universe [1, 21, 39, 61]. This gas consists of 10 advected species (i.e., \(nchem=10\)), corresponding to various ionization states of atomic and molecular Hydrogen and Helium, free electrons, and the internal gas energy. For this setup the reaction mechanism is considerably stiffer than the advection, so we consider two forms for temporal evolution:

(a)	\(\mathop {}\mathopen {}\mathcal {O}\mathopen {}(H^4)\), temporally adaptive, single-step, ImEx method from ARKStep (ARK4(3)7L[2]SA\(_1\) from [37]), where \(f_1\) is treated explicitly and \(f_2\) is treated implicitly, and
(b)	\(\mathop {}\mathopen {}\mathcal {O}\mathopen {}(H^3)\) explicit MIS method from MRIStep (using the “RK3” coefficients from [38]), where \(f_1\) is treated at the slow time scale using fixed step sizes, and \(f_2\) is treated at the fast time scale using the DIRK portion of the \(\mathop {}\mathopen {}\mathcal {O}\mathopen {}(H^4)\), adaptive, DIRK method from (a).

For (a) we use SUNDIALS’s modified Newton solver to handle the global nonlinear algebraic systems arising at each implicit stage of each time step. Since only \(f_2\) is treated implicitly and the reactions are purely local in space, the Newton linear systems are block-diagonal. As such, we provide a custom SUNLinearSolver implementation that solves each MPI rank-local linear system independently. The portion of the Jacobian matrix on each rank, \(J_p\), is itself block-diagonal, i.e., \(\begin{equation*} J_p = \begin{bmatrix} J_{p,1,1,1} & & & \\ & J_{p,2,1,1} & & \\ & & \ddots & \\ & & & J_{p,n_{xloc},n_{yloc},n_{zloc}} \end{bmatrix}, \end{equation*}\) with each cell-local block \(J_{p,i,j,k} \in \mathbb {R}^{10\times 10}\). We further leverage this structure by solving each \(J_p x_p = b_p\) linear system using the SUNDIALS SUNLinearSolver implementation that interfaces to the GPU-enabled, batched, dense LU solver routines from MAGMA [60].

The multirate approach (b) can leverage the structure of \(f_2\) at a higher level. Since the MRI method applied to this problem evolves “fast” sub-problems of the form (28) \(\begin{equation} v^{\prime }(t) = f_2(t,v) + r_i(t), \quad i=2,\ldots ,s, \end{equation}\) and all MPI communication necessary to construct the forcing functions, \(r_i(t)\), has already been performed, each sub-problem in Equation (28) consists of \(n_x\,n_y\,n_z\) spatially decoupled fast IVPs. We construct a custom fast integrator that groups all the independent fast IVPs on an MPI rank together as a single system evolved using a rank-local ARKStep instance. The code for this custom integrator itself is minimal, primarily consisting of steps to access the local subvectors in w on a given MPI rank and wrapping them in MPI-unaware ManyVectors provided to the local ARKStep instance. The collection of independent local IVPs also leads to a block-diagonal Jacobian, and we again utilize the SUNDIALS interface to MAGMA for linear solves within the modified Newton iteration.

These tests use periodic boundary conditions and a “clumpy” initial background density field: \(\begin{equation*} \rho _b(X) = 100 \left(1 + \sum _{i=1}^{N_c} s_i \exp \left(-2 \left(\frac{\Vert X-X_i\Vert }{r_i}\right)^2\right)\right), \end{equation*}\) where \(s_i \in [0,5]\), \(r_i \in [3\Delta x,6\Delta x]\), \(X_i\in \Omega _p\) are uniformly distributed random values, and there are 10 “clumps” per MPI rank. The initial background temperature is \(T_b(X)= 10\), and the initial velocity field is identically zero. The initial conditions for density and temperature are obtained by adding a Gaussian bump to the background states: \(\begin{equation*} \rho _0(X) = \rho _b(X) + 500 \exp \left(-2 \left(\frac{\Vert X-0.5\Vert }{0.1}\right)^2 \right), \quad T_0(X) = T_b(X) + 50 \exp \left(-2\left(\frac{\Vert X-0.5\Vert }{0.1}\right)^2 \right), \end{equation*}\) causing a high-pressure front that emanates from the center of the domain. We initialize the chemical species to consist almost exclusively of neutral Hydrogen (76%) and Helium (24%), with trace amounts of ionized Hydrogen and Helium (0.0000000001%) outside of this high-pressure region. As the region expands and increases the surrounding temperature, the chemical concentrations in those cells rapidly transition to a different equilibrium, resulting in a chemical time scale that is multiple orders of magnitude faster than the spatial propagation of the high-pressure front.

In Figure 3 we present timing results for this problem run on Summit at the Oak Ridge Leadership Computing Facility. Each Summit node contains two IBM POWER9 processors and six NVIDIA Tesla V100 GPUs. The nodes are connected to a dual-rail EDR InfiniBand network in a non-block fat tree topology. We performed a weak-scaling simulation with:

Fig. 3. Demonstration code scaling: simulation times (first row) and algorithmic scalability (second row). fAdv corresponds to evaluation of the advection operator and fChem the chemistry operator. In the first row, MPI corresponds with the global synchronization times required following each rank-local solver component, and other contains all components that are not directly measured (e.g., time spent within the GPU-based MAGMA linear solver). In the second row, steps, slow steps, and fast steps show the total number of time steps taken at each level of each solver; LSetups shows the total number of linear solver setup calls; Jevals shows the total number of Jacobian evaluations; and Niters shows the total number of Newton iterations.

a \(25 \times 25 \times 25\) spatial grid per MPI rank,
multirate integrators using a slow time step size satisfying the CFL constraint \(H \le 10\Delta x\),
fast multirate time step sizes and ImEx time step sizes chosen adaptively by the ARKStep integrator with relative and absolute tolerances of \(10^{-5}\) and \(10^{-9}\), respectively,
temporal evolution over \([0,5H]\), and
six MPI ranks per Summit node, with one NVIDIA Volta GPU tied to each MPI rank.

We note that both the ImEx and MRI approaches select time steps to accurately track the chemical evolution. In the ImEx case, the entire simulation evolves with a shared step that is therefore set by the chemistry in the “hardest” spatial cell. However, in the multirate case, only the fast time scale must track the chemistry, and that occurs on a rank-by-rank basis, allowing other MPI ranks and other physical processes to evolve with larger time steps.

The top row of Figure 3 shows the total simulation times for both configurations of the problem, while the second row shows the corresponding algorithmic scalability. Examining the second row, we see that for all solver statistics both configurations demonstrate ideal algorithmic scalability, implying that our algorithm choices map optimally to the underlying problem structure.

From the first row of Figure 3 two observations are immediately clear. First, the high cost associated with each evaluation of the advection operator (shown as fAdv) causes the MRI approach to easily outperform the ImEx method, due to its much smaller number of evaluations. Second, the primary runtime slowdown experienced within both approaches arises in the green MPI component. For the ImEx method, this component corresponds solely to a single MPI_Allreduce after each rank-local linear solve to determine the overall “success” or “failure” of the global linear solve. Thus, the MPI cost reflects the cost of global synchronization. Similarly, in the MRI case the MPI component measures an MPI_Allreduce that determines success of the fast local IVP solves. Hence, the fundamental scalability difference between the ImEx and MRI approaches results from the frequency of these synchronizations. For the ImEx method synchronizations occur following each linear solve, within each Newton iteration, within each implicit stage, within each time step. Since ARKStep’s temporal adaptivity tracked the (fast) reaction time scale, the simulation required a large number of synchronizations. On the other hand, with the MRI method these synchronizations occur at the slow time scale, i.e., considerably less often than in the ImEx case.

Taking a step back to focus on ARKODE as an infrastructure, we conclude that its algorithmic flexibility, and in particular its support for custom solver components and the new MRIStep multirate module, enables a rich exploration of the “solver space” for this and many problems.

6 CONCLUSIONS

ARKODE is a highly flexible, usable, and scalable infrastructure for one-step IVP solvers. It is currently used by a variety of applications with functionality that is rapidly expanding to encompass novel and flexible numerical methods for simulating complex phenomena. The current version of ARKODE is 5.5.1; it is contained within SUNDIALS version 6.5.1 and is widely available through distribution modes including Spack [16] and GitHub.

We have designed ARKODE to support a range of one-step methods, providing reusable infrastructure components that new methods may leverage to minimize the time between initial inception and public release in a high-quality open-source software library. By supplying a method-specific step function that propagates the IVP solution and provides an estimate of the local truncation error, ARKODE can wrap these methods to provide well-established usage modes that evolve an IVP over a time interval or return after each internal step. ARKODE also provides added features including high-order dense output, event handling, and support for inequality constraints on solution components.

The existing ecosystem of ARKODE time-stepping modules allows users to explore various options for time integration with only minor modifications to their codes, allowing these algorithms to increase in complexity with their application. The current modules include a lean explicit Runge–Kutta time stepper; an additive Runge–Kutta time stepper for explicit, implicit, and ImEx integration; and a new multirate infinitesimal time stepper that supports explicit, implicit, and ImEx integration of the slow time scale while allowing for arbitrary integration methods at the fast time scale. Through its support for user-supplied components (e.g., Butcher tables, MRI coefficients, and algebraic solver modules) within its existing time-stepping modules, as well as its infrastructure support for generic time-stepping modules, ARKODE can serve as a robust platform for rapidly developing and testing “experimental” methods at all levels of the solver hierarchy.

As with most open-source packages, we continue to add functionality to ARKODE. Current enhancements include support for multirate temporal adaptivity within MRIStep, as well as new time-stepping modules for advanced one-step methods (positivity/structure preserving, etc.).

ACKNOWLEDGMENTS

The authors would like to thank Alan Hindmarsh, Cody Balos, Jean Sexton, Slaven Peles, Ting Yan, and John Loffeld for their software and mathematical contributions. We would also like to thank Gregg Hommes, Sylvie Aubry, Tom Arsenlis, Paul Ullrich, Jorge Guerra, Chris Vogl, Andrew Steyer, Mark Taylor, Darin Ernst, and Manaure Francisquez for their use and feedback on ARKODE, and their patience as we ironed out kinks and constructed improvements.

REFERENCES

[1] Abel Tom, Anninos Peter, Zhang Yu, and Norman Michael L.. 1997. Modeling primordial gas in numerical cosmology. New Astronomy 2, 3 (1997), 181–207. Google ScholarCross Ref
Reference
[2] Abhyankar Shrirang, Brown Jed, Constantinescu Emil M., Ghosh Debojyoti, Smith Barry F., and Zhang Hong. 2018. PETSc/TS: A Modern Scalable ODE/DAE Solver Library. arxiv:1806.01437 [math.NA]Google Scholar
Reference
[3] Anderson Robert, Andrej Julian, Barker Andrew, Bramwell Jamie, Camier Jean-Sylvain, Cerveny Jakub, Dobrev Veselin, Dudouit Yohann, Fisher Aaron, Kolev Tzanio, Pazner Will, Stowell Mark, Tomov Vladimir, Akkerman Ido, Dahm Johann, Medina David, and Zampini Stefano. 2021. MFEM: A modular finite element methods library. Computers & Mathematics with Applications 81 (2021), 42–74. Google ScholarCross Ref
Reference
[4] Balos Cody J., Gardner David J., Woodward Carol S., and Reynolds Daniel R.. 2021. Enabling GPU accelerated computing in the SUNDIALS time integration library. Parallel Comput. 108 (Dec.2021), 102836. Google ScholarDigital Library
Reference 1Reference 2
[5] Bogacki Przemyslaw and Shampine Lawrence F.. 1989. A 3(2) pair of Runge-Kutta formulas. Appl. Math. Lett. 2 (1989), 321–325. Google ScholarCross Ref
Reference 1Reference 2
[6] Cash Jeff R. and Karp Alan H.. 1990. A variable order Runge-Kutta method for initial value problems with rapidly varying right-hand sides. ACM Trans. Math. Software 16 (1990), 201–222. Google ScholarDigital Library
Reference
[7] Cervi Jessica and Spiteri Raymond J.. 2019. A comparison of fourth-order operator splitting methods for cardiac simulations. Applied Numerical Mathematics 145 (2019), 227–235.Google ScholarDigital Library
Reference
[8] Chinomona Rujeko and Reynolds Daniel R.. 2021. Implicit-explicit multirate infinitesimal GARK methods. SIAM J. Sci. Comput. 43, 5 (Jan.2021), A3082–A3113. Google ScholarDigital Library
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
[9] Constantinescu Emil M. and Sandu Adrian. 2013. Extrapolated multirate methods for differential equations with multiple time scales. J. Sci. Comput. 56, 1 (July2013), 28–44. Google ScholarDigital Library
Reference
[10] Cooper Graeme J. and Sayfy Ali. 1980. Additive methods for the numerical solution of ordinary differential equations. Math. Comp. 35 (1980), 1159–1172.Google ScholarCross Ref
Reference
[11] DeWitt Stephen, Fackler Philip, Song Younggil, Radhakrishnan Balasubramaniam, and Gorti Sarma. 2022. MEUMAPPS (C++ Version). Google ScholarCross Ref
Reference
[12] al. Robert D. Falgout et2022. XBraid: Parallel multigrid in time. http://llnl.gov/casc/xbraid.Google Scholar
Reference
[13] Falgout Robert D., Friedhoff Stephanie, Kolev Tzanio V., MacLachlan Scott P., and Schroder Jacob B.. 2014. Parallel time integration with multigrid. SIAM Journal of Scientific Computing 36 (2014), C635–C661.Google ScholarDigital Library
Reference
[14] Fish Alex C. and Reynolds Daniel R.. 2023. Adaptive time step control for infinitesimal multirate methods. SIAM Journal on Scientific Computing 45, 2 (2023), A958–A984. Google ScholarCross Ref
Reference
[15] Francisquez Manaure, Ernst Darin R., Reynolds Daniel R., and Balos Cody J.. 2021. 63rd annual meeting of the APS division of plasma physics - A 2D gyrofluid model for coupled toroidal ITG/ETG multiscale turbulence and its comparison to gyrokinetics. In Bulletin of the American Physical Society, Vol. 66. American Physical Society, Pittsburgh, PA.Google Scholar
Reference
[16] Gamblin Todd, LeGendre Matthew P., Collette Michael R., Lee Gregory L., Moody Adam, Supinski Bronis R. de, and Futral W. Scott. 2015. The Spack package manager: Bringing order to HPC software chaos. In Supercomputing 2015 (SC’15). ACM/IEEE.Google Scholar
Reference
[17] Gardner David J., Guerra Jorge E., Hamon François P., Reynolds Daniel R., Ullrich Paul A., and Woodward Carol S.. 2018. Implicit–explicit (IMEX) Runge–Kutta methods for non-hydrostatic atmospheric models. Geosci. Model Dev. 11, 4 (2018), 1497–1515.Google ScholarCross Ref
Reference
[18] Gardner David J., Reynolds Daniel R., Woodward Carol S., and Balos Cody J.. 2022. Enabling new flexibility in the SUNDIALS suite of nonlinear and differential/algebraic equation solvers. ACM Trans. Math. Software 48, 3 (2022), 1–24. Google ScholarDigital Library
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
Reference 6
[19] Gardner David J., Woodward Carol S., Reynolds Daniel R., Hommes Gregg, Aubry Sylvie, and Arsenlis A. Tom. 2015. Implicit integration methods for dislocation dynamics. Modelling and Simulation in Materials Science and Engineering 23, 2 (2015), 025006.Google ScholarCross Ref
Reference
[20] Gear C. William and Wells Daniel R.. 1984. Multirate linear multistep methods. BIT 24, 4 (Dec.1984), 484–502. Google ScholarDigital Library
Reference
[21] Glover Simon C. O. and Abel Tom. 2008. Uncertainties in H2 and HD chemistry and cooling and their role in early structure formation. Monthly Notices of the Royal Astronomical Society 388, 4 (2008), 1627–1651. Google ScholarCross Ref
Reference
[22] Günther Michael and Sandu Adrian. 2016. Multirate generalized additive Runge Kutta methods. Numer. Math. 133, 3 (July2016), 497–524. Google ScholarDigital Library
Reference
[23] Gustafsson Kjell. 1991. Control theoretic techniques for stepsize selection in explicit Runge-Kutta methods. ACM Trans. Math. Software 17 (1991), 533–554.Google ScholarDigital Library
Reference
[24] Gustafsson Kjell. 1994. Control-theoretic techniques for stepsize selection in implicit Runge-Kutta methods. ACM Trans. Math. Software 20 (1994), 496–512.Google ScholarDigital Library
Reference
[25] Hairer Ernst, Nørsett Syvert P., and Wanner Gerhard. 2000. Solving Ordinary Differential Equations I – Nonstiff Problems (2nd ed.). Springer Series in Computational Mathematics, Vol. 8. Springer-Verlag, Berlin.Google Scholar
Reference 1Reference 2
[26] Hairer Ernst and Wanner Gerhard. 2002. Solving Ordinary Differential Equations II – Stiff and Differential-Algebraic Problems (2nd ed.). Springer Series in Computational Mathematics, Vol. 14. Springer-Verlag, Berlin.Google ScholarCross Ref
Reference
[27] Frahan Marc T. Henry de, Rood Jon S., Day Marc S., Sitaraman Hariswaran, Yellapantula Shashank, Perry Bruce A., Grout Ray W., Almgren Ann, Zhang Weiqun, Bell John B., and Chen Jacqueline H.. 2022. PeleC: An adaptive mesh refinement solver for compressible reacting flows. International Journal of High Performance Computing Applications 0, 0 (2022), 10943420221121151. Google ScholarCross Ref
Reference
[28] Hiebert Kathie L. and Shampine Lawrence F.. 1980. Implicitly Defined Output Points for Solutions of ODEs. Technical Report SAND80-0180. Sandia National Laboratories.Google Scholar
Reference
[29] Higueras Inmaculada. 2006. Strong stability for additive Runge-Kutta methods. SIAM J. Numer. Anal. 44, 4 (Jan.2006), 1735–1758. Google ScholarCross Ref
Reference
[30] Higueras Inmaculada. 2009. Characterizing strong stability preserving additive Runge-Kutta methods. J. Sci. Comput 39, 1 (April2009), 115–128. Google ScholarDigital Library
Reference
[31] Higueras Inmaculada, Happenhofer Natalie, Koch Othmar, and Kupka Friedrich. 2014. Optimized strong stability preserving IMEX Runge-Kutta methods. J. Comput. Appl. Math. 272 (Dec.2014), 116–140. Google ScholarDigital Library
Reference
[32] Hindmarsh Alan C., Brown Peter N., Grant Keith E., Lee Steven L., Serban Radu, Shumaker Dan E., and Woodward Carol S.. 2005. SUNDIALS: Suite of nonlinear and differential/algebraic equation solvers. ACM Trans. Math. Software 31, 3 (2005), 363–396.Google ScholarDigital Library
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
Reference 6
Reference 7
Reference 8
Reference 9
Reference 10
[33] Hindmarsh Alan C. and Serban Radu. 2012. User Documentation for CVODE v2.7.0. Technical Report UCRL-SM-208108. Lawrence Livermore National Laboratory.Google Scholar
Reference
[34] Kang Shinhoo, Dener Alp, Hamilton Aidan, Zhang Hong, Constantinescu Emil M., and Jacob Robert L.. 2022. Multirate Partitioned Runge–Kutta Methods for Coupled Navier–Stokes Equations. Google ScholarCross Ref
Reference
[35] Kennedy Christopher A. and Carpenter Mark H.. 2003. Additive Runge-Kutta schemes for convection-diffusion-reaction equations. Appl. Numer. Math. 44 (2003), 139–181. Google ScholarDigital Library
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
Reference 6
[36] Kennedy Christopher A. and Carpenter Mark H.. 2003. Additive Runge–Kutta schemes for convection– diffusion– reaction equations. Appl. Numer. Math. 44, 1–2 (2003), 139–181. Google ScholarDigital Library
Reference
[37] Kennedy Christopher A. and Carpenter Mark H.. 2019. Higher-order additive Runge–Kutta schemes for ordinary differential equations. Appl. Numer. Math. 136 (2019), 183–205. Google ScholarCross Ref
Reference 1Reference 2Reference 3
[38] Knoth Oswald and Wolke Ralf. 1998. Implicit-explicit Runge-Kutta methods for computing atmospheric reactive flows. Appl. Numer. Math. 28, 2 (1998), 327–341.Google ScholarDigital Library
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
[39] Kreckel H., Bruhns H., Čížek M., Glover S. C. O., Miller K. A., Urbain X., and Savin D. W.. 2010. Experimental results for H2 formation from h- and h and implications for first star formation. Science 329, 5987 (2010), 69–71. Google ScholarCross Ref
Reference
[40] Luan Vu Thai, Chinomona Rujeko, and Reynolds Daniel R.. 2020. A new class of high-order methods for multirate differential equations. SIAM J. Sci. Comput. 42, 2 (Jan.2020), A1245–A1268. Google ScholarDigital Library
Reference
[41] Ober Curtis C.. 2022. The Tempus Project Website. https://trilinos.github.io/tempus.html.Google Scholar
Reference
[42] Pareschi Lorenzo and Russo Giovanni. 2005. Implicit-explicit Runge-Kutta schemes and applications to hyperbolic systems with relaxation. J. Sci. Comput. 25, 1 (Oct.2005), 129–155. Google ScholarDigital Library
Reference
[43] Rackauckas Christopher and Nie Qing. 2017. DifferentialEquations.jl–A performant and feature-rich ecosystem for solving differential equations in julia. Journal of Open Research Software 5, 1 (2017), 15.Google ScholarCross Ref
Reference
[44] Radhakrishnan Bala, Gorti Sarma, and Babu Suresh Sudharsanam. 2016. Phase field simulations of autocatalytic formation of alpha lamellar colonies in Ti-6Al-4V. Metallurgical and Materials Transactions A 47, 12 (2016), 6577–6592.Google ScholarCross Ref
Reference
[45] Ranocha Hendrik, Dalcin Lisandro, Parsani Matteo, and Ketcheson David I.. 2021. Optimized runge-kutta methods with automatic step size control for compressible computational fluid dynamics. Communications on Applied Mathematics and Computation 4 (2021), 1191–1228. Google ScholarCross Ref
Reference
[46] Reynolds Daniel R., Gardner David J., and Balos Cody J.. 2022. SUNDIALS ManyVector+Multirate Demonstration Code. https://github.com/sundials-codes/sundials-manyvector-demo.Google Scholar
Reference
[47] Reynolds Daniel R., Gardner David J., Balos Cody J., and Woodward Carol S.. 2019. SUNDIALS multiphysics+MPIManyVector performance testing. arXiv:1909.12966 [cs] (Sept.2019). arxiv:1909.12966 [cs]Google Scholar
Reference
[48] Reynolds Daniel R., Gardner David J., Woodward Carol S., and Balos Cody J.. 2022. User Documentation for ARKODE. https://sundials.readthedocs.io/en/latest/arkode.Google Scholar
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
[49] Roberts Steven, Loffeld John, Sarshar Arash, Woodward Carol S., and Sandu Adrian. 2021. Implicit multirate GARK methods. J. Sci. Comput. 87, 1 (2021), 1–32.Google ScholarDigital Library
Reference
[50] Roberts Steven, Sarshar Arash, and Sandu Adrian. 2020. Coupled multirate infinitesimal GARK schemes for stiff systems with multiple time scales. SIAM J. Sci. Comput. 42, 3 (2020), A1609–A1638. Google ScholarDigital Library
Reference
[51] Sandu Adrian. 2019. A class of multirate infinitesimal GARK methods. SIAM J. Numer. Anal. 57, 5 (Jan.2019), 2300–2327. Google ScholarDigital Library
Reference 1Reference 2Reference 3
[52] Schlegel Martin, Knoth Oswald, Arnold Martin, and Wolke Ralf. 2009. Multirate Runge–Kutta schemes for advection equations. J. Comput. Appl. Math. 226, 2 (April2009), 345–357. Google ScholarDigital Library
Reference
[53] Schlegel Martin, Knoth Oswald, Arnold Martin, and Wolke Ralf. 2009. Multirate Runge-Kutta schemes for advection equations. J. Comput. Appl. Math. 226 (2009), 345–357.Google ScholarDigital Library
Reference
[54] Schlegel Martin, Knoth Oswald, Arnold Martin, and Wolke Ralf. 2012. Implementation of multirate time integration methods for air pollution modelling. Geosci. Model Dev. 5 (2012), 1395–1405.Google ScholarCross Ref
Reference
[55] Schlegel Martin, Knoth Oswald, Arnold Martin, and Wolke Ralf. 2012. Numerical solution of multiscale problems in atmospheric modeling. Appl. Numer. Math. 62 (2012), 1531–1542.Google ScholarDigital Library
Reference
[56] Shu Chi-Wang. 2003. High-order finite difference and finite volume WENO schemes and discontinuous Galerkin methods for CFD. Int. J. Comput. Fluid Dyn. 17, 2 (2003), 107–118. Google ScholarCross Ref
Reference
[57] Söderlind Gustaf. 1998. The automatic control of numerical integration. CWI Quarterly 11 (1998), 55–74.Google Scholar
Reference 1Reference 2
[58] Söderlind Gustaf. 2003. Digital filters in adaptive time-stepping. ACM Trans. Math. Software 29 (2003), 1–26.Google ScholarDigital Library
Reference 1Reference 2
[59] Söderlind Gustaf. 2006. Time-step selection algorithms: Adaptivity, control and signal processing. Appl. Numer. Math. 56 (2006), 488–502.Google ScholarDigital Library
Reference 1Reference 2
[60] Tomov Stanimire, Dongarra Jack, and Baboulin Marc. 2010. Towards dense linear algebra for hybrid GPU accelerated manycore systems. Parallel Comput. 36, 5–6 (June2010), 232–240. Google ScholarDigital Library
Reference
[61] Trevisan Cynthia S. and Tennyson Jonathan. 2002. Calculated rates for the electron impact dissociation of molecular hydrogen, deuterium and tritium. Plasma Physics and Controlled Fusion 44, 7 (June2002), 1263–1276. Google ScholarCross Ref
Reference
[62] Group INdAM Bari Unit Project. 2013. Test Set for IVP Solvers. https://archimede.dm.uniba.it/testset/testsetiv psolvers.Google Scholar
Reference
[63] Vogl Christopher J., Steyer Andrew, Reynolds Daniel R., Ullrich Paul A., and Woodward Carol S.. 2019. Evaluation of implicit-explicit additive Runge-Kutta integrators for the HOMME-NH dynamical core. J. Adv. Model. Earth Syst. 11, 12 (2019), 4228–4244.Google ScholarCross Ref
Reference
[64] Wensch Jörg, Knoth Oswald, and Galant Alexander. 2009. Multirate infinitesimal step methods for atmospheric flow simulation. BIT Numer. Math. 49, 2 (June2009), 449–473. Google ScholarDigital Library
Reference
[65] Zonneveld J. A.. 1963. Automatic Integration of Ordinary Differential Equations. Technical Report R743. Mathematisch Centrum, Postbus 4079, 1009AB Amsterdam.Google Scholar
Reference

Index Terms

ARKODE: A Flexible IVP Solver Infrastructure for One-step Methods
1. Mathematics of computing
  1. Mathematical analysis
    1. Differential equations
      1. Ordinary differential equations
  2. Mathematical software
    1. Solvers

Recommendations

Design and Implementation of Predictors for Additive Semi-Implicit Runge-Kutta Methods

Space discretization of some time-dependent partial differential equations gives rise to stiff systems of ordinary differential equations. In this case, implicit methods should be used, and therefore, in general, nonlinear systems must be solved. The ...
Read More
Numerical approximation of Turing patterns in electrodeposition by ADI methods

In this paper we study the numerical approximation of Turing patterns corresponding to steady state solutions of a PDE system of reaction-diffusion equations modeling an electrodeposition process. We apply the Method of Lines (MOL) and describe the semi-...
Read More
Devising efficient numerical methods for oscillating patterns in reaction-diffusion systems

In this paper, we consider the numerical approximation of a reaction-diffusion system 2D in space whose solutions are patterns oscillating in time or both in time and space. We present a stability analysis for a linear test heat equation in terms of the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Mathematical Software Volume 49, Issue 2
June 2023
275 pages
ISSN:0098-3500
EISSN:1557-7295
DOI:10.1145/3604595
Editors:
Zhaojun Bai
University of California at Davis, USA
,
Wolfgang Bangerth
Colorado State University, USA
Issue’s Table of Contents
ACM acknowledges that this contribution was authored or co-authored by an employee, contractor, or affiliate of the United States government. As such, the United States government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for government purposes only.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 June 2023
- Online AM: 25 April 2023
- Accepted: 10 April 2023
- Revised: 26 March 2023
- Received: 27 May 2022
Published in toms Volume 49, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
ODEs
adaptive integration
additive Runge–Kutta methods
ImEx methods
multirate methods
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 237
  Total Downloads
- Downloads (Last 12 months)237
- Downloads (Last 6 weeks)25
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

ARKODE: A Flexible IVP Solver Infrastructure for One-step Methods

ACM Transactions on Mathematical Software

Abstract

1 INTRODUCTION

2 TIME-STEPPING MODULES

2.1 ERKStep

2.2 ARKStep

2.3 MRIStep

3 SHARED INFRASTRUCTURE

3.1 User-supplied Tolerances and Error Control

3.2 Stepsize Adaptivity

3.3 Interpolation Modules

3.3.1 Hermite Interpolation Module.

3.3.2 Lagrange Interpolation Module.

3.4 Implicit Solvers

3.4.1 Nonlinear Solver Methods.

3.4.2 Implicit Predictors.

3.4.3 Linear Solver Methods.

3.5 Temporal Root-finding

3.6 Inequality Constraints

4 USAGE

5 NUMERICAL RESULTS

5.1 Simple Example Problem

5.1.1 ERKStep.

5.1.2 ARKStep.

5.1.3 MRIStep.

5.2 Large-scale 3D Demonstration Problem

6 CONCLUSIONS

ACKNOWLEDGMENTS

REFERENCES

Cited By

Index Terms

Recommendations

Design and Implementation of Predictors for Additive Semi-Implicit Runge-Kutta Methods

Numerical approximation of Turing patterns in electrodeposition by ADI methods

Devising efficient numerical methods for oscillating patterns in reaction-diffusion systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media