-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adaptive time-stepping for transient solver using SUNDIALS #292
base: main
Are you sure you want to change the base?
Conversation
…ve explicit RK integrator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good. There's a few smaller structural issues, but the two bigger things I am thinking about
a) Can the RUNGE_KUTTA
option use instead the mfem internal time integrator? That way the sundials connection is really only used for the adaptivity. This would slightly tidy the interface for adaptivity, along with making the non-adaptive slightly more fully featured. There are some SDIRK options in there that would probably be good choices.
b) Is there a way to compute B implicit to this process?
I have some of my suggestions on hughcars/transient-adapt-dt
if you want to take a look.
En += E; | ||
Curl->AddMult(En, B, -0.5 * dt); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a better way to be computing B
as part of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so. We solve the ODE system for E and Edot, not sure how else to get B.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI you can, but you have to use the (more standard?) [E, B]
linearization instead of the [E, dE/dt]
one. This is described in Zhu and Cangellaris and should permit the same linear system for E
after 2x2 block elimination.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant no easy way in this [E, dE/dt]
linearization. I actually tried the [E, B]
formulation first but I was seeing some weird things like small but persistent solution differences compared the second-order ODE formulation, error vs dt trends not matching the expected order of accuracy, and some stability issues on complex cases. Not sure if it was an issue with the MixedVectorWeakCurl
operator, some BC consideration (?), or most likely me doing something wrong. We're thinking of moving forward with the [E, dE/dt]
formulation for now but I will try to revisit this later.
palace/models/timeoperator.cpp
Outdated
Vector du1, du2, rhs1, rhs2; | ||
du1.UseDevice(true); | ||
du2.UseDevice(true); | ||
rhs1.UseDevice(true); | ||
rhs2.UseDevice(true); | ||
du.GetSubVector(idx1, du1); | ||
du.GetSubVector(idx2, du2); | ||
RHS.GetSubVector(idx1, rhs1); | ||
RHS.GetSubVector(idx2, rhs2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need idx1
or idx2
here rather than the approach taken in ComplexVector
of using ReadWrite()
?
du.ReadWrite();
Vector du1(du.GetData() + 0, size_E), du2(du.GetData() + size_E, size_E);
RHS.ReadWrite();
Vector rhs1(du.GetData() + 0, size_E), rhs2(du.GetData() + size_E, size_E);
also probably worth making RHS
into rhs
or rhsX
into RHSX
for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried that first but I was getting errors like this when running on GPU:
Verification failed: (it != maps->memories.end()) is false:
--> host pointer is not registered: h_ptr = 0x2af94870
... in function: static void mfem::MemoryManager::CheckHostMemoryType_(mfem::MemoryType, void*, bool)
... in file: /data/home/simlap/palace/build/extern/mfem/general/mem_manager.cpp:1714
Then I saw this page https://mfem.org/gpu-support/ warning against GetData()
on GPU.
u.Read(); | ||
u1.MakeRef(const_cast<Vector &>(u), 0, size_E); | ||
u2.MakeRef(const_cast<Vector &>(u), size_E, size_E); | ||
rhs.ReadWrite(); | ||
rhs1.MakeRef(rhs, 0, size_E); | ||
rhs2.MakeRef(rhs, size_E, size_E); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! The const_cast
isn't my favorite here, as it means the u1
and u2
aren't reflecting the safety constraints on u
. I have a trick I think should work to address this using structured binding:
diff --git a/palace/models/timeoperator.cpp b/palace/models/timeoperator.cpp
index ba5eb184..e12266bc 100644
--- a/palace/models/timeoperator.cpp
+++ b/palace/models/timeoperator.cpp
@@ -18,6 +18,28 @@ namespace palace
namespace
{
+// Helper method for assembling a pair of vectors as references to each half of a vector.
+std::pair<Vector, Vector> AssemblePairedRef(Vector &x, int size)
+{
+ Vector x1, x2;
+ x1.UseDevice(true);
+ x2.UseDevice(true);
+ x1.MakeRef(x, 0, size);
+ x2.MakeRef(x, size, size);
+ return {x1, x2};
+};
+
+// Helper method for assembling a pair of vectors as references to each half of a vector.
+std::pair<const Vector, const Vector> AssemblePairedRef(const Vector &x, int size)
+{
+ Vector x1, x2;
+ x1.UseDevice(true);
+ x2.UseDevice(true);
+ x1.MakeRef(const_cast<Vector &>(x), 0, size);
+ x2.MakeRef(const_cast<Vector &>(x), size, size);
+ return {x1, x2};
+};
+
class TimeDependentFirstOrderOperator : public mfem::TimeDependentOperator
{
public:
@@ -103,17 +125,8 @@ public:
// Form the RHS for the first-order ODE system
void FormRHS(const Vector &u, Vector &rhs) const
{
- Vector u1, u2, rhs1, rhs2;
- u1.UseDevice(true);
- u2.UseDevice(true);
- rhs1.UseDevice(true);
- rhs2.UseDevice(true);
- u.Read();
- u1.MakeRef(const_cast<Vector &>(u), 0, size_E);
- u2.MakeRef(const_cast<Vector &>(u), size_E, size_E);
- rhs.ReadWrite();
- rhs1.MakeRef(rhs, 0, size_E);
- rhs2.MakeRef(rhs, size_E, size_E);
+ const auto [u1, u2] = AssemblePairedRef(u, size_E);
+ auto [rhs1, rhs2] = AssemblePairedRef(rhs, size_E);
// u1 = Edot, u2 = E
// rhs1 = -(K * u2 + C * u1) - J(t)
@@ -140,17 +153,8 @@ public:
}
FormRHS(u, RHS);
- Vector du1, du2, RHS1, RHS2;
- du1.UseDevice(true);
- du2.UseDevice(true);
- RHS1.UseDevice(true);
- RHS2.UseDevice(true);
- du.ReadWrite();
- du1.MakeRef(du, 0, size_E);
- du2.MakeRef(du, size_E, size_E);
- RHS.ReadWrite();
- RHS1.MakeRef(RHS, 0, size_E);
- RHS2.MakeRef(RHS, size_E, size_E);
+ auto [du1, du2] = AssemblePairedRef(du, size_E);
+ auto [RHS1, RHS2] = AssemblePairedRef(RHS, size_E);
kspM->Mult(RHS1, du1);
du2 = RHS2;
@@ -171,17 +175,8 @@ public:
Mpi::Print("\n");
FormRHS(u, RHS);
- Vector k1, k2, RHS1, RHS2;
- k1.UseDevice(true);
- k2.UseDevice(true);
- RHS1.UseDevice(true);
- RHS2.UseDevice(true);
- k.ReadWrite();
- k1.MakeRef(k, 0, size_E);
- k2.MakeRef(k, size_E, size_E);
- RHS.ReadWrite();
- RHS1.MakeRef(RHS, 0, size_E);
- RHS2.MakeRef(RHS, size_E, size_E);
+ auto [k1, k2] = AssemblePairedRef(k, size_E);
+ auto [RHS1, RHS2] = AssemblePairedRef(RHS, size_E);
// A k1 = RHS1 - dt K RHS2
K->AddMult(RHS2, RHS1, -dt);
@@ -215,18 +210,11 @@ public:
// Solve (Mass - dt Jacobian) x = Mass b
int SUNImplicitSolve(const Vector &b, Vector &x, double tol) override
{
- Vector b1, b2, x1, x2, RHS1;
- b1.UseDevice(true);
- b2.UseDevice(true);
- x1.UseDevice(true);
- x2.UseDevice(true);
+ const auto [b1, b2] = AssemblePairedRef(b, size_E);
+ auto [x1, x2] = AssemblePairedRef(x, size_E);
+
+ Vector RHS1;
RHS1.UseDevice(true);
- b.Read();
- b1.MakeRef(const_cast<Vector &>(b), 0, size_E);
- b2.MakeRef(const_cast<Vector &>(b), size_E, size_E);
- x.ReadWrite();
- x1.MakeRef(x, 0, size_E);
- x2.MakeRef(x, size_E, size_E);
RHS.ReadWrite();
RHS1.MakeRef(RHS, 0, size_E);
If that works, we might actually want to use the same trick in other places, then the const
violation will be trapped to one relatively safe location. You should test this on your gpu builds though, as it's very plausible that the Vector constructors don't play too nicely with this.
I updated the config file defaults and parameter checking so we can use appropriate constraints in the json script. Using some existence checks and a few if's in config.cpp to remove the need for the exhaustive if/else's I had in iodata.cpp. |
Add adaptive time-stepping capability for transient simulations. The default transient solver type remains a fixed time-stepping integrator, but the user can now use SUNDIALS implicit multistep (CVODE) and Runge-Kutta (ARKODE) integrators if desired.