Skip to content

A new gauge fixing algorithm which returns the rotation field. #1481

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 26 commits into
base: develop
Choose a base branch
from

Conversation

SaltyChiang
Copy link
Contributor

@SaltyChiang SaltyChiang commented Jul 23, 2024

I implemented a new over-relaxation gauge-fixing algorithm. The major difference between the new and the old implementations is we can obtain the rotation field now. Our workflow often requires this.

The rotation field $g(x)$ is defined as follows:
$$U^\prime_\mu(x)=g(x)U_\mu(x)g^\dagger(x+\hat{\mu})$$

  • Add gaugeRotate to rotate a gauge field
    • Expose performGaugeRotateQuda in quda.h
  • Add gaugeFixingQuality to evaluate gauge fixing quality
    • Expose gaugeFixingQuality in quda.h
  • Add QudaGaugeFixParam to handle parameters for gauge fixing
    • Can this be used for FFT in the future?
  • Add performGaugeFixQuda in quda.h
    • Will not break the existing interface
  • Add gaugeShift to shift a gauge field
    • Expose gaugeShift in quda.h
    • The same feature as Adding gauge shift #1348
    • We actually want to shift the gauge field $U_\mu$ with just one value of $\mu$ instead of all four $\mu$
  • Add shift-only mode to covariant derivative
    • Add covdev_shift to QudaInvertParam to use the shift kernel
  • Enable both Ns=4 and Ns=1 for QUDA_COVDEV_DSLASH and QUDA_LAPLACE_DSLASH
    • Implemented by adding staggered to QudaInvertParam, which is only used with QUDA_COVDEV_DSLASH and QUDA_LAPLACE_DSLASH
  • Fix possible divergence if distance preconditioning is used

@SaltyChiang SaltyChiang requested review from a team as code owners July 23, 2024 12:25
@SaltyChiang
Copy link
Contributor Author

SaltyChiang commented Jul 23, 2024

I'm wondering if I should just override the old interface computeGaugeFixingOVRQuda or add a new interface function. Please let me know if you have any considerations.

I tested the performance and didn't see any significant performance regression compared to the old implementation. But more testing is definitely needed.

@maddyscientist
Copy link
Member

@Jenkins test this please

@maddyscientist
Copy link
Member

Thanks for this PR @SaltyChiang. I don't think there should be any performance change, so I think we could just replace the old interface with the new one.

Can you go ahead and make this change to your PR? I think you'll need to update the MILC interface code (milc_interface.cpp) since that code calls the older interface, but that should be an easy change.

@SaltyChiang SaltyChiang marked this pull request as draft April 30, 2025 14:42
@SaltyChiang
Copy link
Contributor Author

Some additional explanation about the distance preconditioning:
The source vector will get a very small norm after reweighting, so I force the reweighted vector to be normalized. Sometimes we have very large Nt, and the value of cosh(alpha*(t-t0)) will be too large/small for fp32 (>2e38 or <1e-38), which causes nan values. So I force the weighting function to return fp64 values.

@SaltyChiang SaltyChiang marked this pull request as ready for review May 9, 2025 09:17
@SaltyChiang
Copy link
Contributor Author

The performance could be further improved by hiding communication time during computation, and the old version of gauge fixing divided the points into two parts called "Border" and "Int" to implement it. The current performance of the gauge fixing cannot beat the old one in some situations (for example, reunit_interval not very small), so I decided not to remove the existing algorithm.

I did not write a kernel similar to the old one, since it's not good for readability. I think a similar optimization could be applied to the new implementation by using a special dslash kernel working on a special spinor with Ns=3 and Nc=3. I want to work on other topics and will leave the code here for now.

Copy link
Member

@maddyscientist maddyscientist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all this work @SaltyChiang. This looks like a great contribution.

The main request I have for you is that we would need some unit testing added for these new features.

  • For the new gauge fixing functionality, could you add a test for this in the gauge_alg_test?
  • Do you have any thoughts on how to test the shift-only covariant derivative?

include/quda.h Outdated
@@ -140,6 +142,8 @@ extern "C" {

int laplace3D; /**< omit this direction from laplace operator: x,y,z,t -> 0,1,2,3 (-1 is full 4D) */
int covdev_mu; /**< Apply forward/backward covariant derivative in direction mu(mu<=3)/mu-4(mu>3) */
bool covdev_shift; /**< Apply the shift instead of the covariant derivative */
bool staggered; /**< If the input field is staggered or not for Laplace and CovDev */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a more descriptive variable name is needed here, instead of just staggered?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both COVDEV and LAPLACE will use it, and I thought something like covdev_laplace_staggered/covdev_laplace_nspin looks terrible. Another choice is to use two variables like covdev_nspin and laplace_nspin. Which one do you prefer?

Copy link
Contributor Author

@SaltyChiang SaltyChiang May 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use laplace_nspin and covdev_nspin instead. They are initialized as 1 and 4 respectively to keep the former behavior.

@maddyscientist
Copy link
Member

The performance could be further improved by hiding communication time during computation, and the old version of gauge fixing divided the points into two parts called "Border" and "Int" to implement it. The current performance of the gauge fixing cannot beat the old one in some situations (for example, reunit_interval not very small), so I decided not to remove the existing algorithm.

I did not write a kernel similar to the old one, since it's not good for readability. I think a similar optimization could be applied to the new implementation by using a special dslash kernel working on a special spinor with Ns=3 and Nc=3. I want to work on other topics and will leave the code here for now.

Yes, the old version of the code that overlaps comms and compute, while efficient, is horrible to read. Fine to have both versions of the code for now, and for correctness testing, having the two versions is not a bad thing anyway. 😄

@SaltyChiang
Copy link
Contributor Author

I noticed the name covdev_mu is also used in covdev_test.cpp but has a different meaning from that in QudaInvertParam (I added it in the previous PR). I think it looks a bit ambiguous. Maybe another name, such as --test-mu, is better?

@SaltyChiang
Copy link
Contributor Author

@maddyscientist Tests for new gauge fixing and shift-only covdev are added.

  • fp32 testing for gauge fixing causes nan in the new gauge fixing algorithm. Using double versors is a workaround.
  • The shift-only covariant derivative is tested by comparing the shift and normal covariant derivative results with a unit gauge field. They should be the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants