Skip to content

Use compiler builtins to detect "simple common cases" in pp_add, pp_subtract, and pp_multiply #23503

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Aug 19, 2025

Conversation

t-a-k
Copy link
Contributor

@t-a-k t-a-k commented Jul 29, 2025

This will hopefully make the code faster and smaller, and make more cases to be handled as "simple common cases".

Note that this change uses HAS_BUILTIN_{ADD,SUB,MUL}_OVERFLOW macros which have already been defined in config.h but seem not to have been used by existing code.

The behavior should be the same before and after this change.


  • This set of changes requires a perldelta entry, and it is included.

@tonycoz
Copy link
Contributor

tonycoz commented Jul 30, 2025

This breaks Win32, which doesn't enable the __builtin_add_... etc builtins even for gcc.

I suspect it's due to long being 32-bits even on 64-bit Win32, but I haven't tried to debug it.

@t-a-k
Copy link
Contributor Author

t-a-k commented Jul 31, 2025

This breaks Win32, which doesn't enable the __builtin_add_... etc builtins even for gcc.

I suspect it's due to long being 32-bits even on 64-bit Win32, but I haven't tried to debug it.

Thank you for your comment. My patch has a fallback code (similar to the code used before this change) for compilers with no overflow-checking builtins, but it had a bug (I was able to reproduce similar symptoms by ./Configure ... -Duse64bitint -Ud_builtin_mul_overflow on 32-bit x86 Linux). I've pushed a commit to fix this.

inline.h Outdated
# ifndef IV_MUL_OVERFLOW_IS_EXPENSIVE
/* Strict overflow check for IV multiplication is generally expensive
* when IV is a multi-word integer. */
# define IV_MUL_OVERFLOW_IS_EXPENSIVE (IVSIZE > LONGSIZE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a reasonable test - if we enable the builtins for GCC on Win32 x86-64 this will be false even though it is a 64-bit platform.

Testing against PTRSIZE is probably better since that better matches the platform word size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing this out. I've pushed a commit to change the test to against PTRSIZE.

inline.h Outdated

/* Define IV_*_OVERFLOW_IS_EXPENSIVE below to nonzero value
* if strict overflow checks are too expensive
* (for example, for CPUs that has no hardware overflow detection flag).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RISC V doesn't have a hardware overflow flag (or any classic carry, zero etc flags), so the compiler generates more complex code:

bool my_chk_add(long a, long b, long *result) {
    return __builtin_add_overflow(a, b, result);
}
; RISC-V
my_chk_add(long, long, long*):
        add     a5,a0,a1
        slt     a0,a5,a0
        slti    a1,a1,0
        sub     a0,a1,a0
        sd      a5,0(a2)
        snez    a0,a0
        ret
; amd64
"my_chk_add(long, long, long*)":
        add     rdi, rsi
        mov     QWORD PTR [rdx], rdi
        seto    al
        ret
; arm64
my_chk_add(long, long, long*):
        adds    x1, x0, x1
        str     x1, [x2]
        cset    w0, vs
        ret

(no action needed here)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as RISC-V ISA is concerned, it seems to require only two (slt and slti) additional instructions to detect signed overflow. I think this is still faster than classic overflow estimations previously used in pp_add and pp_subtract.

# endif

# if defined(I_STDCKDINT) && !IV_ADD_SUB_OVERFLOW_IS_EXPENSIVE
/* XXX Preparation for upcoming C23, but I_STDCKDINT is not yet tested */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modern clang has stdckdint.h

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, but it would require patches to Configure and so on (and I currently have no test environment with modern Clang or GCC 14+), so I intend to make a separate PR to enable it.

@tonycoz
Copy link
Contributor

tonycoz commented Aug 18, 2025

It looks reasonable to me.

3 commit messages indicate they need to be squashed

…vailable

This will hopefully make the code faster and smaller, and
make more cases to be handled as "simple common cases".

Note that this change uses HAS_BUILTIN_{ADD,SUB,MUL}_OVERFLOW macros
which have already been defined in config.h but seem not to have been
used by existing code.

t/op/64bitint.t: Add tests to exercise "simple common cases".
Note that these tests should pass even before this change.

Thanks to @tonycoz for advices to make this patch work better for
LLP64 platforms, especially Win32 x86-64.
@t-a-k t-a-k force-pushed the overflow-builtins branch from b2a723a to 61ea76f Compare August 18, 2025 18:02
@t-a-k
Copy link
Contributor Author

t-a-k commented Aug 18, 2025

It looks reasonable to me.

3 commit messages indicate they need to be squashed

Thank you for reviewing. I've squashed (and rebased) the commits.

t/op/64bitint.t Outdated
@@ -469,4 +469,30 @@ cmp_ok 0x3ffffffffffffffe % -0xc000000000000000, '==', -0x8000000000000002, 'mo
cmp_ok 0x3fffffffffffffff % -0xc000000000000000, '==', -0x8000000000000001, 'modulo is (IV_MIN-1)';
cmp_ok 0x4000000000000000 % -0xc000000000000000, '==', -0x8000000000000000, 'modulo is IV_MIN';

# Arithmetic close to IV overflow
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On this comment line and the one on line 482 below, consider adding [GH 23503] so that in the future we can look up the context of the code change more easily.

@khwilliamson khwilliamson merged commit c27ae6c into Perl:blead Aug 19, 2025
33 checks passed
@t-a-k t-a-k deleted the overflow-builtins branch August 20, 2025 17:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants