ARC-V RHX-100 upstream patch series #192

MichielDerhaeg · 2025-11-28T20:05:47Z

No description provided.

MichielDerhaeg

I tried to split up the commits in something sensible. Didn't check whether they can be built individually though.

gcc/config/riscv/riscv.cc

gcc/config/riscv/riscv-protos.h

MichielDerhaeg · 2025-11-28T20:10:29Z

gcc/config/riscv/riscv.cc

    case SIGN_EXTRACT:
-      if (TARGET_XTHEADBB && outer_code == SET
+      if ((TARGET_ARCV_RHX100 || TARGET_XTHEADBB)
+	  && outer_code == SET


FYI, this was added for the bit-extract fusion.

MichielDerhaeg · 2025-11-28T20:11:10Z

gcc/config/riscv/riscv.md

+(define_insn "*zero_extract_fused"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(zero_extract:SI (match_operand:SI 1 "register_operand" "r")
+			 (match_operand 2 "const_int_operand")
+			 (match_operand 3 "const_int_operand")))]
+  "TARGET_ARCV_RHX100 && !TARGET_64BIT
+     && (INTVAL (operands[2]) > 1 || !TARGET_ZBS)"
+  {
+     int amount = INTVAL (operands[2]);
+     int end = INTVAL (operands[3]) + amount;
+     operands[2] = GEN_INT (BITS_PER_WORD - end);
+     operands[3] = GEN_INT (BITS_PER_WORD - amount);
+     return "slli\t%0,%1,%2\n\tsrli\t%0,%0,%3";
+  }
+  [(set_attr "type" "alu_fused")]
+)


As far as I can tell, this fusion was never implemented as a define_insn_and_split. Might not be trivial to force these exact instructions after a split.

gcc/config/riscv/riscv.md

libstdc++-v3/ChangeLog: * include/bits/atomic_wait.h (__detail::__atomic_eq): Use std::addressof instead of &. * include/std/atomic (atomic::wait, atomic::notify_one) (atomic::notify_all): Likewise. Reviewed-by: Patrick Palka <[email protected]>

gcc/algol68/ChangeLog PR algol68/123007 * a68-lang.cc (a68_type_for_size): Handle intTI_type_node.

Implement the forwarding performed by std::bind via deducing this when available, instead of needing 4 operator() overloads. Using deducing this here is more complicated than in other standard call wrappers because std::bind is not really "perfect forwarding": it doesn't consider value category, and along with const-ness it also forwards volatile-ness (until C++20). The old implementation suffers from the same problem that other pre-C++23 SFINAE-friendly call wrappers have which is solved by using deducing this (see p5.5 of the deducing this paper P0847R7). PR libstdc++/80564 libstdc++-v3/ChangeLog: * include/std/functional (__cv_like): New. (_Bind::_Res_type): Don't define when not needed. (_Bind::__dependent): Likewise. (_Bind::_Res_type_cv): Likewise. (_Bind::operator()) [_GLIBCXX_EXPLICIT_THIS_PARAMETER]: Define as two instead of four overloads using deducing this. * testsuite/20_util/bind/cv_quals_2.cc: Ignore SFINAE diagnostics inside headers. * testsuite/20_util/bind/ref_neg.cc: Likewise. * testsuite/20_util/bind/80564.cc: New test. Reviewed-by: Tomasz Kamiński <[email protected]> Reviewed-by: Jonathan Wakely <[email protected]>

Starting with r16-4438-ga93f80feeef744, the edge sorting order was switched to lowest execution frequency first. But the "bbro" optimization pass chooses the first edge as a fallthrough. Thus the most unlikely branches were optimized to fallthroughs. Fix by restoring the sorting order prior to r16-4438-ga93f80feeef744. Now the branches most likely to be executed are picked as fallthroughs. There are no regressions for C and C++ on x86_64-pc-linux-gnu. The new tests fail for the respective targets without this patch, and pass with it. PR rtl-optimization/122675 gcc/ChangeLog: * bb-reorder.cc (edge_order): Fix BB edge ordering to be descending. gcc/testsuite/ChangeLog: * gcc.target/aarch64/pr122675-1.c: New test. * gcc.target/i386/pr122675-1.c: New test. * gcc.target/riscv/pr122675-1.c: New test. Signed-off-by: Dimitar Dimitrov <[email protected]>

…fix option From: Mark Zhuang <[email protected]> The previous commit added --default-prefix to handle non-default git prefix configurations, but this option is not available in older git versions. This patch adds a compatibility check. contrib/ChangeLog: * prepare-commit-msg: check --default-prefix

2025-12-06 Paul Thomas <[email protected]> gcc/fortran PR fortran/122578 * primary.cc (gfc_match_varspec): Try to resolve a typebound generic procedure selector expression to provide the associate name with a type. Also, resolve component calls. In both cases, make a copy of the selector expression to guard against changes made by gfc_resolve_expr. gcc/testsuite PR fortran/122578 * gfortran.dg/pdt_72.f03: New test.

2025-12-06 Paul Thomas <[email protected]> gcc/fortran PR fortran/122669 * resolve.cc (resolve_allocate_deallocate): Mold expressions with an array reference and a constant size must be resolved for each allocate object. gcc/testsuite PR fortran/122669 * gfortran.dg/pdt_73.f03: New test.

2025-12-06 Paul Thomas <[email protected]> gcc/fortran PR fortran/122670 * decl.cc (gfc_get_pdt_instance): Ensure that, in an interface body, PDT instances imported implicitly if the template has been explicitly imported. * module.cc (read_module): If a PDT template appears in a use only statement, implicitly add the instances as well. gcc/testsuite PR fortran/122670 * gfortran.dg/pdt_74.f03: New test.

2025-12-06 Paul Thomas <[email protected]> gcc/fortran PR fortran/122693 * array.cc (gfc_match_array_constructor): Stash and restore gfc_current_ns after the call to 'gfc_match_type_spec'. gcc/testsuite PR fortran/122693 * gfortran.dg/pdt_75.f03: New test.

This has been discussed in the 1/9 Reflection thread, but doesn't depend on reglection in any way. cp_parser_std_attribute calls lookup_attribute_spec as: const attribute_spec *as = lookup_attribute_spec (TREE_PURPOSE (attribute)); so with TREE_LIST where TREE_VALUE is attribute name and TREE_PURPOSE attribute ns. Similarly c_parser_std_attribute. And for attribute_takes_identifier_p those do: else if (attr_ns == gnu_identifier && attribute_takes_identifier_p (attr_id)) and bool takes_identifier = (ns != NULL_TREE && strcmp (IDENTIFIER_POINTER (ns), "gnu") == 0 && attribute_takes_identifier_p (name)); when handling std attributes (for GNU attributes they just call those with the IDENTIFIER_NODE name. is_late_template_attribute and tsubst_attribute pass to these functions just get_attribute_name though, so handle attributes in all namespaces as GNU attributes only, which means that lookup_attribute_spec can return NULL or find a different attribute if it is not from gnu:: or say standard attribute mapped to gnu::, or attribute_takes_identifier_p can return true even for attributes for which it shouldn't. I thought about changing attribute_takes_identifier_p to take optionally TREE_LIST, but that would mean handling it in the target hooks too and they only care about GNU attributes right now, so given the above parser.cc/c-parser.cc snippets, the following patch just follow what they do. 2025-12-06 Jakub Jelinek <[email protected]> * decl2.cc (is_late_template_attribute): Call lookup_attribute_spec on TREE_PURPOSE (attr) rather than name. Only call attribute_takes_identifier_p if get_attribute_namespace (attr) is gnu_identifier. * pt.cc (tsubst_attribute): Only call attribute_takes_identifier_p if get_attribute_namespace (t) is gnu_identifier.

This is another thing discussed in the 1/9 Reflection thread, also not dependent on reflection. decl_attributes calls simple_cst_equal on TREE_VALUEs of the current and preexisting attributes, but that is just a small part of how attribute values should be compared. The following patch fixes that. 2025-12-06 Jakub Jelinek <[email protected]> * attribs.cc (decl_attributes): Use attribute_value_equal to compare attribute values instead of simple_cst_equal.

compile-std1.C was breaking on arm-eabi because these interfaces aren't declared. So for exporting let's check the same macros that control declaring them. libstdc++-v3/ChangeLog: * src/c++23/std.cc.in: Add more #if.

2025-12-06 Paul Thomas <[email protected]> gcc/testsuite PR fortran/103414 * gfortran.dg/pdt_76.f03: New test.

Just a minor update to Dimitar's patch for the RISC-V testcase. The cfi directives are not emitted for the -elf configurations causing the new test to fail. The cfi directives (and associated labels) don't seem relevant to the test at hand, so this just drops them. Pushing to the trunk. PR rtl-optimization/122675 gcc/testsuite * gcc.target/riscv/pr122675-1.c: Adjust expected output.

If the reducer is a function and the accumulator type isn't constrained, at runtime the reduction will likely raise a Constraint_Error since the reducer is repeatedly assigned to the accumulator variable (likely changing its length). However, if the reducer is a procedure, no such assignment occurs, and thus the runtime error only depends on the reducer logic. This patch prevents the spurious warning in that case. gcc/ada/ * sem_attr.adb (Resolve_Attribute): Check if the reducer is a procedure before giving the warning.

When computing an address plus a large offset on riscv64 with a PC-relative sequence, we may hit the range limit for auipc and get a relocation overflow, where on riscv32 the computation wraps around. Since -mcmodel=medany requires the entire program to fit in a 2GiB address range, a +/-1GiB+ offset added to an in-range symbol in a barely-fitting program is more likely than not to be out-of-range. Since such large constants are unlikely to come up by chance, separate them from the symbol so as to avoid the relocation overflow. for gcc/ChangeLog PR target/91420 * config/riscv/riscv.cc (riscv_symbolic_constant_p): Require offsets smaller than +/- 1GiB for PCREL symbols. for gcc/testsuite/ChangeLog PR target/91420 * gcc.target/riscv/pr91420.c: New.

Since we may delete stores that are found to be redundant in postreload cse, we need cselib to invalidate argument stores at calls, and to that end we need CALL_INSN_FUNCTION_USAGE to mention all MEM stack space that may be legitimately modified by a const/pure callee, i.e., all arguments passed to it on the stack. When ACCUMULATE_OUTGOING_ARGS, each on-stack argument gets its own usage information, but when it's not, each argument is pushed incrementally, without precomputed stack slots. Since we only mentioned such precomputed stack slots in CALL_INSN_FUNCTION_USAGE, non-ACCUMULATE_OUTGOING_ARGS configurations miss the stack usage data, and cselib fails to invalidate the stores. Stores in such slots are anonymous, and they often invalidate other anonymous slots, even part of the same object, but as the testcase demonstrates, we may occasionally be unlucky that consecutive calls have the stores to multi-word objects reordered by scheduling in such a way that the last store for the first call survives the call in the cselib tables, and then it is found to be redundant with the first store for the subsequent call, as in the testcase. So, if we haven't preallocated outgoing arguments for a call (which would give us preassigned stack slots), and we have used any stack space, add function call usage covering the entire stack range where arguments were stored. for gcc/ChangeLog PR rtl-optimization/122947 * calls.cc (expand_call): Add stack function usage in non-ACCUMULATE_OUTGOING_ARGS configurations. for gcc/testsuite/ChangeLog PR rtl-optimization/122947 * gcc.dg/pr122947.c: New.

Rework dump_cselib_table to not crash when cselib_preserved_hash_table is not allocated, and to remove the extraneous indirection from dump_cselib_val that made it inconvenient to call from a debugger. for gcc/ChangeLog * cselib.cc (dump_cselib_val): Split out of and rename to... (dump_cselib_val_ptr): ... this. (dump_cselib_table): Adjust. Skip cselib_preserved_hash_table when not allocated.

Volatile memory can be used as source operand for any operations. Add -ffuse-ops-with-volatile-access to fuse operations with volatile memory reference and update simplify_binary_operation_1 to keep PLUS for 2 volatile memory references. On x86, this optimizes extern volatile int bar; int foo (int z) { z *= 123; return bar + z; } into foo: imull $123, %edi, %eax addl bar(%rip), %eax ret and compile extern volatile unsigned char u8; void test (void) { u8 = u8 + u8; u8 = u8 - u8; } into test: movzbl u8(%rip), %eax addb %al, u8(%rip) movzbl u8(%rip), %eax subb u8(%rip), %al movb %al, u8(%rip) ret Tested with Linux kernel 6.17.9 on Intel Core i7-1195G7. gcc/ PR target/122343 * common.opt: Add -ffuse-ops-with-volatile-access. * common.opt.urls: Regenerated. * recog.cc (general_operand): Allow volatile memory reference if -ffuse-ops-with-volatile-access is enabled. * simplify-rtx.cc (simplify_binary_operation_1): Keep PLUS for 2 volatile memory references. * doc/invoke.texi: Document -ffuse-ops-with-volatile-access. gcc/testsuite/ PR target/122343 * gcc.target/i386/20040112-1.c: Add -fomit-frame-pointer and use check-function-bodies to check for loop. * gcc.target/i386/avx-ne-convert-1.c: Compile with -fno-fuse-ops-with-volatile-access. * gcc.target/i386/avx10_2-bf16-1.c: Likewise. * gcc.target/i386/avx10_2-convert-1.c: Likewise. * gcc.target/i386/avx10_2-satcvt-1.c: Likewise. * gcc.target/i386/avx512bf16-vcvtneps2bf16-1.c: Likewise. * gcc.target/i386/avx512bf16vl-vcvtneps2bf16-1a.c: Likewise. * gcc.target/i386/avx512bf16vl-vcvtneps2bf16-1b.c: Likewise. * gcc.target/i386/avx512bitalg-vpshufbitqmb.c: Likewise. * gcc.target/i386/avx512bw-vpcmpb-1.c: Likewise. * gcc.target/i386/avx512bw-vpcmpub-1.c: Likewise. * gcc.target/i386/avx512bw-vpcmpuw-1.c: Likewise. * gcc.target/i386/avx512bw-vpcmpw-1.c: Likewise. * gcc.target/i386/avx512dq-vcvtps2qq-1.c: Likewise. * gcc.target/i386/avx512dq-vcvtps2uqq-1.c: Likewise. * gcc.target/i386/avx512dq-vcvtqq2pd-1.c: Likewise. * gcc.target/i386/avx512dq-vcvtqq2ps-1.c: Likewise. * gcc.target/i386/avx512dq-vcvttps2qq-1.c: Likewise. * gcc.target/i386/avx512dq-vcvttps2uqq-1.c: Likewise. * gcc.target/i386/avx512dq-vcvtuqq2pd-1.c: Likewise. * gcc.target/i386/avx512dq-vcvtuqq2ps-1.c: Likewise. * gcc.target/i386/avx512dq-vextractf32x8-1.c: Likewise. * gcc.target/i386/avx512dq-vextractf64x2-1.c: Likewise. * gcc.target/i386/avx512dq-vextracti64x2-1.c: Likewise. * gcc.target/i386/avx512dq-vfpclasspd-1.c: Likewise. * gcc.target/i386/avx512dq-vfpclassps-1.c: Likewise. * gcc.target/i386/avx512dq-vfpclasssd-1.c: Likewise. * gcc.target/i386/avx512dq-vfpclassss-1.c: Likewise. * gcc.target/i386/avx512dq-vpmullq-1.c: Likewise. * gcc.target/i386/avx512dq-vpmullq-3.c: Likewise. * gcc.target/i386/avx512f-pr100267-1.c: Likewise. * gcc.target/i386/avx512f-vcmppd-1.c: Likewise. * gcc.target/i386/avx512f-vcmpps-1.c: Likewise. * gcc.target/i386/avx512f-vcvtps2pd-1.c: Likewise. * gcc.target/i386/avx512f-vcvtsd2si-1.c: Likewise. * gcc.target/i386/avx512f-vcvtsd2si64-1.c: Likewise. * gcc.target/i386/avx512f-vcvtsd2usi-1.c: Likewise. * gcc.target/i386/avx512f-vcvtsd2usi64-1.c: Likewise. * gcc.target/i386/avx512f-vcvtsi2ss-1.c: Likewise. * gcc.target/i386/avx512f-vcvtss2si-1.c: Likewise. * gcc.target/i386/avx512f-vcvtss2si64-1.c: Likewise. * gcc.target/i386/avx512f-vcvtss2usi-1.c: Likewise. * gcc.target/i386/avx512f-vcvtss2usi64-1.c: Likewise. * gcc.target/i386/avx512f-vcvttsd2si-1.c: Likewise. * gcc.target/i386/avx512f-vcvttsd2si64-1.c: Likewise. * gcc.target/i386/avx512f-vcvttsd2usi-1.c: Likewise. * gcc.target/i386/avx512f-vcvttsd2usi64-1.c: Likewise. * gcc.target/i386/avx512f-vcvttss2si-1.c: Likewise. * gcc.target/i386/avx512f-vcvttss2si64-1.c: Likewise. * gcc.target/i386/avx512f-vcvttss2usi-1.c: Likewise. * gcc.target/i386/avx512f-vcvttss2usi64-1.c: Likewise. * gcc.target/i386/avx512f-vextractf32x4-1.c: Likewise. * gcc.target/i386/avx512f-vextractf64x4-1.c: Likewise. * gcc.target/i386/avx512f-vextracti64x4-1.c: Likewise. * gcc.target/i386/avx512f-vmovapd-1.c: Likewise. * gcc.target/i386/avx512f-vmovaps-1.c: Likewise. * gcc.target/i386/avx512f-vmovdqa64-1.c: Likewise. * gcc.target/i386/avx512f-vpandnq-1.c: Likewise. * gcc.target/i386/avx512f-vpbroadcastd-1.c: Likewise. * gcc.target/i386/avx512f-vpbroadcastq-1.c: Likewise. * gcc.target/i386/avx512f-vpcmpd-1.c: Likewise. * gcc.target/i386/avx512f-vpcmpeqq-1.c: Likewise. * gcc.target/i386/avx512f-vpcmpequq-1.c: Likewise. * gcc.target/i386/avx512f-vpcmpged-1.c: Likewise. * gcc.target/i386/avx512f-vpcmpgeq-1.c: Likewise. * gcc.target/i386/avx512f-vpcmpgeud-1.c: Likewise. * gcc.target/i386/avx512f-vpcmpgeuq-1.c: Likewise. * gcc.target/i386/avx512f-vpcmpled-1.c: Likewise. * gcc.target/i386/avx512f-vpcmpleq-1.c: Likewise. * gcc.target/i386/avx512f-vpcmpleud-1.c: Likewise. * gcc.target/i386/avx512f-vpcmpleuq-1.c: Likewise. * gcc.target/i386/avx512f-vpcmpltd-1.c: Likewise. * gcc.target/i386/avx512f-vpcmpltq-1.c: Likewise. * gcc.target/i386/avx512f-vpcmpltud-1.c: Likewise. * gcc.target/i386/avx512f-vpcmpltuq-1.c: Likewise. * gcc.target/i386/avx512f-vpcmpneqd-1.c: Likewise. * gcc.target/i386/avx512f-vpcmpneqq-1.c: Likewise. * gcc.target/i386/avx512f-vpcmpnequd-1.c: Likewise. * gcc.target/i386/avx512f-vpcmpnequq-1.c: Likewise. * gcc.target/i386/avx512f-vpcmpq-1.c: Likewise. * gcc.target/i386/avx512f-vpcmpud-1.c: Likewise. * gcc.target/i386/avx512f-vpcmpuq-1.c: Likewise. * gcc.target/i386/avx512f-vrndscalepd-1.c: Likewise. * gcc.target/i386/avx512f-vrndscaleps-1.c: Likewise. * gcc.target/i386/avx512fp16-complex-fma.c: Likewise. * gcc.target/i386/avx512fp16-vaddph-1a.c: Likewise. * gcc.target/i386/avx512fp16-vcvtpd2ph-1a.c: Likewise. * gcc.target/i386/avx512fp16-vcvtph2dq-1a.c: Likewise. * gcc.target/i386/avx512fp16-vcvtph2pd-1a.c: Likewise. * gcc.target/i386/avx512fp16-vcvtph2psx-1a.c: Likewise. * gcc.target/i386/avx512fp16-vcvtph2qq-1a.c: Likewise. * gcc.target/i386/avx512fp16-vcvtph2udq-1a.c: Likewise. * gcc.target/i386/avx512fp16-vcvtph2uqq-1a.c: Likewise. * gcc.target/i386/avx512fp16-vcvtph2uw-1a.c: Likewise. * gcc.target/i386/avx512fp16-vcvtph2w-1a.c: Likewise. * gcc.target/i386/avx512fp16-vcvtps2ph-1a.c: Likewise. * gcc.target/i386/avx512fp16-vcvtqq2ph-1a.c: Likewise. * gcc.target/i386/avx512fp16-vcvttph2dq-1a.c: Likewise. * gcc.target/i386/avx512fp16-vcvttph2qq-1a.c: Likewise. * gcc.target/i386/avx512fp16-vcvttph2udq-1a.c: Likewise. * gcc.target/i386/avx512fp16-vcvttph2uqq-1a.c: Likewise. * gcc.target/i386/avx512fp16-vcvttph2uw-1a.c: Likewise. * gcc.target/i386/avx512fp16-vcvttph2w-1a.c: Likewise. * gcc.target/i386/avx512fp16-vcvtuqq2ph-1a.c: Likewise. * gcc.target/i386/avx512fp16-vfcmaddcph-1a.c: Likewise. * gcc.target/i386/avx512fp16-vfcmulcph-1a.c: Likewise. * gcc.target/i386/avx512fp16-vfmaddcph-1a.c: Likewise. * gcc.target/i386/avx512fp16-vfmulcph-1a.c: Likewise. * gcc.target/i386/avx512fp16-vfpclassph-1a.c: Likewise. * gcc.target/i386/avx512fp16-vfpclasssh-1a.c: Likewise. * gcc.target/i386/avx512fp16-vmulph-1a.c: Likewise. * gcc.target/i386/avx512fp16-vrcpph-1a.c: Likewise. * gcc.target/i386/avx512fp16-vrsqrtph-1a.c: Likewise. * gcc.target/i386/avx512fp16-vsqrtph-1a.c: Likewise. * gcc.target/i386/avx512fp16vl-vaddph-1a.c: Likewise. * gcc.target/i386/avx512fp16vl-vcvtpd2ph-1a.c: Likewise. * gcc.target/i386/avx512fp16vl-vcvtph2dq-1a.c: Likewise. * gcc.target/i386/avx512fp16vl-vcvtph2psx-1a.c: Likewise. * gcc.target/i386/avx512fp16vl-vcvtph2qq-1a.c: Likewise. * gcc.target/i386/avx512fp16vl-vcvtph2udq-1a.c: Likewise. * gcc.target/i386/avx512fp16vl-vcvtph2uqq-1a.c: Likewise. * gcc.target/i386/avx512fp16vl-vcvtph2uw-1a.c: Likewise. * gcc.target/i386/avx512fp16vl-vcvtph2w-1a.c: Likewise. * gcc.target/i386/avx512fp16vl-vcvtps2ph-1a.c: Likewise. * gcc.target/i386/avx512fp16vl-vcvtqq2ph-1a.c: Likewise. * gcc.target/i386/avx512fp16vl-vcvttph2dq-1a.c: Likewise. * gcc.target/i386/avx512fp16vl-vcvttph2udq-1a.c: Likewise. * gcc.target/i386/avx512fp16vl-vcvttph2uw-1a.c: Likewise. * gcc.target/i386/avx512fp16vl-vcvttph2w-1a.c: Likewise. * gcc.target/i386/avx512fp16vl-vcvtuqq2ph-1a.c: Likewise. * gcc.target/i386/avx512fp16vl-vfcmaddcph-1a.c: Likewise. * gcc.target/i386/avx512fp16vl-vfcmulcph-1a.c: Likewise. * gcc.target/i386/avx512fp16vl-vfmaddcph-1a.c: Likewise. * gcc.target/i386/avx512fp16vl-vfmulcph-1a.c: Likewise. * gcc.target/i386/avx512fp16vl-vfpclassph-1a.c: Likewise. * gcc.target/i386/avx512fp16vl-vmulph-1a.c: Likewise. * gcc.target/i386/avx512fp16vl-vrcpph-1a.c: Likewise. * gcc.target/i386/avx512fp16vl-vrsqrtph-1a.c: Likewise. * gcc.target/i386/avx512fp16vl-vsqrtph-1a.c: Likewise. * gcc.target/i386/avx512vl-pr100267-1.c: Likewise. * gcc.target/i386/avx512vl-vcmppd-1.c: Likewise. * gcc.target/i386/avx512vl-vcmpps-1.c: Likewise. * gcc.target/i386/avx512vl-vcvtpd2ps-1.c: Likewise. * gcc.target/i386/avx512vl-vcvtpd2udq-1.c: Likewise. * gcc.target/i386/avx512vl-vcvttpd2udq-1.c: Likewise. * gcc.target/i386/avx512vl-vcvttps2udq-1.c: Likewise. * gcc.target/i386/avx512vl-vextractf32x4-1.c: Likewise. * gcc.target/i386/avx512vl-vmovapd-1.c: Likewise. * gcc.target/i386/avx512vl-vmovaps-1.c: Likewise. * gcc.target/i386/avx512vl-vmovdqa64-1.c: Likewise. * gcc.target/i386/avx512vl-vpcmpd-1.c: Likewise. * gcc.target/i386/avx512vl-vpcmpeqq-1.c: Likewise. * gcc.target/i386/avx512vl-vpcmpequq-1.c: Likewise. * gcc.target/i386/avx512vl-vpcmpq-1.c: Likewise. * gcc.target/i386/avx512vl-vpcmpud-1.c: Likewise. * gcc.target/i386/avx512vl-vpcmpuq-1.c: Likewise. * gcc.target/i386/pr122343-1a.c: New test. * gcc.target/i386/pr122343-1b.c: Likewise. * gcc.target/i386/pr122343-2a.c: Likewise. * gcc.target/i386/pr122343-2b.c: Likewise. * gcc.target/i386/pr122343-3.c: Likewise. * gcc.target/i386/pr122343-4a.c: Likewise. * gcc.target/i386/pr122343-4b.c: Likewise. * gcc.target/i386/pr122343-5a.c: Likewise. * gcc.target/i386/pr122343-5b.c: Likewise. * gcc.target/i386/pr122343-6a.c: Likewise. * gcc.target/i386/pr122343-6b.c: Likewise. * gcc.target/i386/pr122343-7.c: Likewise. Signed-off-by: H.J. Lu <[email protected]>

Back in r78875 mrs added cpp_get_path/dir accessors for _cpp_file in order to interface with the darwin framework system. But now I notice that the latter duplicates the better-named _cpp_get_file_dir, and I'm inclined to rename the former to match. Perhaps we should drop the initial underscore since these are no longer internal interfaces; OTOH, _cpp_hashnode_value and _cpp_backup_tokens still have the initial underscore in cpplib.h. libcpp/ChangeLog: * include/cpplib.h (cpp_get_path, cpp_get_dir): Remove. (_cpp_get_file_path, _cpp_get_file_name, _cpp_get_file_stat) (_cpp_get_file_dir): Move prototypes from... * internal.h: ...here. * files.cc (_cpp_get_file_path): Rename from... (cpp_get_path): ...this. (cpp_get_dir): Remove. gcc/ChangeLog: * config/darwin-c.cc (find_subframework_header): Use _cpp_get_file_*.

gcc/analyzer/ChangeLog: * kf.cc (register_known_functions): Remove duplicate calls to register_atomic_builtins and register_varargs_builtins. Signed-off-by: David Malcolm <[email protected]>

This was reported as a regression in GCC 14: the compiler resolves Accum_Type to Positive for a reduction expression whose "expected subtype" is Positive, which means that 0 cannot be used as initial value in the expression: Sum : Positive := V'Reduce ("+", 0); without always raising Constraint_Error as run time. That's not the intent according to T. Taft in https://forum.ada-lang.io/t/regression-in-gnat-14/890 so this changes the resolution to use the base type (Integer) instead. gcc/ada/ PR ada/115349 * sem_attr.adb (Resolve_Attribute) <Attribute_Reduce>: Use the base type as Accum_Type if the reducer is an operator from Standard and the type is numeric. Use the type of the first operand for other operators. Streamline the error message given for limited types. gcc/testsuite/ * gnat.dg/reduce3.adb: New test.

Don't allow 2 volatile memory references in *<avx512>_cmp<mode>3_dup_op so that gcc.target/i386/avx2-vpcmpeqq-1.c will generate 2 loads when -march=cascadelake is used. PR target/122343 * config/i386/sse.md (*<avx512>_cmp<mode>3_dup_op): Don't allow 2 volatile memory references. Signed-off-by: H.J. Lu <[email protected]>

When -march=cascadelake is added, we generate vmovdqa x(%rip), %ymm0 vpcmpq $1, x(%rip), %ymm0, %k0 vpmovm2q %k0, %ymm0 vmovdqa %ymm0, x(%rip) instead of vmovdqa x(%rip), %ymm1 vmovdqa x(%rip), %ymm0 vpcmpgtq %ymm1, %ymm0, %ymm0 vmovdqa %ymm0, x(%rip) Compile avx2-vpcmpgtq-1.c with -fno-fuse-ops-with-volatile-access to generate vpcmpgtq instead of vpcmpq. PR target/122343 * gcc.target/i386/avx2-vpcmpgtq-1.c: Compile with -fno-fuse-ops-with-volatile-access. Signed-off-by: H.J. Lu <[email protected]>

…d [PR122868] As Richi suggested this moves the check into the loop so we check every load. I had initially not done this because I figured the loads would be treated as a group anyway and the group would be valid or not as a whole. But for invariants they could be a group, but not all the loads within range of a known bounds. gcc/ChangeLog: PR tree-optimization/122868 * tree-vect-stmts.cc (vectorizable_load): Move check for invariant loads down into the loop.

The Adv. SIMD boolean reduction patterns were accidentally overriding one of the input arguments. This fixes it and removes unneeded intermediate moves around the subreg type castings. gcc/ChangeLog: PR target/123026 * config/aarch64/aarch64-simd.md (reduc_sbool_ior_scal_<mode>, reduc_sbool_and_scal_<mode>): Fix tmp operands[1] override. gcc/testsuite/ChangeLog: PR target/123026 * gcc.target/aarch64/pr123026.c: New test.

When we have a speculated edge but we folded the call to __builtin_unreachable () then trying to update the cgraph ICEs in resolve_speculation because there's no symtab node for __builtin_unreachable (). Reject this resolving attempt similar as to when the callees decl were NULL or it were not semantically equivalent. I only have a LTRANS unit as testcase. PR ipa/122456 * cgraph.cc (cgraph_edge::resolve_speculation): Handle a NULL symtab_node::get (callee_decl).

…ling gcc/Changelog * haifa-sched.cc (choose_ready): Don't require dfa_lookahead <= 0 to schedule SCHED_GROUP_P insns first.

This patch enables dispatch scheduling for the NVIDIA Olympus core. The dispatch constraints are based on the Olympus CPU Core Software Optimization Guide (https://docs.nvidia.com/olympus-cpu-core-software-optimization-guide-dp12531-001v0-7.pdf). The patch was bootstrapped and tested on aarch64-linux-gnu, no regression. OK for trunk? Signed-off-by: Jennifer Schmitz <[email protected]> gcc/ * config/aarch64/aarch64.md: Include olympus.md. * config/aarch64/olympus.md: New file. * config/aarch64/tuning_models/olympus.h: Add dispatch constraints and enable dispatch scheduling.

Add a new target instruction. Hardware-assisted sanitizers on architectures providing instructions to tag/untag memory can then make use of this new instruction pattern. For example, the memtag-stack sanitizer uses these instructions to tag and untag a memory granule. gcc/ * target-insns.def (tag_memory): New target instruction. * doc/md.texi (tag_memory): Add documentation. Signed-off-by: Claudiu Zissulescu <[email protected]>

Add a new target instruction used by hardware-assisted sanitizers on architectures providing memory-tagging instructions. This instruction is used to compute assign tags at a fixed offset from a tagged address base. For example, in AArch64 case, this pattern instantiate `addg` instruction. gcc/ * target-insns.def (compose_tag): New target instruction. * doc/md.texi (compose_tag): Add documentation. Signed-off-by: Claudiu Zissulescu <[email protected]>

Add new command line option -fsanitize=memtag-stack with the following new params: --param memtag-instrument-alloca [0,1] (default 1) to use MTE insns for enabling dynamic checking of stack allocas. Along with the new SANITIZE_MEMTAG_STACK, define a SANITIZE_MEMTAG which will be set if any kind of memtag sanitizer is in effect (e.g., later we may add -fsanitize=memtag-globals). Add errors to convey that memtag sanitizer does not work with hwaddress and address sanitizers. Also error out if memtag ISA extension is not enabled. MEMTAG sanitizer will use the HWASAN machinery, but with a few differences: - The tags are always generated at runtime by the hardware, so -fsanitize=memtag-stack enforces a --param hwasan-random-frame-tag=1 Add documentation in gcc/doc/invoke.texi. gcc/ * builtins.def: Adjust the macro to include the new SANTIZIE_MEMTAG_STACK. * flag-types.h (enum sanitize_code): Add new enumerator for SANITIZE_MEMTAG and SANITIZE_MEMTAG_STACK. * opts.cc (finish_options): memtag-stack sanitizer conflicts with hwaddress and address sanitizers. (sanitizer_opts): Add new memtag-stack sanitizer. (parse_sanitizer_options): memtag-stack sanitizer cannot recover. * params.opt: Add new params for memtag-stack sanitizer. * doc/invoke.texi: Update documentation. Signed-off-by: Claudiu Zissulescu <[email protected]> Co-authored-by: Claudiu Zissulescu <[email protected]>

Memory tagging is used for detecting memory safety bugs. On AArch64, the memory tagging extension (MTE) helps in reducing the overheads of memory tagging: - CPU: MTE instructions for efficiently tagging and untagging memory. - Memory: New memory type, Normal Tagged Memory, added to the Arm Architecture. The MEMory TAGging (MEMTAG) sanitizer uses the same infrastructure as HWASAN. MEMTAG and HWASAN are both hardware-assisted solutions, and rely on the same sanitizer machinery in parts. So, define new constructs that allow MEMTAG and HWASAN to share the infrastructure: - hwassist_sanitize_p () is true when either SANITIZE_MEMTAG or SANITIZE_HWASAN is true. - hwassist_sanitize_stack_p () is when hwassist_sanitize_p () and stack variables are to be sanitized. MEMTAG and HWASAN do have differences, however, and hence, the need to conditionalize using memtag_sanitize_p () in the relevant places. E.g., - Instead of generating the libcall __hwasan_tag_memory, MEMTAG needs to invoke the target-specific hook TARGET_MEMTAG_TAG_MEMORY to tag memory. Similar approach can be seen for handling handle_builtin_alloca, where instead of doing the gimple transformations, target hooks are used. - Add a new internal function HWASAN_ALLOCA_POISON to handle dynamically allocated stack when MEMTAG sanitizer is enabled. At expansion, this allows to, in turn, invoke target-hooks to increment tag, and use the generated tag to finally tag the dynamically allocated memory. The usual pattern: irg x0, x0, x0 subg x0, x0, #16, #0 creates a tag in x0 and so on. For alloca, we need to apply the generated tag to the new sp. In absense of an extract tag insn, the implemenation in expand_HWASAN_ALLOCA_POISON resorts to invoking irg again. gcc/ * asan.cc (handle_builtin_stack_restore): Accommodate MEMTAG sanitizer. (handle_builtin_alloca): Expand differently if MEMTAG sanitizer. (get_mem_refs_of_builtin_call): Include MEMTAG along with HWASAN. (memtag_sanitize_stack_p): New definition. (memtag_sanitize_allocas_p): Likewise. (memtag_memintrin): Likewise. (hwassist_sanitize_p): Likewise. (hwassist_sanitize_stack_p): Likewise. (report_error_func): Include MEMTAG along with HWASAN. (build_check_stmt): Likewise. (instrument_derefs): MEMTAG too does not deal with globals yet. (instrument_builtin_call): Include MEMTAG along with HWASAN. (maybe_instrument_call): Likewise. (asan_expand_mark_ifn): Likewise. (asan_expand_check_ifn): Likewise. (asan_expand_poison_ifn): Expand differently if MEMTAG sanitizer. (asan_instrument): Include MEMTAG along with HWASAN. (hwasan_emit_prologue): Expand differently if MEMTAG sanitizer. (hwasan_emit_untag_frame): Likewise. * asan.h (memtag_sanitize_stack_p): New declaration. (memtag_sanitize_allocas_p): Likewise. (hwassist_sanitize_p): Likewise. (hwassist_sanitize_stack_p): Likewise. (asan_sanitize_use_after_scope): Include MEMTAG along with HWASAN. * cfgexpand.cc (align_local_variable): Likewise. (expand_one_stack_var_at): Likewise. (expand_stack_vars): Likewise. (expand_one_stack_var_1): Likewise. (init_vars_expansion): Likewise. (expand_used_vars): Likewise. (pass_expand::execute): Likewise. * gimplify.cc (asan_poison_variable): Likewise. * internal-fn.cc (expand_HWASAN_ALLOCA_POISON): New definition. (expand_HWASAN_ALLOCA_UNPOISON): Expand differently if MEMTAG sanitizer. (expand_HWASAN_MARK): Likewise. * internal-fn.def (HWASAN_ALLOCA_POISON): Define new. * params.opt: Document new param. * sanopt.cc (pass_sanopt::execute): Include MEMTAG along with HWASAN. * gcc.cc (sanitize_spec_function): Add check for memtag-stack. * doc/tm.texi: Regenerate. * target.def (extract_tag): Update documentation. (add_tag): Likewise. (insert_random_tag): Likewise. Co-authored-by: Indu Bhagat <[email protected]> Signed-off-by: Claudiu Zissulescu <[email protected]>

MEMTAG sanitizer, which is based on the HWASAN sanitizer, will invoke the target-specific hooks to create a random tag, add tag to memory address, and finally tag and untag memory. Implement the target hooks to emit MTE instructions if MEMTAG sanitizer is in effect. Continue to use the default target hook if HWASAN is being used. Following target hooks are implemented: - TARGET_MEMTAG_INSERT_RANDOM_TAG - TARGET_MEMTAG_ADD_TAG - TARGET_MEMTAG_EXTRACT_TAG Apart from the target-specific hooks, set the following to values defined by the Memory Tagging Extension (MTE) in aarch64: - TARGET_MEMTAG_TAG_BITSIZE - TARGET_MEMTAG_GRANULE_SIZE The next instructions were (re-)defined: - addg/subg (used by TARGET_MEMTAG_ADD_TAG and TARGET_MEMTAG_COMPOSE_OFFSET_TAG hooks) - stg/st2g Used to tag/untag a memory granule. - tag_memory A target specific instruction, it will will emit MTE instructions to tag/untag memory of a given size. - compose_tag A target specific instruction that computes a tagged address as an offset from a base (tagged) address. - gmi Used for randomizing the inserting tag. - irg Likewise. gcc/ * config/aarch64/aarch64.md (addg): Update pattern to use addg/subg instructions. (stg): Update pattern. (st2g): New pattern. (tag_memory): Likewise. (compose_tag): Likewise. (irq): Update pattern to accept xzr register. (gmi): Likewise. (UNSPECV_TAG_SPACE): Define. * config/aarch64/aarch64.cc (AARCH64_MEMTAG_GRANULE_SIZE): Define. (AARCH64_MEMTAG_TAG_BITSIZE): Likewise. (aarch64_override_options_internal): Error out if MTE instructions are not available. (aarch64_post_cfi_startproc): Emit .cfi_mte_tagged_frame. (aarch64_can_tag_addresses): Add MEMTAG specific handling. (aarch64_memtag_tag_bitsize): New function (aarch64_memtag_granule_size): Likewise. (aarch64_memtag_insert_random_tag): Likwise. (aarch64_memtag_add_tag): Likewise. (aarch64_memtag_extract_tag): Likewise. (aarch64_granule16_memory_address_p): Likewise. (aarch64_emit_stxg_insn): Likewise. (aarch64_memtag_tag_memory_via_loop): New definition. (aarch64_expand_tag_memory): Likewise. (aarch64_check_memtag_ops): Likewise. (TARGET_MEMTAG_TAG_BITSIZE): Likewise. (TARGET_MEMTAG_GRANULE_SIZE): Likewise. (TARGET_MEMTAG_INSERT_RANDOM_TAG): Likewise. (TARGET_MEMTAG_ADD_TAG): Likewise. (TARGET_MEMTAG_EXTRACT_TAG): Likewise. * config/aarch64/aarch64-builtins.cc (aarch64_expand_builtin_memtag): Update set tag builtin logic. * config/aarch64/aarch64-linux.h: Pass memtag-stack sanitizer specific options to the linker. * config/aarch64/aarch64-protos.h (aarch64_granule16_memory_address_p): New prototype. (aarch64_check_memtag_ops): Likewise. (aarch64_expand_tag_memory): Likewise. * config/aarch64/constraints.md (Umg): New memory constraint. (Uag): New constraint. (Ung): Likewise. * config/aarch64/predicates.md (aarch64_memtag_tag_offset): Refactor it. (aarch64_granule16_imm6): Rename from aarch64_granule16_uimm6 and refactor it. (aarch64_granule16_memory_operand): New constraint. * config/aarch64/iterators.md (MTE_PP): New code iterator to be used for mte instructions. (stg_ops): New code attributes. (st2g_ops): Likewise. (mte_name): Likewise. * config/aarch64/aarch64.opt (aarch64-tag-memory-loop-threshold): New parameter. * doc/invoke.texi: Update documentation. gcc/testsuite: * gcc.target/aarch64/acle/memtag_1.c: Update test. Co-authored-by: Indu Bhagat <[email protected]> Signed-off-by: Claudiu Zissulescu <[email protected]>

Add basic tests for memtag-stack sanitizer. Memtag stack sanitizer uses target hooks to emit AArch64 specific MTE instructions. gcc/testsuite: * gcc.target/aarch64/memtag/alloca-1.c: New test. * gcc.target/aarch64/memtag/alloca-2.c: New test. * gcc.target/aarch64/memtag/alloca-3.c: New test. * gcc.target/aarch64/memtag/arguments-1.c: New test. * gcc.target/aarch64/memtag/arguments-2.c: New test. * gcc.target/aarch64/memtag/arguments-3.c: New test. * gcc.target/aarch64/memtag/arguments-4.c: New test. * gcc.target/aarch64/memtag/arguments.c: New test. * gcc.target/aarch64/memtag/basic-1.c: New test. * gcc.target/aarch64/memtag/basic-3.c: New test. * gcc.target/aarch64/memtag/basic-struct.c: New test. * gcc.target/aarch64/memtag/large-array.c: New test. * gcc.target/aarch64/memtag/local-no-escape.c: New test. * gcc.target/aarch64/memtag/memtag.exp: New file. * gcc.target/aarch64/memtag/no-sanitize-attribute.c: New test. * gcc.target/aarch64/memtag/value-init.c: New test. * gcc.target/aarch64/memtag/vararray-gimple.c: New test. * gcc.target/aarch64/memtag/vararray.c: New test. * gcc.target/aarch64/memtag/zero-init.c: New test. * gcc.target/aarch64/memtag/texec-1.c: New test. * gcc.target/aarch64/memtag/texec-2.c: New test. * gcc.target/aarch64/memtag/texec-3.c: New test. * gcc.target/aarch64/memtag/vla-1.c: New test. * gcc.target/aarch64/memtag/vla-2.c: New test. * lib/target-supports.exp (check_effective_target_aarch64_mte): New function. Co-authored-by: Indu Bhagat <[email protected]> Signed-off-by: Claudiu Zissulescu <[email protected]>

MichielDerhaeg · 2025-12-16T10:32:28Z

gcc/config/riscv/riscv.md

+	emit_insn (gen_mulsi3 (operands[4], operands[1], operands[2]));
+	emit_insn (gen_addsi3 (operands[0], operands[3], operands[4]));


Ah, this is also wrong.

Thanks. Fixed.

Interesting with this fix b4ce3f9 I get a regression of 0.5%

…s with as The gcc.target/i386/shift-gf2p8affine-2.c test FAILs on Solaris with the native assembler: FAIL: gcc.target/i386/shift-gf2p8affine-2.c (test for excess errors) UNRESOLVED: gcc.target/i386/shift-gf2p8affine-2.c compilation failed to produce executable Excess errors: Assembler: shift-gf2p8affine-2.c "/var/tmp//ccZMQ1Ad.s", line 30 : Illegal mnemonic Near line: " vgf2p8affineqb $0, %zmm1, %zmm0, %zmm0" "/var/tmp//ccZMQ1Ad.s", line 30 : Syntax error Thus this patch only runs the test when gas is in use. Tested on i386-pc-solaris2.11 (as and gas) and x86_64-pc-linux-gnu. 2025-12-15 Rainer Orth <[email protected]> gcc/testsuite: * gcc.target/i386/shift-gf2p8affine-2.c: Skip on Solaris without gas.

The following works around SRA not being able to decompose an aggregate copy of std::complex because with x87 math ld/st pairs are not bit-preserving by adding -msse -mfpmath=sse. This avoids spurious failures of the testcase. PR testsuite/123137 * g++.dg/vect/pr64410.cc: Add -mfpmath=sse -msse on x86.

As a result of the automatic replacement by commit 4dd1398, there are several code fragments that receive the return value of end_sequence() and immediately use it as the return value of the function itself. rtx_insn *insn; ... insn = end_sequence (); return insn; It is clear that in such cases, it would be more natural to pass the return value of end_sequence() directly to the return statement without passing it through a variable. Applying this patch naturally does not change any functionality. gcc/ChangeLog: * config/xtensa/xtensa.cc (xtensa_expand_block_set_libcall, xtensa_expand_block_set_unrolled_loop, xtensa_expand_block_set_small_loop, xtensa_call_tls_desc): Change the return statement to pass the return value of end_sequence() directly without going through a variable, and remove the definition of that variable.

In the expansion of cstoresi4 insn patterns, LT[U] comparisons where the second operand is an integer constant are canonicalized to LE[U] ones with one less than the original. /* example */ int test0(int a) { return a < 100; } unsigned int test1(unsigned int a) { return a <= 100u; } void test2(int a[], int b) { int i; for (i = 0; i < 16; ++i) a[i] = (a[i] <= b); } ;; before (TARGET_SALT) test0: entry sp, 32 movi a8, 0x63 salt a2, a8, a2 addi.n a2, a2, -1 ;; unwanted inverting neg a2, a2 ;; retw.n test1: entry sp, 32 movi a8, 0x64 saltu a2, a8, a2 addi.n a2, a2, -1 ;; unwanted inverting neg a2, a2 ;; retw.n test2: entry sp, 32 movi.n a9, 0x10 loop a9, .L5_LEND .L5: l32i.n a8, a2, 0 salt a8, a3, a8 addi.n a8, a8, -1 ;; immediate cannot be hoisted out neg a8, a8 s32i.n a8, a2, 0 addi.n a2, a2, 4 .L5_LEND: retw.n This patch reverts such canonicalization by adding 1 to the comparison value and then converting it back from LE[U] to LT[U], which better matches the output machine instructions. This patch also makes it easier to benefit from other optimizations such as CSE, constant propagation, or loop-invariant hoisting by XORing the result with a register that has a value of 1, rather than subtracting 1 and then negating the sign to invert the truth of the result. ;; after (TARGET_SALT) test0: entry sp, 32 movi a8, 0x64 salt a2, a2, a8 retw.n test1: entry sp, 32 movi a8, 0x65 saltu a2, a2, a8 retw.n test2: entry sp, 32 movi.n a10, 1 ;; hoisted out movi.n a9, 0x10 loop a9, .L5_LEND .L5: l32i.n a8, a2, 0 salt a8, a3, a8 xor a8, a8, a10 s32i.n a8, a2, 0 addi.n a2, a2, 4 .L5_LEND: retw.n gcc/ChangeLog: * config/xtensa/xtensa.cc (xtensa_expand_scc_SALT): New sub-function that emits the SALT/SALTU instructions. (xtensa_expand_scc): Change the part related to the SALT/SALTU instructions to a call to the above sub-function.

Signed-off-by: Mohammad-Reza Nabipoor <[email protected]> gcc/algol68/ChangeLog * a68-imports.cc (dump_encoded_mode): Replace "basic" with "string".

This commit introduces two new utility functions that replace some ad-hoc infrastructure in the scanner. Signed-off-by: Jose E. Marchesi <[email protected]> gcc/algol68/ChangeLog * a68.h: Prototypes for a68_get_file_size and a68_file_read. * a68-parser-scanner.cc (a68_file_size): New function. (a68_file_read): Renamed from io_read. (get_source_size): Deleted function. (include_files): Use a68_file_size and a68_file_read.

This commit adds support for two new command-line options for the Algol 68 front-end: -fmodules-map=<string> -fmodules-map-file=<filename> These options are used in order to specify a mapping from module indicants to file basenames. The compiler will base its search for the modules on these basenames rather on the default schema of deriving the basename from the module indicant. Signed-off-by: Jose E. Marchesi <[email protected]> gcc/algol68/ChangeLog * lang.opt (-fmodules-map): New option. (-fmodules-map-file): Likewise. * a68.h: Add prototype for a68_process_module_map. * a68-imports.cc (SKIP_WHITESPACES): Define. (PARSE_BASENAME): Likewise. (PARSE_INDICANT): Likewise. (a68_process_module_map): New function. * a68-lang.cc: (a68_init): Move initialization of A68_MODULE_FILES from there... (a68_init_options): to here. (a68_handle_option): Handle OPT_fmodules_map and OPT_fmodules_map_. * a68-parser-pragmat.cc (handle_access_in_pragmat): Normalize module indicants to upper case. * ga68.texi (Module search options): New section.

Signed-off-by: Jose E. Marchesi <[email protected]> gcc/ChangeLog * common.opt.urls: Regenerate. gcc/algol68/ChangeLog * lang.opt.urls: Regenerate.

This patch introduces the pipeline description for the Synopsys RMX-100 series processor to the RISC-V GCC backend. The RMX-100 has a short, three-stage, in-order execution pipeline with configurable multiply unit options. The option -mmpy-option was added to control which version of the MPY unit the core has and what the latency of multiply instructions should be similar to ARCv2 cores (see gcc/config/arc/arc.opt:60). gcc/ChangeLog: * config/riscv/riscv-cores.def (RISCV_TUNE): Add arc-v-rmx-100-series. * config/riscv/riscv-opts.h (enum riscv_microarchitecture_type): Add arcv_rmx100. (enum arcv_mpy_option_enum): New enum for ARC-V multiply options. * config/riscv/riscv-protos.h (arcv_mpy_1c_bypass_p): New declaration. (arcv_mpy_2c_bypass_p): New declaration. (arcv_mpy_10c_bypass_p): New declaration. * config/riscv/riscv.cc (arcv_mpy_1c_bypass_p): New function. (arcv_mpy_2c_bypass_p): New function. (arcv_mpy_10c_bypass_p): New function. * config/riscv/riscv.md: Add arcv_rmx100. * config/riscv/riscv.opt: New option for RMX-100 multiply unit configuration * doc/riscv-mtune.texi: Document arc-v-rmx-100-series. * config/riscv/arcv-rmx100.md: New file. Authored-by: Artemiy Volkov <[email protected]> Co-authored-by: Michiel Derhaeg <[email protected]> Signed-off-by: Luis Silva <[email protected]>

This patch introduces the pipeline description for the Synopsys RHX-100 series processor to the RISC-V GCC backend. The RHX-100 features a 10-stage, dual-issue, in-order execution pipeline architecture. It has support for instruction fusion, which will be addressed by subsequent patches. gcc/ChangeLog: * config/riscv/riscv-cores.def (RISCV_TUNE): Add arc-v-rhx-100-series. * config/riscv/riscv-opts.h (enum riscv_microarchitecture_type): Add arcv_rhx100. * config/riscv/riscv.cc (enum riscv_fusion_pairs): Add RISCV_FUSE_ARCV. * config/riscv/riscv.md: Add arcv_rhx100 to tune attribute. * doc/riscv-mtune.texi: Add RHX-100 documentation. * config/riscv/arcv-rhx100.md: New file. Authored-by: Artemiy Volkov <[email protected]> Co-authored-by: Michiel Derhaeg <[email protected]> Signed-off-by: Luis Silva <[email protected]>

This patch implements instruction fusion support for the Synopsys RHX-100 processor by adding the arcv_macro_fusion_pair_p function and supporting infrastructure. The implementation supports fusion of several instruction patterns: multiply-add sequences, shift-based bit extraction, load-immediate with conditional branches, adjacent memory operations, memory operations with arithmetic instructions, memory operations with LUI instructions, and load-immediate with store operations. A new arcv.cc file is added to contain ARC-V specific optimizations, and the existing multiply bypass functions are moved from riscv.cc to this new file for better organization. gcc/ChangeLog: * config.gcc: Add arcv.o to extra_objs. * config/riscv/riscv-protos.h (arcv_macro_fusion_pair_p): New declaration. (arcv_sched_fusion_priority): New declaration. (arcv_can_issue_more_p): New declaration. (arcv_sched_variable_issue): New declaration. (arcv_sched_init): New declaration. (arcv_sched_reorder2): New declaration. (arcv_sched_adjust_priority): New declaration. (arcv_sched_adjust_cost): New declaration. * config/riscv/riscv.cc (arcv_mpy_1c_bypass_p): Move to arcv.cc (arcv_mpy_2c_bypass_p): Move to arcv.cc (arcv_mpy_10c_bypass_p): Move to arcv.cc (riscv_macro_fusion_pair_p): New function. * config/riscv/t-riscv: Add arcv.o build rule. * config/riscv/arcv.cc: New file. Authored-by: Artemiy Volkov <[email protected]> Co-authored-by: Michiel Derhaeg <[email protected]> Signed-off-by: Luis Silva <[email protected]>

…eries. This patch implements the TARGET_SCHED_FUSION_PRIORITY hook for the Synopsys RHX-100 processor to improve instruction scheduling by prioritizing fusible memory operations. The implementation analyzes load and store instructions to extract base registers and offsets, then assigns scheduling priorities based on several factors: access width (wider accesses get higher priority), base register number, and memory offset values. Instructions with adjacent addresses are grouped together to enable better fusion opportunities. gcc/ChangeLog: * config/riscv/arcv.cc (arcv_fusion_load_store): New function. (arcv_sched_fusion_priority): New function. * config/riscv/riscv.cc (riscv_sched_fusion_priority): New function. (TARGET_SCHED_FUSION_PRIORITY): Define hook. Authored-by: Artemiy Volkov <[email protected]> Co-authored-by: Michiel Derhaeg <[email protected]> Signed-off-by: Luis Silva <[email protected]>

This patch implements instruction scheduling support for the dual-issue Synopsys RHX-100 processor by adding scheduler hooks and state tracking for the two execution pipes. The implementation tracks ALU pipe and memory pipe usage to maximize dual-issue opportunities. It includes reordering logic to promote fusion of adjacent memory operations and other instruction pairs that can execute simultaneously on the RHX-100's dual-issue architecture. The scheduler prioritizes fused instruction pairs and adjusts costs to improve scheduling decisions. Memory operations are directed to the appropriate pipe while arithmetic operations utilize the ALU pipe, enabling optimal utilization of both execution units. New TARGET_SCHED hooks are implemented including ADJUST_PRIORITY, REORDER2, and enhanced VARIABLE_ISSUE handling specifically for the RHX-100 microarchitecture. gcc/ChangeLog: * config/riscv/arcv.cc (struct arcv_sched_state): New struct. (arcv_sched_init): New function. (arcv_sched_reorder2): New function. (arcv_sched_adjust_priority): New function. (arcv_sched_adjust_cost): New function. (arcv_can_issue_more_p): New function. (arcv_sched_variable_issue): New function. * config/riscv/riscv.cc (riscv_fusion_enabled_p): Add forward declaration. (riscv_sched_init): Add call to arcv_shed_init. (riscv_sched_variable_issue): Add ARC-V-specific handling. (riscv_sched_adjust_cost): Add ARC-V-specific cost adjustment and fix parameter names. (riscv_sched_adjust_priority): New function. (riscv_sched_reorder2): New function. (TARGET_SCHED_ADJUST_PRIORITY): Define hook. (TARGET_SCHED_REORDER2): Define hook. * config/riscv/riscv.h (TARGET_ARCV_RHX100): New macro. Authored-by: Artemiy Volkov <[email protected]> Co-authored-by: Michiel Derhaeg <[email protected]> Co-authored-by: Alex Turjan <[email protected]> Signed-off-by: Luis Silva <[email protected]>

…axt fusion. This patch adds instruction patterns to support fusion of multiply-add sequences and bit extraction operations for the Synopsys RHX-100 processor. The multiply-add fusion supports both signed and unsigned 16-bit operands expanded to 32-bit multiply-accumulate operations. The implementation generates separate multiply and add instructions that can be fused by the processor hardware. The bit extraction fusion implements zero extraction using shift-left followed by shift-right operations, which can be fused into a single micro-operation. New instruction types "imul_fused" and "alu_fused" are added to the scheduling model to handle these fused operations. Test cases are included to verify the correct generation of fusible instruction sequences for multiply-add, bit extraction, and load-immediate with conditional branch patterns. gcc/ChangeLog: * config/riscv/arcv-rhx100.md (arcv_rhx100_imul_fused): New reservation. (arcv_rhx100_alu_fused): New reservation. * config/riscv/iterators.md (is_zero_extract): New code attribute. * config/riscv/riscv.cc (riscv_rtx_costs): Add TARGET_ARCV_RHX100 support for SIGN_EXTRACT. * config/riscv/riscv.md: Add imul_fused and alu_fused to type attribute. (umaddhisi4): New expand. (madd_split): New insn_and_split. (madd_split_extended): New insn_and_split. (*zero_extract_fused): New insn. gcc/testsuite/ChangeLog: * gcc.target/riscv/arcv-fusion-limm-condbr.c: New test. * gcc.target/riscv/arcv-fusion-madd.c: New test. * gcc.target/riscv/arcv-fusion-xbfu.c: New test. Authored-by: Artemiy Volkov <[email protected]> Co-authored-by: Michiel Derhaeg <[email protected]> Signed-off-by: Luis Silva <[email protected]>

MichielDerhaeg commented Nov 28, 2025

View reviewed changes

GCC Administrator and others added 29 commits December 6, 2025 00:16

Daily bump.

2e61e52

a68: handle intTI_type_node in a68_type_for_size

a1d895c

gcc/algol68/ChangeLog PR algol68/123007 * a68-lang.cc (a68_type_for_size): Handle intTI_type_node.

libstdc++: add more #if to std.cc

da97de4

compile-std1.C was breaking on arm-eabi because these interfaces aren't declared. So for exporting let's check the same macros that control declaring them. libstdc++-v3/ChangeLog: * src/c++23/std.cc.in: Add more #if.

Fortran: [PDT] Verify problems with error recovery have gone [PR103414]

c8450ff

2025-12-06 Paul Thomas <[email protected]> gcc/testsuite PR fortran/103414 * gfortran.dg/pdt_76.f03: New test.

Daily bump.

c70bf3e

analyzer: remove duplicated registration of builtins

01d4414

gcc/analyzer/ChangeLog: * kf.cc (register_known_functions): Remove duplicate calls to register_atomic_builtins and register_varargs_builtins. Signed-off-by: David Malcolm <[email protected]>

Daily bump.

886a4bd

rguenth and others added 9 commits December 16, 2025 08:30

Haifa scheduler: Prevent splitting of fusion pairs in dispatch schedu…

40d0f79

…ling gcc/Changelog * haifa-sched.cc (choose_ready): Don't require dfa_lookahead <= 0 to schedule SCHED_GROUP_P insns first.

MichielDerhaeg commented Dec 16, 2025

View reviewed changes

luismgsilva force-pushed the michiel/fusion-trunk-3 branch from 89415d4 to b4ce3f9 Compare December 16, 2025 11:34

rorth and others added 8 commits December 16, 2025 13:02

a68: fix dump of encoded string mode

0959845

Signed-off-by: Mohammad-Reza Nabipoor <[email protected]> gcc/algol68/ChangeLog * a68-imports.cc (dump_encoded_mode): Replace "basic" with "string".

Regenerate opt-urls

7132e97

Signed-off-by: Jose E. Marchesi <[email protected]> gcc/ChangeLog * common.opt.urls: Regenerate. gcc/algol68/ChangeLog * lang.opt.urls: Regenerate.

luismgsilva force-pushed the michiel/fusion-trunk-3 branch 4 times, most recently from 10f17dd to a33ab39 Compare December 16, 2025 14:15

artemiy-volkov and others added 6 commits December 16, 2025 15:28

luismgsilva force-pushed the michiel/fusion-trunk-3 branch from a33ab39 to 1f24597 Compare December 16, 2025 15:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ARC-V RHX-100 upstream patch series #192

ARC-V RHX-100 upstream patch series #192

Uh oh!

MichielDerhaeg commented Nov 28, 2025

Uh oh!

MichielDerhaeg left a comment

Uh oh!

Uh oh!

Uh oh!

MichielDerhaeg Nov 28, 2025

Uh oh!

MichielDerhaeg Nov 28, 2025

Uh oh!

Uh oh!

MichielDerhaeg Dec 16, 2025

Uh oh!

luismgsilva Dec 16, 2025

Uh oh!

luismgsilva Dec 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

		emit_insn (gen_mulsi3 (operands[4], operands[1], operands[2]));
		emit_insn (gen_addsi3 (operands[0], operands[3], operands[4]));

ARC-V RHX-100 upstream patch series #192

Are you sure you want to change the base?

ARC-V RHX-100 upstream patch series #192

Uh oh!

Conversation

MichielDerhaeg commented Nov 28, 2025

Uh oh!

MichielDerhaeg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

MichielDerhaeg Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

MichielDerhaeg Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

MichielDerhaeg Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

luismgsilva Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

luismgsilva Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants