Conditions using bitwise OR on booleans may produce de-optimized code

When writing optimization-friendly code, sometimes it might seem like a good idea to unroll branches by hand by performing multi-step computations optimistically, while keeping tabs on possible failures of intermediate steps as booleans. These failure flags are then combined using bitwise logic when the validity of the result is finally checked. Here's an example of how it might work with checked arithmetics, inspired by some code in `std::hash`:

``` rust
fn calculate_size(elem_size: usize,
                  length: usize,
                  offset: usize)
                  -> Option<usize> {
    let (acc, oflo1) = elem_size.overflowing_mul(length);
    let (acc, oflo2) = acc.overflowing_add(offset);
    if oflo1 | oflo2 {
        None
    } else {
        Some(acc)
    }
}
```

However, in optimized code generation at least on x86-64, the bitwise OR for booleans is sometimes decomposed into a series of checks and branches, defeating the whole purpose. Here's a condensed benchmark comparing boolean OR with integer bitwise OR, where the results of both are used as the condition for branching.

``` rust
#![feature(test)]

extern crate test;

use test::Bencher;

#[inline(never)]
fn or_bools(a: bool, b: bool, c: bool) -> Option<u64> {
    if a | b | c { Some(1) } else { None }
}

#[inline(never)]
fn or_bytes(a: u8, b: u8, c: u8) -> Option<u64> {
    if (a | b | c) != 0 { Some(1) } else { None }
}

#[bench]
fn bench_or_bools(b: &mut Bencher) {
    const DATA: [(bool, bool, bool); 4]
              = [(false, false, false),
                 (true , false, false),
                 (false, true , false),
                 (false, false, true)];
    b.iter(|| {
        for i in 0 .. 4 {
            let (a, b, c) = DATA[i];
            test::black_box(or_bools(a, b, c));
        }
    })
}

#[bench]
fn bench_or_bytes(b: &mut Bencher) {
    const DATA: [(u8, u8, u8); 4]
              = [(0u8, 0u8, 0u8),
                 (1u8, 0u8, 0u8),
                 (0u8, 1u8, 0u8),
                 (0u8, 0u8, 1u8)];
    b.iter(|| {
        for i in 0 .. 4 {
            let (a, b, c) = DATA[i];
            test::black_box(or_bytes(a, b, c));
        }
    })
}
```

The de-optimization looks like work of LLVM, as the IR for `or_bools` preserves the original intent:

```
; Function Attrs: noinline norecurse nounwind uwtable
define internal fastcc void @_ZN8or_bools20h51bacbaed15b22f4gaaE(%"2.core::option::Option<u64>"* noalias nocapture dereferenceable(16), i1 zeroext, i1 zeroext, i1 zeroext) unnamed_addr #0 {
entry-block:
  %4 = or i1 %1, %2
  %5 = or i1 %4, %3
  %6 = bitcast %"2.core::option::Option<u64>"* %0 to i8*
  br i1 %5, label %then-block-26-, label %else-block

then-block-26-:                                   ; preds = %entry-block
  tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* %6, i8* nonnull bitcast ({ i64, i64, [0 x i8] }* @const5784 to i8*), i64 16, i32 8, i1 false)
  br label %join

else-block:                                       ; preds = %entry-block
  tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* %6, i8* nonnull bitcast ({ i64, [8 x i8] }* @const5785 to i8*), i64 16, i32 8, i1 false)
  br label %join

join:                                             ; preds = %else-block, %then-block-26-
  ret void
}
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Conditions using bitwise OR on booleans may produce de-optimized code #32414

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Conditions using bitwise OR on booleans may produce de-optimized code #32414

Description

Activity

mzabaluev commented on Mar 21, 2016

eefriedman commented on Mar 24, 2016

mzabaluev commented on Mar 24, 2016

Mark-Simulacrum commented on May 2, 2017

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions