Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests contain non-zero timers #548

Open
ricea opened this issue Oct 20, 2016 · 2 comments
Open

Tests contain non-zero timers #548

ricea opened this issue Oct 20, 2016 · 2 comments

Comments

@ricea
Copy link
Collaborator

ricea commented Oct 20, 2016

Short timers make tests flaky. From the Testharness.js API Documentation:

Note that timeouts generally need to be a few seconds long in order to produce stable results in all test environments.

Long timeouts slow down all developers by making the the tests take longer to run. They also waste resources on automatic testing infrastructure.

Mostly timers are being used to ensure that the implementation never takes some action that it shouldn't. For example (from writable-streams/start.js):

promise_test(() => {
  const ws = recordingWritableStream({
    start() {
      return Promise.reject();
    }
  });

  // Wait and verify that write or close aren't called.
  return delay(100)
      .then(() => assert_array_equals(ws.events, [], 'write and close should not be called'));
}, 'underlying sink\'s write or close should not be invoked if the promise returned by start is rejected');

This test never[1] flakes on a correct implementation. Suppose hypothetically that there existed an implementation which called close() here after a asynchronously performing an action which takes an arbitrary amount of time. Then this test would be flaky on that implementation: it would fail when the action took <100ms, and pass otherwise.

I propose that we make the simplifying assumption that no reasonable implementation will do the wrong thing after a delay. We can then disregard the above hypothetical implementation as being unreasonable, and so consider its false positive on the above test to be acceptable.

I furthermore propose that we consider two times around the event loop to be long enough to see whether an implementation is going to do the wrong thing or not. My reasoning is that the implementation may run a callback after the user code, which then posts another async callback which finally does the wrong thing.

I think we should have a function flushAsyncEvents() which is equivalent to delay(0).then(delay(0)). The above test would then look like

promise_test(() => {
  const ws = recordingWritableStream({
    start() {
      return Promise.reject();
    }
  });

  // Verify that write or close aren't called.
  return flushAsyncEvents()
      .then(() => assert_array_equals(ws.events, [], 'write and close should not be called'));
}, 'underlying sink\'s write or close should not be invoked if the promise returned by start is rejected');

[1] Test frameworks usually have a hard timeout, which means that even passing tests will fail occasionally. The longer a test takes, the more likely it is to hit the hard timeout. This is seen in practice on the Blink layout tests.

@domenic
Copy link
Member

domenic commented Oct 20, 2016

This sounds good to me. Minor correction: delay(0).then(() => delay(0)).

It doesn't take care of some remaining cases where we're actually using the timers for sequencing, but we can have a separate discussion about those.

ricea added a commit to ricea/streams that referenced this issue Nov 9, 2016
Add a util function flushAsyncEvents() which spins the event loop to
ensure that the implementation doesn't do something it shouldn't
asynchronously.

Use this function to replace trivial uses of delay(100) and similar  to
improve test determinism.

This change relates to whatwg#548.

Tests which use multiple delays to express ordering requirements have
not been changed. Those tests require individual analysis.
ricea added a commit that referenced this issue Nov 11, 2016
Add a util function flushAsyncEvents() which spins the event loop to
ensure that the implementation doesn't do something it shouldn't
asynchronously.

Use this function to replace trivial uses of delay(100) and similar  to
improve test determinism.

This change relates to #548.

Some non-zero delays were just there to trigger asynchronous
behaviour. Replace these with delay(0).

Tests which use multiple delays to express ordering requirements have
not been changed. Those tests require individual analysis.
ricea added a commit to ricea/web-platform-tests that referenced this issue Apr 24, 2017
Test 'Piping to a WritableStream that does not consume the writes fast
enough exerts backpressure on the ReadableStream' in
streams/piping/flow-control.js is flaky on Chromium
memory sanitizer bots due to its use of timers.

Change to use an interleaved step design. Each step does not start until the
previous one has completed.

Relevant to whatwg/streams#548.
ricea added a commit to web-platform-tests/wpt that referenced this issue Apr 25, 2017
Test 'Piping to a WritableStream that does not consume the writes fast
enough exerts backpressure on the ReadableStream' in
streams/piping/flow-control.js is flaky on Chromium
memory sanitizer bots due to its use of timers.

Change to use an interleaved step design. Each step does not start until the
previous one has completed.

Relevant to whatwg/streams#548.
@ricea
Copy link
Collaborator Author

ricea commented Nov 10, 2017

Some notes from recently porting some tests to not use non-zero delays:

If there is only one delay() in the test, then it's probably only being used to test async behaviour, and can just be replaced by delay(0). If there are negative assertions (assertions that something has not happened) in the test then you may need to use flushAsyncEvents().

Even many tests that use delay(n) for sequencing can be easily rewritten without non-zero delays. Here's four examples, only one of which results in a non-deterministic ordering:

// Example (1)
delay(10).then(a);
delay(20).then(b);

// Example (2)
delay(10).then(a);
someOtherEvent.then(() => delay(20)).then(b);

// Example (3)
someOtherEvent.then(() => delay(10)).then(a);
delay(20).then(b);

// Example (4)
delay(10).then(a);
delay(10).then(b);

In examples (1), (2) and (4), a happens deterministically before b, and so they can be rewritten using chained promises. In example (3) the order may be nondeterministic.

In the general case, tests with non-zero delays implicitly test the ordering of events and should be rewritten to explicitly test ordering instead. Usually the recording stream test helpers in streams/resources/recording-streams.js are the right tools for this.

The tricky part is identifying those implicitly tested ordering expectations. It's easy to rewrite the test so it looks the same and passes, but accidentally reduce the power of the test in identifying incorrect implementations.

ricea added a commit to web-platform-tests/wpt that referenced this issue Nov 15, 2017
Some tests for TransformStream used the delay() function with non-zero
delays. Mostly this was done to make things happen asynchronously in a
particular order, and has been replaced with equivalent code using
promise chains.

The test "TransformStream flush is called after all queued writes
finish, once the writable is closed" was passing accidentally as it was
using test() where it should have been using promise_test(). In addition
to removing non-zero delays from the test, also fix it to use
promise_test() and remove the backpressure that was causing transform()
to never be called.

See whatwg/streams#548 for background on this
clean-up.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants