Add Temporal half-rounding boundary tests across all units by MidnightDesign · Pull Request #4996 · tc39/test262

MidnightDesign · 2026-03-16T14:27:50Z

Note

This PR was drafted with the help of Claude Code. Apologies if that's not welcome here — happy to revise anything by hand.

Context

We're building temporal-php, a PHP 8.4 port of the TC39 Temporal API. We run the test262 suite as part of our CI (transpiled to PHP) and also use Infection for mutation testing. Infection systematically modifies source code and checks whether the test suite catches each mutation. Out of 11,983 mutations, several hundred escaped — and many of those escaped because the upstream test262 data doesn't exercise certain code paths. This PR adds tests to close the most impactful gaps we found.

Summary

Eight new test files across four Temporal types, all exercising the exact 0.5 fractional boundary in RoundRelativeDuration:

PlainDate (since/ and until/): years (183/366 = 0.5 in a leap year) and months (14/28 = 0.5 in February).
PlainDateTime and ZonedDateTime (since/ and until/): years, months, weeks (3.5/7), days (12/24), hours (30/60), minutes (30/60), seconds (500/1000 ms), milliseconds (500/1000 µs), and microseconds (500/1000 ns).
PlainYearMonth (since/ and until/): years only (months always produce exact results for this type). Uses June-starting dates because RoundRelativeDuration converts month remainders to days: Jun–Nov = 183 days in the 366-day span crossing Feb 29, 2020.

Each unit is tested with both an odd integer part (e.g. 1.5) and an even integer part (e.g. 2.5) to distinguish halfEven from halfExpand. Rounding modes with identical outcomes are looped to keep tests concise. The until tests cover the positive direction; the since tests cover the negative direction (where halfExpand and halfCeil diverge). Each scenario includes a .total() assertion to verify the duration is exactly on the 0.5 boundary.

How the gaps were found

Infection rewrites code like swapping halfExpand match arms, etc. If no test fails, the mutant "escapes." We traced ~80 escaped mutants back to the fact that all half-* rounding modes produce the same output with the current test data (no value near the 0.5 boundary), so entire match arms can be deleted or swapped without detection.

All expected values were verified against V8's Temporal implementation via test262-harness with esvu-installed V8 (d8).

…and dayOfWeek The existing rounding mode tests for PlainDate.prototype.since() and PlainDate.prototype.until() use dates that produce ~31.97 months of difference, well above the 0.5 boundary. All half-* rounding modes produce identical results, making it impossible to distinguish halfExpand from halfTrunc, or halfEven from halfCeil. These new tests use dates that produce exactly 0.5 fractional progress (183/366 days in a leap year), causing all nine rounding modes to produce distinct result patterns. The 2.5-year case specifically distinguishes halfEven (rounds to nearest even integer 2) from halfExpand (rounds away from zero to 3). Also adds: - inLeapYear century-year tests (1700, 1800, 1900, 2100, 2200) exercising the 100/400 rule that the basic test does not cover - dayOfWeek tests across all 12 months of a year, since the basic test only checks 7 consecutive days within a single month Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Remove unrelated dayOfWeek and inLeapYear tests. Add RoundRelativeDuration spec references to info blocks. Extend half-boundary coverage to PlainDateTime, PlainYearMonth, and ZonedDateTime (until + since). PlainYearMonth uses June-starting dates because RoundRelativeDuration converts month remainders to days: Jun-Nov = 183 days in a 366-day year span (Jun 2019 - Jun 2020 crossing Feb 29), giving exactly 183/366 = 0.5. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

ptomato

Thanks! This looks good at first glance.

A couple questions and comments:

Would you mind sharing approximately the process you used for prompting Claude Code and turning the output into this PR? As the technology is new we are still finding our way around. Thank you for disclosing it up front, by the way.
Did the mutation tool only find a lack of test coverage when rounding to years, or is the coverage for units such as months also lacking? It might make sense to find similar pairs of objects for other units. (If you do that, it probably makes sense to loop over rounding modes that have the same outcome in each case, to prevent the tests from getting overly long.)
To make sure we are testing what we expect to be testing, it might be helpful to add assertions at the beginning such as assert.sameValue(earlier1.since(later).total({ unit: "years", relativeTo: earlier1 }), -1.5, "duration is on a 0.5 boundary");

test/built-ins/Temporal/PlainDate/prototype/since/roundingmode-half-boundary.js

MidnightDesign · 2026-03-18T17:01:21Z

Would you mind sharing approximately the process you used for prompting Claude Code and turning the output into this PR?

I told it to write a minimal transpiler for the syntax used in the Temporal tests and it came up with this. It grew as it implemented more classes and methods that required different syntax to be converted.

Example conversion: JS -> PHP

Then I told it to start implementing classes and methods one by one against the test suite.

At one point it was pretty much done and all that was left were concepts that are untranslatable, like JS Symbol.

I was wondering whether there was any dead or untested code in there (which there shouldn't if test262 is exhaustive). I fired up my go-to technique for that, mutation testing, using Infection. To my surprise it actually flagged some mutants, specifically pointing out that the rounding modes were pretty much uncovered. I then told it in a different session (directly in the test262 repo) to double and triple check and it and it was positive that that's an actual gap in the test suite. I told it to add the tests, it did it, verified them against V8 and I let it open the PR, making sure to include the fact that this was written by an agent.

As the technology is new we are still finding our way around. Thank you for disclosing it up front, by the way.

Yeah of course, that was important to me. We're all still figuring out how to handle this stuff and I thought it would be important to be upfront with that.

Did the mutation tool only find a lack of test coverage when rounding to years, or is the coverage for units such as months also lacking? It might make sense to find similar pairs of objects for other units. (If you do that, it probably makes sense to loop over rounding modes that have the same outcome in each case, to prevent the tests from getting overly long.)

Claude Code did not flag other units by itself specifically, and I didn't ask. Will check tomorrow.

To make sure we are testing what we expect to be testing, it might be helpful to add assertions at the beginning such as assert.sameValue(earlier1.since(later).total({ unit: "years", relativeTo: earlier1 }), -1.5, "duration is on a 0.5 boundary");

Great idea, I will add that.

@ptomato Thanks for taking the time.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Clarifies the half-* rounding mode assertion messages by using standard tie-breaking terminology for consistency with non-half mode phrasing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

linusg · 2026-03-18T18:00:00Z

The copyright hallucination makes me very skeptical of this contribution.

We ought to reject it as a violation of the AI policy alone
Even if there was an exception, it clearly wasn't sanity checked

MidnightDesign · 2026-03-18T20:27:14Z

The copyright hallucination makes me very skeptical of this contribution.

I checked the code, not the boilerplate-looking legal header.

We ought to reject it as a violation of the AI policy alone

Absolutely fair. I'm skeptical of AI-written code that I haven't generated myself either. But given the radical shift in the last couple of months I think these kinds of policies will need to be updated.

What a weird point in time; people still see anything involving AIs instinctively as low-quality (myself included) while even the greatest skeptics have started using it all the time.

Personally, I'm in favor of being transparent with the use of AI and allowing it in a limited capacity instead of banning it outright (and people using it anyway without telling).

ptomato · 2026-03-20T15:17:54Z

The copyright hallucination makes me very skeptical of this contribution.

* We ought to reject it as a violation of the [AI policy](https://github.com/tc39/how-we-work/blob/main/AI_POLICY.md) alone

* Even if there was an exception, it clearly wasn't sanity checked

I'm comfortable with this contribution.

I read the AI policy as "don't copy-paste LLM-generated text into discussions that TC39 delegates have to waste time reading", not "don't accept any LLM-assisted PRs" (and from what I caught of the discussion when the policy was adopted, that's how it was intended.)

Furthermore, I believe I can vouch for this PR being (1) correct and (2) filling an actual coverage gap, because I checked those things. And the modifications I suggested (looping over rounding modes that expect the same result, asserting that total() is a floating point that's exactly x.5) will make the tests even easier to verify for correctness at a glance.

I overlooked the copyright line because I hadn't met Rudi Theunissen and I just assumed that was MidnightDesign's real name 😄 I can understand others' skepticism after seeing that, but I'm pretty confident these are useful tests.

@ptomato

Adds assert.sameValue checks using .until().total() at the start of each test to prove the test data produces exactly x.5 years, as suggested by @ptomato. Uses .until().total() rather than .since().total() for the since tests because .total() with a negative duration traverses backward from relativeTo into a non-leap year (2018, 365 days), yielding 183/365 ≈ 0.5014 instead of 183/366 = 0.5. Verified against V8 14.8.37 via test262-harness. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

MidnightDesign · 2026-03-25T08:43:46Z

Added .total() boundary assertions as suggested — see d3274e1.

Note: the since test files use .until().total() (rather than .since().total()) for the boundary check because .total() with a negative duration walks backward from relativeTo into a non-leap year (2018, 365 days), giving 183/365 ≈ 0.5014 instead of 183/366 = 0.5. Using .until().total() walks forward into the leap year and produces the exact 0.5 value.

All 8 files verified against V8 14.8.37 via test262-harness (16/16 pass).

(This comment was AI-generated — I don't fully understand the .total() directionality issue, so please flag it if the reasoning is off.)

Extend roundingmode-half-boundary.js tests to cover all units where an exact 0.5 fractional boundary can be achieved, not just years. Loop over rounding modes with identical outcomes to keep tests concise. Units by type: - PlainDate: years, months - PlainDateTime/ZonedDateTime: years, months, weeks, days, hours, minutes, seconds, milliseconds, microseconds - PlainYearMonth: years (months always exact for this type) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

MidnightDesign · 2026-03-25T09:53:19Z

@ptomato Added tests for all units in ec3a307.

I've uploaded the full assistant session since you were interested in it in your first message here: https://gist.github.com/MidnightDesign/a9fd0fffff198d6c59929e64062855a5

ctcpip · 2026-03-25T13:56:37Z

@MidnightDesign Thank you for your contribution here! 🙏 Please be mindful of our AI policy. Specifically:

Prose contributions and comments must be your own writing, not the product of large language models (LLMs) or other tools.

We ask that contributors understand, and be able to explain, what they are posting. That includes code, of course. It also includes comments and posts, which should be your own writing and not the output of AI/LLMs.

MidnightDesign · 2026-03-26T06:11:29Z

@ctcpip Thank you, I will be mindful of that in the future.

MidnightDesign requested a review from a team as a code owner March 16, 2026 14:27

MidnightDesign changed the title ~~Add Temporal PlainDate tests for half-rounding boundary, century leap years, and dayOfWeek coverage~~ Add Temporal half-rounding boundary tests for PlainDate, PlainDateTime, PlainYearMonth, and ZonedDateTime Mar 16, 2026

MidnightDesign added 2 commits March 16, 2026 15:52

Merge branch 'main' into temporal-rounding-half-boundary

86246eb

Merge branch 'main' into temporal-rounding-half-boundary

e98298c

ptomato reviewed Mar 18, 2026

View reviewed changes

Ms2ger reviewed Mar 18, 2026

View reviewed changes

test/built-ins/Temporal/PlainDate/prototype/since/roundingmode-half-boundary.js Outdated Show resolved Hide resolved

test/built-ins/Temporal/PlainDate/prototype/since/roundingmode-half-boundary.js Outdated Show resolved Hide resolved

MidnightDesign and others added 2 commits March 18, 2026 18:09

Update copyright name to Rudolph Gottesheim

76e77c3

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Use "breaks ties" instead of "rounds 0.5" in assertion messages

a3cfa92

Clarifies the half-* rounding mode assertion messages by using standard tie-breaking terminology for consistency with non-half mode phrasing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Merge branch 'main' into temporal-rounding-half-boundary

4dee97b

MidnightDesign changed the title ~~Add Temporal half-rounding boundary tests for PlainDate, PlainDateTime, PlainYearMonth, and ZonedDateTime~~ Add Temporal half-rounding boundary tests across all units Mar 25, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Temporal half-rounding boundary tests across all units#4996

Add Temporal half-rounding boundary tests across all units#4996
MidnightDesign wants to merge 9 commits intotc39:mainfrom
MidnightDesign:temporal-rounding-half-boundary

MidnightDesign commented Mar 16, 2026 •

edited

Loading

Uh oh!

ptomato left a comment

Uh oh!

Uh oh!

Uh oh!

MidnightDesign commented Mar 18, 2026

Uh oh!

linusg commented Mar 18, 2026

Uh oh!

MidnightDesign commented Mar 18, 2026

Uh oh!

ptomato commented Mar 20, 2026

Uh oh!

MidnightDesign commented Mar 25, 2026

Uh oh!

MidnightDesign commented Mar 25, 2026

Uh oh!

ctcpip commented Mar 25, 2026

Uh oh!

MidnightDesign commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

MidnightDesign commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

Summary

How the gaps were found

Uh oh!

ptomato left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

MidnightDesign commented Mar 18, 2026

Uh oh!

linusg commented Mar 18, 2026

Uh oh!

MidnightDesign commented Mar 18, 2026

Uh oh!

ptomato commented Mar 20, 2026

Uh oh!

MidnightDesign commented Mar 25, 2026

Uh oh!

MidnightDesign commented Mar 25, 2026

Uh oh!

ctcpip commented Mar 25, 2026

Uh oh!

MidnightDesign commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

MidnightDesign commented Mar 16, 2026 •

edited

Loading