chore(suspect flags): Include filtered flag in output #95007

aayush-se · 2025-07-07T22:24:08Z

Performs a filtering step prior to RRF
- Normalizes KL and Entropy scores with Box Cox transform then takes scores which have a z-score >= threshold
  - Threshold is currently 1.5 but can be adjusted if over/under filtering on RRF
Exposes if the flag has been filtered out in the JSON response as a boolean such that all flags are still visible on the frontend if required

TODO:

Update frontend to use these filtered scores when sorting by Heuristic + RRF or RRF

codecov · 2025-07-07T23:15:12Z

Codecov Report

Attention: Patch coverage is 97.26027% with 4 lines in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
src/sentry/seer/math.py	92.30%	4 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           master   #95007       +/-   ##
===========================================
+ Coverage   38.17%   87.82%   +49.65%     
===========================================
  Files        9858    10436      +578     
  Lines      556058   604299    +48241     
  Branches    23550    23550               
===========================================
+ Hits       212265   530718   +318453     
+ Misses     343426    73214   -270212     
  Partials      367      367

cursor

Bug: Box-Cox Lambda Calculation Inconsistency

The boxcox_transform function calculates the optimal lambda parameter using the original values via _boxcox_normmax, even when it internally shifts non-positive values to shifted_values for the actual transformation. This leads to an inconsistency where the optimal lambda is determined for a different dataset than what is ultimately transformed, potentially yielding suboptimal results. Furthermore, _boxcox_normmax duplicates the shifting logic, which is inefficient and brittle.

src/sentry/seer/math.py#L109-L129

sentry/src/sentry/seer/math.py

Lines 109 to 129 in aa96455

    
               shifted_values = values 
        
           if lambda_param is not None: 
        
               if lambda_param == 0.0: 
        
                   transformed = [math.log(max(v, 1e-10)) for v in shifted_values] 
        
               else: 
        
                   transformed = [ 
        
                       (pow(max(v, 1e-10), lambda_param) - 1) / lambda_param for v in shifted_values 
        
                   ] 
        
               return transformed, lambda_param 
        
           optimal_lambda = _boxcox_normmax(values) 
        
           if optimal_lambda == 0.0: 
        
               transformed = [math.log(max(v, 1e-10)) for v in shifted_values] 
        
           else: 
        
               transformed = [ 
        
                   (pow(max(v, 1e-10), optimal_lambda) - 1) / optimal_lambda for v in shifted_values 
        
               ] 
        
           return transformed, optimal_lambda

Fix in Cursor • Fix in Web

Was this report helpful? Give feedback by reacting with 👍 or 👎

trillville · 2025-07-08T16:57:02Z

src/sentry/seer/math.py

+            ]
+        return transformed, lambda_param
+
+    optimal_lambda = _boxcox_normmax(values)


should this be _boxcox_normmax(shifted_values)?

ram-senth · 2025-07-08T17:12:47Z

src/sentry/seer/math.py

+
+    optimal_lambda = _boxcox_normmax(values)
+
+    if optimal_lambda == 0.0:


Minor one - looks like you can avoid code duplication by first initializing lambda_param as lambda_param = _boxcox_normmax(values) if lambda_param is not None else lambda_param.

aayush-se added 3 commits July 7, 2025 10:59

initial functions for filtering

50c773d

make more consistent with scipy implementation

07ad2f4

update return

d26e4b3

github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Jul 7, 2025

types and tests

74fc10b

vercel bot deployed to Preview July 7, 2025 23:02 View deployment

typo

1fd3aff

aayush-se marked this pull request as ready for review July 7, 2025 23:06

aayush-se requested review from a team as code owners July 7, 2025 23:06

aayush-se requested review from trillville and ram-senth July 7, 2025 23:06

vercel bot deployed to Preview July 7, 2025 23:06 View deployment

This comment was marked as outdated.

Sign in to view

Ensure 0 is handled and update tests

18f9df8

vercel bot deployed to Preview July 7, 2025 23:51 View deployment

This comment was marked as outdated.

Sign in to view

aayush-se marked this pull request as draft July 7, 2025 23:56

use the correct values for z score calculation

271bb90

vercel bot deployed to Preview July 8, 2025 00:34 View deployment

aayush-se marked this pull request as ready for review July 8, 2025 00:39

This comment was marked as outdated.

Sign in to view

bugs

aa96455

vercel bot deployed to Preview July 8, 2025 04:33 View deployment

cursor bot reviewed Jul 8, 2025

View reviewed changes

trillville reviewed Jul 8, 2025

View reviewed changes

trillville approved these changes Jul 8, 2025

View reviewed changes

ram-senth reviewed Jul 8, 2025

View reviewed changes

update using shifted values and clean up boxcox function

93d2f05

vercel bot deployed to Preview July 8, 2025 18:37 View deployment

aayush-se merged commit eb87fa0 into master Jul 8, 2025
64 of 65 checks passed

aayush-se deleted the suspect-flags/rrf-filtering branch July 8, 2025 19:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

chore(suspect flags): Include filtered flag in output #95007

chore(suspect flags): Include filtered flag in output #95007

Uh oh!

aayush-se commented Jul 7, 2025 •

edited

Loading

Uh oh!

This comment was marked as outdated.

Uh oh!

codecov bot commented Jul 7, 2025 •

edited

Loading

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

cursor bot left a comment

Uh oh!

trillville Jul 8, 2025

Uh oh!

ram-senth Jul 8, 2025

Uh oh!

Uh oh!

Uh oh!

	shifted_values = values

	if lambda_param is not None:
	if lambda_param == 0.0:
	transformed = [math.log(max(v, 1e-10)) for v in shifted_values]
	else:
	transformed = [
	(pow(max(v, 1e-10), lambda_param) - 1) / lambda_param for v in shifted_values
	]
	return transformed, lambda_param

	optimal_lambda = _boxcox_normmax(values)

	if optimal_lambda == 0.0:
	transformed = [math.log(max(v, 1e-10)) for v in shifted_values]
	else:
	transformed = [
	(pow(max(v, 1e-10), optimal_lambda) - 1) / optimal_lambda for v in shifted_values
	]

	return transformed, optimal_lambda

Uh oh!

chore(suspect flags): Include filtered flag in output #95007

chore(suspect flags): Include filtered flag in output #95007

Uh oh!

Conversation

aayush-se commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

codecov bot commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Bug: Box-Cox Lambda Calculation Inconsistency

Uh oh!

trillville Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

ram-senth Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

aayush-se commented Jul 7, 2025 •

edited

Loading

codecov bot commented Jul 7, 2025 •

edited

Loading