wip - initial draft results forest #7398

peterargue · 2025-05-09T23:44:22Z

This is an early draft of the results forest and new access ingestion engine.

codecov-commenter · 2025-05-09T23:48:41Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 39.29%. Comparing base (0075514) to head (681859a).
Report is 61 commits behind head on master.

Additional details and impacted files

@@             Coverage Diff             @@
##           master    #7398       +/-   ##
===========================================
- Coverage   42.75%   39.29%    -3.47%     
===========================================
  Files        1591      111     -1480     
  Lines      146192    13180   -133012     
===========================================
- Hits        62507     5179    -57328     
+ Misses      78345     7615    -70730     
+ Partials     5340      386     -4954

Flag	Coverage Δ
unittests	`39.29% <ø> (-3.47%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

AlexHentschel

Looks very good. I think this is going very much in the right direction. One high-level comment:

I would suggest that you consolidate all the threading and orchestration in Engine. The engine decides when something is happening and the ResultsForest implements what is happening.

module/forest/vertex.go

module/executiondatasync/optimistic_syncing/states.go

AlexHentschel · 2025-05-28T23:37:25Z

engine/access/ingestion2/results_forest.go

+	wg := sync.WaitGroup{}
+	ch := f.notifier.Channel()
+	for {
+		select {
+		case <-ctx.Done():
+			// wait for all pipelines to gracefully shutdown
+			wg.Wait()
+			return
+
+		case <-ch:
+			runningCount := uint(0)
+
+			// Inspect all vertices in the tree forming from the latest persisted sealed result.
+			// Count all running pipelines, and start new ones up to the maxRunningCount.
+			// This will iterate over at most 2*maxRunningCount vertices.
+			f.visitAllAncestorsBFS(f.latestPersistedSealedResult.ResultID(), func(container *ExecutionResultContainer) bool {
+				state := container.pipeline.GetState()
+
+				switch state {
+				case pipeline.StateCanceled:
+					// TODO: free the pipeline's resources (not necessarily here)
+					return true
+
+				case pipeline.StateComplete:
+					if err := f.processCompleted(container.resultID); err != nil {
+						ctx.Throw(err)
+					}
+					return true
+
+				case pipeline.StatePending:
+					wg.Add(1)
+					go func() {
+						defer wg.Done()
+
+						core := pipeline.NewCore()
+						err := container.pipeline.Run(ctx, core)
+						if err != nil && !errors.Is(err, context.Canceled) {
+							ctx.Throw(fmt.Errorf("pipeline execution failed (result: %s): %w", container.resultID, err))
+						}
+					}()
+				}
+
+				runningCount++
+				return runningCount < f.maxRunningCount
+			})
+		}
+	}


Two high-level comments:

that is thread maintenance logic. I feel it complicates the ResultsForest because

it creates two domains: some work that the result forest is internally doing, vs work that is done by external threads of the result forest.

From my perspective just encapsulating the business logic by itself provides a really nice abstraction of "elementary self-contained operations". In my head, it is the responsibility of the ResultsForest to provide those operations. While it is the responsibility of the Engine to provide the orchestration and worker threads so that ResultsForest (in conjunction with the ForestLoader) is doing its job over time

engine/access/ingestion2/results_forest.go

AlexHentschel · 2025-05-29T00:14:40Z

engine/access/ingestion2/results_forest.go

+		case <-ch:
+			runningCount := uint(0)
+
+			// Inspect all vertices in the tree forming from the latest persisted sealed result.


I am wondering what you think about the alternative approach:

overall, I would prefer if that thread and work management lived in the Engine

Assume that when we call ResultsForest.AddResult we create a new task and add it to some queue. Then we have a worker pool with 20 workers pulling elements of that queue.

I'll look into this. I'd have to use a priority queue based on height, but that should be simple enough.

AlexHentschel · 2025-05-29T00:25:51Z

module/executiondatasync/optimistic_syncing/pipeline.go

+	Run(context.Context, Core) error
+	GetState() State
+	SetSealed()
+	OnParentStateUpdated(State)


This is essentially carrying over information from one pipeline to another. I think that would be better placed in the ResultsForest. I guess all that is really relevant here is whether this pipeline is cancelled?

This is essentially carrying over information from one pipeline to another. I think that would be better placed in the ResultsForest.

This is called from the ResultsForest. I'm not sure what you mean.

I guess all that is really relevant here is whether this pipeline is cancelled?

also if the parent is in an active state which is required to start downloading

AlexHentschel · 2025-05-29T00:27:36Z

engine/access/ingestion2/results_forest.go

+func (f *ResultsForest) processCompleted(resultID flow.Identifier) error {
+	f.mu.Lock()
+	defer f.mu.Unlock()
+
+	// first, ensure that the result ID is in the forest, otherwise the forest is in an inconsistent state
+	container, ok := f.getContainer(resultID)
+	if !ok {
+		return fmt.Errorf("state update from unknown result vertex %s", resultID)
+	}
+
+	// next, ensure that this result descends from the latest persisted sealed result, otherwise
+	// the forest is in an inconsistent state since persisting must be done sequentially
+	if container.resultID != f.latestPersistedSealedResult.ResultID() {
+		return fmt.Errorf("result %s does not match the latest persisted sealed result %s", container.resultID, f.latestPersistedSealedResult.ResultID())
+	}
+
+	// finally, prune the forest up to the latest persisted result's block view
+	latestPersistedView := container.blockHeader.View
+	err := f.pruneUpToView(latestPersistedView)
+	if err != nil {
+		return fmt.Errorf("failed to prune results forest (view: %d): %w", latestPersistedView, err)
+	}
+


can we do this right after calling Run on the pipeline? 👇

flow-go/engine/access/ingestion2/results_forest.go

Lines 131 to 135 in 391632a

core := pipeline.NewCore()

err := container.pipeline.Run(ctx, core)

if err != nil && !errors.Is(err, context.Canceled) {

ctx.Throw(fmt.Errorf("pipeline execution failed (result: %s): %w", container.resultID, err))

}

yea, that's a good point

Co-authored-by: Alexander Hentschel <[email protected]>

peterargue added 2 commits May 9, 2025 16:40

wip - initial draft results forest

ec740f7

remove initialize method

2471de6

peterargue requested review from Guitarheroua and illia-malachyn May 20, 2025 13:38

peterargue added 6 commits May 22, 2025 10:48

Merge branch 'master' into peter/results-forest

5a8c9c4

refactor forest

05311e3

Merge branch 'master' into peter/results-forest

ae9e9c7

cleanup and improve comments

e641373

switch forest to use view as level

c0bf905

add persisted latest persisted sealed result, and adjust locking

391632a

AlexHentschel reviewed May 29, 2025

View reviewed changes

peterargue and others added 2 commits May 29, 2025 13:18

Apply suggestions from code review

e90b4de

Co-authored-by: Alexander Hentschel <[email protected]>

Updates from code review

681859a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

wip - initial draft results forest #7398

wip - initial draft results forest #7398

Uh oh!

peterargue commented May 9, 2025

Uh oh!

codecov-commenter commented May 9, 2025 •

edited

Loading

Uh oh!

AlexHentschel left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

AlexHentschel May 28, 2025

Uh oh!

Uh oh!

AlexHentschel May 29, 2025

Uh oh!

peterargue May 29, 2025

Uh oh!

AlexHentschel May 29, 2025

Uh oh!

peterargue May 29, 2025 •

edited

Loading

Uh oh!

AlexHentschel May 29, 2025

Uh oh!

peterargue May 29, 2025

Uh oh!

Uh oh!

	core := pipeline.NewCore()
	err := container.pipeline.Run(ctx, core)
	if err != nil && !errors.Is(err, context.Canceled) {
	ctx.Throw(fmt.Errorf("pipeline execution failed (result: %s): %w", container.resultID, err))
	}

wip - initial draft results forest #7398

Are you sure you want to change the base?

wip - initial draft results forest #7398

Uh oh!

Conversation

peterargue commented May 9, 2025

Uh oh!

codecov-commenter commented May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

AlexHentschel left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

AlexHentschel May 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

AlexHentschel May 29, 2025

Choose a reason for hiding this comment

Uh oh!

peterargue May 29, 2025

Choose a reason for hiding this comment

Uh oh!

AlexHentschel May 29, 2025

Choose a reason for hiding this comment

Uh oh!

peterargue May 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AlexHentschel May 29, 2025

Choose a reason for hiding this comment

Uh oh!

peterargue May 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov-commenter commented May 9, 2025 •

edited

Loading

AlexHentschel left a comment •

edited

Loading

peterargue May 29, 2025 •

edited

Loading