Skip to content

Conversation

@gaogaotiantian
Copy link
Contributor

What changes were proposed in this pull request?

Enable codecov test report upload on all workflows in apache/spark, which are all scheduled tests plus main commit.

Why are the changes needed?

After experimenting with codecov, we had a better understanding of how things work. They use flag to categorize tests from different environments. This is helpful for us to do test analysis. Our schedule tests are sometimes unstable, this could help us to find the unstable tests.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

This is a CI change so we had to wait for CI.

Was this patch authored or co-authored using generative AI tooling?

No

@gaogaotiantian gaogaotiantian marked this pull request as ready for review January 9, 2026 21:23
@github-actions
Copy link

github-actions bot commented Jan 9, 2026

JIRA Issue Information

=== Test SPARK-54889 ===
Summary: Upload test result to codecov.io to have a test dashboard
Assignee: Tian Gao
Status: Resolved
Affected: ["4.2.0"]


This comment was automatically generated by GitHub Actions

@github-actions github-actions bot added the INFRA label Jan 9, 2026
@gaogaotiantian
Copy link
Contributor Author

@zhengruifeng the CI actually passed after re-run (some glitches). Don't know why github did not show it correctly.

@zhengruifeng
Copy link
Contributor

I remember workflow with codecov is much slower, I am not sure whether we want to enable it on all workflows

@HyukjinKwon
Copy link
Member

yeah we can't it becomes much much slower and unstable ..

@gaogaotiantian
Copy link
Contributor Author

I believe you are thinking about "coverage run", where we instrument the unittests so we can generate coverage data. This is a different thing. We do not change how we run tests at all. The only thing we do is to upload already generated test report (not coverage report) to codecov.io.

Codecov.io is the website to visualize data. It is often used to visualize coverage data - which is generated by coverage runs (we do that daily). In this case, we are using a different feature to visualize test results, it's not related to coverage at all.

@HyukjinKwon
Copy link
Member

In this case, we are using a different feature to visualize test results, it's not related to coverage at all.

What are we uploading?

@HyukjinKwon
Copy link
Member

Do you have any example to show? how would it look like?

@gaogaotiantian
Copy link
Contributor Author

We are uploading the xml files generated by each unittest. Notice we are already uploading those to github artifacts in another action. The dashboard would look like https://app.codecov.io/gh/apache/spark/tests - it's not the most delicate dashboard, but it's something. The cost is really low and we at least have some test analysis (I need to work on it a bit more to have better data, but this is a start)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants