Skip to content

Conversation

@dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Nov 13, 2025

What changes were proposed in this pull request?

This PR is a follow-up of the following to fix connectutils.py to import pb2 conditionally.

Why are the changes needed?

Currently, Python CIs are broken like the following.

  File "/__w/spark/spark/python/pyspark/testing/connectutils.py", line 26, in <module>
    import pyspark.sql.connect.proto as pb2
  File "/__w/spark/spark/python/pyspark/sql/connect/proto/__init__.py", line 18, in <module>
    from pyspark.sql.connect.proto.base_pb2_grpc import *
  File "/__w/spark/spark/python/pyspark/sql/connect/proto/base_pb2_grpc.py", line 19, in <module>
    import grpc
ModuleNotFoundError: No module named 'grpc'

Does this PR introduce any user-facing change?

No behavior change. We has been importing pyspark.sql.connect conditionally before #52894 .

How was this patch tested?

Pass the CIs and manual test.

Was this patch authored or co-authored using generative AI tooling?

No.

@dongjoon-hyun
Copy link
Member Author

Could you review this follow-up PR, @zhengruifeng ?

@dongjoon-hyun
Copy link
Member Author

Could you review this PR when you have some time, @peter-toth ?

@dongjoon-hyun
Copy link
Member Author

Thank you, @peter-toth !

@dongjoon-hyun
Copy link
Member Author

Thank you, @hvanhovell !

dongjoon-hyun added a commit that referenced this pull request Nov 13, 2025
… conditionally

### What changes were proposed in this pull request?

This PR is a follow-up of the following to fix `connectutils.py` to import `pb2` conditionally.
- #52894

### Why are the changes needed?

Currently, Python CIs are broken like the following.
- https://github.com/apache/spark/actions/workflows/build_python_3.11_classic_only.yml
    - https://github.com/apache/spark/actions/runs/19316448951/job/55248810741
- https://github.com/apache/spark/actions/workflows/build_python_3.12.yml
    - https://github.com/apache/spark/actions/runs/19275741458/job/55212353468

```
  File "/__w/spark/spark/python/pyspark/testing/connectutils.py", line 26, in <module>
    import pyspark.sql.connect.proto as pb2
  File "/__w/spark/spark/python/pyspark/sql/connect/proto/__init__.py", line 18, in <module>
    from pyspark.sql.connect.proto.base_pb2_grpc import *
  File "/__w/spark/spark/python/pyspark/sql/connect/proto/base_pb2_grpc.py", line 19, in <module>
    import grpc
ModuleNotFoundError: No module named 'grpc'
```

### Does this PR introduce _any_ user-facing change?

No behavior change. We has been importing `pyspark.sql.connect` conditionally before #52894 .

### How was this patch tested?

Pass the CIs and manual test.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #53037 from dongjoon-hyun/SPARK-54194.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 63bcc87)
Signed-off-by: Dongjoon Hyun <[email protected]>
@dongjoon-hyun
Copy link
Member Author

For the record, CIs are recovered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants