Regression: pandas.read_parquet hangs when using filprofiler 2022.09.0 #415

kdebrab · 2022-09-20T16:22:25Z

I hope the following is sufficient for reproducing the issue.

Writing with df.to_parquet goes fine, it's when reading the data back with pd.read_parquet that the code hangs. The parquet engine used is pyarrow. No error is raised, the docker container simply hangs forever.

python: 3.10.7
OS: Linux
pandas: 1.4.4
numpy: 1.23.3
pyarrow: 9.0.0

Disabling filprofiler (I use the api with a conditional environment variable as documented in https://pythonspeed.com/fil/docs/api.html#using-the-python-api) resolves the issue. Also reverting to filprofiler 2022.06.0 (with everything else exactly the same) resolves the issue.

The text was updated successfully, but these errors were encountered:

itamarst · 2022-09-20T16:26:08Z

Thanks for the detailed bug report. I will try to reproduce, and if I fail I will ask for more details.

itamarst · 2022-09-25T22:07:02Z

Hi, I am an unable to reproduce with a random parquet file I have lying around. Could you share a minimal reproducer if you can make one? Python script + parquet file, ideally.

itamarst · 2022-09-28T16:49:28Z

@kdebrab just checking again, would love to get this fixed.

itamarst · 2022-10-25T13:51:31Z

@kdebrab could you provide a reproducer?

itamarst added bug Something isn't working NEXT labels Sep 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regression: pandas.read_parquet hangs when using filprofiler 2022.09.0 #415

Regression: pandas.read_parquet hangs when using filprofiler 2022.09.0 #415

kdebrab commented Sep 20, 2022 •

edited

Loading

itamarst commented Sep 20, 2022

itamarst commented Sep 25, 2022

itamarst commented Sep 28, 2022

itamarst commented Oct 25, 2022

Regression: pandas.read_parquet hangs when using filprofiler 2022.09.0 #415

Regression: pandas.read_parquet hangs when using filprofiler 2022.09.0 #415

Comments

kdebrab commented Sep 20, 2022 • edited Loading

itamarst commented Sep 20, 2022

itamarst commented Sep 25, 2022

itamarst commented Sep 28, 2022

itamarst commented Oct 25, 2022

kdebrab commented Sep 20, 2022 •

edited

Loading