Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problems with profiling column names with period in name in pyspark #1532

Open
1 task
jamie256 opened this issue Jun 12, 2024 · 2 comments
Open
1 task

problems with profiling column names with period in name in pyspark #1532

jamie256 opened this issue Jun 12, 2024 · 2 comments

Comments

@jamie256
Copy link
Contributor

Description

Filing on behalf of report.

column names:
"model1.Category", "model1.input_temp"
in a dataframe named df:

collect_column_profile_views(df)

-> No column named "model1.Category" error

Copy link
Contributor

This issue is stale. Remove stale label or it will be closed next week.

@richard-rogers
Copy link
Contributor

I wasn't able to work around this with backtick escapes in collect_column_profile_views(), but renaming the columns works: input_df.toDF(*(c.replace('.', '_') for c in input_df.columns)). We might want to maintain a map so we can restore the original column names.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants