Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for automatic column statistics generation in DSF catalog constructs #349

Open
vgkowski opened this issue Jan 15, 2024 · 0 comments
Assignees
Labels
enhancement Enhancement in existing feature

Comments

@vgkowski
Copy link
Contributor

AWS native products including EMR, Athena, Glue, Redshift provide better query performances with Glue Column statistics
https://docs.aws.amazon.com/glue/latest/dg/column-statistics.html

DSF should provide an option (with default) to enable column statistic generation at the database level when creating a DatabaseCatalog.

DSF should create the gold layer of the DataLakeCatalog with column statistics enabled.

@github-project-automation github-project-automation bot moved this to Backlog in DSF roadmap Jan 15, 2024
@lmouhib lmouhib added the enhancement Enhancement in existing feature label Jan 18, 2024
@lmouhib lmouhib moved this from Backlog to Todo in DSF roadmap Apr 30, 2024
@vgkowski vgkowski assigned vgkowski and lmouhib and unassigned vgkowski Apr 30, 2024
@vgkowski vgkowski moved this from Todo to In Progress in DSF roadmap Apr 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement in existing feature
Projects
Status: In Progress
Development

No branches or pull requests

2 participants