Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to DataLakeCatalog/DataCatalogDatabase that crawlers use manually defined table #490

Open
karnik opened this issue Mar 19, 2024 · 3 comments
Labels
enhancement Enhancement in existing feature

Comments

@karnik
Copy link

karnik commented Mar 19, 2024

DataLakeCatalog/DataCatalogDatabase should have the option of manually setting the tables for the crawler as parameters. There are several use cases that require a manually created catalog table.

  • You want to choose the catalog table name manually and not rely on the catalog table naming algorithm
  • Reuse the table later in the stack (e.g., in a lambda to query the table)

See: https://docs.aws.amazon.com/glue/latest/dg/tables-described.html#update-manual-tables

@github-project-automation github-project-automation bot moved this to Backlog in DSF roadmap Mar 19, 2024
@karnik karnik changed the title Add option to DataLakeCatalog/DataCatalogDatabase that crawlers will use existing table Add option to DataLakeCatalog/DataCatalogDatabase that crawlers will manully defined table Mar 19, 2024
@karnik karnik changed the title Add option to DataLakeCatalog/DataCatalogDatabase that crawlers will manully defined table Add option to DataLakeCatalog/DataCatalogDatabase that crawlers use manually defined table Mar 19, 2024
@vgkowski
Copy link
Contributor

Good point! My suggestion is we add a crawler mode parameter with 3 different values:

  • Can create tables, update schemas and add partitions
  • Can only update schemas and add partitions
  • Can only add partitions
    What do you think?

@karnik
Copy link
Author

karnik commented Mar 20, 2024

Sounds great. How do you want to pass the table definition?

@lmouhib lmouhib added the enhancement Enhancement in existing feature label Apr 1, 2024
@vgkowski
Copy link
Contributor

Do you want to be able to filter tables by S3 prefix?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement in existing feature
Projects
Status: Backlog
Development

No branches or pull requests

3 participants