Things not yet implemented.
Most things in the cli have yet to be implemented.
- Packaging
- Downloadable/installable self-contained binary
- Improved S3CopyArc
- Object renaming
- Basic static partitioning
- Include/exclude predicates
- Custom Lambda based arc workloads
- High frequency S3 listener boundary
- For aggregating objects that arrive within a lot interval
- https://docs.clusterless.io/reference/1.0-wip/components/aws-core-s3-put-listener-boundary.html
- Native resources and workloads
- AWS Glue database and catalog updates
- AWS Athena CTAS/INSERT INTO queries (for chaining SQL)
- AWS Sagemaker training/validation
- Common data processing workloads
- Data reformatting (from text/json to binary/parquet) 1https://github.com/ClusterlessHQ/tessellate
- Dynamic data repartitioning (partitions based on data like timestamps)
- Predicate/duplicate index creation and data filtering
- Join Barrier implementations
- Scheduled arc executions
- Some arcs may need to run periodically
- Parallelized workloads
- Workloads can be parallelized on source partitions
- Pluggable modules for providing third-party services
- Localstack support for faster testing AWS scenarios
- Alternate substrates/providers
- Azure
- GCP
- Digital Ocean
- OCI