Releases: dlt-hub/dlt
0.4.6
Core Library
- feat(airflow): expose the Airflow runner method to create custom DAGs by @IlyaFaer in #1014
- removes sql alchemy dependency and port parts of URL class by @rudolfix in #1028
- Parallelize decorator - run many regular generators in parallel by @steinitzu in #965
- Add main entry point to support calling dlt as python module by @sultaniman in #1023
Library Bugfixes
- fixes naive datetime bug in incremental by @rudolfix in #1020
- Import missing pyarrow compute for transforms on arrowitems by @sh-rp in #1010
- delete normalized package in case it already existed by @sh-rp in #1012
- fix(core): validation error with TTableHintTemplate by @IlyaFaer in #1039
- adds test case where payload data contains PUA unicode characters by @willi-mueller in #1053
- fix add_limit behavior in edge cases by @sh-rp in #1052
- adds row_order to Incremental - automatically stop taking data when out of range by @rudolfix in #1041
- Fix to serialize load metrics as list instead of a dictionary by @sultaniman in #1051
- fix import schema workflow by @sh-rp in #1013
- rollback all changes to live schemas when extraction fails by @sh-rp in #1013
Docs
- Fix zendesk example test by @VioletM in #1027
- Edit arrow-pandas.md and fix a typo by @Bl3f in #1001
- Added info about file compression to filesystem docs by @dat-a-man in #975
- Update "create destination" docs with new file layouts by @steinitzu in #1032
- Docs update on how to set query limits. by @dat-a-man in #973
- Docs/Updated for slack alerts. by @dat-a-man in #1042
Verified Sources
- scrape web sites with spiders and Scrapy and send data to dlt @sultaniman dlt-hub/verified-sources#332
sql_database
recoginizesend_value
androw_order
to return rows in range and optionally ordered. backfill and proper Airflow intervals support @rudolfix dlt-hub/verified-sources#388
New Contributors
Full Changelog: 0.4.5...0.4.6
0.4.5
Core Library
- enables google drive filesystem for sources and destinations (second one experimental, google drive listings are only eventually consistent!) by @IlyaFaer in #932
- creates parallel Airflow DAGs in airflow helper to allow many resources to be executed at once @IlyaFaer in #966
- 855 create bigquery adapter for dlt resources: easily configure partitions, clustering, data retention etc. by @Pipboyguy in #952 and https://dlthub.com/docs/dlt-ecosystem/destinations/bigquery#bigquery-adapter
- Use BIGNUMERIC for large decimals in bigquery by @steinitzu in #984
- Normalize keys for Google secrets config provider by @sultaniman in #963
- does not lowercase postgres and redshift database names by @rudolfix in #990
- Introduce
hard_delete
anddedup_sort
columns hint formerge
by @jorritsandbrink in #960 and https://dlthub.com/docs/general-usage/incremental-loading#delete-records - adjustment of pua start in typed json encoding, pass through on decoding errors by @rudolfix in #974
- creates isolated parallel Airflow DAGs in airflow helper to execute resources parallel in isolated pipelines @IlyaFaer in #979
- Fix annotation processing and rebuilding, mark dataclass as complex by @sultaniman in #980
- allows async functions to be decorated with dlt.source by @rudolfix in #985
- allows right pipe operator to feed simple lists into a transformer @rudolfix in #985
- allows pendulum datetime as incremental cursor when loading arrow tables @rudolfix in #985
- enables Python 3.12 (mind that not all extras have python 3.12 libraries!) @rudolfix in #985
Docs
- docs(filesystem): include Google Drive into filesystem tutorial by @IlyaFaer in #962
- Fix typos/grammar in tutorial docs by @taljaards in #972
- add blog post observability by @adrianbr in #989
- Update arrow-pandas.md by @snehangsude in #992
- Clarify info about GoodData in modelling tools article by @mhauzirek in #956
- Fix small typings in contributing guide by @VioletM in #993
- Docs/google sheets update by @dat-a-man in #976
- Added "Incremental Configuration" section to SQL Databases documentat… by @dat-a-man in #977
Verified Sources
- Bing Webmaster source by @willi-mueller
New Contributors
- @taljaards made their first contribution in #972
- @mhauzirek made their first contribution in #956
- @snehangsude made their first contribution in #992
- @VioletM made their first contribution in #993
Full Changelog: 0.4.4...0.4.5
0.4.4
Core Library
- passes incremental from apply hints to resource function by @rudolfix in #953
- Handle UnionType when checking is_union_type and is_optional_type by @sultaniman in #951
- yanks orjson to <=0.3.10 by @rudolfix in #958
Docs
- Databricks workspace setup docs by @steinitzu in #949
Verified Source
- allows for table reflection at runtime, column selection and buffer control in
sql_database
@rudolfix (dlt-hub/verified-sources#351)
Full Changelog: 0.4.3...0.4.4
0.4.3
Core Library
- Databricks destination by @steinitzu and @phillem15 in #892
- Synapse destination by @jorritsandbrink in #900
- BigQuery Partitioning Improvements by @Pipboyguy in #887
- enable async generators as resources by @sh-rp in #905
- fix: use truthy value in ternary since 0 cause div by zero by @z3z1ma in #902
- feat(filesystem): add compression flag if the read file is GZ by @IlyaFaer in #912
- Enhancements in Filesystem Configuration by @Pipboyguy in #869
- add mark function to emit resource hints from decorated function by @rudolfix in #938
- handles nested Pydantic models when generating dlt schema by @sultaniman in #901
Docs
- Restructure intro, getting started and tutorial by @burnash in #702
- Update the release instructions in CONTRIBUTING.md by @burnash in #867
- Add explicit sub section about streamlit under getting started by @sultaniman in #884
- Examples: google sheets by @AstrakhantsevaAA in #846
- Added URL-parser documentation by @dat-a-man in #909
Verified Sources
- feat(filesystem): implement a csv reader with duckdb engine @IlyaFaer dlt-hub/verified-sources#319
- fix(notion): define payload within the while-loop @glebzhidkov (dlt-hub/verified-sources#338)
- sql alchemy + connector x example @rudolfix (dlt-hub/verified-sources#334)
- Shopify: Standalone resource for partner API queries @steinitzu (dlt-hub/verified-sources#329)
- sql-database: detect precision and scale of supported column types @steinitzu (dlt-hub/verified-sources#324)
- feat(sources.kafka): implement Kafka source @IlyaFaer (dlt-hub/verified-sources#306)
New Contributors
- @Pipboyguy made their first contribution in #869
- @sultaniman made their first contribution in #883
Full Changelog: 0.4.2...0.4.3
0.4.2
Core Library
- Fix the data type used in the
from_db_type()
method fromMsSqlTypeMapper
by @jorritsandbrink in #863 - Use Secret Manager in CI by @AstrakhantsevaAA in #859
- Move destination adapters to
dlt.destination.adapters
by @rudolfix in #854
Docs
- Improve HubSpot source docs by @IlyaFaer in #864
- Add new topic to docs: Destination; improve Configuration docs by @rudolfix in #861
Full Changelog: 0.4.1...0.4.2
0.4.1
Major release
This is a major dlt
release (as per our semantic versioninghttps://github.com/dlt-hub/dlt?tab=readme-ov-file#adding-as-dependency). It brings several interesting new features like: schema evolution control, data contracts, deeper Pydantic integration, parametrized destinations, improvements to parallelism and data lineage + many more
There are no significant breaking changes, but minor ones exist, please refer to #763 for details
Core Library
- Parametrized destinations - import destinations from
dlt.destinations
module and instantiate them: by @steinitzu in #746 - schema and data contracts by @sh-rp in #594
- load package id in extract step by @rudolfix in #790
- named destinations: configure many destinations with different names by @sh-rp in #783
- rich tracing information from pipeline steps (extract, normalize, load) by @rudolfix in #801
- adds exception stack to pipeline trace by @rudolfix in #806
- fixed attribute check: getuid -> geteuid by @jorritsandbrink in #823
- allows to run parallel pipelines in separate threads by @rudolfix in #813
- 791 test mssql credentialspy is odbc driver 18 dependent by @jorritsandbrink in #834
- adds extract and normalize traces by @rudolfix in #839
Plus some tooling changes
- introduce black formatting by @sh-rp in #583
- Fix: ensure accessor typing does not make static type checker error by @z3z1ma in #785
- Hot fix: add skipifgithubfork to nested_data example by @AstrakhantsevaAA in #802
- Fix Windows lint issue and implement CI lint matrix strategy by @jorritsandbrink in #827
Docs
- documents schema and data contract by @rudolfix in #782
- Added Kinesis documentation. by @dat-a-man in #804
- 788 clarify docs intro by @deanja in #797
- Fix links to source code by @AstrakhantsevaAA in #805
- Clarify docs dev process by @deanja in #809
- Qdrant ingestion pipeline example eg by @hibajamal in #775
- Personio doc: added more endpoints by @AstrakhantsevaAA in #829
New Contributors
- @deanja made their first contribution in #797
- @IlyaFaer made their first contribution in #820
- @jorritsandbrink made their first contribution in #823
Full Changelog: 0.3.25...0.4.1
0.4.1a2
🧪 pre-release of 0.4.x (do not use in production)
- fixed attribute check: getuid -> geteuid by @jorritsandbrink in #823
- allows to run parallel pipelines in separate threads by @rudolfix in #813
- parallel pipelines docs update: https://dlthub.com/devel/reference/performance#running-several-pipelines-in-parallel-in-single-process
New Contributors
- @IlyaFaer made their first contribution in #820
- @jorritsandbrink made their first contribution in #823
Full Changelog: 0.4.1a1...0.4.1a2
0.4.1a1
🧪 pre-release of 0.4.x (do not use in production)
load_id
is generated in extract step and carried till the end to improve data lineage by @rudolfix in #790- added destination names, environment and ability to configure them by custom name by @sh-rp in #783
- step info (extract, normalize, load) contain list of load packages in traces by @rudolfix in #801
- adds exception traces to run trace by @rudolfix in #806
consult #763 for a list of major changes compared to 0.3.x version
Full Changelog: 0.4.1a0...0.4.1a1
0.4.1a0
🧪 pre-release of 0.4.x (do not use in production)
- Parametrized destinations by @steinitzu in #746
- schema contract by @sh-rp in #594
- source and schema changes by @sh-rp in #769
- introduce black formatting by @sh-rp in #583
- docs updates by @sh-rp in #784
- documents schema and data contract by @rudolfix in #782
- Fix: ensure accessor typing does not make static type checker error by @z3z1ma in #785
- prototype platform connection by @sh-rp in #727
Full Changelog: 0.3.24...0.4.1a0
0.3.25
Core Library
- Add authenticator to SnowflakeCredentials class by @gjdevincentis in #734
- Set port correctly in mssql connection string by @steinitzu in #731
- Empty rows fix by @steinitzu in #745
- Nit: Remove py.typed file that makes Pyright incessant by @z3z1ma in #732
- adds more name hashes to telemetry by @rudolfix in #764
- Autodetector for ISO date strings by @codingcyclist in #767
- pipeline: drop pending packages by @rudolfix in #771
Docs
- Copy improvements in the SQL Database verified source by @anuunchin in #749
- Docs: add Examples contributing doc by @AstrakhantsevaAA in #743
- Example: nested mongo data by @AstrakhantsevaAA in #737
- Documents how to use dbt wrapper without pipeline by @rudolfix in #733
New Contributors
- @gjdevincentis made their first contribution in #734
- @anuunchin made their first contribution in #749
Full Changelog: 0.3.24...0.3.25