Skip to content

v0.1.2 - 2019-09-11

Compare
Choose a tag to compare
@alecmocatta alecmocatta released this 11 Sep 00:09
6910ff5

Changes:

  • 6910ff5 v0.1.2
  • 18171b6 Merge pull request #4 from alecmocatta/granular-build
  • 700dce6 disable common_crawl example
  • c199389 fix common_craewl example depending on constellation
  • 7f1544f ignore parquet tests that relied on derive(Record)
  • 7ab8928 make constellation an optional dependency
  • b575b19 Merge remote-tracking branch 'template/master' into granular-build
  • f64bc9e Merge pull request #3 from alecmocatta/wip-label
  • 6093903 spaces rather than hyphens in wip label
  • 40c72d0 only build commoncrawl example when feature emabled
See More
  • 54b96f5 conditionally build connectors
  • e9777bd Merge pull request #3 from alecmocatta/warc-parser
  • 8110789 Merge warc_nom_parser
  • 10399b2 Merge pull request #1 from alecmocatta/wip
  • a9a61cf retry all aws calls on http error
  • 6978df0 set empty features for ci doc build
  • 157ee87 fix parquet test failure
  • 33ac751 disable postgres test until we install it on CI
  • 2c29261 fix up crate structure
  • 062685c Merge amadeus-parquet
  • 0c0fd44 use a single refcounted bufreader
  • 26d4344 retry s3 read on http failure
  • af0a57a fix crashes in tokio and hyper by switching to shared threadpool
  • e77abbf working with parquet directory locally
  • 79eb81c working with parquet directory on s3
  • 037b6ab single parquet file on s3 working
  • 8f21199 introduce genericness over filesystem
  • 63134f9 fix temporary hangs in the thread and process pools; make the pools async; simplify the SerDe traits with ProcessSend
  • 8b9862a derive(Data) working in main crate
  • f55da91 build fixes
  • 5996e29 fmt
  • 6247b61 fix up docs
  • 9a6458f rename
  • 19f8675 organise into multiple crates
  • 8c943e4 Merge remote-tracking branch 'upstream/master'
  • 0f59a42 Merge branch 'dependabot/cargo/url-2.1'
  • 2319d8b impl Iterator returning WebpageOwned
  • 301850c clippy clean
  • afa2fa9 building again!
  • e2cbd18 avoid infinitely growing vecs in benchmarks
  • 318edf2 use oxidised flate
  • cdcf32c clippy clean derive
  • 85de9eb fix Cargo.toml
  • 6d719a3 fix build
  • 47524fd rustfmt
  • 171b446 Merge remote-tracking branch 'template/master'
  • 393bb02 Revert "Remove benchmarks, to be re-added in a separate PR"
  • c89a3c9 Revert "Remove get_row_iter argument"
  • 4982b27 Revert "Revert "Fix some UB; make clippy less unhappy; cleanup""
  • a6257fe rename crates
  • 4368e9a update dependencies
  • 92b6eb4 update version of parquet-format used
  • 6aa7858 remove arrow
  • 1370646 Revert "Remove #[derive(Record)] for another commit"
  • 209f126 strip down to just parquet
  • a3a857b fix tests
  • 6a724b3 Remove get_row_iter argument; document sound usage of unsafe
  • 0c3a6a2 Integrate @sunchao's feedback
  • 56cee06 Remove #[derive(Record)] for another commit
  • cef0a14 Don't rely on existential types; Docs now build; Doc improvements
  • ad20e15 Use a faster hashmap for Group, and wrap it in Arc rather than Rc so that Group, Value etc are Send; Return Err on overflowing Timetamp math
  • 58d3033 Implement From/Into<chrono::DateTime> for Timestamp
  • 8b119c3 Make derive macro more hygienic
  • b2109f2 Change derive const from DESERIALIZE -> RECORD
  • d2f5c37 Make RowIter own rather than borrow FileReader
  • 800ed3b Return errors thru iterator
  • b717fe7 Remove invalid map derived test
  • a1f6e60 More permissive with unknown LogicalTypes, Faster [u8; N]; Docs
  • f993f27 Document more and add invalid map test
  • 4f7fd4d Document various components
  • 66ffc7b Revert "Fix some UB; make clippy less unhappy; cleanup"
  • 9e8dc2f Avoid need for box_syntax feature
  • 73c3f03 List Rust <-> Parquet type correspondence
  • d10e56e Impl Default for Schemas where sensical; clean up interface
  • 91a5c1d Unbox result of FileReader::get_row_group so one can call get_row_iter on it
  • adfe532 Update time tests
  • ac567c6 Avoid resizing Group's Vec during read
  • d741e54 Clean up warnings
  • 6d0c451 Test parsing and printing of schemas
  • eb09562 Reflect upstream rustfmt config change
  • f66659d Better printing of schema
  • aae6114 Add projection tests using derive macro
  • bfaf031 Move Deserialize into record; minor cleanup
  • c5d168f Remove benchmarks, to be re-added in a separate PR
  • e3c81e2 Avoid Boxing all Value::Options
  • b55c0a1 Enable existential types; From and PartialEq for Value
  • 2b3355c Add benchmarks from original repo
  • 37d9c8e Remove unnecessary state from readers; Remove Root from Deserialize bounds so can deserialize foreign structs; Test #[derive(Deserialize)]; Better honour TripletIter API
  • f386de9 Fix derive macro on structs with type parameters
  • f07d038 Initial derive macro implementation
  • fd19a87 Add license headers
  • cb813e7 Avoid stack overflow by boxing RleDecoder's array
  • d4633b4 Implement Date, Time, Decimal, Bson, Json, Enum
  • 99d2d44 Make Value::as_* take &self; Add Value::into_*
  • 2ed208d Enable typed reading of records
  • 6f071a5 Nom upgrade 4 (#8)
  • 02e54e9 Fix misspelling of "parsing" in README (#7)
  • affa53f ARROW-6130: [Release] Use 0.15.0 as the next release [ #5007 ]
  • 9a1cba0 ARROW-6069: [Rust] [Parquet] Add converter. [ #4997 ]
  • 7a7e1c7 Update url requirement from 1.7 to 2.1
  • b818eba Merge pull request #2 from alecmocatta/auto-releases
  • 2608725 Add endpoint parameter for automating releases
  • 63f2da3 ARROW-4365: [Rust] [Parquet] Implement arrow record reader. [ #4292 ]
  • d3fd8fa ARROW-6047: [Rust] Rust nightly 1.38.0 builds failing [ #4954 ]
  • d1f50da Merge pull request #1 from alecmocatta/alecmocatta-patch-1
  • 4899781 Update README.md
  • 5b514df disable broken targets
  • d3324eb add azure pipeline, mergify, rustfmt, readme
  • f0b6f82 cargo init --lib
  • b53f630 ARROW-5788: [Rust] Use both "path" and "version" for internal dependencies [ #4873 ]
  • 639e4e0 ARROW-5753: [Rust] Fix test failure in CI code coverage [ #4748 ]
  • c4a0e89 ARROW-5792: [Rust] Add TypeVisitor for parquet type. [ #4766 ]
  • 277bd77 [Release] Update versions for 1.0.0-SNAPSHOT
  • 3ab3e96 [Release] Update versions for 0.14.0
  • 0b743f8 ARROW-5755: [Rust] [Parquet] Derive clone for Type. [ #4719 ]
  • b0407af ARROW-5721: [Rust] Move array related code into a separate module [ #4687 ]
  • e18d922 ARROW-5455: [Rust] Build broken by 2019-05-30 Rust nightly [ #4429 ]
  • 344265c ARROW-5371: [Release] Add tests for dev/release/00-prepare.sh [ #4343 ]
  • acfa308 PARQUET-1402: [C++] Parquet files with dictionary page offset as 0 is not readable [ #4359 ]
  • c96f24c ARROW-5317: [Rust] [Parquet] impl IntoIterator for SerializedFileReader [ #4323 ]
  • d883712 ARROW-5281: [Rust] Extract DataPageBuilder to test common [ #4269 ]
  • 47d1701 ARROW-5217: [Rust] [DataFusion] Fix failing tests [ #4217 ]
  • b0b7ab9 ARROW-5189: [Rust] [Parquet] Format / display individual fields within a parquet row [ #4174 ]
  • 21a6232 ARROW-5184: [Rust] Broken links and other documentation warnings [ #4172 ]
  • 86addef ARROW-4467: [Rust] [DataFusion] Create a REPL & Dockerfile for DataFusion [ #4147 ]
  • 34befd5 ARROW-5129: [Rust] Column writer bug: check dictionary encoder when adding a new data page [ #4152 ]
  • b925ce8 ARROW-5162: [Rust] [Parquet] Rename mod reader to arrow. [ #4145 ]
  • 9abec21 ARROW-5127: [Rust] [Parquet] Add page iterator. [ #4136 ]
  • 28f951d ARROW-5126: [Rust] [Parquet] Convert parquet column desc to arrow data type [ #4117 ]
  • c6f10a3 ARROW-5053: [Rust] [DataFusion] Use ARROW_TEST_DATA env var [ #4068 ]
  • e3eefb9 [Release] Update versions for 0.14.0-SNAPSHOT
  • ae9349f [Release] Update versions for 0.13.0
  • be9b0c1 ARROW-4908: [Rust] [DataFusion] Add support for date/time parquet types encoded as INT32/INT64 [ #3940 ]
  • 9add112 ARROW-4466: [Rust] [DataFusion] Add support for Parquet data source [ #3851, #2 ]
  • 72f03e8 ARROW-3954: [Rust] Add Slice to Array and ArrayData [ #3856 ]
  • 82c098e Refactor: par_iter; move bounds to iterator adapters rather than reducers
  • 6b7b3cb ARROW-4071: [Rust] Add rustfmt as a pre-commit hook [ #3753 ]
  • fabe106 ARROW-4072: [Rust] Set default value for PARQUET_TEST_DATA [ #3783 ]
  • a7f7e57 ARROW-4678: [Rust] Minimize unstable feature usage [ #3764 ]
  • 190b55a ARROW-4634: [Rust] [Parquet] Reorganize test_common [ #3710 ]
  • f94ec1f ARROW-4680: [CI] [Rust] Travis CI builds fail with latest Rust 1.34.0… [ #3756 ]
  • e07bcf1 impl Source for Json
  • 8c93d8e postgres working with Data; pre Data::schema
  • bfbd5c7 parquet, csv, json, nascent postgres
  • da7f371 ARROW-4525: [Rust] [Parquet] Enable conversion of ArrowError to ParquetError [ #3603 ]
  • 3fc43b5 ARROW-4476: [Rust] [DataFusion] Update README to cover DataFusion and new testing git submodule [ #3558 ]
  • db6de40 ARROW-4061: [Rust] [Parquet] Implement spaced version for non-diction… [ #3510 ]
  • 5967e24 ARROW-4263: [Rust] Donate DataFusion [ #3399 ]
  • 09fb56a ARROW-4459: [Testing] Add arrow-testing repo as submodule [ #3547 ]
  • 96e9377 Namespace all imports in the derive macro
  • 6975a9f Data trait, parquet and serde working
  • 7898332 pre cleanup
  • 0242fe5 ARROW-4393: [Rust] coding style: apply 90 characters per line limit [ #3501 ]
  • 60611ef PARQUET-1494: [C++] Recognize statistics built with UNSIGNED sort order by parquet-mr 1.10.0 onwards [ #3441 ]
  • 63c5c4b ARROW-4305: [Rust] Fix parquet version number in README [ #3437 ]
  • ca10537 [Release] Update versions for 0.13.0-SNAPSHOT
  • 3b3b145 ARROW-4271: [Rust] Move Parquet specific info to Parquet Readme [ #3412 ]
  • fa27998 move parquet stuff into parquet lib
  • 51fbf1b [Release] Update versions for 0.12.0
  • a1b4251 Schema: Debug + DebugType
  • 40e4622 ARROW-4060: [Rust] Add parquet arrow converter. [ #3279 ]
  • 6baef5e ARROW-4188: [Rust] Move Rust README to top level rust directory [ #3342 ]
  • 0998f82 ARROW-4151: [Rust] Restructure project directories [ #3325 ]
  • 7fb6b59 ARROW-4171: [Rust] fix parquet crate release version [ #3324 ]
  • 168f838 ARROW-4160: [Rust] Add README and executable files to parquet [ #3314 ]
  • a280e0d ARROW-4137: [Rust] Move parquet code into a separate crate [ #3291 ]
  • bc5b960 ARROW-4080: [Rust] Improving lengthy build times in Appveyor [ #3231 ]
  • 967a891 PARQUET-1481: [C++] Throw exception when encountering bad Thrift metadata in RecordReader [ #3242 ]
  • 2ab0c14 ARROW-2560: [Rust] The Rust README should include Rust-specific information on contributing [ #3210 ]
  • f24ee4e ARROW-4028: [Rust] Merge parquet-rs codebase [ #3050 ]
  • 6f8d4db dry-er
  • 0e8864e Value working, fixed timestamps
  • 4af7527 staticify
  • 06140e1 ARROW-3880: [Rust] Implement simple math operations for numeric arrays [ #3033 ]
  • cfbc823 ARROW-3885: [Rust] Release prepare step should increment Rust version [ #3096 ]
  • 1ca033f ARROW-3952: [Rust] Upgrade to Rust 2018 Edition [ #3119 ]
  • 00bc2cf pre struct
  • da061b1 parquet: elevate field names
  • 3149d4d parquet initial
  • e6637bb ARROW-3883: [Rust] Update README [ #3105 ]
  • e478358 minor cleanup
  • b4fd966 impl Clone for Cloudfront::Error
  • 726e871 add cloudfront logs support
  • f418ae6 ARROW-3855: [Rust] Schema/Field/Datatype now have derived serde traits [ #3016 ]
  • b15ceda add short-circuiting any and all methods, although they don't currently pre-empt already running tasks
  • 548e298 add common_crawl source, bump to 2018 edition
  • 0d77ea9 add more methods, incl some streaming algos
  • 3a81d98 ARROW-3726: [Rust] Add CSV reader with example [ #2992 ]
  • 693e1cf ARROW-3796: [Rust] Add Example for PrimitiveArrayBuilder [ #2969 ]
  • 3d14530 ARROW-3601: [Rust] Add instructions for publishing to crates.io [ #2823 ]
  • 0c646d9 ARROW-3664: [Rust] Add benchmark for PrimitiveArrayBuilder [ #2903 ]
  • 315e1bb minor cleanup
  • c038dbc cleanup
  • a1f2b16 initial commit
  • 06f83ca PARQUET-1160: [C++] Implement BYTE_ARRAY-backed Decimal reads [ #2646 ]
  • cf398c0 ARROW-3075: [C++] Incorporate parquet-cpp codebase into Arrow C++ build
  • c5a3f28 ARROW-2583: [Rust] Buffer should be typeless [ #2330 ]
  • e908bf3 ARROW-3035 [Rust] Examples in README.md do not run [ #2418 ]
  • 55ac71d streaming parser
  • 3bcedf6 ARROW-2908: [Rust] Update version to 0.10.0 [ #2321 ]
  • e20358f ARROW-2557: [Rust] Add badge for code coverage in README [ #2014 ]
  • ea75dde ARROW-2420: [Rust] Fix major memory bug and add benches [ #1860 ]
  • 2ef5b24 ARROW-2398: [Rust] Create Builder for building buffers directly in aligned memory [ #1838 ]
  • 59eb0b2 ARROW-2385: [Rust] implement to_json for DataType and Field [ #1829 ]
  • 43e47e4 [Rust] Update READMEs to add Rust libraries link and to remove out-of-data comment about memory alignment (#1817)
  • efe0480 ARROW-2361: [Rust] Starting point for a native Rust implementation of Arrow [ #1804 ]
  • e7b2d75 faster
  • 45df83e 1.0.1
  • af5d8d2 Update to nom 2.0 (#6)
  • da49a56 [wip] Streaming parsing (#5)
  • 09612d0 Upgrade from 0.3 to 1.2 (#4)
  • 5c1d6c5 Return IResult::Incomplete if the input buffer doesn't include a full record. (#3)
  • fea3ee8 Markdown cleanup
  • aba7c5b Markdown cleanup
  • a9f85ec Build button!
  • 941920d Clean up toml and test config
  • 0c34659 Update version to 1.0.0
  • 372bfc1 Update to build without warnings on Rust 1.8.0 (#2)
  • b45c44f Location updates
  • 9027e6b Best i can do with docs
  • 2a9904e Update doc
  • 551461d Light docs
  • dd14e88 Remove prints
  • e61d7c2 Works
  • 8a6320d Single test is working
  • 90f7e34 Fix warnings
  • d8cfc02 Format and shrink warc
  • b5312f8 Move the test file
  • bfe80f4 Test pass
  • a4729be Move to [u8]s
  • 53f8716 Merge pull request #1 from mbrubeck/sample-fix
  • 63dec8b Read test input as bytes instead of strings
  • 396fff4 Utf8 error again
  • 7edd899 IAH was not readable
  • 128b702 Parsed truncated bbc
  • e727c37 Move test file
  • 2348bd9 nom parser

This list of changes was auto generated.