[!NOTE]
This repo is achvieved, please checkout the project Bolt for the same requirements
ByteDance Bolt Parquet Reader
ByteDance’s next generation universal high-performance Parquet Reader.
Design Goal
The Bolt Parquet Reader is a native Parquet Reader in Rust language.
This design supports steaming reading, which allows to read the whole batch in smaller batches and reduce the peak memory cost. And, as a result, it is able to increase the overall parallelism.
Moreover, Bolt Parquet Reader is designed with a sophisticated filter push down strategies and range selectivity operations. Considering the consecutive reading progress, this feature is able to reduce unnecessary branching operations.
Roadmap
This project is under actively development. You are more than welcomed to make contributions.
ByteDance Bolt Parquet Reader
ByteDance’s next generation universal high-performance Parquet Reader.
Design Goal
The Bolt Parquet Reader is a native Parquet Reader in Rust language.
This design supports steaming reading, which allows to read the whole batch in smaller batches and reduce the peak memory cost. And, as a result, it is able to increase the overall parallelism.
Moreover, Bolt Parquet Reader is designed with a sophisticated filter push down strategies and range selectivity operations. Considering the consecutive reading progress, this feature is able to reduce unnecessary branching operations.
Roadmap
This project is under actively development. You are more than welcomed to make contributions.
How to Compile
1. Pull the code
2. Prepare Rust Environment
3. Compile and Execute
License
The Bolt Parquet Reader is licensed under Apache 2.0.
During the development, we referenced a lot to Rust Arrow 2 implementation and would like to express our appreciation the authors.