This Week in Databend #98
PsiACEJun 18, 2023
Databend is a modern cloud data warehouse, serving your massive-scale analytics needs at low cost and complexity. Open source alternative to Snowflake. Also available in the cloud: https://app.databend.com .
What's On In Databend
Stay connected with the latest news about Databend.
Background Service
Databend Internal Storage FuseTable is similar to Apache Iceberg, a Log Structured Table that requires regular table compaction, re-clustering, and vacuuming to merge small data chunks. The process involves sorting the data by the cluster key or vacuuming unneeded branches.
Previously, different drivers were used for these implementations which added complexity to the infrastructure. Additional services had to be deployed and maintained to trigger driver events. To simplify this process, we have introduced a Background Service that allows Databend to run as a background one-shot job or daemon for running cron jobs. These jobs can trigger table maintenance tasks such as automatic compaction/vacuum/reclustering based on criteria without additional maintenance needed.
This implementation includes:
- Complete metasrv schema definition and background_job and background_tasks.
- APIs for updating and maintaining background_job and background_task state on meta-service.
- Simplified job scheduler implementation which support ,
one_shot
,interval
job type.cron
Background Service
If you are interested in learning more, please check out the resources listed below:
Code Corner
Discover some fascinating code snippets or projects that showcase our work or learning journey.
IceLake - Pure Rust Iceberg Implementation
Iceberg, an open table format for analytics, lacks a mature Rust binding, making integration with databases like Databend difficult. IceLake aims to fill this gap and build an open ecosystem that:
- Users can read/write iceberg tables from ANY storage services like s3, gcs, azblob, hdfs and so on.
- ANY Databases can integrate with to facilitate reading and writing of iceberg tables.
icelake
- Provides NATIVE support transmute between s.
arrow
- Provides bindings so that other languages can work with iceberg tables powered by Rust core.
If you are interested in learning more, please check out the resources listed below:
Highlights
We have also made these improvements to Databend that we hope you will find helpful:
- Added support for MERGE JOIN.
- Add support for column position to CSV format.
- Read Docs | Computed Columns to understand how to use computed columns and the trade-offs when choosing which type to adopt.
- Read Docs | Subquery-Based Deletions to learn how to use subquery operators and comparison operators to achieve the desired deletion.
What's Up Next
We're always open to cutting-edge technologies and innovative ideas. You're more than welcome to join the community and bring them to Databend.
Support VALIDATION_MODE
for COPY INTO
VALIDATION_MODE
We hope to support the
VALIDATION_MODE
COPY INTO
- : This mode validates the data and returns all errors.
RETURN_ERRORS
- : This mode validates
RETURN_<number>_ROWS
rows of data. If there are no errors, it returns the loaded information. If there are any errors encountered, it will throw errors.<number>
Issue #11582 | Feature: copy support VALIDATION_MODE
Please let us know if you're interested in contributing to this issue, or pick up a good first issue at https://link.databend.com/i-m-feeling-lucky to get started.
New Contributors
We always open arms to everyone and can't wait to see how you'll help our community grow and thrive.
- @jonahgao made their first contribution in #11718. Fixed column types of MySQLClient.
- @akoshchiy made their first contribution in #11783. Updated value.
MACOSX_DEPLOYMENT_TARGET
Changelog
You can check the changelog of Databend Nightly for details about our latest developments.
Full Changelog: https://github.com/datafuselabs/databend/compare/v1.1.56-nightly...v1.1.64-nightly
Subscribe to our newsletter
Stay informed on feature releases, product roadmap, support, and cloud offerings!