This Week in Databend #136
PsiACEMar 18, 2024
Databend is a modern cloud data warehouse, serving your massive-scale analytics needs at low cost and complexity. Open source alternative to Snowflake. Also available in the cloud: https://app.databend.com .
What's New
Stay informed about the latest features of Databend.
Understanding Tasks and Notifications in Databend
Databend now supports a comprehensive mechanism for tasks and notifications.
Tasks are executed according to a schedule or based on a DAG of tasks, executing specified SQL statements. With notification integrations, notifications can be sent to external messaging services.
CREATE TASK IF NOT EXISTS mytask
WAREHOUSE = 'mywh'
SCHEDULE = 30 SECOND
ERROR_INTEGRATION = 'myerror'
AS
BEGIN
BEGIN;
INSERT INTO mytable(ts) VALUES(CURRENT_TIMESTAMP);
DELETE FROM mytable WHERE ts < DATEADD(MINUTE, -5, CURRENT_TIMESTAMP());
COMMIT;
END;
The above example defines a task named mytask that runs every 30 seconds on the mywh compute cluster. The task executes a multi-statement transaction that includes an
INSERT
DELETE
myerror
The mechanisms related to tasks and notifications are ready to use out of the box in Databend Cloud. If you would like to learn more, please contact the Databend team or refer to the resources listed below:
Code Corner
Discover some fascinating code snippets or projects that showcase our work or learning journey.
Databend vs. Snowflake: Data Ingestion Benchmark
We conducted four specific benchmarks to evaluate Databend Cloud versus Snowflake:
- TPC-H SF100 Dataset Loading: Focuses on loading performance and cost for a large-scale dataset (100GB, ~600 million rows).
- ClickBench Hits Dataset Loading: Tests efficiency in loading a wide-table dataset (76GB, ~100 million rows, 105 columns), emphasizing challenges associated with high column counts.
- 1-Second Freshness: Measures the platforms' ability to ingest data within a strict 1-second freshness requirement.
- 5-Second Freshness: Compares the platforms' data ingestion capabilities under a 5-second freshness constraint.
Data Loading Benchmark
Freshness Benchmark
Welcome to read the following documentation to understand the low-cost, high-performance data ingestion of Databend Cloud.
Highlights
We have also made these improvements to Databend that we hope you will find helpful:
- Added support for spill in .
CROSS JOIN
- Added support for spill in the new aggregate hash table.
- Added support for refreshing inverted indexes.
- Added more function aliases for time and date related functions to support more data analysis tools.
What's Up Next
We're always open to cutting-edge technologies and innovative ideas. You're more than welcome to join the community and bring them to Databend.
Adding support for TOP K Syntax
Databend plans to support the
SELECT TOP
The following
SELECT TOP
select TOP 4 c1 from testable ORDER BY c1;
Equivalent to the following
SELECT ... LIMIT
select c1 from testable order by c1 limit 4;
This is a good first issue, aimed at guiding everyone interested in Rust and Databend to participate.
Issue #14972 | Feature: top k syntax support
Please let us know if you're interested in contributing to this feature, or pick up a good first issue at https://link.databend.com/i-m-feeling-lucky to get started.
New Contributors
We always open arms to everyone and can't wait to see how you'll help our community grow and thrive.
- @suimenno3002 implemented support for histogram aggregate functions, #14839.
Changelog
You can check the changelog of Databend Nightly for details about our latest developments.
Full Changelog: https://github.com/datafuselabs/databend/compare/v1.2.371-nightly...v1.2.378-nightly
Subscribe to our newsletter
Stay informed on feature releases, product roadmap, support, and cloud offerings!