This Week in Databend #86
PsiACEMar 24, 2023
Databend is a modern cloud data warehouse, serving your massive-scale analytics needs at low cost and complexity. Open source alternative to Snowflake. Also available in the cloud: https://app.databend.com .
What's On In Databend
Stay connected with the latest news about Databend.
FlightSQL Handler in Progress
Flight SQL is an innovative open database protocol that caters to modern architectures. It boasts a columnar-oriented design and provides seamless support for parallel processing of data partitions.
The benefits of supporting FlightSQL include reducing serialization and deserialization during query execution, as well as easily supporting SDKs in different languages using predefined
*.proto
We're currently engaged in developing support for the FlightSQL Handler. If you're interested, refer to the following links:
Natural Language to SQL
By integrating with the popular AI services, Databend now provide you an efficient built-in solution - the
AI_TO_SQL
With this function, instructions written in natural language can be converted into SQL query statements aligned with table schema. With just a few modifications (or possibly none at all), it can be put into production.
SELECT * FROM ai_to_sql(
'List the total amount spent by users from the USA who are older than 30 years, grouped by their names, along with the number of orders they made in 2022',
'<openai-api-key>');
*************************** 1. row ***************************
database: openai
generated_sql: SELECT name, SUM(price) AS total_spent, COUNT(order_id) AS total_orders
FROM users
JOIN orders ON users.id = orders.user_id
WHERE country = 'USA' AND age > 30 AND order_date BETWEEN '2022-01-01' AND '2022-12-31'
GROUP BY name;
The function is now available on both Databend and Databend Cloud. To learn more about how it works, refer to the following links:
Code Corner
Discover some fascinating code snippets or projects that showcase our work or learning journey.
Vector Similarity Calculation in Databend
Databend has added a new function called
cosine_distance
from
to
select cosine_distance([3.0, 45.0, 7.0, 2.0, 5.0, 20.0, 13.0, 12.0], [2.0, 54.0, 13.0, 15.0, 22.0, 34.0, 50.0, 1.0]) as sim
----
0.1264193
The Rust implementation efficiently performs calculations by utilizing the
ArrayView
pub fn cosine_distance(from: &[f32], to: &[f32]) -> Result<f32> {
if from.len() != to.len() {
return Err(ErrorCode::InvalidArgument(format!(
"Vector length not equal: {:} != {:}",
from.len(),
to.len(),
)));
}
let a = ArrayView::from(from);
let b = ArrayView::from(to);
let aa_sum = (&a * &a).sum();
let bb_sum = (&b * &b).sum();
Ok((&a * &b).sum() / ((aa_sum).sqrt() * (bb_sum).sqrt()))
}
Do you remember how to register scalar functions in Databend? You can check Doc | How to Write a Scalar Function and PR | #10737 to verify your answer.
Highlights
Here are some noteworthy items recorded here, perhaps you can find something that interests you.
- Learn how to monitor Databend using Prometheus and Grafana: Doc | Monitor - Prometheus & Grafana
- Metabase Databend Driver helps you connect Databend to Metabase and dashboard your data: Doc | Integrations - Metabase
- Databend now supports ,
PIVOT
,UNPIVOT
andGROUP BY CUBE
query syntax. For more information, please see PR #10676 and #10601.GROUP BY ROLLUP
What's Up Next
We're always open to cutting-edge technologies and innovative ideas. You're more than welcome to join the community and bring them to Databend.
Enable -Zgitoxide
to Speed up Git Dependencies Download
-Zgitoxide
Enabling
-Zgitoxide
This feature integrates cargo with gitoxide, a pure Rust implementation of Git that is idiomatic, lean, fast, and safe.
Issue #10466 | CI: Enable -Zgitoxide
Please let us know if you're interested in contributing to this issue, or pick up a good first issue at https://link.databend.com/i-m-feeling-lucky to get started.
New Contributors
We always open arms to everyone and can't wait to see how you'll help our community grow and thrive.
- @SkyFan2002 made their first contribution in #10656. This pull request aimed to resolve inconsistent results caused by variations in column name case while executing SQL statements with .
EXCLUDE
Changelog
You can check the changelog of Databend Nightly for details about our latest developments.
Full Changelog:https://github.com/datafuselabs/databend/compare/v1.0.22-nightly...v1.0.33-nightly
Subscribe to our newsletter
Stay informed on feature releases, product roadmap, support, and cloud offerings!