We're excited to bring you the latest updates, new features, and improvements for Dec 2024 in Databend! We hope these enhancements are helpful, and we look forward to your feedback.
- Introducing Temp Tables
- New Data Type: Interval
- Support for User-Defined Aggregate Functions
- New Tutorial: Automating JSON Log Loading with Vector
- Enhanced Support for Parquet File Handling
- New Conversion Functions: TO_BINARY & TRY_TO_BINARY
- New String Function: JARO_WINKLER
- Introducing Private Cloud Management Platform
Introducing Temp Tables
You can now create a temporary table that is automatically dropped at the end of the session. A temporary table is visible only within the session that created it and is automatically dropped, with all data vacuumed, when the session ends. To create a temp table, use the CREATE TEMP TABLE command.
CREATE [ OR REPLACE ] { TEMPORARY | TEMP } TABLE
[ IF NOT EXISTS ]
[ <database_name>. ]<table_name>
...
New Data Type: Interval
The INTERVAL data type represents a duration of time, allowing precise manipulation and storage of time intervals across various units.
-- Create a table with one INTERVAL column
CREATE OR REPLACE TABLE intervals (duration INTERVAL);
-- Insert different types of INTERVAL data
INSERT INTO intervals VALUES
('1 year 2 months ago'), -- Natural language format with 'ago' (negative interval)
('1 year 2 months'), -- Natural language format without 'ago' (positive interval)
('1000000'), -- Positive numeric value interpreted as microseconds
('-1000000'); -- Negative numeric value interpreted as microseconds
-- Query the table to see the results
SELECT * FROM intervals;
┌──────────────────────────┐
│ duration │
├──────────────────────────┤
│ -1 year -2 months │
│ 1 year 2 months │
│ 0:00:01 │
│ -1 month -1 day -0:00:01 │
└──────────────────────────┘
Support for User-Defined Aggregate Functions
You can now define your own aggregate functions with Python and JavaScript.
CREATE or REPLACE FUNCTION weighted_avg (INT, INT) STATE {sum INT, weight INT} RETURNS FLOAT
LANGUAGE javascript AS $$
export function create_state() {
return {sum: 0, weight: 0};
}
export function accumulate(state, value, weight) {
state.sum += value * weight;
state.weight += weight;
return state;
}
export function retract(state, value, weight) {
state.sum -= value * weight;
state.weight -= weight;
return state;
}
export function merge(state1, state2) {
state1.sum += state2.sum;
state1.weight += state2.weight;
return state1;
}
export function finish(state) {
return state.sum / state.weight;
}
$$;
New Tutorial: Automating JSON Log Loading with Vector
In the tutorial, we simulate generating logs locally, collect them using Vector, store them in S3, and automate their ingestion into Databend Cloud using scheduled tasks.
To follow along, click here to access the full tutorial and start setting up your log ingestion pipeline.
Enhanced Support for Parquet File Handling
We’ve introduced new parameters to provide greater flexibility and precision when working with Parquet files.
Introducing the CASE_SENSITIVE Parameter
The CASE_SENSITIVE parameter determines whether column names in the queried Parquet files are treated with case sensitivity:
- (default): Column names are treated as case-insensitive, meaning
CASE_SENSITIVE => false
andb
are considered the same.B
- : Column names are treated as case-sensitive, meaning only exact matches (including case) are valid. For example, querying
CASE_SENSITIVE => true
will succeed if the column in the file is namedB
, but not if it is namedB
.b
For example, if you have a column named
MinTemp
CASE_SENSITIVE
false
SELECT MinTemp FROM '@mystage/weather.parquet'(CASE_SENSITIVE=>false);
SELECT MINTEMP FROM '@mystage/weather.parquet'(CASE_SENSITIVE=>false);
SELECT mintemp FROM '@mystage/weather.parquet'(CASE_SENSITIVE=>false);
When
CASE_SENSITIVE
true
SELECT `MinTemp` FROM '@mystage/weather.parquet'(CASE_SENSITIVE=>true);
New COPY INTO Option: COLUMN_MATCH_MODE
COLUMN_MATCH_MODE determines if column name matching during COPY INTO is case-sensitive or case-insensitive (default). For the full list of COPY INTO options, see copyOptions.
New Conversion Functions: TO_BINARY & TRY_TO_BINARY
TO_BINARY converts supported data types, including string, variant, bitmap, geometry, and geography, into their binary representation (hex format).
SELECT TO_BINARY('Databend');
┌───────────────────────┐
│ to_binary('Databend') │
├───────────────────────┤
│ 4461746162656E64 │
└───────────────────────┘
SELECT TO_BINARY(PARSE_JSON('{"key":"value", "number":123}')) AS binary_variant;
┌──────────────────────────────────────────────────────────────────────── ──┐
│ binary_variant │
├──────────────────────────────────────────────────────────────────────────┤
│ 40000002100000031000000610000005200000026B65796E756D62657276616C7565507B │
└──────────────────────────────────────────────────────────────────────────┘
SELECT TO_BINARY(TO_BITMAP('10,20,30')) AS binary_bitmap;
┌──────────────────────────────────────────────────────────────────────┐
│ binary_bitmap │
├──────────────────────────────────────────────────────────────────────┤
│ 0100000000000000000000003A3000000100000000000200100000000A0014001E00 │
└──────────────────────────────────────────────────────────────────────┘
SELECT TO_BINARY(ST_GEOMETRYFROMWKT('SRID=4326;POINT(1.0 2.0)')) AS binary_geometry;
┌────────────────────────────────────────────────────┐
│ binary_geometry │
├────────────────────────────────────────────────────┤
│ 0101000020E6100000000000000000F03F0000000000000040 │
└────────────────────────────────────────────────────┘
SELECT TO_BINARY(ST_GEOGRAPHYFROMEWKT('SRID=4326;POINT(-122.35 37.55)')) AS binary_geography;
┌────────────────────────────────────────────────────┐
│ binary_geography │
├────────────────────────────────────────────────────┤
│ 0101000020E61000006666666666965EC06666666666C64240 │
└───────────────────────────────────────────────── ───┘
TRY_TO_BINARY is an enhanced version of TO_BINARY that converts an input expression to a binary value, returning
NULL
SELECT TRY_TO_BINARY(PARSE_JSON(NULL)) AS binary_variant_invalid_json;
┌─────────────────────────────┐
│ binary_variant_invalid_json │
├─────────────────────────────┤
│ NULL │
└─────────────────────────────┘
New String Function: JARO_WINKLER
JARO_WINKLER calculates the Jaro-Winkler distance between two strings. It is commonly used for measuring the similarity between strings, with values ranging from 0.0 (completely dissimilar) to 1.0 (identical strings).
SELECT JARO_WINKLER('databend', 'Databend') AS similarity;
┌────────────────────┐
│ similarity │
├────────────────────┤
│ 0.9166666666666666 │
└────────────────────┘
SELECT JARO_WINKLER('databend', 'database') AS similarity;
┌────────────┐
│ similarity │
├────────────┤
│ 0.9 │
└────────────┘
Introducing Private Cloud Management Platform
Databend now supports private cloud deployments with the addition of a Private Cloud Management Platform, enabling seamless monitoring and management of Databend node statuses.
Subscribe to our newsletter
Stay informed on feature releases, product roadmap, support, and cloud offerings!