Blog

Databend Monthly (Dec 2024)

avatarEricDec 25, 2024
Databend Monthly (Dec 2024)

We're excited to bring you the latest updates, new features, and improvements for Dec 2024 in Databend! We hope these enhancements are helpful, and we look forward to your feedback.

Introducing Temp Tables

You can now create a temporary table that is automatically dropped at the end of the session. A temporary table is visible only within the session that created it and is automatically dropped, with all data vacuumed, when the session ends. To create a temp table, use the CREATE TEMP TABLE command.

Syntax
CREATE [ OR REPLACE ] { TEMPORARY | TEMP } TABLE 
[ IF NOT EXISTS ]
[ <database_name>. ]<table_name>
...

New Data Type: Interval

The INTERVAL data type represents a duration of time, allowing precise manipulation and storage of time intervals across various units.

-- Create a table with one INTERVAL column
CREATE OR REPLACE TABLE intervals (duration INTERVAL);

-- Insert different types of INTERVAL data
INSERT INTO intervals VALUES
('1 year 2 months ago'), -- Natural language format with 'ago' (negative interval)
('1 year 2 months'), -- Natural language format without 'ago' (positive interval)
('1000000'), -- Positive numeric value interpreted as microseconds
('-1000000'); -- Negative numeric value interpreted as microseconds

-- Query the table to see the results
SELECT * FROM intervals;

┌──────────────────────────┐
│ duration │
├──────────────────────────┤
-1 year -2 months │
1 year 2 months │
0:00:01
-1 month -1 day -0:00:01
└──────────────────────────┘

Support for User-Defined Aggregate Functions

You can now define your own aggregate functions with Python and JavaScript.

CREATE or REPLACE FUNCTION weighted_avg (INT, INT) STATE {sum INT, weight INT} RETURNS FLOAT
LANGUAGE javascript AS $$
export function create_state() {
return {sum: 0, weight: 0};
}
export function accumulate(state, value, weight) {
state.sum += value * weight;
state.weight += weight;
return state;
}
export function retract(state, value, weight) {
state.sum -= value * weight;
state.weight -= weight;
return state;
}
export function merge(state1, state2) {
state1.sum += state2.sum;
state1.weight += state2.weight;
return state1;
}
export function finish(state) {
return state.sum / state.weight;
}
$$;

New Tutorial: Automating JSON Log Loading with Vector

In the tutorial, we simulate generating logs locally, collect them using Vector, store them in S3, and automate their ingestion into Databend Cloud using scheduled tasks.

Automating JSON Log Loading with Vector

To follow along, click here to access the full tutorial and start setting up your log ingestion pipeline.

Enhanced Support for Parquet File Handling

We’ve introduced new parameters to provide greater flexibility and precision when working with Parquet files.

Introducing the CASE_SENSITIVE Parameter

The CASE_SENSITIVE parameter determines whether column names in the queried Parquet files are treated with case sensitivity:

  • CASE_SENSITIVE => false
    (default): Column names are treated as case-insensitive, meaning
    b
    and
    B
    are considered the same.
  • CASE_SENSITIVE => true
    : Column names are treated as case-sensitive, meaning only exact matches (including case) are valid. For example, querying
    B
    will succeed if the column in the file is named
    B
    , but not if it is named
    b
    .

For example, if you have a column named

MinTemp
in a Parquet file, you can query it using one of the following statements when
CASE_SENSITIVE
is set to
false
:

SELECT MinTemp FROM '@mystage/weather.parquet'(CASE_SENSITIVE=>false);

SELECT MINTEMP FROM '@mystage/weather.parquet'(CASE_SENSITIVE=>false);

SELECT mintemp FROM '@mystage/weather.parquet'(CASE_SENSITIVE=>false);

When

CASE_SENSITIVE
is set to
true
, you must use the exact column name as it appears in the file:

SELECT `MinTemp` FROM '@mystage/weather.parquet'(CASE_SENSITIVE=>true);

New COPY INTO Option: COLUMN_MATCH_MODE

COLUMN_MATCH_MODE determines if column name matching during COPY INTO is case-sensitive or case-insensitive (default). For the full list of COPY INTO options, see copyOptions.

New Conversion Functions: TO_BINARY & TRY_TO_BINARY

TO_BINARY converts supported data types, including string, variant, bitmap, geometry, and geography, into their binary representation (hex format).

SELECT TO_BINARY('Databend');

┌───────────────────────┐
│ to_binary('Databend')
├───────────────────────┤
4461746162656E64 │
└───────────────────────┘

SELECT TO_BINARY(PARSE_JSON('{"key":"value", "number":123}')) AS binary_variant;

┌──────────────────────────────────────────────────────────────────────────┐
│ binary_variant │
├──────────────────────────────────────────────────────────────────────────┤
40000002100000031000000610000005200000026B65796E756D62657276616C7565507B │
└──────────────────────────────────────────────────────────────────────────┘

SELECT TO_BINARY(TO_BITMAP('10,20,30')) AS binary_bitmap;

┌──────────────────────────────────────────────────────────────────────┐
│ binary_bitmap │
├──────────────────────────────────────────────────────────────────────┤
0100000000000000000000003A3000000100000000000200100000000A0014001E00 │
└──────────────────────────────────────────────────────────────────────┘

SELECT TO_BINARY(ST_GEOMETRYFROMWKT('SRID=4326;POINT(1.0 2.0)')) AS binary_geometry;

┌────────────────────────────────────────────────────┐
│ binary_geometry │
├────────────────────────────────────────────────────┤
0101000020E6100000000000000000F03F0000000000000040 │
└────────────────────────────────────────────────────┘

SELECT TO_BINARY(ST_GEOGRAPHYFROMEWKT('SRID=4326;POINT(-122.35 37.55)')) AS binary_geography;

┌────────────────────────────────────────────────────┐
│ binary_geography │
├────────────────────────────────────────────────────┤
0101000020E61000006666666666965EC06666666666C64240 │
└────────────────────────────────────────────────────┘

TRY_TO_BINARY is an enhanced version of TO_BINARY that converts an input expression to a binary value, returning

NULL
if the conversion fails instead of raising an error.

SELECT TRY_TO_BINARY(PARSE_JSON(NULL)) AS binary_variant_invalid_json;

┌─────────────────────────────┐
│ binary_variant_invalid_json │
├─────────────────────────────┤
NULL
└─────────────────────────────┘

New String Function: JARO_WINKLER

JARO_WINKLER calculates the Jaro-Winkler distance between two strings. It is commonly used for measuring the similarity between strings, with values ranging from 0.0 (completely dissimilar) to 1.0 (identical strings).

SELECT JARO_WINKLER('databend', 'Databend') AS similarity;

┌────────────────────┐
│ similarity │
├────────────────────┤
0.9166666666666666
└────────────────────┘

SELECT JARO_WINKLER('databend', 'database') AS similarity;

┌────────────┐
│ similarity │
├────────────┤
0.9
└────────────┘

Introducing Private Cloud Management Platform

Databend now supports private cloud deployments with the addition of a Private Cloud Management Platform, enabling seamless monitoring and management of Databend node statuses.

Private Cloud Management Platform

Share this post

Subscribe to our newsletter

Stay informed on feature releases, product roadmap, support, and cloud offerings!