Under the Hood: GPT-4-Powered QA at Databend
BohuTANGDec 4, 2023
Quality is a foundational element within the database industry. Databend, with its diverse applications, particularly in the finance sector, places utmost importance on the accuracy of query results. Consequently, ensuring product quality during rapid iterations poses a significant challenge for us.
As the Databend open-source community rapidly evolves, introducing new features and optimizing existing ones brings forth novel testing challenges. Our commitment lies in implementing rigorous testing with each code update to ensure stability and prevent potential issues.
Databend is a modern cloud data warehouse, serving your massive-scale analytics needs at low cost and complexity. Open source alternative to Snowflake. Also available in the cloud: https://app.databend.com .
Databend Testing Methods
To ensure the stability and reliability of the software, Databend employs testing methods that cover various aspects from code-level to system-level:
-
Unit Tests
- Foundationally, unit tests focus on verifying basic code functionality and logic.
- Automated execution of unit tests occurs before each code submission to promptly identify any potential issues.
-
SQL Logic Tests
- Databend incorporates extensive SQL logic tests from DuckDB, CockroachDB, and PostgreSQL, covering a wide range of SQL scenarios.
- These tests help discover and rectify potential issues, ensuring the accuracy of SQL queries.
-
Compatible Tests
- Compatibility tests ensure backward compatibility with older versions, facilitating a smooth transition to updated Databend versions and ensuring business continuity.
-
Perf Tests
- Performance tests, using ClickBench hits dataset and TPCH-SF100 as benchmarks, validate that each version meets performance expectations.
-
Longrun Tests
- Focusing on long-term effects of operations like data writes, updates, and merges, longrun tests monitor CPU and memory stability to ensure Databend's long-term operational stability and reliability.
Except for Longrun Tests, all tests execute with each GitHub Pull Request to guarantee adherence to quality standards.
Integrating GPT-4 for New Models
The Databend team continuously seeks innovation. Recently, GPT-4 has been introduced to further enhance the testing process.
Dual-Slit Detection Model
For modifications involving core paths, we employ a dual-slit detection model for verification. This method validates changes by comparing the result sets of the current Pull Request (PR) version with the main branch (main) version. If the results are consistent, it is deemed unproblematic. However, the quality of the SQL statements used for these validations is crucial, and this is precisely where we leverage the output generated by GPT-4.
Firstly, we guide GPT-4 to infer the random data generation method based on the requirements, as demonstrated in setup.sql. Subsequently, building upon this data, GPT-4 further generates SQL statements for validation, such as check.sql. These validation SQL statements can be adjusted based on different scenarios.
Next, we execute these SQL statements on both versions of Databend to verify the consistency of the result sets.
Result Set Correctness Model
To ensure the correctness of Databend's result sets, we choose Snowflake as a reference. This method comprises three steps:
File Name | Description |
---|---|
setup.sql | Builds tables and imports random datasets separately on Databend and Snowflake. |
action.sql | Executes data change operations, such as Replace/Merge, separately on Databend and Snowflake. |
check.sql | Executes and verifies the results separately on Databend and Snowflake. |
These SQL statements are generated by GPT-4 based on the data schema in setup.sql, making them more complex and random to effectively detect potential issues.
Summary
The introduction of GPT-4 has significantly advanced Databend's testing process. Additional test sets have been released in the Databend Wizard project. With these GPT-4-generated testing models, Databend's quality and stability have taken a substantial leap forward, reaffirming that technology is the primary driver of productivity.
Subscribe to our newsletter
Stay informed on feature releases, product roadmap, support, and cloud offerings!