Leveraging Randomized Testing with SQLsmith to Uncover Bugs
b41shOct 24, 2023
Testing plays a crucial role in the development and maintenance of database systems. It helps verify functionality correctness and proactively identify potential issues to maintain performance and stability. Databend's CI already supports these test types:
- Unit Test: A unit test validates the functionality of minimum testable code units like functions and modules, and ensure they execute successfully and return the expected results.
- SQL Logic Test: A SQL logic test verifies SQL syntax and logic correctness using test cases, covering various scenarios.
- Performance Test: A performance test validates performance impact of new features and optimizations. This helps prevent performance regressions.
While these tests ensure functional correctness and stability during rapid development, they have limitations. Handwritten SQL queries are often simplistic and lack coverage of complex real-world scenarios, edge cases, and exceptions.
Introducing SQLsmith
SQLsmith is a randomized SQL query generator that produces a high volume of diverse test cases to simulate real-world variability. Compared to other testing approaches, SQLsmith improves test coverage to uncover more potential issues and bugs.
Implementing SQLsmith for Databend
The original SQLsmith for PostgreSQL fuzz testing was inspired by Csmith. Several well-known open-source databases have adapted SQLsmith for their own use, such as CockroachDB, TiDB, and RisingWave.
These open source SQLsmiths use diverse languages (C++, Go, Rust) with different syntax support suitable for their domains. We couldn't directly use them and had to build our own SQLsmith in Rust to fully support Databend's syntax and features.
SQLsmith has three main components:
- SQL Generator: Generates abstract syntax trees (ASTs) of various types
- SQL Reducer: Simplifies complex SQLs to pinpoint bugs
- Runner: Executes SQLs and records errors
SQL Generator
The SQL Generator randomly generates ASTs including:
- Data Types: Basic types, Nested types like ,
Array
,Map
etc.Tuple
- DDLs: ,
CREATE
,ALTER
etc. to create and modify test tables.DROP TABLE
- DMLs: ,
INSERT
,UPDATE
,DELETE
etc. to populate test data.MERGE
- Queries: ,
WITH
,SELECT
,JOIN
,SubQuery
etc.ORDER BY
andWITH
can generate complex queries.SubQuery
- Expressions: ,
Column
,Literal
andScalar Function
etc. Expressions can be nested.Aggregate Function
By randomly generating AST components, it covers all possible SQL syntax. Controlled recursion also creates complex nested SQLs to find obscure bugs. But depth is capped to avoid unexecutable SQLs.
SQL Reducer
Since generated SQLs can be very complex with the bug trigger buried, simplifying the original SQL makes bug isolation easier.
The SQL Reducer iteratively removes AST components like
WITH
SubQuery
Expression
Runner
The SQL Runner executes test SQL queries using Databend, and in case of failure, it proceeds with the following error-handling steps:
- Check for expected errors like syntax, semantics etc.
- Verify if it's a known issue or unimplemented feature.
- Call the SQL Reducer to generate the smallest reproducible SQL.
- Log errors and simplified SQLs.
The Runner is now integrated into Databend's CI, running with each release to log errors for further analysis.
Impact
After a month of running, SQLsmith has uncovered 50+ bugs in Databend, including:
- Internal logic errors (17 bugs)
- Invalid function/expression checks (12 bugs)
- Missing semantic checks (9 bugs)
- Improper /
unwrap
handling (7 bugs)unreachable
- Parser failures (3 bugs)
- Failed casts across types (4 bugs)
- Parquet I/O errors (1 bug)
Key Lessons Learned
Analyzing the uncovered bugs led to some key learnings to avoid common pitfalls:
- Thoroughly validate function parameters and edge cases. Main cases include:
- For parameters, check for empty strings.
String
- For parameters, check for large number values.
Int
- When supporting any parameter type, consider unusual types like ,
Null
,EmptyArray
etc.Bitmap
- When only supporting specific types, see if other types can auto-convert or check and return errors early.
- For
- Use unwrap judiciously. Explicitly handle and
Result
instead of assuming success.Option
- Understand SQL semantic rules during development e.g. constraints around ,
GROUP BY
etc. Perform semantic checks in the Binder phase to prevent runtime errors.ORDER BY
- Add more unit test cases for critical modules to prevent bugs from internal logic errors.
What's Next
SQLsmith helps enhance Databend's stability and reliability by uncovering hidden issues. Upcoming improvements include:
- Add support for more SQL features like , Computed Column etc.
UNION
- Introduce more configurations like expression nesting depths, query complexity.
- Improve SQL Reducer.
- Optimize query executions and result analysis.
Subscribe to our newsletter
Stay informed on feature releases, product roadmap, support, and cloud offerings!