Cassandra-SQL provides extensive PostgreSQL-compatible SQL support in KV mode. This document details supported SQL syntax and gaps compared to PostgreSQL.
✅ Fully Supported
SELECT [DISTINCT] column_list
FROM table_name [alias]
[WHERE condition]
[GROUP BY column_list]
[HAVING condition]
[ORDER BY column_list [ASC|DESC]]
[LIMIT count] [OFFSET offset]Features:
- Column projection (
SELECT col1, col2) - Wildcard selection (
SELECT *) - Column aliases (
SELECT col AS alias) - Literal values (
SELECT 'constant' AS name) - Arithmetic expressions (
SELECT price * quantity AS total) - DISTINCT
- WHERE clause with complex predicates
- ORDER BY (single or multi-column)
- LIMIT and OFFSET
- Subqueries (see below)
Examples:
-- Basic SELECT
SELECT id, name, email FROM users WHERE age > 18;
-- With ORDER BY and LIMIT
SELECT * FROM products ORDER BY price DESC LIMIT 10;
-- With expressions
SELECT id, price * 1.1 AS price_with_tax FROM products;
-- With DISTINCT
SELECT DISTINCT category FROM products;✅ Fully Supported
SELECT columns
FROM table1 [alias1]
[INNER | LEFT | RIGHT | FULL | CROSS] JOIN table2 [alias2]
ON join_condition
[WHERE condition]
[ORDER BY columns]
[LIMIT count]Supported JOIN Types:
INNER JOIN- Returns matching rows from both tablesLEFT JOIN(LEFT OUTER JOIN) - All rows from left table + matching from rightRIGHT JOIN(RIGHT OUTER JOIN) - All rows from right table + matching from leftFULL JOIN(FULL OUTER JOIN) - All rows from both tablesCROSS JOIN- Cartesian product
Multi-way JOINs (3+ tables):
SELECT *
FROM orders o
INNER JOIN customers c ON o.customer_id = c.id
INNER JOIN products p ON o.product_id = p.id
WHERE o.status = 'shipped';Examples:
-- INNER JOIN
SELECT u.name, o.order_date
FROM users u
INNER JOIN orders o ON u.id = o.user_id;
-- LEFT JOIN
SELECT u.name, COUNT(o.id) AS order_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.name;
-- FULL JOIN
SELECT *
FROM table_a a
FULL JOIN table_b b ON a.id = b.id;
-- CROSS JOIN
SELECT *
FROM colors
CROSS JOIN sizes;✅ Fully Supported
SELECT aggregate_functions
FROM table_name
[WHERE condition]
[GROUP BY column_list]
[HAVING condition]Aggregate Functions:
COUNT(*),COUNT(column),COUNT(DISTINCT column)SUM(column)AVG(column)MIN(column)MAX(column)
Examples:
-- Simple aggregation
SELECT COUNT(*) FROM users;
-- GROUP BY
SELECT category, COUNT(*), AVG(price)
FROM products
GROUP BY category;
-- HAVING clause
SELECT category, COUNT(*) AS count
FROM products
GROUP BY category
HAVING COUNT(*) > 10;
-- Multiple aggregates
SELECT
COUNT(*) AS total_orders,
SUM(amount) AS total_revenue,
AVG(amount) AS avg_order_value
FROM orders
WHERE status = 'completed';✅ Fully Supported
Types:
- Scalar subqueries (returns single value)
- IN subqueries
- EXISTS subqueries
- Correlated subqueries
Examples:
-- IN subquery
SELECT * FROM users
WHERE id IN (SELECT user_id FROM orders WHERE total > 1000);
-- EXISTS subquery
SELECT * FROM users u
WHERE EXISTS (
SELECT 1 FROM orders o WHERE o.user_id = u.id
);
-- Correlated subquery
SELECT u.name,
(SELECT COUNT(*) FROM orders o WHERE o.user_id = u.id) AS order_count
FROM users u;
-- Subquery in FROM clause
SELECT category, avg_price
FROM (
SELECT category, AVG(price) AS avg_price
FROM products
GROUP BY category
) AS subquery
WHERE avg_price > 100;✅ Fully Supported
SELECT column,
window_function() OVER (
[PARTITION BY partition_columns]
[ORDER BY order_columns]
[ROWS|RANGE frame_clause]
)
FROM table_nameSupported Functions:
ROW_NUMBER()- Sequential number within partitionRANK()- Rank with gapsDENSE_RANK()- Rank without gapsLAG(column, offset)- Previous row valueLEAD(column, offset)- Next row valueFIRST_VALUE(column)- First value in windowLAST_VALUE(column)- Last value in window
Examples:
-- ROW_NUMBER
SELECT id, name, salary,
ROW_NUMBER() OVER (ORDER BY salary DESC) AS rank
FROM employees;
-- PARTITION BY
SELECT department, name, salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dept_rank
FROM employees;
-- LAG/LEAD
SELECT date, revenue,
LAG(revenue, 1) OVER (ORDER BY date) AS prev_revenue,
LEAD(revenue, 1) OVER (ORDER BY date) AS next_revenue
FROM daily_sales;✅ Fully Supported
SELECT columns FROM table1
UNION [ALL]
SELECT columns FROM table2
[ORDER BY columns]
[LIMIT count]Examples:
-- UNION (removes duplicates)
SELECT name FROM customers
UNION
SELECT name FROM suppliers;
-- UNION ALL (keeps duplicates)
SELECT product_id FROM orders_2023
UNION ALL
SELECT product_id FROM orders_2024;
-- With ORDER BY and LIMIT
SELECT name, 'customer' AS type FROM customers
UNION ALL
SELECT name, 'supplier' AS type FROM suppliers
ORDER BY name
LIMIT 100;✅ Fully Supported
INSERT INTO table_name [(column_list)]
VALUES (value_list) [, (value_list), ...]Features:
- Single row insert
- Multi-row insert
- Auto-increment columns (SERIAL, GENERATED AS IDENTITY)
- DEFAULT values
- NULL handling
Examples:
-- Basic INSERT
INSERT INTO users (name, email, age)
VALUES ('John Doe', 'john@example.com', 30);
-- Multi-row INSERT
INSERT INTO products (name, price)
VALUES
('Product A', 19.99),
('Product B', 29.99),
('Product C', 39.99);
-- With auto-increment
INSERT INTO orders (user_id, total)
VALUES (1, 99.99); -- id is auto-generated
-- With DEFAULT
INSERT INTO users (name, email, created_at)
VALUES ('Jane', 'jane@example.com', DEFAULT);✅ Fully Supported
UPDATE table_name
SET column = value [, column = value, ...]
[WHERE condition]Features:
- Single/multiple column updates
- Arithmetic expressions (
balance = balance + 100) - WHERE clause filtering
- Automatic index maintenance
Examples:
-- Basic UPDATE
UPDATE users SET age = 31 WHERE id = 1;
-- Multiple columns
UPDATE products
SET price = 29.99, stock = 100
WHERE id = 5;
-- Arithmetic expression
UPDATE accounts
SET balance = balance + 100
WHERE id = 123;
-- Complex WHERE
UPDATE orders
SET status = 'shipped'
WHERE status = 'pending' AND created_at < '2024-01-01';✅ Fully Supported
DELETE FROM table_name
[WHERE condition]Features:
- WHERE clause filtering
- Automatic index cleanup
- Transactional (ACID)
Examples:
-- Delete specific rows
DELETE FROM users WHERE age < 18;
-- Delete with complex condition
DELETE FROM orders
WHERE status = 'cancelled' AND created_at < '2023-01-01';
-- Delete all rows (use TRUNCATE for better performance)
DELETE FROM temp_table;✅ Fully Supported
CREATE TABLE [IF NOT EXISTS] table_name (
column_name data_type [constraints],
...
[PRIMARY KEY (column_list)]
)Supported Data Types:
INT,INTEGER- 32-bit integerBIGINT- 64-bit integerSMALLINT- 16-bit integerSERIAL- Auto-incrementing integerBIGSERIAL- Auto-incrementing bigintTEXT,VARCHAR(n)- Variable-length stringDOUBLE PRECISION,DOUBLE,FLOAT- Double precision floatBOOLEAN,BOOL- BooleanJSON,JSONB- JSON dataUUID- UUIDTIMESTAMP- Timestamp (future)DATE- Date (future)
Constraints:
PRIMARY KEY- Single or compositeNOT NULL- Non-nullable columnGENERATED ALWAYS AS IDENTITY- Auto-increment (PostgreSQL 10+)GENERATED BY DEFAULT AS IDENTITY- Auto-increment with override
Examples:
-- Basic table
CREATE TABLE users (
id INT PRIMARY KEY,
name TEXT NOT NULL,
email TEXT,
age INT
);
-- With auto-increment
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
user_id INT NOT NULL,
total DOUBLE PRECISION,
status TEXT
);
-- Composite primary key
CREATE TABLE order_items (
order_id INT,
product_id INT,
quantity INT,
PRIMARY KEY (order_id, product_id)
);
-- With GENERATED AS IDENTITY
CREATE TABLE products (
id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
name TEXT NOT NULL,
price DOUBLE PRECISION
);
-- IF NOT EXISTS
CREATE TABLE IF NOT EXISTS logs (
id BIGSERIAL PRIMARY KEY,
message TEXT,
created_at TIMESTAMP
);Hidden rowid Column:
If no PRIMARY KEY is specified, Cassandra-SQL automatically adds a hidden rowid column (similar to PostgreSQL's ctid).
✅ Fully Supported
CREATE [UNIQUE] INDEX [IF NOT EXISTS] index_name
ON table_name (column_list)Features:
- Single-column indexes
- Multi-column (composite) indexes
- UNIQUE indexes (future)
- Automatic index maintenance
Examples:
-- Single-column index
CREATE INDEX idx_users_email ON users (email);
-- Multi-column index
CREATE INDEX idx_orders_user_status ON orders (user_id, status);
-- IF NOT EXISTS
CREATE INDEX IF NOT EXISTS idx_products_category ON products (category);✅ Fully Supported
DROP TABLE [IF EXISTS] table_name [, table_name, ...]Features:
- Single or multiple tables
- IF EXISTS clause
- Lazy deletion (VacuumJob cleans up data)
Examples:
-- Drop single table
DROP TABLE users;
-- Drop multiple tables
DROP TABLE orders, order_items;
-- IF EXISTS
DROP TABLE IF EXISTS temp_table;✅ Fully Supported
DROP INDEX [IF EXISTS] index_name✅ Fully Supported
TRUNCATE TABLE table_nameFast O(1) operation using table versioning. Data is marked as invisible and cleaned up by VacuumJob.
ALTER TABLE table_name
ADD PRIMARY KEY (column_list)Supported:
ADD PRIMARY KEY- Add primary key constraint
Not Yet Supported:
ADD COLUMN- Add new columnDROP COLUMN- Remove columnALTER COLUMN- Modify column typeADD CONSTRAINT- Add constraints (FOREIGN KEY, CHECK, etc.)
✅ Fully Supported
BEGIN [TRANSACTION]
START TRANSACTION
COMMIT [TRANSACTION]
ROLLBACK [TRANSACTION]
ABORTFeatures:
- ACID transactions
- Snapshot isolation
- Optimistic concurrency control
- Automatic conflict detection
Examples:
-- Transfer money between accounts
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;
-- Rollback on error
BEGIN;
INSERT INTO users (name, email) VALUES ('Test', 'test@example.com');
-- ... error occurs ...
ROLLBACK;✅ Fully Supported
EXPLAIN [ANALYZE] queryShows query execution plan and cost estimates. With ANALYZE, also executes the query and shows actual statistics.
Examples:
-- Show query plan
EXPLAIN SELECT * FROM users WHERE age > 18;
-- Show plan with execution statistics
EXPLAIN ANALYZE SELECT * FROM orders WHERE status = 'pending';✅ Supported (No-op)
VACUUM
ANALYZEThese commands are no-ops in KV mode. Background jobs handle cleanup and statistics collection automatically.
COPY table_name FROM STDIN
COPY table_name TO STDOUTBasic support for bulk data loading/export.
✅ Supported
\dt- List tables\di- List indexes\d table_name- Describe table\l- List databases\c database_name- Connect to database\q- Quit
✅ Fully Supported
- Comparison:
=,!=,<>,<,<=,>,>= - Logical:
AND,OR,NOT - Pattern matching:
LIKE,NOT LIKE - Range:
BETWEEN,NOT BETWEEN - Set membership:
IN,NOT IN - NULL testing:
IS NULL,IS NOT NULL
Examples:
-- Comparison
SELECT * FROM users WHERE age >= 18;
-- Logical operators
SELECT * FROM products WHERE price > 10 AND price < 100;
-- LIKE
SELECT * FROM users WHERE email LIKE '%@gmail.com';
-- BETWEEN
SELECT * FROM orders WHERE total BETWEEN 100 AND 500;
-- IN
SELECT * FROM users WHERE status IN ('active', 'pending');
-- IS NULL
SELECT * FROM users WHERE deleted_at IS NULL;
-- Complex conditions
SELECT * FROM orders
WHERE (status = 'pending' OR status = 'processing')
AND total > 100
AND created_at >= '2024-01-01';| Feature | Status | Notes |
|---|---|---|
| Basic SELECT | ✅ | Full support |
| JOINs (all types) | ✅ | INNER, LEFT, RIGHT, FULL, CROSS |
| Aggregations | ✅ | COUNT, SUM, AVG, MIN, MAX |
| GROUP BY / HAVING | ✅ | Full support |
| Subqueries | ✅ | Scalar, IN, EXISTS, correlated |
| Window functions | ✅ | ROW_NUMBER, RANK, LAG, LEAD, etc. |
| UNION / UNION ALL | ✅ | Full support |
| INSERT / UPDATE / DELETE | ✅ | Full support |
| Transactions | ✅ | ACID with snapshot isolation |
| CREATE/DROP TABLE | ✅ | Full support |
| CREATE/DROP INDEX | ✅ | Single and multi-column |
| ORDER BY / LIMIT / OFFSET | ✅ | Full support |
| DISTINCT | ✅ | Full support |
| Auto-increment | ✅ | SERIAL, GENERATED AS IDENTITY |
| Feature | Status | Limitations |
|---|---|---|
| ALTER TABLE | Only ADD PRIMARY KEY supported | |
| Data types | Missing TIMESTAMP, DATE, INTERVAL, ARRAY | |
| COPY | Basic support only | |
| Constraints | No FOREIGN KEY, CHECK, UNIQUE yet | |
| Functions | Limited built-in functions | |
| CASE expressions | Not yet implemented | |
| CTEs (WITH) | Not yet implemented | |
| Recursive queries | Not yet implemented |
| Feature | Status | Priority |
|---|---|---|
| FOREIGN KEY constraints | ❌ | High |
| CHECK constraints | ❌ | Medium |
| UNIQUE constraints | ❌ | High |
| DEFAULT values (complex) | ❌ | Medium |
| Views | ❌ | High |
| Materialized views | ❌ | Medium |
| Triggers | ❌ | Low |
| Stored procedures | ❌ | Low |
| User-defined functions | ❌ | Low |
| CTEs (WITH clause) | ❌ | High |
| CASE expressions | ❌ | High |
| CAST / type conversion | ❌ | Medium |
| String functions | ❌ | Medium |
| Date/time functions | ❌ | High |
| Math functions | ❌ | Medium |
| ARRAY data type | ❌ | Medium |
| TIMESTAMP / DATE types | ❌ | High |
| INTERVAL type | ❌ | Medium |
| Sequences (CREATE SEQUENCE) | ❌ | Medium |
| UPSERT (INSERT ... ON CONFLICT) | ❌ | High |
| RETURNING clause | ❌ | Medium |
| Lateral joins | ❌ | Low |
| Recursive CTEs | ❌ | Low |
| Table partitioning | ❌ | Low |
| Inheritance | ❌ | Low |
- Index scans: Automatically used when beneficial
- Predicate pushdown: WHERE clauses pushed to storage layer
- Projection pushdown: Only requested columns retrieved
- JOIN optimization: Cost-based join order selection
- LIMIT pushdown: Early termination for LIMIT queries
- Full table scans on large tables: Use indexes
- SELECT * on wide tables: Project only needed columns
- Unbounded queries: Always use LIMIT for large result sets
- Complex subqueries in WHERE: Consider JOINs instead
- High-contention transactions: Can cause conflicts
Most PostgreSQL queries work as-is:
-- These work identically in Cassandra-SQL
SELECT * FROM users WHERE age > 18 ORDER BY name LIMIT 10;
INSERT INTO orders (user_id, total) VALUES (1, 99.99);
UPDATE products SET price = price * 1.1 WHERE category = 'electronics';
DELETE FROM logs WHERE created_at < '2023-01-01';-- PostgreSQL: RETURNING clause
INSERT INTO users (name) VALUES ('John') RETURNING id;
-- Cassandra-SQL: Query after insert
INSERT INTO users (name) VALUES ('John');
SELECT id FROM users WHERE name = 'John' ORDER BY id DESC LIMIT 1;
-- PostgreSQL: UPSERT
INSERT INTO users (id, name) VALUES (1, 'John')
ON CONFLICT (id) DO UPDATE SET name = EXCLUDED.name;
-- Cassandra-SQL: Separate UPDATE or INSERT
-- (Check existence first, then INSERT or UPDATE)
-- PostgreSQL: CTEs
WITH recent_orders AS (
SELECT * FROM orders WHERE created_at > '2024-01-01'
)
SELECT * FROM recent_orders WHERE total > 100;
-- Cassandra-SQL: Use subquery
SELECT * FROM (
SELECT * FROM orders WHERE created_at > '2024-01-01'
) AS recent_orders WHERE total > 100;Use the provided test scripts:
# Run SQL compatibility tests
./gradlew test --tests "*KvQueryExecutor*"
# Run JOIN tests
./gradlew test --tests "*JoinTest*"
# Run transaction tests
./gradlew test --tests "*TransactionTest*"See FEATURE_ROADMAP.md for planned features and priorities.