name: matlab-use-duckdb description: "Generates MATLAB code for DuckDB database operations using Database Toolbox. Use when connecting to DuckDB (in-memory or file-based), querying CSV/Parquet/JSON files with SQL, creating development databases, preprocessing out-of-memory data, using duckdb(), installing DuckDB extensions, or using DuckDB as an analytical engine in MATLAB." license: MathWorks BSD-3-Clause metadata: author: MathWorks version: "1.0"
MATLAB Database Toolbox Interface to DuckDB
Use when working with DuckDB databases from MATLAB using Database Toolbox. DuckDB is an embedded analytical database engine that ships with Database Toolbox starting in R2026a. It enables SQL-based analytics on files, out-of-memory data preprocessing, and portable development databases — all without external database server configuration.
When to Use This Skill
- Connecting to a DuckDB database (in-memory or file-based)
- Creating a new DuckDB database file for development workflows
- Querying CSV, Parquet, or JSON files directly with SQL
- Preprocessing large data that doesn't fit in memory before importing into MATLAB
- Using DuckDB as an analytical engine for filtering, aggregation, joins, or sorting
- Installing and using DuckDB extensions
- User mentions keywords: DuckDB, duckdb, analytical engine, embedded database, parquet, CSV analytics, in-memory database, portable database, development database, out-of-memory preprocessing
When NOT to Use
- Connecting to MySQL, PostgreSQL, SQLite, or other external databases — use their native interfaces or JDBC/ODBC
- Data fits in memory and only needs standard MATLAB operations — use
readtable/readmatrixdirectly - Object-relational mapping — use ORM (
ormread/ormwritewithMappableclasses) - MongoDB, Cassandra, or Neo4j — use their dedicated Database Toolbox interfaces
What Is DuckDB and Why Does Database Toolbox Ship It?
DuckDB is an embedded, serverless analytical database engine. Unlike MySQL or PostgreSQL, it requires no server, no configuration, and runs in-process within MATLAB.
Why it ships with Database Toolbox (R2026a+):
- Zero-config database —
conn = duckdb()gives you a full SQL engine instantly. - Analytical engine for files — Query CSV, Parquet, and JSON files directly with SQL without loading them into memory.
- Out-of-memory preprocessing — Filter, aggregate, join, and sort datasets larger than memory, then bring only results into MATLAB.
- Portable development databases —
.duckdbor.dbfiles work on any machine with Database Toolbox. No database setup needed. - AI agent advantage — An agent's SQL knowledge directly translates to powerful analytical queries.
DuckDB does NOT replace MATLAB's file I/O (readtable, etc.). It is a performant alternative when data exceeds memory or SQL operations are more natural than MATLAB table operations.
Critical Rules
Connection
- ALWAYS use
duckdb()to connect — notdatabase(), not JDBC, not ODBC. - ALWAYS verify with
isopen(conn)and close withclose(conn).
API Surface
- All standard functions work:
sqlread,fetch,execute,sqlwrite,sqlfind,sqlinnerjoin,sqlouterjoin,commit,rollback. - DuckDB does NOT support
databasePreparedStatement. Useexecuteorsqlwriteinstead. - Use
ExcludeDuplicatesviadatabaseImportOptionswhen reading from database tables (withsqlread). For direct file queries (read_csv/read_parquetviafetch), useSELECT DISTINCTin SQL.
File Queries
- ALWAYS use
fetch(notsqlread) for file queries — they require SQL syntax likeSELECT * FROM read_csv('file.csv'). - ALWAYS use single quotes for file paths inside SQL:
read_csv('data.csv').
Decision Framework
Which connection mode should I use?
| Goal | Connection | Why |
|---|---|---|
| Analytical queries on files | duckdb() | No persistence needed; query files directly |
| Temporary workspace | duckdb() | Fast, discarded on close |
| Portable development database | duckdb("mydata.duckdb") | Creates a .duckdb or .db file; works on any machine |
| Open existing database | duckdb("existing.db") | Read/write access to pre-existing .db or .duckdb file |
| Read-only shared database | duckdb("shared.duckdb", ReadOnly=true) | Prevents accidental writes |
When should I use DuckDB vs. MATLAB file I/O?
| Scenario | Recommendation |
|---|---|
| Small data, simple operations | readtable / readmatrix |
| Data exceeds memory, needs filtering/aggregation | DuckDB (preprocess in SQL, analyze in MATLAB) |
| Query across multiple CSV/Parquet files | DuckDB with glob patterns |
| Portable development database | DuckDB file-based connection |
| MATLAB-specific analysis (signal processing, ML) | Preprocess in DuckDB, analyze in MATLAB |
Common Patterns
Pattern 1: Analytical Engine on Files
conn = duckdb();
result = fetch(conn, "SELECT region, SUM(revenue) as total " + ...
"FROM read_parquet('sales.parquet') " + ...
"GROUP BY region ORDER BY total DESC");
close(conn);
Pattern 2: Out-of-Memory Preprocessing
conn = duckdb();
summary = fetch(conn, "SELECT date, AVG(value) as avg_val " + ...
"FROM read_csv('huge_dataset.csv') " + ...
"WHERE status = 'valid' " + ...
"GROUP BY date ORDER BY date");
close(conn);
% summary fits in memory — continue with MATLAB analysis
Pattern 3: Development Database
conn = duckdb("dev.duckdb");
sqlwrite(conn, "experiments", experimentData);
rf = rowfilter("score");
results = sqlread(conn, "experiments", RowFilter=rf.score > 0.8);
close(conn);
Pattern 4: Multi-File Query with Glob
conn = duckdb();
data = fetch(conn, "SELECT * FROM read_parquet('data/year=2024/*.parquet') " + ...
"WHERE category = 'A'");
close(conn);
Pattern 5: Extensions
conn = duckdb();
execute(conn, "INSTALL httpfs");
execute(conn, "LOAD httpfs");
data = fetch(conn, "SELECT * FROM read_parquet('https://example.com/data.parquet') LIMIT 1000");
close(conn);
For detailed examples, see:
- File analytics and out-of-memory preprocessing:
reference/cards/file-analytics.md - Development database workflows:
reference/cards/development-database.md - DuckDB extensions:
reference/cards/extensions.md
Common Mistakes
% WRONG — using database() or JDBC to connect to DuckDB
conn = database("", "", "", "org.duckdb.DuckDBDriver", "jdbc:duckdb:");
% CORRECT
conn = duckdb();
% WRONG — using sqlread for file queries (expects a table name)
data = sqlread(conn, "read_csv('data.csv')");
% CORRECT — use fetch with SQL
data = fetch(conn, "SELECT * FROM read_csv('data.csv')");
% WRONG — double quotes for file paths in SQL
data = fetch(conn, "SELECT * FROM read_csv(""data.csv"")");
% CORRECT — single quotes
data = fetch(conn, "SELECT * FROM read_csv('data.csv')");
% WRONG — loading huge file into MATLAB then filtering
data = readtable("huge.parquet"); filtered = data(data.val > 100, :);
% CORRECT — let DuckDB filter on disk
conn = duckdb();
filtered = fetch(conn, "SELECT * FROM read_parquet('huge.parquet') WHERE val > 100");
close(conn);
% WRONG — using databasePreparedStatement (not supported)
pstmt = databasePreparedStatement(conn, "INSERT INTO t VALUES(?, ?)");
% CORRECT — use sqlwrite
sqlwrite(conn, "t", data);
Checklist
Before finalizing DuckDB code, verify:
- Connected with
duckdb()orduckdb("file.duckdb")/duckdb("file.db")— notdatabase()or JDBC -
isopen(conn)checked after connection - File queries use
fetchwith SQL (notsqlread) - File paths in SQL use single quotes
- Out-of-memory data preprocessed in DuckDB before importing to MATLAB
- No
databasePreparedStatementusage (not supported) -
close(conn)called when done
Troubleshooting
Issue: duckdb function not found
- Solution: Requires R2026a+ with Database Toolbox. Check with
ver('database').
Issue: sqlread errors with file query
- Solution: Use
fetch(conn, "SELECT * FROM read_csv('file.csv')")—sqlreadexpects table names only.
Issue: Permission denied on ReadOnly connection
- Solution: Reconnect without
ReadOnly:conn = duckdb("file.duckdb").
Issue: Out of memory when querying large file
- Solution: Add
WHERE,GROUP BY,LIMIT, or aggregation in SQL to reduce result size before it enters MATLAB.
Issue: File path not found in read_csv/read_parquet
- Solution: Paths are relative to
pwd. Use absolute paths or verify withdir('file.csv').
Issue: Extension install fails
- Solution: Requires internet for first install (cached afterward). See https://duckdb.org/docs/current/core_extensions/overview.
Issue: Type mismatch on sqlwrite
- Solution: DuckDB supports rich types (ARRAY, LIST, STRUCT, MAP). Use
sqlfind(conn, "tableName")to check column types.
Copyright 2026 The MathWorks, Inc.