Use SQLite Databases From The Filesystem
Problem
You need to read and work with data in SQLite databases stored on your your filesystem.
Solution
Use DuckDB’s ATTACH
and USE
to work directy with SQLite databases.
Discussion
Via the sqlite
extension, DuckDB can work with any existing SQLite databases you may have. You should read the DuckDB documentation on SQLite as there are a few caveats around types and other database characteristics due the the way DuckDB was designed.
We’ll show how to work with SQLite tables by using the classic “chinook” database. For those not familiar with chinook, it’s a database with eleven tables that model a record store. These are the tables:
employees
table stores employees data such as employee id, last name, first name, etc. It also has a field named ReportsTo to specify who reports to whom.customers
table stores customers data.invoices
&invoice_items
tables: these two tables store invoice data. Theinvoices
table stores invoice header data and theinvoice_items
table stores the invoice line items data.artists
table stores artists data. It is a simple table that contains only the artist id and name.albums
table stores data about a list of tracks. Each album belongs to one artist. However, one artist may have multiple albums.media_types
table stores media types such as MPEG audio and AAC audio files.genres
table stores music types such as rock, jazz, metal, etc.tracks
table stores the data of songs. Each track belongs to one album.playlists
&playlist_track
tables: playlists table store data about playlists. Each playlist contains a list of tracks. Each track may belong to multiple playlists. The relationship between the playlists table and tracks table is many-to-many. The playlist_track table is used to reflect this relationship.
To access the copy of chinook that ships with this book, we’ll use the ATTACH
statement which adds a new database file to the catalog, and USE
statment which will let us switch the default schema
duckdb -markdown <<EOF
ATTACH 'data/chinook.db';
USE chinook;
SHOW TABLES;
EOF
name |
---|
albums |
artists |
customers |
employees |
genres |
invoice_items |
invoices |
media_types |
playlist_track |
playlists |
sqlite_sequence |
sqlite_stat1 |
tracks |
Because this is a frequent use-case for DuckDB, we can also open SQLite databases directly. Let’s do that and find the top 10 best-selling tracks:
duckdb data/chinook.db -markdown <<EOF
FROM
tracks t,
invoice_items i
SELECT
t.Name AS Track,
SUM(i.Quantity) AS "Total Sold"
WHERE
t.TrackId = i.TrackId
GROUP BY t.Name
ORDER BY "Total Sold" DESC
LIMIT 10
EOF
Track | Total Sold |
---|---|
The Trooper | 5 |
Hallowed Be Thy Name | 4 |
Untitled | 4 |
The Number Of The Beast | 4 |
Eruption | 4 |
Sure Know Something | 4 |
Good Golly Miss Molly | 3 |
Heaven Can Wait | 3 |
Flying High Again | 3 |
Sweetest Thing | 3 |