Use SQLite Databases From The Filesystem

Problem

You need to read and work with data in SQLite databases stored on your your filesystem.

Solution

Use DuckDB’s ATTACH and USE to work directy with SQLite databases.

Discussion

Via the sqlite extension, DuckDB can work with any existing SQLite databases you may have. You should read the DuckDB documentation on SQLite as there are a few caveats around types and other database characteristics due the the way DuckDB was designed.

We’ll show how to work with SQLite tables by using the classic “chinook” database. For those not familiar with chinook, it’s a database with eleven tables that model a record store. These are the tables:

  • employees table stores employees data such as employee id, last name, first name, etc. It also has a field named ReportsTo to specify who reports to whom.
  • customers table stores customers data.
  • invoices & invoice_items tables: these two tables store invoice data. The invoices table stores invoice header data and the invoice_items table stores the invoice line items data.
  • artists table stores artists data. It is a simple table that contains only the artist id and name.
  • albums table stores data about a list of tracks. Each album belongs to one artist. However, one artist may have multiple albums.
  • media_types table stores media types such as MPEG audio and AAC audio files.
  • genres table stores music types such as rock, jazz, metal, etc.
  • tracks table stores the data of songs. Each track belongs to one album.
  • playlists & playlist_track tables: playlists table store data about playlists. Each playlist contains a list of tracks. Each track may belong to multiple playlists. The relationship between the playlists table and tracks table is many-to-many. The playlist_track table is used to reflect this relationship.

To access the copy of chinook that ships with this book, we’ll use the ATTACH statement which adds a new database file to the catalog, and USE statment which will let us switch the default schema

duckdb -markdown <<EOF
ATTACH 'data/chinook.db';
USE chinook;
SHOW TABLES;
EOF
name
albums
artists
customers
employees
genres
invoice_items
invoices
media_types
playlist_track
playlists
sqlite_sequence
sqlite_stat1
tracks

Because this is a frequent use-case for DuckDB, we can also open SQLite databases directly. Let’s do that and find the top 10 best-selling tracks:

duckdb data/chinook.db -markdown <<EOF
FROM 
  tracks t, 
  invoice_items i
SELECT 
  t.Name AS Track, 
  SUM(i.Quantity) AS "Total Sold"
WHERE 
  t.TrackId = i.TrackId
GROUP BY t.Name
ORDER BY "Total Sold" DESC
LIMIT 10
EOF
Track Total Sold
The Trooper 5
Hallowed Be Thy Name 4
Untitled 4
The Number Of The Beast 4
Eruption 4
Sure Know Something 4
Good Golly Miss Molly 3
Heaven Can Wait 3
Flying High Again 3
Sweetest Thing 3