Output CSV From DuckDB

Problem

You need the output from DuckDB to be a CSV, not some silly ASCII or fancy Markdown table.

Solution

Use one of the built-in methods to output resultsets to CSV

Discussion

There are a few ways to output CSV from DuckDB. The easiest is to use the -csv command line flag.

Let’s say you need the output of the number of commits per author this month as a CSV for some daily automation process that sends an electric shock to the three lowest performing coders (NOTE: git commits are not a sign of productivity, capability, or anything save for the fact that you know how to run git commit):

cd ~/Development/clones/rstudio
git pull > /dev/null # get the latest commits
git log --pretty=format:%as,%an | \
  duckdb -csv -c "
FROM read_csv('/dev/stdin', 
  auto_detect = false, 
  columns = { 
    'day': 'DATE',
    'author': 'VARCHAR'
  }
)
SELECT
  strftime(date_trunc('month', day), '%Y %B') AS month,  
  author,
  COUNT(*) AS n_commits_this_month
WHERE
  date_trunc('month', day) = date_trunc('month', now()::DATE)
GROUP BY 1, 2
ORDER BY 3
LIMIT 3
"
month,author,n_commits_this_month
"2024 March","Adam Foryś",1
"2024 March","Tim Mok",1
"2024 March","Mervin Fansler",1

The output can be redirected to a file or piped into another process.

Another way we can output CSV from DuckDB gives us more granular control over the output. It works by using the COPY statement.

cd ~/Development/clones/rstudio
git pull > /dev/null # get the latest commits
git log --pretty=format:%as,%an | \
  duckdb -c "
COPY (FROM read_csv('/dev/stdin', 
  auto_detect = false, 
  columns = { 
    'day': 'DATE',
    'author': 'VARCHAR'
  }
)
SELECT
  strftime(date_trunc('month', day), '%Y %B') AS month,  
  author,
  COUNT(*) AS n_commits_this_month
WHERE
  date_trunc('month', day) = date_trunc('month', now()::DATE)
GROUP BY 1, 2
ORDER BY 3
LIMIT 3)
TO '/dev/stdout' (FORMAT CSV)
"
month,author,n_commits_this_month
2024 March,Melissa Barca,1
2024 March,Pete,1
2024 March,Tim Mok,1

You can, of course, substitute a real filespec vs. use stdout.