Query for Go Dependency Licenses

Fetch the license of all your Go mod dependencies

Yesterday, I saw this thread on HN (an open-source project includes a dependency with a license that forbids its use). It got me thinking about how the query used in this post could be amended to download the licenses of all declared dependencies in a Go codebase (using go.mod). Here’s what I came up with.

Identify the dependencies

SELECT
json_extract(value, '$.mod.path')
FROM files f, json_each(go_mod_to_json(contents), '$.require') j
WHERE f.path = 'go.mod'
view raw go_mod_require.sql hosted with ❤ by GitHub

This will use the built-in JSON functions and go_mod_to_json to list all the required dependencies in a go.mod file. Here are the first 5 from the AskGit codebase:

+-----------------------------------+
| JSON_EXTRACT(VALUE, '$.MOD.PATH') |
+-----------------------------------+
| github.com/BurntSushi/toml        |
+-----------------------------------+
| github.com/DATA-DOG/go-sqlmock    |
+-----------------------------------+
| github.com/ProtonMail/go-crypto   |
+-----------------------------------+
| github.com/asaskevich/govalidator |
+-----------------------------------+
| github.com/augmentable-dev/vtab   |
+-----------------------------------+

Fetch each dependency

Next, we’ll want to grab the LICENSE file from every dependency:

SELECT
str_split(mod, '/', 0) || '/' || str_split(mod, '/', 1) || '/' || str_split(mod, '/', 2) AS repo, -- this is a bit whacky, need to get the repo part of the module path
contents -- the actual LICENSE file
FROM
(select json_extract(value, '$.mod.path') AS mod from files f, json_each(go_mod_to_json(contents), '$.require') j WHERE f.path = 'go.mod') AS deps,
files('https://' || repo) -- this will clone every every repo to a temp directory
WHERE repo IS NOT NULL
AND path = 'LICENSE'
AND repo LIKE 'github.com%' -- was failing on some non-git dependencies, shortcut for now, but limits to modules only on GitHub
view raw go_mod_licenses.sql hosted with ❤ by GitHub

It looks like a lot 🙂, but we’re extending the previous query to:

  1. Extract the git repo part of a Go mod path

  2. Join with the files table to clone every dependency to a tmp directory

  3. Pull out the contents of the LICENSE file (if it exists)

Currently, this will only work with git-hosted modules (so we add a filter to only select ones hosted on GitHub, for now).

Results

And there you have it, the contents of the LICENSE file for every dependency required in a go.mod file (at least the ones on GitHub).

You can see the output for the AskGit project here.

The content of licenses is pretty verbose - a next step would be to identify the type of license and just return that (i.e. MIT vs Apache 2.0) rather than the full text…