Which Go Modules Does Google Use?

Finding the most frequently used go mod dependencies in public Google repositories with AskGit

The Google GitHub org has over 2k public repositories. At the time of writing, 200+ are Golang codebases (at least according to GitHub’s primary_language field), and 125 of those repos have a go.mod file.

Can we compose a query that finds the most commonly declared dependencies in these 125 go projects? Absolutely! Brace yourself for some SQL 😃.

First, let’s use the AskGit export command to save every public Go repo in the Google org, along with a copy of its go.mod file (if it exists) into a SQLite database file.

SELECT
name, primary_language,
github_repo_file_content('google', name, 'go.mod') as go_mod
FROM github_org_repos('google')
WHERE primary_language = 'Go'
view raw fetch.sql hosted with ❤ by GitHub

We’re making use of the github_org_repos table-valued function and the github_repo_file_content scalar function to retrieve what we want from the GitHub API alone (no repos are cloned to disk).

This can take some time, go ahead and make a coffee

The end result is a database file which we can attach the SQLite shell to:

$ sqlite3 google_go_mod
sqlite> .load .build/libgaskgit.so 

Here we’re mounting AskGit compiled as a run-time loadable extension so that we can access some AskGit specific functions within our SQLite shell.1

SELECT
count(*),
json_extract(value, '$.mod.path') AS dep
FROM
google_go_mod,
json_each(go_mod_to_json(go_mod), '$.require')
GROUP BY dep ORDER BY count(*);
view raw query.sql hosted with ❤ by GitHub

This query makes use of the SQLite JSON functions and the AskGit go_mod_to_json helper (which parses a go.mod file into JSON) to give us an ordered list of the most frequently declared (public) go mod dependencies.

The Top 10

  1. github.com/google/go-cmp 47

  2. golang.org/x/sys 41

  3. golang.org/x/net 39

  4. github.com/golang/protobuf 36

  5. google.golang.org/grpc 30

  6. golang.org/x/oauth2 28

  7. google.golang.org/genproto 26

  8. google.golang.org/protobuf 25

  9. github.com/golang/glog 24

  10. golang.org/x/crypto 22

The full results can be found here.

So What?

To be honest, I don’t know 😃. This information alone may not be too useful, beyond just as a point of interest (I guess Google uses go-cmp for tests across many of its repos, and a fair amount of grpc/protobuf - are we that surprised?)

This example, however, could be extended into your own org to help you understand dependency usage across your internal (or public) projects. The same principles could be applied to package.json files for JavaScript projects, or Cargo.toml for Rust.

You could dig deeper to see declared dependencies changing over time or over team. Does a particular library get used more over time? Less? Does a particular team or set of codebases use a dependency more than others?

Ultimately, you’'ll want to craft the queries above to your use case to get more interesting results - but I think scratching the surface with public Google repos is still pretty cool 😎.

If you haven’t already, please go ahead a subscribe to stay up to date with AskGit use-cases and new features 🚀

1

Note that the SQLite shell shipped on macs does not permit run-time loading extensions. See here for more info