Which Go Modules Does Google Use?
Finding the most frequently used go mod dependencies in public Google repositories with AskGit
The Google GitHub org has over 2k public repositories. At the time of writing, 200+ are Golang codebases (at least according to GitHub’s primary_language
field), and 125 of those repos have a go.mod
file.
Can we compose a query that finds the most commonly declared dependencies in these 125 go projects? Absolutely! Brace yourself for some SQL 😃.
First, let’s use the AskGit export command to save every public Go repo in the Google org, along with a copy of its go.mod
file (if it exists) into a SQLite database file.
SELECT | |
name, primary_language, | |
github_repo_file_content('google', name, 'go.mod') as go_mod | |
FROM github_org_repos('google') | |
WHERE primary_language = 'Go' |
We’re making use of the github_org_repos
table-valued function and the github_repo_file_content
scalar function to retrieve what we want from the GitHub API alone (no repos are cloned to disk).
This can take some time, go ahead and make a coffee ☕
The end result is a database file which we can attach the SQLite shell to:
$ sqlite3 google_go_mod
sqlite> .load .build/libgaskgit.so
Here we’re mounting AskGit compiled as a run-time loadable extension so that we can access some AskGit specific functions within our SQLite shell.1
SELECT | |
count(*), | |
json_extract(value, '$.mod.path') AS dep | |
FROM | |
google_go_mod, | |
json_each(go_mod_to_json(go_mod), '$.require') | |
GROUP BY dep ORDER BY count(*); |
This query makes use of the SQLite JSON functions and the AskGit go_mod_to_json
helper (which parses a go.mod
file into JSON) to give us an ordered list of the most frequently declared (public) go mod dependencies.
The Top 10
github.com/google/go-cmp
47
golang.org/x/sys
41
golang.org/x/net
39
github.com/golang/protobuf
36
google.golang.org/grpc
30
golang.org/x/oauth2
28
google.golang.org/genproto
26
google.golang.org/protobuf
25
github.com/golang/glog
24
golang.org/x/crypto
22
The full results can be found here.
So What?
To be honest, I don’t know 😃. This information alone may not be too useful, beyond just as a point of interest (I guess Google uses go-cmp
for tests across many of its repos, and a fair amount of grpc/protobuf - are we that surprised?)
This example, however, could be extended into your own org to help you understand dependency usage across your internal (or public) projects. The same principles could be applied to package.json
files for JavaScript projects, or Cargo.toml
for Rust.
You could dig deeper to see declared dependencies changing over time or over team. Does a particular library get used more over time? Less? Does a particular team or set of codebases use a dependency more than others?
Ultimately, you’'ll want to craft the queries above to your use case to get more interesting results - but I think scratching the surface with public Google repos is still pretty cool 😎.
If you haven’t already, please go ahead a subscribe to stay up to date with AskGit use-cases and new features 🚀