Identifying Recently Active Git Contributors 📅 📈
Using a bit of SQL to find the most *recently* active git contributors to an open-source codebase
The other day, I had a need to contact the maintainer(s) of an open-source project. There was no contact info and no project website. Opening an issue felt like a bit much, but could’ve worked as a last resort.
I wanted to reach the “core” set of maintainers, without knowing explicitly who they were (no CONTRIBUTORS file or explicit contact list). Looking at the commit log yielded some recent committers, but it’s hard to differentiate a “core maintainer” from a “passer-by” (someone fixing a typo, patching their specific bug, etc: contributing but not actively maintaining).
I wanted to know who the most recent active contributors were - not necessarily the most active overall, but the most active within the last N days. The original authors may have the most commits + LOC, but could be long gone from the project. This
askgit query helped achieve that:
SELECT author_name, count(*) FROM commits WHERE author_when > date('now', '-90 days') AND parents < 2 GROUP BY author_name ORDER BY count(*) DESC
This produced a ranked list of contributors in the past 90 days by commit count (excluding merge commits).
The GitHub contributor insights page offers a really nice way to surface the same information visually.
Needless to say, I was able to find my “core maintainers” and reach them via Twitter DM 🙂.
Narrowing the Search
This was a pretty simple use case for AskGit, with a nice visual alternative in the GitHub UI (though the SQL version would be necessary for any repo not on GitHub).
A further step could be looking specifically for contributors to certain sections of a codebase. Maybe I don’t care about overall contribution - just contribution to specific features (by limiting the query to only commits that changed certain files).
SELECT author_name, count(DISTINCT hash) FROM commits, stats('', commits.hash) WHERE author_when > date('now', '-120 days') AND parents < 2 AND file_path LIKE 'go.mod' GROUP BY author_name ORDER BY count(*) DESC
should show me only the contributors who updated the
go.mod file. The
LIKE clause could be modified to filter by certain file types or paths (depending on the structure of the repo).
LIKE ‘*.go‘ or
The built-in language-detection functions could also be useful to examine only test or documentation files (who most recently maintains the docs for this project?).
I hope this is useful! If you want to stay up to date with more queries and use cases, go ahead and subscribe if you haven’t already 🙂
As usual, you can try out queries on our demo site.