PHPUnit nests testsuite nodes inside each other, which is not part of either of the official JUnit specifications. While Check Run Reporter has supported PHPUnit for some time, but due to a misunderstanding of what the count values on each testsuite node represented, results were doubled; each testsuite node has, for example, a tests field which identifies how many tests are nested under that node. I had assumed this number represented all direct testcases, not direct and descendant testcases. When Check Run Reporter iterated over the subsuites, it inadvertently double counted all nested suites. This has now been fixed.

Expect a longer post somewhere going into more details, but thanks to mounting costs, the instability of Aurora Serverless v1, and the impending retirement of Aurora Serverless v1, Check Run Reporter is now powered by Lambda and DyanamoDB insteadof of EC2, ELB, RDS, and Elasticache!

In the short run, this probably won't mean any obvious changes (well, there have been some UI tweaks because I dumped NextJS in favor of Remix, but that's a story for a proper blog post), but you should see that things are a lot more stable. Under the old architecture, EC2 and/or Aurora Serverless would just cause hosts to stop responding, but not in a way that healtchecks detected. You should no longer see gateway timeouts when making submissions.

In the medium-term, I'm going to take a bit of a break on hacking on this project all the time. I've been working on it non-stop (in ways that you couldn't see) for far too long now, but in the not too distant future, I intend to add several more things that take advantage of stored results, including features that help with flaky tests.

Split tests across multiple executors.

Using historical timing data, customers on the persistent plan can efficiently split their tests across multiple hosts using.

See the docs for more info.


It's been a very long time coming, but persistent results are finally here!

Coming Soon

For now, persistent results means you'll be able to look at past check runs on rather than just relying on the GitHub UI. In the very near future, however, persistence will also give you flaky test detection, test splitting accross executors, and the ability to temporarily mute tests that may be known to be flaky.


There were some issues with the settings forms that prevented some of their changes from being persisted. Specifically, the "Reset to Defaults" button did not, in fact, reset to defaults. It does now.


  • Forms are much more modular and testable so the issues above with reset not working shouldn't come back.
  • Email automation has been moved out of code to Their workflow is way better than anything that I was able to write in code in the time I had available and gives me just the right amount of reporting without being a full-fledged CRM that requires months of config only to learn it still doesn't work. I'm looking at you, FreshWorks.

Coming Soon!

You may have noticed this changelog has been rather sparse. That's not for lack of progress, but the big feature I've been working on is, well, really big (at least, as far as the architectural shift required to make it work). Historically, Check Run Reporter was completely ephemeral. Result files may be stored for a day or two in S3 (there's an expiration set on them in case the processing job fails to delete them) and the pipeline is somewhat event driven, so there's data stored for a few minutes in SQS, but otherwise, test data is effectively ephemeral.

In the coming weeks, expect to see a new pricing tier for persistent results that enables features like historical reporting, flaky test detection, and multi-node test balancing. Result history will roll out to a few beta users first and the other features will follow shortly there after.


Since its inception, Check Run Reporter has had a page for changing permissions for each of your installations. It...didn't work well and, truth be told, I'm pretty sure it's been entirely broken for quite some time.

I didn't fix it because it's only ever been used twice.

That page really only existed to meet a general requirement for building GitHub applications: administrators needed access controls. Since everything has been rendered on GitHub, there really hasn't been anything to which access needed to be controlled.

That all changes soon. There are some exciting new features on the horizon that really will require access control and other per-installation and per-repository configuration, so the settings system has been completely revamped. You can now configure access control at an installation level that can be inherited or overridden for each of your repositories. More importantly, I can add new settings without doing a database migration.

New Pages

With the new settings system, I've created new pages for each installation and each repository. The old permissions page has been removed in favor of per-installation and per-repository settings forms. The separate Account and Repositories pages have been merged into a single Installations and Repositories page. You can get your repo token from the specific page for that repository.

I removed Google Analytics from the site. I hadn't been using it and I'm as upset by cookie banners as everyone. I've still got one other cookie to remove, and then the consent banner go away.

The GitHub Action has been completely rewritten in TypeScript instead of Bash, so it now additionally supports Windows and Mac agents.

It's been a long-time coming, but we've added support for Checkstyle reports! Checkstyle is the JUnit of style reporting - except it's actually standardized.

Checkstyle reports are a new feature, so we expect edge cases to crop up as it gets used accross various CI systems. As always, please let us know if you run into trouble.

Docs were a little ambiguous on how to set your repo's token. Now, as long as you manage to get your repo's token somewhere in the auth header, it should be accepted. All of the following should work:

curl --user token:YOUR_ID
curl --user YOUR_ID:
curl --user :YOUR_ID
curl --user token:"token:YOUR_ID"
  • swiftlint reported errors when it should have reported failures.
  • typescript reported errors when it should have reported failures.
  • eslint was parsed as swiftlint.
  • junit sometimes showed passing tests when they should have failed.

    Previously, we had been relying on the root testsuite's summary attributes to determine if the overall check run had passed or failed, but many reporters don't properly include those summary attributes. Now, we take the greater of a given summary value or a count of the number of testcase nodes under that testsuite