<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://mathieu.fenniak.net/feed.xml" rel="self" type="application/atom+xml" /><link href="https://mathieu.fenniak.net/" rel="alternate" type="text/html" /><updated>2026-03-10T18:19:28+00:00</updated><id>https://mathieu.fenniak.net/feed.xml</id><title type="html">Mathieu Fenniak</title><subtitle>Constantly learning new technology.</subtitle><entry><title type="html">testtrim: The Testing Tool That Couldn’t Test Itself (Until Now)</title><link href="https://mathieu.fenniak.net/testtrim-2025-01-nested-syscall-tracing/" rel="alternate" type="text/html" title="testtrim: The Testing Tool That Couldn’t Test Itself (Until Now)" /><published>2025-01-24T07:00:00+00:00</published><updated>2025-01-24T07:00:00+00:00</updated><id>https://mathieu.fenniak.net/testtrim-nested-syscall-tracing</id><content type="html" xml:base="https://mathieu.fenniak.net/testtrim-2025-01-nested-syscall-tracing/"><![CDATA[<p>Today, we’re going to deep-dive into the kind of thing you can only “invest” time on if you’re a single engineer working on a project with no supervision.  I just finished a crazy complicated development effort in my project, <a href="https://codeberg.org/testtrim/testtrim/">testtrim</a>, and all I want to do is talk about how surprised I am that it actually worked.</p>

<p>I’ve also published this article in a video form, if you’re more inclined to that format:</p>

<iframe width="560" height="315" src="https://www.youtube.com/embed/uMrJAZY-bPA?si=JeCt6bj7XoqeH3oR" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen=""></iframe>

<h1 id="context-whats-testtrim-again">Context: What’s testtrim, again?</h1>

<p>testtrim is an experimental project to optimize execution of software tests.</p>

<p>It works by:</p>
<ul>
  <li>running your tests once, and</li>
  <li>analyzing test dependencies (via code coverage + syscall tracing).</li>
</ul>

<p>Then, when any future software changes occur, it can use that analysis to select which tests need to be run to cover the software change.</p>

<p>I went into more <a href="https://mathieu.fenniak.net/introducing-testtrim/">detail on the code coverage side in an earlier post</a>, but today it’s all about syscall tracing.</p>

<h1 id="syscall-tracing">syscall Tracing</h1>

<p>There are two types of dependencies that I want testtrim to find that require syscall tracing:</p>
<ol>
  <li>reading files that are part of the repository, and</li>
  <li>accessing resources over the network.</li>
</ol>

<p>If a test does either of those things, it requires some special handling to determine when the test needs to be run again.  For example, if a file referenced by a test (eg. <code class="language-plaintext highlighter-rouge">test_data/Fibonacci_sequence.txt</code>) changes in the future, it would make a lot of sense to run the referencing test (eg. <code class="language-plaintext highlighter-rouge">test_fibonacci_sequence</code>).</p>

<p><a href="https://mathieu.fenniak.net/assets/testtrim-syscall-tracing/test-case-with-file.png"><img src="https://mathieu.fenniak.net/assets/testtrim-syscall-tracing/test-case-with-file-small.png" alt="Screenshot of code showing a unit test that refers to test data in a file; a red arrow highlights the filename and a green arrow highlights the test name" /></a></p>

<p>The tool that I always reach for when I need to do syscall tracing is called <code class="language-plaintext highlighter-rouge">strace</code>.  It’s a superpower to be able to use <code class="language-plaintext highlighter-rouge">strace</code> effectively, but it’s super easy to start with.  You take a command that you want to run, and you throw <code class="language-plaintext highlighter-rouge">strace</code> at the beginning, and you get a record of all the syscalls that the command made:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span><span class="nb">echo</span> <span class="s2">"This is some text."</span>
This is some text.

<span class="nv">$ </span>strace <span class="nb">echo</span> <span class="s2">"This is some text."</span>
... snip output ...
write<span class="o">(</span>1, <span class="s2">"This is some text.</span><span class="se">\n</span><span class="s2">"</span>, 19<span class="o">)</span>    <span class="o">=</span> 19
close<span class="o">(</span>1<span class="o">)</span>                                <span class="o">=</span> 0
close<span class="o">(</span>2<span class="o">)</span>                                <span class="o">=</span> 0
exit_group<span class="o">(</span>0<span class="o">)</span>                           <span class="o">=</span> ?
+++ exited with 0 +++
</code></pre></div></div>

<p>It took a couple weeks to develop the first draft of syscall tracing in testtrim, using strace.  It was pretty straightforward engineering:</p>

<p><a href="https://mathieu.fenniak.net/assets/testtrim-syscall-tracing/testtrim-strace-simple.png"><img src="https://mathieu.fenniak.net/assets/testtrim-syscall-tracing/testtrim-strace-simple-small.png" alt="Flow control diagram of testtrim running strace, as described in the text in detail below." /></a></p>

<p>When you run testtrim from the command line:</p>
<ol>
  <li><strong>Discover Tests</strong>: testtrim finds all the tests in your project (<code class="language-plaintext highlighter-rouge">Discover Tests</code>)</li>
  <li><strong>For Each Test, in Parallel…</strong>:
    <ol>
      <li>It runs each test under <code class="language-plaintext highlighter-rouge">strace</code>, with options including an output file (<code class="language-plaintext highlighter-rouge">--output</code>), and including all child processes (<code class="language-plaintext highlighter-rouge">--follow-forks</code>).</li>
      <li><strong>Open fixture.txt</strong>: When the test process (eg. pid 1234) performs a syscall like opening the file “fixture.txt”…</li>
      <li><code class="language-plaintext highlighter-rouge">strace</code> writes a record to the output file.</li>
      <li><strong>Read Trace Data</strong>: After the test is completed, testtrim reads the data file and parses it,</li>
      <li>… from which it can interpret the syscalls into dependencies that are part of that test - like in this case, that “test-case” requires “fixture.txt”.</li>
    </ol>
  </li>
</ol>

<p>If a file referenced by a test is part of the project’s repo, then it will become a dependency – and in the future, if it’s changed, then “test-case” needs to be rerun.</p>

<h1 id="but-nested-syscall-tracing">But, Nested syscall Tracing</h1>

<p>But, here’s the catch: I want to use testtrim to test testtrim.  I’ve always been a fan of “dogfooding” – although there are cases where it doesn’t make sense.  But as a software developer working on a software development tool, it’s a pretty natural fit!</p>

<p>testtrim has integration tests which use miniature test projects to verify that testtrim works.  Once syscall tracing was introduced, these tests prevented testtrim from running on its own project because you can’t use <code class="language-plaintext highlighter-rouge">strace</code> on a process that is already being traced.  (It would work if you weren’t using <code class="language-plaintext highlighter-rouge">--follow-forks</code>, but then the trace would be arbitrarily incomplete.)</p>

<p><a href="https://mathieu.fenniak.net/assets/testtrim-syscall-tracing/testtrim-strace-nested-broken.png"><img src="https://mathieu.fenniak.net/assets/testtrim-syscall-tracing/testtrim-strace-nested-broken-small.png" alt="Flow control diagram highlighting that strace runs under strace, which isn't permitted with the --follow-forks option." /></a></p>

<p>Honestly, understanding the roadblock this posed was a really disappointing moment for me.  syscall tracing worked - my miniature test projects showed it - and there was apparently nothing blocking me from using it on other projects.  But it was always a goal of mine for testtrim to work on itself.  In addition to be emotionally disappointing, I think it was also important to the project to be able to use testtrim on itself:</p>
<ul>
  <li>That would allow me to work out kinks and bugs involved in real-world usage.</li>
  <li>It would generate the dataset that would prove (or disprove 🤷) its own value in test target reduction.</li>
  <li>And I didn’t want to be a hypocrite trying to promote something that I didn’t even use myself.</li>
</ul>

<p><strong>It had to be solved.</strong></p>

<h1 id="nested-tracing-v1-what-could-go-wrong">Nested Tracing v1: What could go wrong?</h1>

<p>After a bit of disappointment, I had a lightbulb moment: I control both of these programs.  Can I just make them cooperate?  The “tracer” knows everything that all of these processes are doing - it’s just a simple matter of getting the data from testtrim over to the integration test.</p>

<p>So, first pass… ah, I was so naive…</p>

<p><a href="https://mathieu.fenniak.net/assets/testtrim-syscall-tracing/single-file-maybe.png"><img src="https://mathieu.fenniak.net/assets/testtrim-syscall-tracing/single-file-maybe-small.png" alt="Flow control diagram of testtrim running strace, as described in the text in detail below." /></a></p>

<ol>
  <li>Inject an environment variable when running a test child that contains the path to the strace output file.</li>
  <li>In the test, recognize the environment variable and:
    <ol>
      <li>Do not use <code class="language-plaintext highlighter-rouge">strace</code>.</li>
      <li>Read the strace output file contained in the environment variable.</li>
      <li>Filter out the file to just the process IDs that are relevant to the completed test.</li>
    </ol>
  </li>
</ol>

<p><strong>This does somewhat work</strong> - both trace points are able to identify the knowledge that they need.</p>

<p>But, if you can see the problem with this approach, then you are very, very clever.  I, on the other hand, did not.</p>

<p>You see, every time the child goes to read data from the strace output file, it has to perform a <code class="language-plaintext highlighter-rouge">read</code> syscall… which goes back to the parent’s strace… which gets written to the output file… which then the child has to read…</p>

<p><a href="https://mathieu.fenniak.net/assets/testtrim-syscall-tracing/single-file-broken.png"><img src="https://mathieu.fenniak.net/assets/testtrim-syscall-tracing/single-file-broken-small.png" alt="Flow control diagram showing the nearly infinite loop of syscall output." /></a></p>

<p><strong>sigh</strong></p>

<p>It’s not quite an infinite loop.  (Well, it was when I first ran the code, of course.)  But there is an exit condition that can be added in: the integration test can stop reading the trace file once it finds the process exit record for its test cases.  For example, in the diagram above the test was process 4321, and so process 1234 can stop when it hits the exit condition for 4321.  So not infinite, but incredibly magnified.</p>

<p>The performance of this solution was very bad - the more tests that needed to be run in a subprocess, the more times it was re-reading the syscall data, which included every syscall involved in the last re-read of the data.  When combined with multiple parallel tests running, there’s a geometric growth in strace output that needs to be read.</p>

<p>So, I came up with an alternative that seemed far more promising… for a while.</p>

<h1 id="nested-tracing-v2-this-will-work-for-sure">Nested Tracing v2: This Will Work For Sure!</h1>

<p>strace has an option called <code class="language-plaintext highlighter-rouge">--output-separately</code>.  This changes the output location to a directory rather than a file, and outputs each process that was traced into its own file.</p>

<p><a href="https://mathieu.fenniak.net/assets/testtrim-syscall-tracing/multiple-file-maybe.png"><img src="https://mathieu.fenniak.net/assets/testtrim-syscall-tracing/multiple-file-maybe-small.png" alt="Flow control diagram of testtrim running strace with `--output-separately`, as described below." /></a></p>

<p>This would work with testtrim wonderfully:</p>
<ol>
  <li>Use <code class="language-plaintext highlighter-rouge">--output-separately</code> to write one file per process.</li>
  <li>Pass the location of the files to the child process.</li>
  <li>When the integration tests’s test process completes, it can read just the child PID’s file – which is a terminated subprocess, avoiding the syscall amplification.</li>
</ol>

<p><strong>There is, of course, a complication.</strong>  It seems small, and then it becomes really messy.</p>

<p>We can’t just read the file with the child’s PID, but we also need to read any subprocesses of that process, and subprocesses of those processes.</p>

<p>Like I said, seems simple, right?  Read the child PID’s file, and since it is a record of syscalls, simply note any processes that it spawned and read those files as well.  Repeat as necessary.</p>

<p>This seemed great!  For a bit.</p>

<p>But the subtle problem is that processes aren’t independent.  What software developers would usually call a “process” is a program execution which is operating independently; and we’d use the term “thread” to refer to multiple concurrent execution within a process.  But in the Linux kernel, and in <code class="language-plaintext highlighter-rouge">strace</code>’s output, a thread is a process.  <code class="language-plaintext highlighter-rouge">strace</code> outputs each thread to a separate output file… but they have shared state that is relevant to our syscall tracing.</p>

<p>To keep things simple, we’ll use the current working directory as an example.  Let’s say pid 4321 and pid 4322 are threads, and our <code class="language-plaintext highlighter-rouge">strace</code> output shows:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span><span class="nb">cat </span>test-case.strace.4321:
execve<span class="o">(</span><span class="s2">"test-executable"</span>, <span class="o">[</span><span class="s2">"test-case"</span><span class="o">])</span> <span class="o">=</span> ?
clone<span class="o">(</span>... CLONE_THREAD, ...<span class="o">)</span> <span class="o">=</span> 4322
...snip...
openat<span class="o">(</span>AT_CWD, <span class="s2">"fixture.txt"</span>, ...<span class="o">)</span> <span class="o">=</span> ...
...snip...
+++ exited with code 0 +++

<span class="nv">$ </span><span class="nb">cat </span>test-case.strace.4322
...snip...
chdir<span class="o">(</span><span class="s2">"../test_data"</span><span class="o">)</span>
...snip...
+++ exited with code 0 +++
</code></pre></div></div>

<p>We know:</p>
<ul>
  <li>PID 4321 opened the relative file path <code class="language-plaintext highlighter-rouge">fixture.txt</code></li>
  <li>PID 4322 changed the current working directory to <code class="language-plaintext highlighter-rouge">../test_data</code></li>
</ul>

<p>But we don’t know <strong>what order</strong> those two events happened in.  The order of them completely changes which file was accessed by the test.</p>

<p><a href="https://mathieu.fenniak.net/assets/testtrim-syscall-tracing/multiple-file-broken.png"><img src="https://mathieu.fenniak.net/assets/testtrim-syscall-tracing/multiple-file-broken-small.png" alt="Flow control diagram of testtrim running strace with `--output-separately`, as described below." /></a></p>

<p>After I started sending <code class="language-plaintext highlighter-rouge">strace</code> data to separate files, I no longer know the order of the events in them.  And if they can affect shared state, like the current working directory of the process, or access shared file descriptors between the threads, the effective events of the process are lost.</p>

<p>I took a brief look at using strace’s timing outputs in order to provide ordering, but first it seemed sketchy in terms of accuracy, and second the performance of strace gets much, much worse.  It needs to constantly access the system clock, so that performance drop makes sense.</p>

<h1 id="nested-tracing-v3-syscall-tracing-sans-syscalls">Nested Tracing v3: syscall Tracing Sans syscalls?</h1>

<p>Getting to a solution that met all the criteria required a little backtracking, one simple tweak, and one complicated tweak.</p>

<p>First, I reverted to the single output stream, rather than <code class="language-plaintext highlighter-rouge">--output-separately</code> - other than timestamps, this was the only way I was going to get an ordered view of the syscalls.</p>

<p>The simple tweak was to start to stream the data from strace into the parent process through a FIFO pipe, which allowed the parent process to parse and understand the asynchronously while waiting for the subprocess to finish.  Now that data was live in the parent process, and I just needed to get access to that event stream from the child… without using syscalls.  (Well, syscalls to get things “started” were fine, but there had to be no syscalls involved in reading data.)</p>

<p><a href="https://mathieu.fenniak.net/assets/testtrim-syscall-tracing/nested-pipe-complete.png"><img src="https://mathieu.fenniak.net/assets/testtrim-syscall-tracing/nested-pipe-complete-small.png" alt="Flow control diagram of testtrim running with a shared memory buffer, as described below." /></a></p>

<p>Simplifying a few details, the architecture ends up looking like this:</p>
<ol>
  <li>The testtrim process opens a UNIX socket, and passes the address down to subprocesses through an environment variable.</li>
  <li>When an integration test wants to start a test-case with nested tracing, it connects to the testtrim process, and sends a request to subscribe to the PID of the child process.</li>
  <li>testtrim starts a shared-memory buffer and begins to store any syscalls from the PID of the child.  This is where the live connection through the pipe to strace matters - we’re able to instantly connect into that data stream and copy it into the buffer.</li>
  <li>testtrim sends back to the test process an address for the shared memory buffer.</li>
  <li>On an ongoing basis, testtrim writes syscall data to the shared buffer, and the test process reads it out.  A ring buffer is used so that we can have a fixed memory allocation which supports the continuous stream between the two processes.</li>
</ol>

<p>There are are few tricky parts of the implementation:</p>
<ul>
  <li>When a child process starts a test, we can’t miss syscalls - so we need to synchronize starting up the child process, subscribing to the events, receiving acknowledgement that we subscribed, and then allowing the child process to actually start.</li>
  <li>We can’t guarantee the order of syscalls coming out of strace: a child process’s <code class="language-plaintext highlighter-rouge">clone</code> or <code class="language-plaintext highlighter-rouge">fork</code> or whatever might begin in the syscall stream, and then syscalls from that child process would appear, and then the finish of the clone/fork/etc could be emitted.  So testtrim needs to occasionally “lightly reorder” the syscalls during process startup, just to ensure that subprocesses are recognized correctly.</li>
  <li>When a test process terminates, we need to make sure that all the syscalls are delivered to the child process, and none are still in-memory.  To accomplish this, the testtrim process over here keeps an eye out for process exit messages and sends an EOF marker into the shared memory buffer.</li>
</ul>

<h1 id="conclusion">Conclusion</h1>

<p>Does it work?</p>

<p><strong>Yes.</strong></p>

<p>It took more than a few passes to get the synchronization logic correct, and with this complexity there might still be a surprise or two to learn.  But, it functions!</p>

<p>With this design, nested strace data collection is possible, and the order of syscalls is preserved such that we can recognize dependencies between threads like changing the working directory on one thread, and opening a file on another.</p>

<p>This has been the final piece in the puzzle to getting testtrim to be able to run on its own project, in its own CI.  It’s a shame that the nested logic will never be useful for anything else, but it’s pretty cool.</p>

<p>There’s one more neat thing that testtrim is doing with syscall tracing which is more generally applicable to other test projects - but that seems like a topic for another post.  Until then, hope you’ve enjoyed this deep-dive!</p>]]></content><author><name></name></author><summary type="html"><![CDATA[Today, we’re going to deep-dive into the kind of thing you can only “invest” time on if you’re a single engineer working on a project with no supervision. I just finished a crazy complicated development effort in my project, testtrim, and all I want to do is talk about how surprised I am that it actually worked.]]></summary></entry><entry><title type="html">testtrim: Almost “Self-Hosted”</title><link href="https://mathieu.fenniak.net/testtrim-2024-12-almost-self-hosted/" rel="alternate" type="text/html" title="testtrim: Almost “Self-Hosted”" /><published>2024-12-21T07:00:00+00:00</published><updated>2024-12-21T07:00:00+00:00</updated><id>https://mathieu.fenniak.net/testtrim-almost-self-hosted</id><content type="html" xml:base="https://mathieu.fenniak.net/testtrim-2024-12-almost-self-hosted/"><![CDATA[<p>I’ve been working on a project called testtrim, which targets the execution of software automated tests based upon previous code-coverage data and git changes.  The concept <a href="/introducing-testtrim/">was introduced</a> in October 2024, and over the past two months it has made some great progress towards its most interesting goal.  It is getting dangerously close to a major milestone of being “self-hosted” – in this context that means being used as the engine to run its own tests in its own continuous integration system.  Today I’ll review the goals and challenges currently on the table, and then make an inventory of all the improvements in the tool since the concept introduction.</p>

<p>Table of contents:</p>
<ul>
  <li><a href="#current-project-goals">Current Project Goals</a></li>
  <li><a href="#new-features-since-last-update">New Features Since Last Update</a>
    <ul>
      <li><a href="#syscall-tracing">syscall Tracing</a></li>
      <li><a href="#external-dependency-tracing">External Dependency Tracing</a></li>
      <li><a href="#go--net-platforms">Go &amp; .NET Platforms</a></li>
      <li><a href="#remote-api-server--client">Remote API Server &amp; Client</a></li>
      <li><a href="#history-simulation">History Simulation</a></li>
      <li><a href="#minor-features">Minor Features</a></li>
    </ul>
  </li>
  <li><a href="#testtrim-on-testtrim-self-hosting">testtrim on testtrim (“Self Hosting”)</a>
    <ul>
      <li><a href="#real-world-network-tracing">Real-world Network Tracing</a></li>
      <li><a href="#recursive-strace-failure">Recursive strace Failure</a></li>
    </ul>
  </li>
  <li><a href="#whats-the-future">What’s the future?</a></li>
  <li><a href="#follow-along-share-your-thoughts">Follow along, share your thoughts!</a></li>
</ul>

<h2 id="current-project-goals">Current Project Goals</h2>

<p>The core goal of the project is: <strong>determine whether this will work for large-scale projects</strong>.</p>

<p>In short, the concept is:</p>
<ol>
  <li>Just like you would use to report test coverage (eg. “75% of our code is tested!”), run tests with a coverage tool.  But rather than running the entire test suite and reporting generalized coverage, run each individual test to get the coverage for each test.</li>
  <li>Invert the data; change “test case touches code” into “code was touched by test case”, and then store it into a database.</li>
  <li>Look at a source control diff since the last time you did #2 to find out what changes occurred to the code, then look them up in the database to see what test cases need to be run.</li>
</ol>

<p>It has been trivial to prove that this will work on microscopic test projects, but real-world projects come with a lot of complexity which muddy the issue up:</p>
<ul>
  <li>Tests which read data files, for example for test fixture data</li>
  <li>Tests which access network resources, which you might want to run always, or you might want to run never, or somewhere in-between</li>
  <li>Programming environments which allow you to embed data files into code (eg. in Rust, <code class="language-plaintext highlighter-rouge">include!()</code> and similar; in Go <code class="language-plaintext highlighter-rouge">//go:embed</code>; etc.)</li>
  <li>Upgrades of third-party dependencies</li>
</ul>

<p>These are just some of the most obvious problems.  In order to uncover more problems, and develop solutions to these problems, testtrim has two medium-term goals:</p>
<ol>
  <li>Run testtrim’s own test automation suite, in it’s own CI system, under testtrim.</li>
  <li>Incorporate testtrim into an Open Source product’s CI system.</li>
</ol>

<p>As these goals are tackled I expect to get some clarity on that core determination, and find out whether this concept will work, won’t work, or most likely, that it will work but has some limited applicability which I can begin to define.</p>

<h2 id="new-features-since-last-update">New Features Since Last Update</h2>

<h3 id="syscall-tracing">syscall Tracing</h3>

<p>On Linux, testtrim is capable of tracing a test with <code class="language-plaintext highlighter-rouge">strace</code> in order to identify all the system calls (syscalls) that are being made by a test.  After analyzing the output, testtrim identifies two things that add onto the code-coverage logic and could cause additional reasons for a test to run again in the future:</p>

<ul>
  <li>Opening local files within the repo
    <ul>
      <li>If <code class="language-plaintext highlighter-rouge">test_a</code> reads a local file <code class="language-plaintext highlighter-rouge">tests/fixture/data.txt</code> when the test is run, then in the future if the git data shows that <code class="language-plaintext highlighter-rouge">tests/fixture/data.txt</code> is modified, testtrim will rerun <code class="language-plaintext highlighter-rouge">test_a</code>.</li>
    </ul>
  </li>
  <li>Network access
    <ul>
      <li>If <code class="language-plaintext highlighter-rouge">test_a</code> opens a network port to <code class="language-plaintext highlighter-rouge">10.1.1.1:5432</code> when the test is run, then in the future testtrim will always rerun <code class="language-plaintext highlighter-rouge">test_a</code> with the assumption that the network access might “be different now”.</li>
    </ul>
  </li>
</ul>

<p>However, the network access logic of “rerun every network test” is extremely conservative.  So, testtrim now has the capability to have <a href="https://codeberg.org/testtrim/testtrim#network-configuration">network configuration</a> in order to customize the behavior for network access; it is now possible to:</p>
<ul>
  <li>Ignore test network access – good for something like an internal test server.</li>
  <li>Rerun tests that access the network, only when other files change – good for something like “rerun all database tests whenever the database schema in these files change”.</li>
</ul>

<h3 id="external-dependency-tracing">External Dependency Tracing</h3>

<p>When you upgrade an external dependency, testtrim will identify which tests used that dependency and rerun those tests.  The modification to <code class="language-plaintext highlighter-rouge">Cargo.lock</code> or <code class="language-plaintext highlighter-rouge">go.mod</code> will be used to identify the changed module, and coverage-tracking being enabled across the dependencies will be used to find the relevant tests.</p>

<h3 id="go--net-platforms">Go &amp; .NET Platforms</h3>

<p>testtrim is a project written in Rust, and so it targeted Rust as a first platform to operate on.  I added some abstractions early in the development to support multiple platforms in the future, but I didn’t have any idea whether those abstractions would allow other platforms to actually be implemented.</p>

<table>
  <thead>
    <tr>
      <th>Feature</th>
      <th style="text-align: center">Rust</th>
      <th style="text-align: center">Go</th>
      <th style="text-align: center">.NET (C#, etc.)</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>File-based coverage tracking (ie. changes that will affect tests are tracked on a file-by-file basis; the least granular but simplest approach)</td>
      <td style="text-align: center">✅</td>
      <td style="text-align: center">✅</td>
      <td style="text-align: center">✅</td>
    </tr>
    <tr>
      <td>Function-based coverage tracking (Only theorized, not implemented at all yet)</td>
      <td style="text-align: center">❌</td>
      <td style="text-align: center">❌</td>
      <td style="text-align: center">❌</td>
    </tr>
    <tr>
      <td>External dependency change tracking</td>
      <td style="text-align: center">✅</td>
      <td style="text-align: center">✅</td>
      <td style="text-align: center">❌</td>
    </tr>
    <tr>
      <td>syscall tracking for file &amp; network tracking</td>
      <td style="text-align: center">✅</td>
      <td style="text-align: center">✅</td>
      <td style="text-align: center">❌</td>
    </tr>
    <tr>
      <td>Embedded file tracking (ie. if a file embeds another file, changes to either will trigger related tests)</td>
      <td style="text-align: center">✅</td>
      <td style="text-align: center">✅</td>
      <td style="text-align: center">❌</td>
    </tr>
    <tr>
      <td>Performance</td>
      <td style="text-align: center">👍</td>
      <td style="text-align: center">OK</td>
      <td style="text-align: center">Mega-👎</td>
    </tr>
    <tr>
      <td>Test self-tracing (ie. modifications to tests are traced by coverage)</td>
      <td style="text-align: center">✅</td>
      <td style="text-align: center">❌</td>
      <td style="text-align: center">✅</td>
    </tr>
  </tbody>
</table>

<p>The Go platform is basically feature-for-feature comparable with Rust.  There is one major exception: Go doesn’t perform coverage tracing on test files (<code class="language-plaintext highlighter-rouge">*_tests.go</code>).  testtrim tries to cover this up with some file heuristics and regexes, but it does leave a gap where functions in <code class="language-plaintext highlighter-rouge">*_tests.go</code> that are reused in other tests won’t trigger their execution correctly.</p>

<p>The .NET platform ran into some very large limitations with external dependency tracking and has been frozen half-implemented for the time being.  Further developing it doesn’t get me any closer to the main “determine whether this will work for large-scale projects” goal, but I think it will be possible to improve it in the future at the right time.</p>

<h3 id="remote-api-server--client">Remote API Server &amp; Client</h3>

<p>In order to use testtrim in a CI environment, it will be necessary to have a persistent database that is accessible between CI invocations.  To support this, testtrim has an API server and client that exposes its coverage database.</p>

<p>The server is run with the <code class="language-plaintext highlighter-rouge">testtrim run-server</code> subcommand.</p>

<p>The client is used by setting the <code class="language-plaintext highlighter-rouge">TESTTRIM_DATABASE_URL</code> to access the http or https address of the server.</p>

<h3 id="history-simulation">History Simulation</h3>

<p>I’ve been exploring how effective testtrim is by using Open Source projects, and running testtrim against their recent commits.  In other words, going back 100 commits, and running testtrim on each commit afterwards to see how many tests will be run.</p>

<p>The <code class="language-plaintext highlighter-rouge">testtrim simulate-history</code> subcommand automates this process and outputs a CSV file with statistical information about the simulation.</p>

<p>For example, as part of developing the Go platform I completed a simulation of an Open Source project called <a href="https://github.com/uber-go/zap">zap</a>.  The <a href="https://docs.google.com/spreadsheets/d/1PQ3kXqmS8lTQAlvsmjT96Imkz1DX2olSTmnSnGNqA5A/edit?usp=sharing">simulation results</a> are inline with other Rust projects, suggesting about 90% of the tests are skipped on average.</p>

<table>
  <thead>
    <tr>
      <th># of Commits</th>
      <th>100</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td># of Commits Successful Build &amp; Test</td>
      <td>100</td>
    </tr>
    <tr>
      <td># of Commits with Ancestor; testtrim Worked</td>
      <td>99</td>
    </tr>
    <tr>
      <td>Average Tests to Run per Commit:</td>
      <td>10.1%</td>
    </tr>
    <tr>
      <td>Median Tests to Run per Commit:</td>
      <td>0.4%</td>
    </tr>
    <tr>
      <td>P90 Tests to Run per Commit:</td>
      <td>36.6%</td>
    </tr>
    <tr>
      <td> </td>
      <td> </td>
    </tr>
    <tr>
      <td>Average # Files Changed per Commit in Range:</td>
      <td>3.97</td>
    </tr>
    <tr>
      <td>Average Ext. Dep. Changed per Commit in Range:</td>
      <td>1.25</td>
    </tr>
  </tbody>
</table>

<h3 id="minor-features">Minor Features</h3>

<ul>
  <li>Embedded files:
    <ul>
      <li>When files are embedded in the source code (Rust: <code class="language-plaintext highlighter-rouge">include!(), include_str!(), include_bytes!()</code>, Go: <code class="language-plaintext highlighter-rouge">//go:embed</code>), any modifications to those files will be treated as a modification to the embedding file.  This means that related tests should be rerun automatically.</li>
    </ul>
  </li>
  <li>Test coverage tags:
    <ul>
      <li>When running tests, the <code class="language-plaintext highlighter-rouge">--tags</code> command-line option can be used to differentiate coverage, allowing the same project to be tested in different configurations.  For example, you could run tests with a tag <code class="language-plaintext highlighter-rouge">database=postgresql</code>, and later tests with a tag <code class="language-plaintext highlighter-rouge">database=mysql</code>, and the two tags would have coverage maps tracked separately.  This would allow a change that only affects one codepath to trigger only tests that are related to that codepath.</li>
      <li>An automatic <code class="language-plaintext highlighter-rouge">platform</code> tag is enabled by default, eg. <code class="language-plaintext highlighter-rouge">x86_64-unknown-linux-gnu</code>.  If a project is multiplatform but uses conditional compilation, this will prevent corrupted coverage data from leaking between the two platforms.</li>
    </ul>
  </li>
  <li>Transparency into test targeting:
    <ul>
      <li>When <code class="language-plaintext highlighter-rouge">get-test-identifiers</code> is run, testtrim outputs each test that it plans to run for the current repository, as well as <strong>why</strong> it will run.  It might run because: there’s no coverage map to calculate results on, the test is new, a file has changed, a network dependency is present, an external dependency of the project has changed, etc.</li>
    </ul>
  </li>
  <li>Parallel test execution:
    <ul>
      <li>For the Go and Rust projects, test execution is parallelized to improve performance.</li>
    </ul>
  </li>
  <li>Release tooling:
    <ul>
      <li>As part of running testrim in a CI environment, and hosting a testtrim API server, I need to have a released version of testtrim.  testrim now has a fully automated release process with automatic changelogs, and publishes an OCI container for running the API server.</li>
    </ul>
  </li>
</ul>

<h2 id="testtrim-on-testtrim-self-hosting">testtrim on testtrim (“Self Hosting”)</h2>

<p>A lot of issues have been resolved on the pathway to getting <a href="https://codeberg.org/testtrim/testtrim/pulls/170">testtrim to run on itself</a>, but a couple complex issues remain:</p>

<h3 id="real-world-network-tracing">Real-world Network Tracing</h3>

<p>testtrim runs its tests under <code class="language-plaintext highlighter-rouge">strace</code> and captures the network access that the tests performs.  It’s very common for an automated test to reach out to a database, or a locally hosted version of the application, for example.</p>

<pre class="mermaid">
flowchart LR
    localhost:5432[(PostgreSQL on localhost:5432)]
    localhost:8443[(API on localhost:8443)]
    testtrim--&gt;test_coverage_db
    test_coverage_db--&gt;localhost:8443
    test_coverage_db--&gt;localhost:5432
</pre>

<p>testtrim has new capabilities to match against these network patterns and apply rules to them.
FIXME: add a link here to the network policy</p>

<p>When running testtrim-on-testtrim, most of the tests had network access patterns that were easy to match with network patterns and apply the right logic to the tests.  But one end-to-end integration test started to expose some complications.  This is a trimmed output from the <code class="language-plaintext highlighter-rouge">get-test-identifiers</code> command, which in testtrim shows you which test will be run and why; all the easy to deal with problems have been removed:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>RustTestIdentifier { test_src_path: "tests/linearcommits_filecoverage_tests.rs", test_name: "linearcommits_filecoverage::dotnet_test::dotnet_linearcommits_filecoverage" }
  ...
	NetworkDependency(Inet([::ffff:152.199.4.184]:443))
	NetworkDependency(Inet([::ffff:192.229.211.108]:80))
	NetworkDependency(Inet([::ffff:23.217.131.226]:80))
	NetworkDependency(Inet([2001:67c:1401:20f0::1]:443))
	NetworkDependency(Inet(10.89.0.1:53))
	NetworkDependency(Inet(217.197.91.145:443))
  ...
</code></pre></div></div>

<p>The remaining network dependencies fall into three categories:</p>
<ul>
  <li>Downloading from nuget.org during this .NET build process.  This specific test is running a .NET build with an external dependency that needs to be downloaded – but even though there’s network access on every test, it’s downloading a file that doesn’t change between test runs.</li>
  <li>Downloading from codeberg.org a test repository (dotnet-test-specimen) to run the .NET project.  This is somewhat the “subject under test” and might be kinda fair – but again, I don’t want this test to rerun just because it does this behavior.</li>
  <li>Accessing a CRL/CA check to inspect whether any https access is subject to a certificate revocation list.  There is extremely little reason to rerun tests because of this dependency – but need to identify how to ignore it accurately.</li>
</ul>

<p>Boiling down these examples a little gives two problems:</p>
<ul>
  <li>Sometimes a test might download “static content”.  You don’t want the test to rerun just because it does this download; you want it to rerun if the content changes.
    <ul>
      <li><code class="language-plaintext highlighter-rouge">nixpkgs</code> has an approach to external dependencies that might be inspiration for a solution here: download the dependencies outside of the test, hash them, use the hashes as an input to the test that is traced by testtrim, and then update the tests to use the downloaded local copies.  It’s a great outcome for test quality – they get the right external content, they rerun when it changes, the test accomplishes the same test goal – but has the downside of drastically changing how a test is run.  Maybe with tooling that does the heavy lifting it would be feasible?</li>
    </ul>
  </li>
  <li>When a test accesses a network resource, the syscall tracing doesn’t capture the DNS resolution involved, just the raw socket connect syscalls with their IP addresses.  As a result it’s hard to create useful long-term network policies.
    <ul>
      <li>I’d love to be able to intercept the DNS requests a process makes and match the network policies on DNS names.  With modern Linux systems that use caching daemons (<code class="language-plaintext highlighter-rouge">nscd</code>) I’m not sure if I can capture these requests through an <code class="language-plaintext highlighter-rouge">strace</code>…</li>
    </ul>
  </li>
</ul>

<h3 id="recursive-strace-failure">Recursive strace Failure</h3>

<p>When testtrim runs on a project, it runs each test independently to collect its coverage data.  Each of those tests is also run under <code class="language-plaintext highlighter-rouge">strace</code> in order to capture the network and filesystem access that the test performs.</p>

<p>Imagine that you’re running testtrim on a project called “Test Project”, which has a test case “ABC”.  After discovering all the tests in the project and evaluating what needs to be run, testtrim will eventually run the test “ABC”, wrapped in <code class="language-plaintext highlighter-rouge">strace</code>:</p>

<pre class="mermaid">
sequenceDiagram
    testtrim-&gt;&gt;strace: Run Test ABC
    strace-&gt;&gt;Test Project: Run Test ABC
    Test Project-&gt;&gt;strace: Test Results w/ Coverage
    strace-&gt;&gt;testtrim: Test Results w/ Coverage &amp; syscalls
</pre>

<p>However, when I’m attempting to run testtrim on testtrim, it has a small number of test cases which validate the strace functionality.  That means that after discovering all the tests in the target project (testtrim), testtrim will try to run a subprocess to execute that test under strace.  (<code class="language-plaintext highlighter-rouge">testtrim(cli)</code> here indicates the testtrim command-line that is being executed, and <code class="language-plaintext highlighter-rouge">testtrim(sut)</code> indicates testtrim in the context of being a subject-under-test):</p>

<pre class="mermaid">
sequenceDiagram
    testtrim(cli)-&gt;&gt;strace(1): Run Test 'e2e-test'
    strace(1)-&gt;&gt;testtrim(sut): Run Test 'e2e-test'
    testtrim(sut)-&gt;&gt;strace(2): Test uses `strace`...
    strace(2)--xtesttrim(sut): PTRACE_TRACEME:<br />Operation not permitted
</pre>

<p>This failure is because <code class="language-plaintext highlighter-rouge">strace(1)</code> is running the test process with the <code class="language-plaintext highlighter-rouge">--follow-forks</code> option to capture all subprocesses’s information.  The subprocess is already being traced, and another invocation of strace will fail.</p>

<p>This is an outstanding issue that prevents testtrim-on-testtrim, but I think it’s a pretty narrow problem that doesn’t generalize very well to other projects.  That said, I still want to get it resolved.  There are a few approaches that I have in mind:</p>

<ul>
  <li>It would be possible to mark some tests as exempt from syscall tracing.  The affected tests won’t be subject to reexecution when they should be, but, it’s an option.</li>
  <li>It might be possible for the subprocess to recognize that it is being traced already by testtrim, and then work with the tracing data generated by it.  This would look something like <code class="language-plaintext highlighter-rouge">testtrim(cli)</code> setting an environment variable <code class="language-plaintext highlighter-rouge">TESTTRIM_STRACE_OUTPUT</code>, and then when <code class="language-plaintext highlighter-rouge">testtrim(sut)</code> starts its test it won’t start a new strace, but instead will read the data file at <code class="language-plaintext highlighter-rouge">TESTTRIM_STRACE_OUTPUT</code> for its syscalls.  It’s a pretty ugly solution.</li>
  <li>testtrim could have a new capability added to skip tests.  Although this doesn’t truly solve this problem, it might be a reasonable workaround – I want testtrim to run on it’s own CI, but I also think I can’t trust testtrim to test testtrim, and it might make sense for testtrim to rerun its test suite unconditionally.</li>
  <li>testtrim could use some compile-time flags to remove the tests, rather than skip tests, during the CI.</li>
</ul>

<p>An interesting problem that I’m a little stumped on, other than “don’t run these tests”.</p>

<h2 id="whats-the-future">What’s the future?</h2>

<p>Short term:</p>
<ul>
  <li>Finish having testtrim integrated into it’s CI.
    <ul>
      <li>Discover new problems from this.  Fix them.</li>
    </ul>
  </li>
  <li>Choose a target Open Source project to integrate testtrim into.
    <ul>
      <li>Discover new problems from this.  Fix them.</li>
    </ul>
  </li>
</ul>

<p>Medium-term:</p>
<ul>
  <li>testtrim should be usable on developer workstations to reduce test cycle time.  But there are some obvious blockers:
    <ul>
      <li>syscall tracing only supported on Linux.</li>
      <li>Other capabilities that might be platform-dependent (eg. file pathing related work) haven’t been tested extensively outside of Linux.</li>
    </ul>
  </li>
  <li>Evaluate alternatives to having testtrim be a test executor.
    <ul>
      <li>Every platform has basic or advanced tools for test execution; for example, Rust has a great <code class="language-plaintext highlighter-rouge">cargo test</code> capability, but also an advanced alternative <a href="https://nexte.st/">cargo-nextest</a> with great value-add.  I want testtrim to be able to be focused on the challenging problem of fast, accurate, test impact analysis and test selection.  I don’t want to have to rebuild all the other capabilities of a test executor.  Reconciling these two paths will be important to making a tool that can be value-add to an ecosystem.</li>
    </ul>
  </li>
</ul>

<p>Longer-term:</p>
<ul>
  <li>Distributed testing – eg. web application testing – I have a theory on how to do this with Open Telemetry tracing, but no design, no implementation today.</li>
</ul>

<h2 id="follow-along-share-your-thoughts">Follow along, share your thoughts!</h2>

<p>testtrim is under heavy development, and work will continue in 2025… until we see whether it will really be practical.  It’s an Open Source project, and I’d love to get feedback from people on the concept, and share in the development as it goes.</p>

<ul>
  <li><a href="https://codeberg.org/testtrim/testtrim">Source Repo</a></li>
  <li><a href="https://www.youtube.com/@mfenniak">YouTube Channel for Updates</a></li>
  <li><a href="https://yyc.bike/@mfenniak">Mastodon for Updates / Feedback</a></li>
  <li><a href="https://bsky.app/profile/mathieu.fenniak.net">Bluesky for Updates / Feedback</a></li>
  <li><a href="https://mathieu.fenniak.net/feed.xml">Blog RSS for Updates</a></li>
</ul>

<script type="module">
  import mermaid from 'https://cdn.jsdelivr.net/npm/mermaid@11/dist/mermaid.esm.min.mjs';
  mermaid.initialize({ startOnLoad: true });
</script>]]></content><author><name></name></author><summary type="html"><![CDATA[I’ve been working on a project called testtrim, which targets the execution of software automated tests based upon previous code-coverage data and git changes. The concept was introduced in October 2024, and over the past two months it has made some great progress towards its most interesting goal. It is getting dangerously close to a major milestone of being “self-hosted” – in this context that means being used as the engine to run its own tests in its own continuous integration system. Today I’ll review the goals and challenges currently on the table, and then make an inventory of all the improvements in the tool since the concept introduction.]]></summary></entry><entry><title type="html">Introducing testtrim: Coverage-based Test Targeting</title><link href="https://mathieu.fenniak.net/introducing-testtrim/" rel="alternate" type="text/html" title="Introducing testtrim: Coverage-based Test Targeting" /><published>2024-10-16T07:00:00+00:00</published><updated>2024-10-16T07:00:00+00:00</updated><id>https://mathieu.fenniak.net/introducing-testtrim</id><content type="html" xml:base="https://mathieu.fenniak.net/introducing-testtrim/"><![CDATA[<p>I’ve been working on a project called testtrim, which targets software automated tests for execution based upon previous code-coverage data and git changes.  It’s in early development, but it’s looking quite promising with evaluations showing that on-average 90% of tests can be safely skipped with this strategy.</p>

<p>If you’re more inclined to follow this kind of content in a video format, I’ve also published an introductory video (in two formats):</p>

<p><a href="https://youtu.be/wNPeTxf3xFw">Short Introduction Video, 10 minutes</a></p>

<p><a href="https://youtu.be/YQKc58dTR1M">Deep-dive Introduction Video, 37 minutes</a></p>

<h2 id="problem">Problem</h2>

<p>The longer a good software project lives, the longer its automated testing solution will take to run.  This is especially true if the engineers responsible aren’t able to dedicate a lot of time to carefully managing their test suite, pruning it like a Japanese garden into a state of zen minimalism.  In my experience, nobody has time for that.</p>

<p>Long automated testing cycles can be managed pretty effectively; they can be parallelized, they can be run on the fastest hardware, changes can be batched together, they can be moved around to different stages of the release cycle to get early likely-failure feedback, and of course, they can be <code class="language-plaintext highlighter-rouge">[Ignored]</code> (or <code class="language-plaintext highlighter-rouge">.skip</code> or <code class="language-plaintext highlighter-rouge">#[ignore]</code> or whatever.)</p>

<p>But can they be made more efficient by only running the things that are relevant to test the change being made?</p>

<h2 id="testtrims-equation">testtrim’s Equation</h2>

<ol>
  <li>
    <p>Just like you would use to report test coverage (eg. “75% of our code is tested!”), run tests with a coverage tool.  But rather than running the entire test suite and reporting generalized coverage, run each individual test to get the coverage for each test.</p>
  </li>
  <li>
    <p>Invert the data; change “test case touches code” into “code was touched by test case”, and then store it into a database.</p>
  </li>
  <li>
    <p>Look at a source control diff since the last time you did #2 to find out what changes occurred to the code, then look them up in the database to see what test cases need to be run.</p>
  </li>
</ol>

<p>This is the core concept behind <a href="https://codeberg.org/testtrim/testtrim">testtrim</a>.</p>

<h2 id="how-well-does-it-work">How well does it work?</h2>

<p>It’s early days for testtrim.</p>

<p>To evaluate how well it worked, I took an Open Source project (<a href="https://github.com/alacritty/alacritty">alacritty</a>), and I ran the last 100 commits through testtrim.</p>

<table>
  <thead>
    <tr>
      <th># Commits:</th>
      <th style="text-align: right">100</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td># Commits Successful Build &amp; Test:</td>
      <td style="text-align: right">66</td>
    </tr>
    <tr>
      <td># Commits Appropriate for Analysis:</td>
      <td style="text-align: right">53</td>
    </tr>
    <tr>
      <td>Average Tests to Run per Commit:</td>
      <td style="text-align: right"><strong>8.4%</strong></td>
    </tr>
    <tr>
      <td>Median Tests to Run per Commit:</td>
      <td style="text-align: right"><strong>0.0%</strong></td>
    </tr>
    <tr>
      <td>P90 Tests to Run per Commit:</td>
      <td style="text-align: right"><strong>38.9%</strong></td>
    </tr>
  </tbody>
</table>

<p>For each commit, testtrim identified that an average of only 8.4% of tests needed to be executed to fully test the code change that was being made.</p>

<p>I could list a dozen reasons why this analysis isn’t generalizable… and so I will:</p>

<ul>
  <li>This analysis didn’t include changes to <code class="language-plaintext highlighter-rouge">Cargo.lock</code> files (eg. references to external dependencies).  I’ve added that capability to testtrim, but I haven’t redone this analysis yet.</li>
  <li>alacritty is a pretty mature project that isn’t undergoing rapid development.</li>
  <li>This specific measurement didn’t take into account new tests being added to the repo during this period.</li>
  <li>alacritty has some tests that work through reading static files; those commits were removed from the analysis, possibly lowering the numbers.</li>
  <li>There’s no evidence that a test on a single project will generalize to lots of other projects.</li>
  <li>The only guarantee of correctness in this analysis is my own eyeballing of the changes and proposed tests.</li>
</ul>

<p>But on the other hand, this was just based upon the simple heuristic of “this test touched a file, and that file changed, therefore rerun this test.”  I think that could become a lot more sophisticated with more work in the future as well.</p>

<p>I think it’s <strong>promising</strong>, but not <strong>promised</strong>.</p>

<h2 id="whats-the-future">What’s the future?</h2>

<ul>
  <li><strong>Small scale technical items</strong> – these prevent testtrim from working well, even in the scope of Rust local apps which are presumably going to be its jam since that’s the only thing it supports today:
    <ul>
      <li>Test execution performance / parallelism – testtrim doesn’t run tests in parallel, so it’s slower than using <code class="language-plaintext highlighter-rouge">cargo test</code> even when it runs fewer tests</li>
      <li>Doesn’t work on all Rust codebases – possibly indicative of environmental differences between <code class="language-plaintext highlighter-rouge">cargo test</code> and testtrim, but could be other issues</li>
    </ul>
  </li>
  <li><strong>Larger requirements</strong> – things that would really unlock value
    <ul>
      <li>Multiple platforms – eg. JavaScript, Java, C#, etc.</li>
      <li>Distributed testing – eg. web application testing – I have a theory on how to do this with Open Telemetry tracing, but no implementation today</li>
    </ul>
  </li>
  <li><strong>Project risks</strong> – aside from technical work
    <ul>
      <li>How effective can it really be? – a few data points aren’t quite enough to be confident</li>
      <li>How effective does it have to be for people to want to pick it up and use it? (Cost/Benefit)
        <ul>
          <li>Even as an Open Source project (which is the plan, today!), it still has costs to implement: <strong>complexity</strong>, engineering time, operational costs, and maintenance costs</li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<h2 id="follow-along-share-your-thoughts">Follow along, share your thoughts!</h2>

<p>testtrim is under heavy development, and I expect it to move quickly through the end of 2024 and early 2025 into… well, we’ll see.  It’s an Open Source project, and I’d love to get feedback from people on the concept, and share in the development as it goes.</p>

<ul>
  <li><a href="https://codeberg.org/testtrim/testtrim">Source Repo</a></li>
  <li><a href="https://www.youtube.com/@mfenniak">YouTube Channel for Updates</a></li>
  <li><a href="https://yyc.bike/@mfenniak">Mastodon for Updates / Feedback</a></li>
  <li><a href="https://bsky.app/profile/mathieu.fenniak.net">Bluesky for Updates / Feedback</a></li>
  <li><a href="https://mathieu.fenniak.net/feed.xml">Blog RSS for Updates</a></li>
</ul>]]></content><author><name></name></author><summary type="html"><![CDATA[I’ve been working on a project called testtrim, which targets software automated tests for execution based upon previous code-coverage data and git changes. It’s in early development, but it’s looking quite promising with evaluations showing that on-average 90% of tests can be safely skipped with this strategy.]]></summary></entry><entry><title type="html">Summer of Fun, 2024</title><link href="https://mathieu.fenniak.net/summer-of-fun-2024/" rel="alternate" type="text/html" title="Summer of Fun, 2024" /><published>2024-09-22T07:00:00+00:00</published><updated>2024-09-22T07:00:00+00:00</updated><id>https://mathieu.fenniak.net/summer-of-fun</id><content type="html" xml:base="https://mathieu.fenniak.net/summer-of-fun-2024/"><![CDATA[<p>What does a software developer do with 3 months of spare time? It’s time to reflect on the most fun part of any vacation: the software development. What did I fiddle with while I had time off?</p>

<h2 id="pixelperfectpi">pixelperfectpi</h2>

<p><a href="https://mathieu.fenniak.net/assets/summer-of-fun-2024/PXL_20240922_103527144.jpg"><img src="https://mathieu.fenniak.net/assets/summer-of-fun-2024/PXL_20240922_103527144-small.jpg" alt="A panel of LEDs, displaying a UV index in the top left corner reading 2 with a small bar chart out of 10, the time 12:35, and the text Tue 2p Paris to Calgary" class="float-image" /></a></p>

<p>The first target was my LED matrix clock. There were some things that needed tweaking since I wasn’t in Calgary. The weather forecast was read from Environment Canada’s XML feeds – no good in Paris. So, I rebuilt weather based upon Home Assistant weather components, sent to an MQTT broker, and the clock reads that data. There are plenty of HA weather components for different services; easy to adapt this to any new location. Also added a UV index display.</p>

<p><a href="https://github.com/mfenniak/pixelperfectpi/compare/40383af796900b3fcac19b6a15185a6da5752fd6...d05d6e39217fa17b9aa11ce8bb2a3d6e10256713">https://github.com/mfenniak/pixelperfectpi/compare/40383af796900b3fcac19b6a15185a6da5752fd6…d05d6e39217fa17b9aa11ce8bb2a3d6e10256713</a></p>

<p>Along the way, I ripped out the usage of the dependency_injector library since it’s no longer supported upstream. I like having a DI-based design… but I will admit that for a Python app, and a small Python app at that, it’s probably not super useful.</p>

<p><a href="https://github.com/mfenniak/pixelperfectpi/commit/07e3c04d06a59e8d840598ce0b6a25a555a1473a">https://github.com/mfenniak/pixelperfectpi/commit/07e3c04d06a59e8d840598ce0b6a25a555a1473a</a></p>

<p><a href="https://mathieu.fenniak.net/assets/summer-of-fun-2024/PXL_20240922_103507394.jpg"><img src="https://mathieu.fenniak.net/assets/summer-of-fun-2024/PXL_20240922_103507394-small.jpg" alt="A panel of LEDs, displaying 18 degrees in the top-left, 12:35 in the top-right, and a table of with four hourly columns 1pm, 2pm, 3pm, 4pm, and the temperatures 18 deg, 19 deg, 18 deg, 19 deg under the hour columns." class="float-image" /></a></p>

<p>Now that I had a better source of weather data, I really wanted to display an hourly forecast. The clock is very small so I was limited in the display options… but also all the layout to-date had been shitty pixel hard-coding. So the next improvement was reimplementing the layout on the clock to use a flexbox calculator <a href="https://stretchable.readthedocs.io/en/latest/">stretchable</a>, which required touching every component and changing how they handled layout and drawing.</p>

<p><a href="https://github.com/mfenniak/pixelperfectpi/compare/f7924871c88e9c5ec2c172594cee71ef9c8d1389...e51e97c1ca4b09c0f00cafbed3d4cecbd9f48878">https://github.com/mfenniak/pixelperfectpi/compare/f7924871c88e9c5ec2c172594cee71ef9c8d1389…e51e97c1ca4b09c0f00cafbed3d4cecbd9f48878</a></p>

<p>The hourly weather output isn’t super amazing… but it’s useful enough. One last tweak to the clock was adding a display of the time at home. <a href="https://github.com/mfenniak/pixelperfectpi/commit/fbfabe331141d247c6021068989cf4607a089fd1">https://github.com/mfenniak/pixelperfectpi/commit/fbfabe331141d247c6021068989cf4607a089fd1</a></p>

<h2 id="crowd-control-video-game">Crowd Control Video Game</h2>

<p>Having my clock all settled, I started getting interested in the idea of a video game with the <a href="https://bevyengine.org/">Bevy engine</a>. I had built up a <a href="https://www.youtube.com/playlist?list=PL4cUxeGkcC9iHCXBpxbdsOByZ55Ez4bgF">Godot tutorial game</a> earlier in the year, and I rewrote that game with Bevy to learn the basics.  Not really too exciting, but it set the foundation to learn how to code with Bevy’s ECS model.</p>

<p><a href="https://mathieu.fenniak.net/assets/summer-of-fun-2024/PXL_20240729_130148152.jpg"><img src="https://mathieu.fenniak.net/assets/summer-of-fun-2024/PXL_20240729_130148152-small.jpg" alt="Outdoor photograph of steel barricades and a pedestrian bridge.  The barricades are decorated with pink and blue &quot;Paris 2024&quot; signage, and people are navigating the barricades in thee background." class="float-image" /></a></p>

<p>As we traveled around France and started to attend events during the Paris 2024 Olympics, I was exposed to good, bad, and complicated crowd control mechanisms. Barriers, volunteers, police, signage &amp; lack of signage, security, etc. All of these contributed to holding safe events where crowds were efficiently moved from place to place with limited time &amp; risk. And it’s all thrown into chaos if someone rides in a bike in the wrong place.</p>

<p><a href="https://mathieu.fenniak.net/assets/summer-of-fun-2024/2024-09-22T13:10:42,960941745+02:00.png"><img src="https://mathieu.fenniak.net/assets/summer-of-fun-2024/2024-09-22T13:10:42,960941745+02:00-small.png" alt="Screenshot of a cyan background, on which small character sprites are randomly distributed.  The sprites have green, blue, and red lines emanating from them towards points on the right side of the map." class="float-image" /></a></p>

<p>Well, that sounds like it could be a video game to me.</p>

<p>Many of you reading this will immediately think about the complex part of building this – pathfinding. Video game pathfinding is difficult, especially if you consider that every actor in this system would have a different goal, a different level of patience, a different level of comfort getting close to others, shoving others, moving a barrier, ignoring a sign…</p>

<p>Anyway, I hacked away at a proof of concept with Bevy. Just enough to see, if I have a “crowd generator” on one side of a map, and a variety of “targets” on the other side of the map, can I impact the efficiency of moving people just by moving barriers?  The answer is yes, but making the pathfinding efficient is hard. 😝  So, it’s a fun idea that I might revisit in the future, but I put it aside.</p>

<h2 id="hyprland">hyprland</h2>

<p>At this point, I decided to sabotage my own productivity for a bit. It’s completely rational though. You see, I’ve been using a Plasma KDE 5 desktop for a few years, and I’ve become very accustom to the <a href="https://github.com/Bismuth-Forge/bismuth">bismuth</a> window tiling plugin. But, it isn’t supported in Plasma 6, and while there is an autotiler <a href="https://github.com/zeroxoneafour/polonium">polonium</a> for Plasma 6, I don’t find it to be nearly the same experience.</p>

<p>So I thought this was a good opportunity to give a solid try to <a href="https://hyprland.org/">hyprland</a>, a tiling window compositor. Well… let’s say it’s a throwback to my early Linux desktop days. Other than composing windows, everything else you might want in a desktop environment is something to piece together. So, a week later, and I’ve configured hyprland, bemoji, wl-clipboard, wtype, cliphist, rofi, grimblast, waybar, dunst, hyprlock, hypridle… and I feel like I have a usable system.</p>

<p><a href="https://mathieu.fenniak.net/assets/summer-of-fun-2024/2024-09-22T13:19:09,857260585+02:00.png"><img src="https://mathieu.fenniak.net/assets/summer-of-fun-2024/2024-09-22T13:19:09,857260585+02:00-small.png" alt="Screenshot of status widgets on a desktop monitor, displaying the current battery life at 100%, currently connected bluetooth devices, volume levels, WiFi network, Discord and Signal icons, and the time." class="float-image" /></a></p>

<p>What did I gain out of this? I’ve made the move to Wayland, and I’m not blocked on that Plasma major upgrade.</p>

<p>But I also love the multi-monitor support of hyprland. I have a set of workspaces dedicated to each monitor. When I disconnect the 2nd monitor, the workspaces automatically collapse back to the 1st monitor, and when I reconnect it, they pop back.</p>

<p>I’m not sold on this for the long term, but I’m pretty happy with it for a little over a month. We’ll see what the maintenance looks like!</p>

<h2 id="zenbook-duo-fixes">Zenbook Duo Fixes</h2>

<p><a href="https://mathieu.fenniak.net/assets/summer-of-fun-2024/PXL_20240213_180856860.jpg"><img src="https://mathieu.fenniak.net/assets/summer-of-fun-2024/PXL_20240213_180856860-small.jpg" alt="Photograph of a laptop with two displays; the second display is underneath where the keyboard normally sits.  In this photograph the keyboard is detached and sitting on a desk in front of the two fully visible displays." class="float-image" /></a></p>

<p>Speaking of the dual-screen laptop, the Asus Zenbook Duo UX8406MA, one of my goals during the vacation was to make some Linux compatibility improvements for it. This sorta happened; I fixed two things, kinda.</p>

<h3 id="secondary-display">Secondary Display</h3>

<p>The first was that myself and others experienced a regression in June where the secondary display stopped working, which I reported upstream to the <a href="https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11488">i915 team</a>. In July an engineer provided a patch which fixed the problem.</p>

<p>However in August, the secondary display stopped working again. I started to bisect the kernel to find the issue, but made a grave error - I assumed that the current kernel was bad and my last kernel was  good. This seemed to make sense - after all, it was working and then broken, right?</p>

<p>Each bisect required having my desktop at home build the kernel remotely, and then download it. After dozens of iterations I eventually found that the “broken commit” didn’t do anything when reverted. Whoops.</p>

<p>So I changed my approach to bisecting nixpkgs, and eventually <a href="https://discourse.nixos.org/t/asus-zenbook-duo-2024-ux8406ma-nixos/39792/101?u=mfenniak">found that the linux-firmware-20240811 was the true cause of the secondary display breaking</a>. Reverting this upgrade restored the secondary display again. The fault was reverted out of linux-firmware, and the 20240909 is back to working.</p>

<p>So, I didn’t really fix anything here, but I did help identify some problems.</p>

<h3 id="rfkill">rfkill</h3>

<p>Another problem with the UX8406MA is that connecting the keyboard to the laptop caused the wifi to shut off. I <em>think</em> the cause is that when the keyboard is turning off its Bluetooth on USB connect, it happens to send a wireless disable keypress?  But that’s a bit guessy.</p>

<p>Regardless of the cause, I was able to create a kernel patch that detected and discarded these keypresses from this specific keyboard, which otherwise doesn’t have a wireless disable button.</p>

<p>This is the first time I’ve ever sent a Linux kernel patch upstream, and I’m very pleased that the platform maintainer integrated it in just a few days with one quick review cycle. <a href="https://github.com/torvalds/linux/commit/9286dfd5735b9cceb6a14bdf15e13400ccb60fe7">https://github.com/torvalds/linux/commit/9286dfd5735b9cceb6a14bdf15e13400ccb60fe7</a>  I’m a Linux kernel contributor!</p>

<p>This summer was great. One more stretch of hacking left…</p>

<h2 id="testtrim">testtrim</h2>

<p><a href="https://mathieu.fenniak.net/assets/summer-of-fun-2024/2024-08-30T18:05:35,487039257+02:00.png"><img src="https://mathieu.fenniak.net/assets/summer-of-fun-2024/2024-08-30T18:05:35,487039257+02:00-small.png" alt="Screenshot showing a table and chart of testing results; the chart is a histogram indicating the percentage of tests that need to be run for every commit, and it is heavily weighted in the 0-10% bucket." class="float-image" /></a></p>

<p>As September started to tick on, which I think is summer but my wife says is fall (and she’s technically correct, the best kind of correct), I started to experiment on a project that I’m calling testtrim.</p>

<p>The goal of testtrim is: run <em>only</em> the automated tests that are required to test the changes being made for software. The way it works is by running each test in an individually instrumented environment, and then using the code coverage map that generates to figure out what code a test runs.</p>

<p>I’m going to share a lot more about this when I get home and spend some more time working on it. Right now though, it looks very promising. I think it has quite a complex future in front of it… but a lot of potential value.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[What does a software developer do with 3 months of spare time? It’s time to reflect on the most fun part of any vacation: the software development. What did I fiddle with while I had time off?]]></summary></entry><entry><title type="html">In-Floor Heating Thermostat with Home Assistant and Shelly</title><link href="https://mathieu.fenniak.net/smart-thermostat-w-ha-and-shelly/" rel="alternate" type="text/html" title="In-Floor Heating Thermostat with Home Assistant and Shelly" /><published>2023-11-01T07:00:00+00:00</published><updated>2023-11-01T07:00:00+00:00</updated><id>https://mathieu.fenniak.net/infloor-thermostat</id><content type="html" xml:base="https://mathieu.fenniak.net/smart-thermostat-w-ha-and-shelly/"><![CDATA[<p>One of my pet peeves is devices in the house that don’t track time accurately, or, don’t deal with Daylight Saving Time automatically.  Well, I now have one fewer of those in my house, as I’ve replaced this “Easy Heat FTS-1” thermostat for electric in-floor heating with a <a href="https://www.home-assistant.io/">Home Assistant</a> powered thermostat.</p>

<p><a href="https://mathieu.fenniak.net/assets/ha-shelly-addon-plus/IMG_20200124_220222.jpg"><img src="https://mathieu.fenniak.net/assets/ha-shelly-addon-plus/IMG_20200124_220222-small.jpg" alt="A beige thermostat on a wall, displaying the current time, the number &quot;7.0&quot; indicating the relative temperature, and a few manual control buttons to program the device." class="float-image" /></a></p>

<p>The original thermostat was a low-frills device.  It allowed for two schedules to be configured, one for weekdays, and one for weekends, with the floor heat being set to four different temperatures at four different times.  The temperatures were not measured in degrees, but rather, in unlabeled numbers from 1 to 10.  The thermostat’s control interface was… better to not use it.  And it lost time constantly, requiring manual adjustment every few months.</p>

<p class="clear"><a href="https://mathieu.fenniak.net/assets/ha-shelly-addon-plus/PXL_20231030_234233096.jpg"><img src="https://mathieu.fenniak.net/assets/ha-shelly-addon-plus/PXL_20231030_234233096-small.jpg" alt="A beige thermostat pulled out of the wall; black and white wires run into it.  At the top of the device, bottom of the photo, a black and white small gauge wire run in from the thermistor as a temperature sensor.  In the middle, heavier wires carry the AC load into the resistive heating of the floor.  And at the bottom, black and white wires provide the AC power input." class="float-image" /></a></p>

<p>On the “backend”, the thermostat was a simple device.  It had a 120V AC input for power, and a 120V AC output for the resistive heating of the floor.  A thermistor input was used to measure the temperature of the floor; a thermistor is a resistor whose resistance changes with temperature.</p>

<p>So, things that needed to be done:</p>
<ul>
  <li>Figure out what I was going to replace this with that could both control the 120V AC output and measure the thermistor input.</li>
  <li>Rip out the old thermostat and install and wire the new device.</li>
  <li>Integrate it with Home Assistant for remote control.</li>
  <li>Test that I could control the 120V AC output; at this point it was pretty much a win because a good scheduling configuration would be fine.</li>
  <li>Bonus points: figure out how to measure the thermistor input and integrate that into Home Assistant.</li>
</ul>

<h2 class="clear" id="hardware-hunt">Hardware Hunt</h2>

<p>My usual go-to for such projects would be an ESP32, flashed with ESPHome for its versatility and ease of use.  Everything on my TODO list would be pretty straightforward to implement with it… but, usually when I work with an ESP32 I’m working with low-voltage DC, not high-voltage AC.  I wanted something that was designed to be installed in a wall gang box, and that could handle the high-voltage AC safely.  The resistive heating of the floor is about a 5A, 600W load; it would be easy to create a fire hazard if I didn’t do this right.</p>

<p>After some research, I opted for the <a href="https://kb.shelly.cloud/knowledge-base/shelly-plus-1">Shelly Plus 1</a>, a compact Wi-Fi &amp; Bluetooth smart switch that’s designed precisely for such scenarios.  I was originally thinking of going with a switch that I would flash with ESPHome… while I think that’s possible still with the Shelly Plus 1, I decided it might make more sense to just try it out-of-the-box first.</p>

<p>Additionally, to manage the temperature sensing with the thermistor, the <a href="https://kb.shelly.cloud/knowledge-base/shelly-plus-add-on">Shelly Plus Add-on</a> seemed like the perfect addition to the Plus 1.  It’s obviously designed for this exact usage; it has a voltage divider circuit that’s perfect for reading the thermistor, and it snaps right onto the Plus 1.</p>

<h2 id="physical-installation">Physical Installation</h2>

<p><a href="https://mathieu.fenniak.net/assets/ha-shelly-addon-plus/PXL_20231101_021931098.jpg"><img src="https://mathieu.fenniak.net/assets/ha-shelly-addon-plus/PXL_20231101_021931098-small.jpg" alt="Wires from the wall connect to a small blue and black box." class="float-image" /></a></p>

<p>With the Shelly Plus 1 and Add-on in hand, the next step was installation.</p>

<p>The AC power wiring was quite straightforward; my input AC power was split between the “L” input to power the Shelly Plus 1, and the “I” input as an input for the relay.  The output AC to the floor heating was wired to the “O” output of the relay.  The neutrals from the input, the floor, and the Shelly Plus 1 were all connected together.</p>

<p>The thermistor was a bit more confusing.  Shelly’s documentation for the Plus Add-on had a perfectly correct diagram, but, I screwed it up on the first pass due to my lack of familiarity with the voltage divider circuit.  Once I figured it out, it was easy to fix:</p>
<ul>
  <li>“VREF + R1 Out” was connected into a wire connector.</li>
  <li>One wire from the wire connector was connected to the “ANALOG IN”.</li>
  <li>The other wire from the wire connector was connected to the thermistor.</li>
  <li>The other end of the thermistor was connected to the “GND”.</li>
</ul>

<h2 id="software-set-up">Software Set-up</h2>

<p>It’s hardly worth talking about the initial connection of the Shelly to Home Assistant; at least in my experience it was virtually automatic.  I connected to the Shelly via its builtin Wi-Fi, and configured its connection to my home Wi-Fi.  Home Assistant picked it up immediately, and I was able to control the relay output from Home Assistant.</p>

<p>At that point, I was able to write automations for Home Assistant to turn the floor heat on and off.  With some timing settings, this probably would have been perfectly fine… but I was going for those bonus points.</p>

<p>Getting the thermistor read wasn’t exactly challenging… it was just undocumented.  Which is really the primary reason why I thought I’d put this blog post up; because otherwise this was just a matter of reading the documentation and following the instructions.</p>

<p>So, here were the steps I took to get the thermistor read:</p>

<p><a href="https://mathieu.fenniak.net/assets/ha-shelly-addon-plus/Screenshot_20231102_202652.png"><img src="https://mathieu.fenniak.net/assets/ha-shelly-addon-plus/Screenshot_20231102_202652-small.png" alt="Shelly web UI showing a &quot;Voltmeter&quot; configured in the &quot;Add-on&quot; section, reading a voltage of 4.89 V." class="float-image" /></a></p>

<ol>
  <li>
    <p>I configured the Shelly Plus Add-on as a voltmeter.  This was done via the Shelly Web UI, which I accessed through my local Wi-Fi address, but it’s also possible to do it through the Shelly App.  There aren’t a lot of settings involved here, so it’s not exactly complicated… but I wasn’t sure whether I should be using the “Analog Input” (since I’m wired to the “ANALOG IN”, you know), or the Voltmeter, and eventually figured out that the Voltmeter setting was what I needed for the next steps.</p>
  </li>
  <li>
    <p>After 5-10 minutes, the Voltmeter began to appear in Home Assistant.</p>
  </li>
  <li>
    <p>I configured a template sensor in Home Assistant to convert the voltage into a temperature.  This was a bit of a challenge, because I didn’t have the thermistor’s specifications.  What I did know was: the reference resistor in the Shelly Plus Add-on’s voltage divider circuit is 10k Ohm; the reference voltage was 10 V.  Theoretically I either needed a B-calibration value and a reference temperature of the thermistor, or, to measure a few temperatures and calculate the appropriate calibration values.  Well, I couldn’t find the right documentation, and changing the temperature of a floor by a huge amount and measuring it consistently is pretty impractical, so I chose reasonable approximate values for those, an extracted the NTC resistance to temperature calculations from ESPHome.</p>

    <p>The values <code class="language-plaintext highlighter-rouge">ref_voltage</code>, <code class="language-plaintext highlighter-rouge">ref_resistor</code> are used to calculate the resistence in the thermistor, and these values refer to the voltage on the circuit and the pulldown resistor in the Shelly Plus Add-On.</p>

    <p>The values <code class="language-plaintext highlighter-rouge">b_constant</code>, <code class="language-plaintext highlighter-rouge">ref_temp</code>, and <code class="language-plaintext highlighter-rouge">ref_resistence</code> are used to calculate the temperature from the resistence, and refer to the B-calibration value of the thermistor, the reference temperature and reference resistance that the B-calibration value was measured at.</p>

    <div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="na">template</span><span class="pi">:</span>
 <span class="pi">-</span> <span class="na">sensor</span><span class="pi">:</span>
     <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s2">"</span><span class="s">Bathroom</span><span class="nv"> </span><span class="s">Floor</span><span class="nv"> </span><span class="s">Temperature"</span>
         <span class="na">unit_of_measurement</span><span class="pi">:</span> <span class="s2">"</span><span class="s">°C"</span>
         <span class="na">state</span><span class="pi">:</span> <span class="pi">&gt;</span>
         <span class="s">{% set ref_voltage = 10 %}</span>
         <span class="s">{% set ref_resistor = 10000 %}</span>
         <span class="s">{% set b_constant = 3950 %}</span>
         <span class="s">{% set ref_temp = 25 %}</span>
         <span class="s">{% set ref_resistence = 10000 %}</span>
         <span class="s">{% set resistence = (</span>
             <span class="s">(states.sensor.shellyplus1_b8d61a8aaa94_voltmeter.state|float) * ref_resistor)</span>
             <span class="s">/</span>
             <span class="s">(ref_voltage - (states.sensor.shellyplus1_b8d61a8aaa94_voltmeter.state|float))</span>
         <span class="s">%}</span>
         <span class="s">{% set t0 = ref_temp + 273.15 %}</span>
         <span class="s">{% set a = (1/t0)-(1/b_constant)*log(ref_resistence) %}</span>
         <span class="s">{% set b = (1/b_constant) %}</span>
         <span class="s">{% set c = 0 %}</span>
         <span class="s">{% set lr = log(resistence) %}</span>
         <span class="s">{% set v = a + (b * lr) + (c * lr * lr * lr) %}</span>
         <span class="s">{% set temp = (1 / v) - 273.15 %}</span>
         <span class="s">{ temp|round(2) }</span>
</code></pre></div>    </div>
  </li>
  <li>
    <p>Then it was possible to configure a thermostat in Home Assistant based upon the temperature calculation, which turns the Shelly switch on and off based upon the temperature.</p>

    <div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="na">climate</span><span class="pi">:</span>
 <span class="pi">-</span> <span class="na">platform</span><span class="pi">:</span> <span class="s">generic_thermostat</span>
     <span class="s">unique_id</span><span class="err">:</span> <span class="s">bathroom_floor</span>
     <span class="s">name</span><span class="err">:</span> <span class="s">Bathroom Floor</span>
     <span class="s">heater</span><span class="err">:</span> <span class="s">switch.shellyplus1_b8d61a8aaa94_switch_0</span>
     <span class="s">ac_mode</span><span class="err">:</span> <span class="no">false</span>
     <span class="s">target_sensor</span><span class="err">:</span> <span class="s">sensor.bathroom_floor_temperature</span>
</code></pre></div>    </div>
  </li>
  <li>
    <p>And finally I’ve reached nirvana when I can add Home Assistant automations to turn the thermostat heating on and off at the right times.</p>
  </li>
</ol>

<p><a href="https://mathieu.fenniak.net/assets/ha-shelly-addon-plus/Screenshot_20231102_204153.png"><img src="https://mathieu.fenniak.net/assets/ha-shelly-addon-plus/Screenshot_20231102_204153-small.png" alt="Shelly web UI showing a &quot;Voltmeter&quot; configured in the &quot;Add-on&quot; section, reading a voltage of 4.89 V." /></a></p>

<p><a href="https://mathieu.fenniak.net/assets/ha-shelly-addon-plus/Screenshot_20231102_203932.png"><img src="https://mathieu.fenniak.net/assets/ha-shelly-addon-plus/Screenshot_20231102_203932.png" alt="Home Assistant web UI showing an automation to turn off the bathroom floor at 7am." /></a></p>

<p>Now, the manual time adjustments are a thing of the past.  My thermostat knows what time it is, all year round… and actually technically it doesn’t care because Home Assistant knows the same thing and is remotely controlling it.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[One of my pet peeves is devices in the house that don’t track time accurately, or, don’t deal with Daylight Saving Time automatically. Well, I now have one fewer of those in my house, as I’ve replaced this “Easy Heat FTS-1” thermostat for electric in-floor heating with a Home Assistant powered thermostat.]]></summary></entry><entry><title type="html">Bisecting the Linux Kernel with NixOS</title><link href="https://mathieu.fenniak.net/bisecting-the-linux-kernel-with-nixos/" rel="alternate" type="text/html" title="Bisecting the Linux Kernel with NixOS" /><published>2023-06-14T07:00:00+00:00</published><updated>2023-06-14T07:00:00+00:00</updated><id>https://mathieu.fenniak.net/bisecting</id><content type="html" xml:base="https://mathieu.fenniak.net/bisecting-the-linux-kernel-with-nixos/"><![CDATA[<p>Recently my kernel started to panic every time I awoke my monitors from sleep.  This seemed to be a regression; it worked one day, then I received a kernel upgrade from upstream, and the next time I was operating my machine it would crash when I came back to it.</p>

<p>After being annoyed for a bit, I realized this was a great time to learn how to bisect the linux kernel, find the problem, and either report it upstream, or, patch it out of my kernel!  I thought this would be useful to someone else in the future, so here we are.</p>

<p><strong>Step #1:</strong> Clone the Kernel; I grabbed Linus’ tree from <a href="https://github.com/torvalds/linux">https://github.com/torvalds/linux</a> with <code class="language-plaintext highlighter-rouge">git clone git@github.com:torvalds/linux.git</code></p>

<p><strong>Step #2:</strong> Start a bisect.</p>

<p>If you’re not familiar with a bisect, it’s a process by which you tell git, “this commit was fine”, and “this commit was broken”, and it will help you test the commits in-between to find the one that introduced the problem.</p>

<p>You start this by running <code class="language-plaintext highlighter-rouge">git bisect start</code>, and then you provide a tag or commit ID for the good and the bad kernel with <code class="language-plaintext highlighter-rouge">git bisect good ...</code> and <code class="language-plaintext highlighter-rouge">git bisect bad ...</code>.</p>

<p>I knew my issue didn’t occur on the 5.15 kernel series, but did start with my NixOS upgrade to 6.1.  But I didn’t know precisely where, so I aimed a little broader… I figured an extra test or two would be better than missing the problem. 😬</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git bisect start
git bisect good v5.15
git bisect bad master 
</code></pre></div></div>

<p><strong>Step #3:</strong> Replace your kernel with that version</p>

<p>In an ideal world, I would have been able to test this in a VM.  But it was a graphics problem with my video card and connected monitors, so I went straight for testing this on my desktop to ensure it was easy to reproduce and accurate.</p>

<p>Testing a mid-release kernel with NixOS is pretty easy!  All you have to do is override your kernel package, and NixOS will handle building it for you… here’s an example from my bisect:</p>

<div class="language-nix highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">boot</span><span class="o">.</span><span class="nv">kernelPackages</span> <span class="o">=</span> <span class="nv">pkgs</span><span class="o">.</span><span class="nv">linuxPackagesFor</span> <span class="p">(</span><span class="nv">pkgs</span><span class="o">.</span><span class="nv">linux_6_2</span><span class="o">.</span><span class="nv">override</span> <span class="p">{</span> <span class="c"># (#4)</span>
  <span class="nv">argsOverride</span> <span class="o">=</span> <span class="kr">rec</span> <span class="p">{</span>
    <span class="nv">src</span> <span class="o">=</span> <span class="nv">pkgs</span><span class="o">.</span><span class="nv">fetchFromGitHub</span> <span class="p">{</span>
      <span class="nv">owner</span> <span class="o">=</span> <span class="s2">"torvalds"</span><span class="p">;</span>
      <span class="nv">repo</span> <span class="o">=</span> <span class="s2">"linux"</span><span class="p">;</span>
      <span class="c"># (#1) -&gt; put the bisect revision here</span>
      <span class="nv">rev</span> <span class="o">=</span> <span class="s2">"7484a5bc153e81a1740c06ce037fd55b7638335c"</span><span class="p">;</span>
      <span class="c"># (#2) -&gt; clear the sha; run a build, get the sha, populate the sha</span>
      <span class="nv">sha256</span> <span class="o">=</span> <span class="s2">"sha256-nr7CbJO6kQiJHJIh7vypDjmUJ5LA9v9VDz6ayzBh7nI="</span><span class="p">;</span>
    <span class="p">};</span>
    <span class="nv">dontStrip</span> <span class="o">=</span> <span class="kc">true</span><span class="p">;</span>
    <span class="c"># (#3) `head Makefile` from the kernel and put the right version numbers here</span>
    <span class="nv">version</span> <span class="o">=</span> <span class="s2">"6.2.0"</span><span class="p">;</span>
    <span class="nv">modDirVersion</span> <span class="o">=</span> <span class="s2">"6.2.0-rc2"</span><span class="p">;</span>
  <span class="p">};</span>
<span class="p">});</span>
</code></pre></div></div>

<p>Getting this defined requires a couple intermediate steps…</p>

<p>Step #3.1 – put the version that <code class="language-plaintext highlighter-rouge">git bisect</code> asked me to test in (#1)</p>

<p>Step #3.2 – clear out <code class="language-plaintext highlighter-rouge">sha256</code></p>

<p>Step #3.3 – run a <code class="language-plaintext highlighter-rouge">nixos-rebuild boot</code></p>

<p>Step #3.4 – grab the sha256 and put it into the <code class="language-plaintext highlighter-rouge">sha256</code> field (#2)</p>

<p>Step #3.5 – make sure the major version matches at (#3) and (#4)</p>

<p>Then run <code class="language-plaintext highlighter-rouge">nixos-rebuild boot</code>.</p>

<p><strong>Step #4:</strong>  Test!</p>

<p>Reboot into the new kernel, and test whatever is broken.  For me I was able to set up a simple test protocol: <code class="language-plaintext highlighter-rouge">xset dpms force off</code> to blank my screens, wait 30 seconds, and then wake them.  If my kernel panicked then it was a fail.</p>

<p><strong>Step #5:</strong> Repeat the bisect</p>

<p>Go into the linux source tree and run <code class="language-plaintext highlighter-rouge">git bisect good</code> or <code class="language-plaintext highlighter-rouge">git bisect bad</code> depending on whether the test succeeded.  Return to step #3.</p>

<p><strong>Step #6:</strong> Revert it!</p>

<p>For my case, I eventually found a single commit that introduced the problem, and I was able to revert it from my local kernel.  This involves leaving a kernel patch in my NixOS config like this:</p>

<div class="language-nix highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="nv">boot</span><span class="o">.</span><span class="nv">kernelPatches</span> <span class="o">=</span> <span class="p">[</span>
    <span class="p">{</span>
      <span class="nv">patch</span> <span class="o">=</span> <span class="sx">./revert-bb2ff6c27b.patch</span><span class="p">;</span>
      <span class="nv">name</span> <span class="o">=</span> <span class="s2">"revert-bb2ff6c27b"</span><span class="p">;</span>
    <span class="p">}</span>
  <span class="p">];</span>
</code></pre></div></div>

<p>Obviously this isn’t a great long-term solution; it gets my desktop stable right now and I’m OK with that.  The best follow-up would be to create a detailed bug report for the impacted kernel module, the steps to reproduce, and the regressing commit.  The next best might just be to check occasionally if I can disable the revert and if things still work.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[Recently my kernel started to panic every time I awoke my monitors from sleep. This seemed to be a regression; it worked one day, then I received a kernel upgrade from upstream, and the next time I was operating my machine it would crash when I came back to it.]]></summary></entry><entry><title type="html">ABTraceTogether Android Background Capability Analysis</title><link href="https://mathieu.fenniak.net/abtracetogether-android-background-capability-analysis/" rel="alternate" type="text/html" title="ABTraceTogether Android Background Capability Analysis" /><published>2020-11-14T07:00:00+00:00</published><updated>2020-11-14T07:00:00+00:00</updated><id>https://mathieu.fenniak.net/abtracetogether-android-background-capability-analysis</id><content type="html" xml:base="https://mathieu.fenniak.net/abtracetogether-android-background-capability-analysis/"><![CDATA[<p>The ABTraceTogether COVID contact tracing app also doesn’t work well on Android devices.  Let’s dig in and see.</p>

<p>This research was originally published on Google Docs (<a href="https://docs.google.com/document/d/1UFQDo9oNF771NgMLCn9Pzy3BZ7PfOFmdSdtpPzSo4nM/edit#heading=h.qjzwlgfl7h1l">ABTraceTogether Android Background Capability Analysis</a>) on November 14, 2020, with a short summary described on Twitter.  The tweets and research notes are archived and reproduced here with some formatting edits.</p>

<h1 id="executive-summary">Executive Summary</h1>

<p>(Reproduced from <a href="https://twitter.com/mfenniak/status/1327750843001520128">Twitter</a>)</p>

<p>Today I continued testing ABTraceTogether, focusing on the exposures collected on an Android.  There is some defect.  It failed to record an Android in close proximity.  I don’t know whether it’s an flaw related to one of my devices, or general problem.</p>

<p>Throughout my testing, I had an iPhone and two Android devices in physical proximity.  The iPhone was recorded by the Android pretty consistently.  Overnight, a second iPhone nearby was also recorded. A good start!</p>

<p>But the Android-to-Android data was so sparse that I revised my test protocol a few times to try to figure it out.  No overnight data recorded.  8 hours of the phones snuggled, and they didn’t see each other at all?</p>

<p>I guessed that maybe ABTraceTogether detected that they had registered with the same phone number, and that blocked recording. I didn’t see that in the source code, but, maybe? I borrowed a friend’s spare phone number (thanks Mike). No difference in another hour of testing.</p>

<p>I found that one of my Android devices had automatically configured the ABTraceTogether app to “Intelligent Control” battery optimization. Well, that could have some impact, right? So, I tweaked that to “Don’t Optimize”. Another hour of testing, and no exposures recorded.</p>

<p>I added a third Android device to the test, and I started getting a few Android-to-Android traces. So; maybe one Android device just didn’t work? Despite saying “ABTraceTogether is scanning to keep you safe!” all the time?</p>

<p>Another theory… Android-to-Android only stores data a few times for an exposure (3), and then ignores new records. But… only Android/Android, no other combo? And that won’t allow exposures to age out of the DB accurately. And… the Open Source repo doesn’t do that.</p>

<p>So, I’m stumped for now. I have a few old Android devices; maybe I’ll dust off more and see if this is a single device problem. Unluckily it’s my primary day-to-day phone. 😰</p>

<p>Which I guess is still an interesting result; ABTraceTogether can appear to be working, and not work.</p>

<h1 id="overview">Overview</h1>

<p>(Special thanks to my collaborator and iPhone contributor: Amanda Fenniak 😘, and my collaborator and phone number contributor: Mike Roest 😎)</p>

<p>On November 11th, an analysis of ABTraceTogether’s background contact tracing capabilities was conducted with a specific focus on iPhone interactions (<a href="../abtracetogether-iphone-background-capability-analysis/">https://mathieu.fenniak.net/abtracetogether-iphone-background-capability-analysis/</a>).  This allowed investigation of all data that was collected by an iPhone, but not the data that was collected by an Android phone in proximity to other phones.</p>

<p>In this document, testing and analysis is performed on the data collected from an Android phone using ABTraceTogether.</p>

<h1 id="test-protocol">Test Protocol</h1>

<ul>
  <li>ABTraceTogether was installed on two Android phones, and one iPhone
    <ul>
      <li>ABTraceTogether was run, other applications were opened so that ABTraceTogether was in the “background”, and then the phones were locked</li>
    </ul>
  </li>
  <li>All phones were kept within close proximity for a 12 hour period; less than 1 meter separation</li>
  <li>Data collected from one Android was analyzed
    <ul>
      <li>See Appendix for more detailed information on the data collection procedure</li>
    </ul>
  </li>
</ul>

<h1 id="test-details">Test Details</h1>

<p>Dates and times are in Mountain Standard Time</p>

<ul>
  <li>iPhone #1:
    <ul>
      <li>iPhone 11 Pro</li>
      <li>iOS 14.2</li>
      <li>ABTraceTogether version 1.4.0 installed from the Apple App Store; registered with phone number #1</li>
      <li>iPhone #1 was a nearby phone during the test, but it is not the iPhone that was specifically kept in physical proximity; it would have varied between 3m and 20m distance throughout the test</li>
    </ul>
  </li>
  <li>iPhone #2:
    <ul>
      <li>iPhone 6s</li>
      <li>iOS 14.1</li>
      <li>ABTraceTogether version 1.4.0 installed from the Apple App Store; registered with phone number #2</li>
      <li>Kept within 1m of Android #2 throughout test</li>
    </ul>
  </li>
  <li>Android #1:
    <ul>
      <li>OnePlus 7T</li>
      <li>OxygenOS (Android) 10.0.14.HD65AA</li>
      <li>ABTraceTogether version 1.4.0 installed from the Google Play Store; registered with phone number #3</li>
      <li>Kept within 1m of Android #2 throughout test</li>
    </ul>
  </li>
  <li>Android #2:
    <ul>
      <li>Google Nexus 5</li>
      <li>LineageOS 17.1; Android 10</li>
      <li>ABTraceTogether version 1.4.0 installed from the Google Play Store
        <ul>
          <li>Originally registered with phone number #3</li>
          <li>Registered with phone number #4 in test #2 and later</li>
        </ul>
      </li>
    </ul>
  </li>
  <li>Android #3:
    <ul>
      <li>Introduced only in later tests</li>
      <li>Amazon Fire HD 10 (9th generation)</li>
      <li>FireOS 7.3.1.6 (Android)</li>
      <li>ABTraceTogether version 1.40 installed from the Google Play Store; registered with phone number #3</li>
    </ul>
  </li>
  <li>2020-11-13
    <ul>
      <li>6:52pm
        <ul>
          <li>Android #2, Android #1, and iPhone #2 have been placed in close physical proximity, had the ABTraceTogether app run, and then put into the background by hitting the home button within the past 2 minutes.</li>
          <li>All phones are running on battery power.</li>
        </ul>
      </li>
      <li>~10:00pm
        <ul>
          <li>iPhone #1 &amp; Android #1 are connected to power</li>
        </ul>
      </li>
    </ul>
  </li>
  <li>2020-11-14
    <ul>
      <li>5:30am
        <ul>
          <li>iPhone #1 &amp; Android #1 are disconnected from power</li>
        </ul>
      </li>
      <li>6:27am
        <ul>
          <li>Android #2 is connected to power</li>
          <li>Android #2’s ABTraceTogether database is cloned for analysis</li>
        </ul>
      </li>
    </ul>
  </li>
  <li>Test period #1: 2020-11-13 6:52pm -&gt; 2020-11-14 6:27am</li>
  <li>Following the completion of the test, data extraction, and analysis above, a second test was conducted to rule out a possible explanation for unexpected results
    <ul>
      <li>Possible flaw in testing: Android #1 and Android #2 were registered to ABTraceTogether with the same phone number</li>
      <li>ABTraceTogether was deleted from Android #2</li>
      <li>ABTraceTogether was installed in Android #2, and re-registered with a fourth phone number, distinct from iPhone #1, #2, and Android #1</li>
      <li>All devices were charged to &gt;90% battery before test period #2 began</li>
    </ul>
  </li>
  <li>2020-11-14
    <ul>
      <li>10:49am
        <ul>
          <li>Test phones Android #1, Android #2, and iPhone #2 are placed in close proximity</li>
          <li>ABTraceTogether is run in the foreground on the three test phones</li>
          <li>Continual interactions are required on Android devices to keep screen &amp; app active</li>
        </ul>
      </li>
      <li>11:00am
        <ul>
          <li>ABTraceTogether is sent to the background on the three test phones by hitting the home button, and locking the phones</li>
        </ul>
      </li>
      <li>12:07pm
        <ul>
          <li>End of test period; data to be collected at 12:55pm</li>
        </ul>
      </li>
    </ul>
  </li>
  <li>Test period #2: 2020-11-14 10:49am -&gt; 12:07pm</li>
  <li>Following the completion of the test, data extraction, and analysis above, a third test was conducted to rule out a possible explanation for unexpected results
    <ul>
      <li>Android #1 was explicitly configured so that ABTraceTogether app was set to “Don’t Optimize” for battery options</li>
      <li>Android #3 was introduced into the test environment</li>
    </ul>
  </li>
  <li>2020-11-14
    <ul>
      <li>1:46pm
        <ul>
          <li>Android #3: ABTraceTogether installed and registered</li>
        </ul>
      </li>
      <li>1:48pm
        <ul>
          <li>ABTraceTogether is sent to the background on the four test device sby hitting the home button, and locking the phones</li>
        </ul>
      </li>
      <li>2:34pm
        <ul>
          <li>End of test period; data collected immediately</li>
        </ul>
      </li>
    </ul>
  </li>
  <li>Test period #3: 2020-11-14 1:48pm -&gt; 2:34pm</li>
</ul>

<h1 id="data--analysis">Data &amp; Analysis</h1>

<ul>
  <li>Raw data from both data sets is available in the below spreadsheet
    <ul>
      <li>Original analysis format was <a href="https://docs.google.com/spreadsheets/d/1h9ESJH05I0WDGfEqNMqfoG04mTj4N56jpL8AGPcL6VM/edit?usp=sharing">Google Sheets</a></li>
      <li>An archive of the same spreadsheet is available in a <a href="../assets/abtracetogether/2020-11-14%20-%20ABTraceTogether%20Android%20Background%20Capabilities%20Analysis.xlsx">Microsoft Excel</a> and <a href="../assets/abtracetogether/2020-11-14%20-%20ABTraceTogether%20Android%20Background%20Capabilities%20Analysis.ods">OpenDocument</a> format.</li>
    </ul>
  </li>
  <li>ABTraceTogether Android collects all exposures in a table named record_table with the following fields:
    <ul>
      <li>id INTEGER PRIMARY KEY
        <ul>
          <li>Analysis: sequentially incrementing primary key of the record</li>
        </ul>
      </li>
      <li>timestamp INTEGER
        <ul>
          <li>Analysis: epoch-milliseconds; that is, a timestamp in UTC represented by the number of milliseconds past since midnight 1970-01-01</li>
        </ul>
      </li>
      <li>v INTEGER
        <ul>
          <li>Analysis: always contains the value 2</li>
        </ul>
      </li>
      <li>msg VARCHAR
        <ul>
          <li>Data: 61 bytes of BASE64 encoded data; repeated multiple times for each exposure</li>
          <li>Analysis: This is the exposure ID of the target device; there seems to be no confidential information in the base64 decoded data</li>
          <li>The data published in this spreadsheet has been trimmed to the first 30 characters in the unlikely event that confidential or sensitive information is contained in the token</li>
        </ul>
      </li>
      <li>org TEXT
        <ul>
          <li>Data: String “CA_AB”</li>
          <li>Analysis: Possible future expansion to multiple regions, or integration with other OpenTrace-supported applications</li>
        </ul>
      </li>
      <li>modelC VARCHAR
        <ul>
          <li>Data: “iPhone” or “Android”</li>
          <li>Analysis: In the Bluetooth protocol, “central” devices are host devices, and “peripheral” devices connect to central devices.  The “C” suffix indicates that this suggests which type of device was the “central” device in this exposure.</li>
        </ul>
      </li>
      <li>modelP VARCHAR
        <ul>
          <li>Data: “iPhone” or “Android”</li>
          <li>Analysis: As per field modelC, this is believed to be the model of the peripheral device.</li>
        </ul>
      </li>
      <li>rssi INTEGER
        <ul>
          <li>Analysis: measurement of the received signal strength indicator, likely in dBm; the values range from -12 to -100</li>
          <li>txPower INTEGER</li>
          <li>Data: NULL</li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<h2 id="test-period-1">Test Period #1</h2>

<ul>
  <li>Data collected
    <ul>
      <li>339 exposures were collected</li>
      <li>Msg: avuP1VvHjDRWCuoVFoAAdTOW2TACdp
        <ul>
          <li>58 exposures</li>
          <li>iPhone central device, Android #2 is peripheral</li>
          <li>Exposures recorded between 7:07pm and 5:51am</li>
          <li>Time between exposure:
            <ul>
              <li>Average: 11 minutes</li>
              <li>Maximum: 67 minutes</li>
            </ul>
          </li>
          <li>RSSI: -100 to -58</li>
        </ul>
      </li>
      <li>Msg: KPZfd0ip8lGWMSUA/IdTeB/ZBZfpGA
        <ul>
          <li>67 exposures</li>
          <li>iPhone central device, Android #2 is peripheral</li>
          <li>Exposures recorded between 6:57pm and 5:51am</li>
          <li>Time between exposure:
            <ul>
              <li>Average: 9 minutes</li>
              <li>Maximum: 26 minutes</li>
            </ul>
          </li>
          <li>RSSI: -70 to -15</li>
        </ul>
      </li>
      <li>Msg: odarSPFz7coBgKtf0mmLFRw7ODGt5O
        <ul>
          <li>131 exposures</li>
          <li>Android #2 central device, iPhone ? is peripheral</li>
          <li>Exposures recorded between 8:56pm and 5:51am</li>
          <li>Time between exposure:
            <ul>
              <li>Average: 4 minutes</li>
              <li>Maximum: 18 minutes</li>
            </ul>
          </li>
          <li>RSSI: -47 to -18</li>
        </ul>
      </li>
      <li>Msg: TgT0cEg2mZm6YMd5ejPMv7KRpDVzoK
        <ul>
          <li>42 exposures</li>
          <li>Android #2 is central device, iPhone ? is peripheral</li>
          <li>Exposures recorded between 9:10pm and 5:51am</li>
          <li>Time between exposure:
            <ul>
              <li>Average: 12 minutes</li>
              <li>Maximum: 71 minutes</li>
            </ul>
          </li>
          <li>RSSI: -93 to -60</li>
        </ul>
      </li>
      <li>Msg: U00y/jHqnbDh7+wIkl4zCY+D1C99e3
        <ul>
          <li>33 exposures</li>
          <li>Android #2 is central device, iPhone ? is peripheral</li>
          <li>Exposures recorded between 6:35pm and 8:52pm; includes time before the test began</li>
          <li>Time between exposure:
            <ul>
              <li>Average: 4 minutes</li>
              <li>Maximum: 15 minutes</li>
            </ul>
          </li>
          <li>RSSI: -96 to -12</li>
        </ul>
      </li>
      <li>Msg: wOPJa8pn/iB8wN6Gawd6olHKeHREfy
        <ul>
          <li>1 exposure</li>
          <li>Android - Android; cannot distinguish the central or peripheral</li>
          <li>Occurred at 6:49pm; this is before all phones were in the background configuration at the start of the test</li>
          <li>RSSI: -89</li>
        </ul>
      </li>
      <li>Msg: IHGhpOV5XP0sqGfXVlKIaEjzQm+j8z
        <ul>
          <li>2 exposures</li>
          <li>Android - Android; cannot distinguish the central or peripheral</li>
          <li>All exposures occurred at 6:59pm</li>
          <li>RSSI: -68 to -47</li>
        </ul>
      </li>
      <li>Msg: jhWnG4TnzYe1XOqVej37NWKx2QTgK+
        <ul>
          <li>5 exposures</li>
          <li>Android #2 is central device, iPhone ? is peripheral</li>
          <li>Exposures recorded between 6:35pm and 8:45pm; includes time before the test began</li>
          <li>Time between exposure:
            <ul>
              <li>Average: 32 minutes</li>
              <li>Maximum: 96 minutes</li>
            </ul>
          </li>
          <li>RSSI: -91 to -81</li>
        </ul>
      </li>
    </ul>
  </li>
  <li>Analysis
    <ul>
      <li>On iPhone / Android interactions:
        <ul>
          <li>TgT0cEg2mZm6YMd5ejPMv7KRpDVzoK &amp; avuP1VvHjDRWCuoVFoAAdTOW2TACdp
            <ul>
              <li>One of these messages was Android central, other iPhone central</li>
              <li>Messages had similar start &amp; end times
                <ul>
                  <li>7:07pm to 5:51am</li>
                  <li>9:10pm to 5:51am</li>
                </ul>
              </li>
              <li>Messages had similar RSSI ranges
                <ul>
                  <li>-93 to -60</li>
                  <li>-100 to -58</li>
                </ul>
              </li>
              <li>Given the long-term overnight exposure, this is likely to be either iPhone #1 or iPhone #2 communicating with Android #2</li>
              <li>Considering the lower RSSI values, which would correlate with greater physical distance between the phones, it is plausible that this is iPhone #1 to Android #2</li>
            </ul>
          </li>
          <li>KPZfd0ip8lGWMSUA/IdTeB/ZBZfpGA &amp; odarSPFz7coBgKtf0mmLFRw7ODGt5O
            <ul>
              <li>One of these messages was Android central, other iPhone central</li>
              <li>Messages had similar start &amp; end times
                <ul>
                  <li>6:57pm to 5:51am</li>
                  <li>8:56pm to 5:51am</li>
                </ul>
              </li>
              <li>Messages had similar RSSI ranges
                <ul>
                  <li>-70 to -15</li>
                  <li>-47 to -18</li>
                </ul>
              </li>
              <li>Given the long-term overnight exposure, this is likely to be either iPhone #1 or iPhone #2 communicating with Android #2</li>
              <li>Considering the higher RSSI values, which would correlate with shorter physical distance between the phones, it is plausible that this is iPhone #2 to Android #2</li>
            </ul>
          </li>
          <li><strong>Note</strong>: Although the above message pairs look like they may be mismatched based upon the timestamps, the RSSI signal strength matches better, and all the exposures that started in the 7pm timeframe were iPhone central devices; these messages need to be paired central/peripheral, which they can’t be if the pairing was adjusted.  Regardless the results of the analysis are not impacted by the specific messages that identify a device.</li>
          <li>U00y/jHqnbDh7+wIkl4zCY+D1C99e3
            <ul>
              <li>Unidentified iPhone peripheral between 6:35pm and 8:52pm</li>
              <li>RSSI values were surprisingly strong for an unidentified phone</li>
              <li>8:52pm roughly corresponds to Android #2 being moved within the home test environment; my best explanation is that this was a neighbour’s iPhone and we were in close physical proximity, but this explanation seems questionable</li>
            </ul>
          </li>
          <li>jhWnG4TnzYe1XOqVej37NWKx2QTgK+
            <ul>
              <li>Unidentified iPhone peripheral between 6:35pm and 8:45pm</li>
              <li>RSSI values are weaker than U00y</li>
              <li>If it weren’t for this device also showing an Android central, it would be likely that this is the inverse relationship of the U00y message above; considering both show the Android central device, this is unexplained</li>
            </ul>
          </li>
        </ul>
      </li>
      <li>On Android / Android interactions:
        <ul>
          <li>Messages wOPJa8pn/iB8wN6Gawd6olHKeHREfy &amp; IHGhpOV5XP0sqGfXVlKIaEjzQm+j8z
            <ul>
              <li>Both occurred in the same time window, 6:49pm - 6:59pm</li>
              <li>Only 3 exposures occurred, and in a small time period</li>
              <li>RSSI values are not strongly correlated between the exposures, but the number of exposures is so low that this can’t be correlated well</li>
            </ul>
          </li>
        </ul>
      </li>
      <li>Strong interactions between iPhone and Android was recorded throughout the experiment, with a high probability that a 10 minute physical exposure would result in a recorded ABTraceTogether exposure</li>
      <li>Android-Android interactions appear to be effectively absent from the dataset; despite 12 hours of physical exposure, only 10 minutes of recording between the devices was recorded
        <ul>
          <li>Part of this exposure recording is outside of the “background” time window in the test, as well; exposures occurred at 6:49pm while the test was being configured and either device may run the application in the foreground</li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<h2 id="test-period-2">Test Period #2</h2>

<ul>
  <li>Data collected
    <ul>
      <li>52 exposures were collected
        <ul>
          <li>23 exposures were within the test time period; the remaining exposures occurred and were recorded while all devices were being recharged between tests</li>
          <li>Non-eligible exposures are coloured grey in the spreadsheet</li>
          <li>Test period #2: 2020-11-14 10:49am -&gt; 12:07pm</li>
        </ul>
      </li>
      <li>Msg: h5Cjy32U67/0fnQeeq05K+NStEElVG
        <ul>
          <li>19 exposures</li>
          <li>Android #2 central device, iPhone #2 is peripheral</li>
          <li>Exposures recorded between 10:52am and 11:55am</li>
          <li>Time between exposure:
            <ul>
              <li>Average: 3 minutes</li>
              <li>Maximum: 10 minutes</li>
            </ul>
          </li>
          <li>RSSI: -47 to -28</li>
        </ul>
      </li>
      <li>Msg: KPZfd0ip8lGWMSUA/IdTeB/ZBZfpGA
        <ul>
          <li>9 exposures</li>
          <li>iPhone central device, Android #2 is peripheral</li>
          <li>Exposures recorded between 10:50am and 11:49am</li>
          <li>Time between exposure:
            <ul>
              <li>Average: 7 minutes</li>
              <li>Maximum: 11 minutes</li>
            </ul>
          </li>
          <li>RSSI: -77 to -62</li>
        </ul>
      </li>
      <li>Analysis
        <ul>
          <li>On iPhone / Android interactions:
            <ul>
              <li>KPZfd0ip8lGWMSUA/IdTeB/ZBZfpGA &amp; h5Cjy32U67/0fnQeeq05K+NStEElVG
                <ul>
                  <li>One of these messages was Android central, other iPhone central</li>
                  <li>Messages had similar start &amp; end times
                    <ul>
                      <li>10:52am and 11:55am</li>
                      <li>10:50am and 11:49am</li>
                    </ul>
                  </li>
                  <li>When iPhone is central device, the msg code KPZfd0ip8lGWMSUA/IdTeB/ZBZfpGA is the same as the code previously identified as iPhone #2; this is consistent with iPhone #2 being the only iPhone present during test period #2</li>
                </ul>
              </li>
            </ul>
          </li>
          <li>On Android / Android interactions:
            <ul>
              <li>No Android/Android interactions were recorded during the test period, including both the foreground and background execution time</li>
            </ul>
          </li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<h2 id="test-period-3">Test Period #3</h2>

<ul>
  <li>Data collected
    <ul>
      <li>An additional 8 exposures were collected on top of Test Period #2
        <ul>
          <li>Non-eligible exposures from outside of this test period are coloured grey in the spreadsheet</li>
          <li>Test period #3: 2020-11-14 1:48pm -&gt; 2:34pm</li>
        </ul>
      </li>
      <li>Msg: mIT1/+vA/XhDRFt51WqXtw0pFOy0bK
        <ul>
          <li>3 exposures</li>
          <li>Android - Android; cannot distinguish the central or peripheral</li>
          <li>Exposures recorded between 1:55pm and 1:59pm</li>
          <li>Average: 1 minute time between exposure; however very few exposures are captured in this time window</li>
          <li>RSSI: -61 to -61</li>
        </ul>
      </li>
    </ul>
  </li>
  <li>Msg: KPZfd0ip8lGWMSUA/IdTeB/ZBZfpGA
    <ul>
      <li>5 exposures</li>
      <li>iPhone central device, Android #2 is peripheral</li>
      <li>Exposures recorded between 1:55pm and 2:28pm</li>
      <li>Average: 8 minutes time between exposure; however very few exposures are captured in this time window</li>
      <li>RSSI: -53 to -50</li>
    </ul>
  </li>
  <li>Analysis
    <ul>
      <li>On iPhone / Android interactions:
        <ul>
          <li>KPZfd0ip8lGWMSUA/IdTeB/ZBZfpGA
            <ul>
              <li>Consistent with Test Period #2, this appears to be iPhone #2 being recorded from Android #2</li>
            </ul>
          </li>
        </ul>
      </li>
      <li>On Android / Android interactions:
        <ul>
          <li>A new Android/Android interaction was recorded during the test period</li>
          <li>Data is inconclusive as to whether Android #2 was the central or peripheral device</li>
          <li>Data is inconclusive as to whether the other partner in the device was Android #1 or Android #3</li>
          <li>Regardless of whether Android #1 or #3 was the pair device, the absence of an additional signal indicates that one of the two Android devices did not record within the 46 minute test window</li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<h2 id="multi-test-analysis">Multi-Test Analysis</h2>

<ul>
  <li>ABTraceTogether failed to record exposures from a nearby Android device during the initial 12 hour test period.</li>
  <li>During a shorter 78 minute Android-Android test, ABTraceTogether failed to record exposures from the Android device.
    <ul>
      <li>Android #2 was reconfigured to a distinct phone number from Android #1.</li>
      <li>This change appears to have had no impact.</li>
    </ul>
  </li>
  <li>During a third 46 minute Android-Android test, one of two Android devices were detected.
    <ul>
      <li>Android #1 had been reconfigured to disable battery optimization on ABTraceTogether.</li>
      <li>Android #3 was introduced into the experiment.</li>
      <li>One of these two changes caused exposures to be recorded, but only to one device.</li>
      <li>Zero Android-to-Android captures would have indicated that Android #2 is unable to receive Android exposures correctly.</li>
      <li>Two Android-to-Android captures would have indicated that the battery optimization on Android #1 had been successful, and Android #3 was working as expected.</li>
      <li>One Android-to-Android capture could theoretically be either change, but, for it to be the battery optimization of Android #1 would be the more complex explanation, as it would indicate Android #3 was not working properly for a new, unexplained reason.</li>
      <li>It isn’t clear why not all test devices are being recognized.</li>
    </ul>
  </li>
</ul>

<h1 id="conclusion">Conclusion</h1>

<p><img src="../assets/abtracetogether/image5.png" alt="Screen capture of an Android device displaying &quot;ABTraceTogether is scanning to keep you safe!&quot;" /></p>

<p>From the perspective of an Android device in close proximity to iPhone devices, ABTraceTogether effectively records the presence of an iPhone.  A ten minute physical exposure would very likely result in a recorded ABTraceTogether exposure.</p>

<p>There appears to be a condition in which Android-to-Android exposure detection will fail to be recorded during continuous exposure.  However, experimentation has only been able to show the absence of expected functionality, but not the cause of this failure.</p>

<p>ABTraceTogether displays a message on Android continuously that reads “ABTraceTogether is scanning to keep you safe!” and “Restart phone if this notification disappears.”  The condition in which Android/Android exposure was not functioning occurred despite this comforting message on all Android devices.</p>

<h2 id="follow-up-work">Follow-Up Work</h2>

<p>The results of analyzing Android recorded data is not consistent with observed over-the-air Bluetooth traffic generated from ABTraceTogether on an Android.  Further theories on the possible fault with Android/Android tracing are welcome to be contributed; however future tests that are required are not yet clear.</p>

<p>The experimental testing occurring with phones in an uncontrolled environment has resulted in aberrant, unexpected exposures.  A radio frequency isolating environment would be a great addition to the test configuration, allowing for the removal of all unexpected exposures, and simplifying the analysis of the recorded data.  A short experiment with three phones in a microwave, which is a 2.4GHz isolating container, resulted in zero communication between the three devices.  Further experimentation with an environment that isolates the devices but does not interfere with their communication would be warranted.</p>

<h1 id="appendix">Appendix</h1>

<h2 id="collecting-android-exposure-database">Collecting Android Exposure Database</h2>

<p>ABTraceTogether on an Android stores data in an App-specific file location, which is not typically readable from an external tool.  In order to collect an Android database, it was necessary to configure a device with root-level administrative access.</p>

<p>A Nexus 5 device was chosen from a library of unused Android phones.  The following steps were required to get the device available to use ABTraceTogether, and be able to access the recording database:</p>

<ul>
  <li>Bootloader unlocked – this allowed the installation of a custom recovery tool</li>
  <li>Installation of TWRP – this is a flexible recovery operating system for Android devices</li>
  <li>Repartitioning of Nexus 5 system &amp; data partitions
    <ul>
      <li>Additional storage was required on the system partition</li>
      <li>The Nexus 5 system partition was expanded by 1 GB, by reallocating the data partition space</li>
    </ul>
  </li>
  <li>Installation of LineageOS 17.1 built for Nexus 5 via TWRP</li>
  <li>Installation of Magisk (<a href="https://github.com/topjohnwu/Magisk">https://github.com/topjohnwu/Magisk</a>) via TWRP</li>
  <li>Installation of OpenGApps via TWRP
    <ul>
      <li>OpenGApps was found to be required to supply Google services, which were required for the successful registration of the ABTraceTogether application</li>
      <li>“Nano” installation was used</li>
    </ul>
  </li>
  <li>Android system setup</li>
  <li>Configuration of Developer Tools via Android Settings</li>
  <li>Remote root access by enabling USB debugging and root debugging via Developer Tools</li>
  <li>Installation of ABTraceTogether via Google Play Store</li>
  <li>Registration of ABTraceTogether</li>
</ul>

<p>Following all that messy work, retrieving the ABTraceTogether exposure database was performed by USB debugging the Android device, and using the “adb” tool to extract the DB via the command: adb pull /data/data/ca.albertahealthservices.contacttracing/databases/record_database .</p>

<p>The database extracted was a SQLite 3.x database, user version 1, last written using SQLite version 3022000.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[The ABTraceTogether COVID contact tracing app also doesn’t work well on Android devices. Let’s dig in and see.]]></summary></entry><entry><title type="html">ABTraceTogether iPhone Background Capability Analysis</title><link href="https://mathieu.fenniak.net/abtracetogether-iphone-background-capability-analysis/" rel="alternate" type="text/html" title="ABTraceTogether iPhone Background Capability Analysis" /><published>2020-11-11T07:00:00+00:00</published><updated>2020-11-11T07:00:00+00:00</updated><id>https://mathieu.fenniak.net/abtracetogether-iphone-background-capability-analysis%20copy</id><content type="html" xml:base="https://mathieu.fenniak.net/abtracetogether-iphone-background-capability-analysis/"><![CDATA[<p>The Government of Alberta advertises that ABTraceTogether, their COVID contact tracing app, works on iPhones when run in the background.  They are incorrect.</p>

<p>This research was originally published on Google Docs (<a href="https://docs.google.com/document/d/1zm7zskUOI6ie2rBLl6dAq07_cQeMVoDhpGUunKZ33CM/edit#heading=h.qjzwlgfl7h1l">ABTraceTogether iPhone Background Capability Analysis</a>) on November 11, 2020, with a short summary described on Twitter.  The tweets and research notes are archived and reproduced here with some formatting edits.</p>

<h1 id="executive-summary">Executive Summary</h1>

<p>(Reproduced from <a href="https://twitter.com/mfenniak/status/1326563045892501505">Twitter</a>)</p>

<p>On the weekend, I tweeted about how ABTraceTogether doesn’t work correctly between iPhones.  Steve Buick, press secretary to the Minister of Health of Alberta, replied on Twitter and told me that I was wrong.  So, I tested it.</p>

<p>It doesn’t work in the background, iPhone/iPhone. Doesn’t work consistently iPhone/Android, either.</p>

<p>iPhone/iPhone: In over 12 hours of close exposure between two iPhones with ABTraceTogether in the background, zero records of exposure were recorded. After bringing the app into the foreground on one iPhone, exposure records began being collected immediately.</p>

<p>Android/iPhone: While some exposure was recorded while both devices were in the background, it did not occur consistently w/ physical exposure. One hour of physical exposure is more than sufficient to contract COVID-19, but did not trigger ABTraceTogether’s exposure recording.</p>

<h1 id="overview">Overview</h1>

<p>(Special thanks to my collaborator and iPhone contributor: Amanda Fenniak 😘)</p>

<p>ABTraceTogether is an application distributed by the Government of Alberta for the purpose of aiding Alberta Health Services contact tracers in reaching out to potential subjects of COVID-19 exposure. ABTraceTogether is based off a Singapore application called TraceTogether, which has been developed based upon the OpenTrace protocol for bluetooth exposure notification.</p>

<p>Concerns have been raised over the effectiveness of ABTraceTogether at detecting exposure when the application is running in the “background” on an iOS device, such as an iPhone.</p>

<p>In this document, a real-world test of the exposure notification capability of an iPhone is performed and analyzed.</p>

<h1 id="background">Background</h1>

<p>When ABTraceTogether was launched, it was clearly communicated that the application did not work in the “background”. For example, Global News reported May 22, 2020, that “An update to Apple’s iOS will allow the Alberta government’s ABTraceTogether app to operate in the background and when the iPhone is locked. However, the ABTraceTogether app is not using the software update yet.” (<a href="https://globalnews.ca/news/6969842/ios-update-abtracetogether-app-apple-alberta-coronavirus/">https://globalnews.ca/news/6969842/ios-update-abtracetogether-app-apple-alberta-coronavirus/</a>)</p>

<p>On September 24 2020, an update to ABTraceTogether was launched which included in the release notes “added capability to do contact tracing while the application is in the background” (<a href="https://apps.apple.com/ca/app/abtracetogether/id1508213665">https://apps.apple.com/ca/app/abtracetogether/id1508213665</a>).</p>

<p><img src="../assets/abtracetogether/image2.png" alt="Screen capture of the Version History from ABTraceTogether sourced from the Apple App store, indicating an update in version 1.4.0 that &quot;added capability to do contact tracing while the application is in the background&quot;" /></p>

<p>However, doubts arose that this was an accurate description of the capabilities. The Singaporean application TraceTogether also had an update to allow background utilization, but it included the notes that “Except for very limited circumstances, iOS apps are still unable to exchange signals with other iOS apps if both apps are in the background.” (<a href="https://support.tracetogether.gov.sg/hc/en-sg/articles/360050556334-I-see-COVID-19-Exposure-Notifications-in-my-phone-s-operating-system-Do-I-need-to-turn-it-on-for-my-TraceTogether-App-to-work-in-the-background-">https://support.tracetogether.gov.sg/hc/en-sg/articles/360050556334-I-see-COVID-19-Exposure-Notifications-in-my-phone-s-operating-system-Do-I-need-to-turn-it-on-for-my-TraceTogether-App-to-work-in-the-background-</a>)</p>

<p><img src="../assets/abtracetogether/image4.png" alt="Screen capture of the release notes from the Singaporean application, noting that &quot;Except for very limited circumstances, iOS apps are still unable to exchange signals with other iOS apps if both apps are in the background&quot;" /></p>

<p>Based upon a review of public literature, it seems unlikely that ABTraceTogether can work correctly as expected. Reference literature:</p>

<ul>
  <li><a href="https://medium.com/kinandcartacreated/why-bespoke-contact-tracing-apps-dont-work-so-well-on-ios-df0dabd95d42">https://medium.com/kinandcartacreated/why-bespoke-contact-tracing-apps-dont-work-so-well-on-ios-df0dabd95d42</a></li>
  <li><a href="https://github.com/opentrace-community/opentrace-ios/issues/4">https://github.com/opentrace-community/opentrace-ios/issues/4</a></li>
</ul>

<p>It seems that there is sufficient reason to believe that Apple enforces app restrictions that would prevent ABTraceTogether from performing the type of background scanning that is required for OpenTrace bluetooth protocol to work correctly.</p>

<p>Experimental testing will resolve whether the ABTraceTogether application has capabilities beyond those documented in the TraceTogether application.</p>

<h1 id="test-protocol">Test Protocol</h1>

<ul>
  <li>ABTraceTogether was installed on two iPhones</li>
  <li>The iPhones were exposed to each other with ABTraceTogether in the “foreground” for a period of time
    <ul>
      <li>Foreground: Both phones were left unlocked, with the ABTraceTogether application running</li>
    </ul>
  </li>
  <li>Other apps on the phones were used, to make ABTraceTogether run in the “background” for an extended period of time
    <ul>
      <li>ABTraceTogether was not closed via the App Switcher</li>
    </ul>
  </li>
  <li>Data collected from one of the phones was analyzed
    <ul>
      <li>See Appendix for more detailed information on the data collection procedure</li>
    </ul>
  </li>
</ul>

<h1 id="test-details">Test Details</h1>

<ul>
  <li>iPhone #1:
    <ul>
      <li>iPhone 11 Pro</li>
      <li>iOS 14.2</li>
      <li>ABTraceTogether version 1.4.0 installed from the Apple App Store</li>
    </ul>
  </li>
  <li>iPhone #2:
    <ul>
      <li>iPhone 6s</li>
      <li>iOS 14.1</li>
      <li>ABTraceTogether version 1.4.0 installed from the Apple App Store</li>
    </ul>
  </li>
  <li>Android #1:
    <ul>
      <li>OnePlus 7T</li>
      <li>OxygenOS (Android) 10.0.14.HD65AA</li>
      <li>ABTraceTogether version 1.4.0 installed from the Google Play Store</li>
    </ul>
  </li>
  <li>After the installation, the two iPhones were left with ABTraceTogether running in the foreground between 7:22am and 8:00am on November 10th</li>
  <li>Other applications were used on both iPhones; the two iPhones were physically within 2 meters of each other throughout November 10th between 8am and 6pm</li>
  <li>Android #1 entered physical proximity of both iPhones around 8am for less than 10 minutes</li>
  <li>Android #1 entered physical proximity of both iPhones around 12pm for about 1 hour</li>
  <li>Android #1 entered physical proximity of both iPhones around 5pm for about 1 hour</li>
  <li>Between 6pm and the next morning around 7am, iPhone #1 and iPhone #2 were physically separated</li>
  <li>The next morning around 6:30am, iPhone #1, iPhone #2, and Android #1 were all physically reunited</li>
  <li>Test data was collected from iPhone #2 at 6:49am by performing a backup to a macOS device, extracting the SQLite tracer.sqlite DB from the application, and exporting all data to a spreadsheet for analysis and reformatting</li>
  <li>Between 7:39am and 7:50am (Nov 11) iPhone #1 started running ABTraceTogether in the foreground</li>
  <li>A second test data set was collected from iPhone #2 at 7:50am, following the same procedure as the first test data set.</li>
</ul>

<h1 id="data--analysis">Data &amp; Analysis</h1>

<ul>
  <li>Raw data from both data sets is was imported into a spreadsheet for analysis
    <ul>
      <li>Original analysis format was <a href="https://docs.google.com/spreadsheets/d/1c7y8jlrZgyt2QCadNvdLBt6_w7YT1lnfS3XkmoaaS9Q/edit#gid=959594780">Google Sheets</a></li>
      <li>An archive of the same spreadsheet is available in a <a href="../assets/abtracetogether/2020-11-11%20-%20ABTraceTogether%20iPhone%20Background%20Capabilities%20Analysis.xlsx">Microsoft Excel</a> and <a href="../assets/abtracetogether/2020-11-11%20-%20ABTraceTogether%20iPhone%20Background%20Capabilities%20Analysis.ods">OpenDocument</a> format.</li>
    </ul>
  </li>
  <li>ABTraceTogether collects all exposures in a table named ZENCOUNTER with the following fields:
    <ul>
      <li>Z_PK INTEGER PRIMARY KEY
        <ul>
          <li>Analysis: sequentially incrementing primary key of the record</li>
        </ul>
      </li>
      <li>Z_ENT INTEGER
        <ul>
          <li>Analysis: always contains the value 1</li>
        </ul>
      </li>
      <li>Z_OPT INTEGER
        <ul>
          <li>Analysis: always contains the value 1</li>
        </ul>
      </li>
      <li>ZV INTEGER
        <ul>
          <li>Analysis: Contains the value 1 for an initial record “Scanning started”, and contains the value 2 for all exposure records</li>
        </ul>
      </li>
      <li>ZRSSI FLOAT
        <ul>
          <li>Analysis: measurement of the received signal strength indicator, likely in dBm; the values range from -49 to -101</li>
        </ul>
      </li>
      <li>ZTIMESTAMP TIMESTAMP
        <ul>
          <li>Analysis: Cocoa date-time value, which is measured in seconds since midnight, January 1, 2001, in UTC</li>
        </ul>
      </li>
      <li>ZTXPOWER FLOAT
        <ul>
          <li>Data: NULL, 8, 12, or -1</li>
          <li>Analysis: Not clear</li>
        </ul>
      </li>
      <li>ZMODELC VARCHAR
        <ul>
          <li>Data: “iPhone” or “Android”</li>
          <li>Analysis: In the Bluetooth protocol, “central” devices are host devices, and “peripheral” devices connect to central devices. The “C” suffix indicates that this suggests which type of device was the “central” device in this exposure.</li>
        </ul>
      </li>
      <li>ZMODELP VARCHAR
        <ul>
          <li>Data: “iPhone” or “Android”</li>
          <li>Analysis: As per field ZMODELC, this is believed to be the model of the peripheral device.</li>
        </ul>
      </li>
      <li>ZMSG VARCHAR
        <ul>
          <li>Data: 61 bytes of BASE64 encoded data; repeated multiple times for each exposure</li>
          <li>Analysis: This is the exposure ID of the target device; there seems to be no confidential information in the base64 decoded data</li>
          <li>The data published in this spreadsheet has been trimmed to the first 30 characters in the unlikely event that confidential or sensitive information is contained in the token</li>
        </ul>
      </li>
      <li>ZORG VARCHAR
        <ul>
          <li>Data: String “CA_AB”</li>
          <li>Analysis: Possible future expansion to multiple regions, or integration with other OpenTrace-supported applications</li>
        </ul>
      </li>
    </ul>
  </li>
  <li>Data collected:
    <ul>
      <li>47 exposures were collected between 7:22am Nov 10 and 6:49am Nov 11</li>
      <li>Three unique ZMSGs were found between iPhone devices
        <ul>
          <li>One ZMSG occurred 28 times</li>
          <li>One ZMSG occurred 13 times</li>
          <li>One ZMSG occurred 2 times</li>
        </ul>
      </li>
      <li>Two unique ZMSGs were found between iPhone and Android devices
        <ul>
          <li>One ZMSG occurred 1 time</li>
          <li>One ZMSG occurred 2 times</li>
        </ul>
      </li>
      <li>An additional 16 exposures were collected between 6:49am Nov 11 and 7:50am Nov 11</li>
    </ul>
  </li>
  <li>Analysis:
    <ul>
      <li>The three iPhone exposures are believed to be:
        <ul>
          <li>iPhone #1 -&gt; iPhone #2</li>
          <li>iPhone #2 -&gt; iPhone #2</li>
          <li>An unidentified iPhone -&gt; iPhone #2 (2 times
            <ul>
              <li>These rarer iPhone to iPhone connections had a much lower RSSI, -101 and -98, as compared to the RSSIs in the range of -68 between the more common exposures</li>
            </ul>
          </li>
        </ul>
      </li>
      <li>The two Android exposures are believed to be:
        <ul>
          <li>Android #1 -&gt; iPhone #2</li>
          <li>An unidentified Android -&gt; iPhone #2</li>
        </ul>
      </li>
      <li>In test data set #1, all iPhone to iPhone exposures occurred while iPhone #2 had the ABTraceTogether application running in the foreground
        <ul>
          <li>44 iPhone/iPhone exposures occurred in the 30 minutes that the app was running in the foreground on both devices</li>
          <li>0 iPhone/iPhone exposures occurred in the 23 hours that the app was running in the background on both devices</li>
        </ul>
      </li>
      <li>In test data set #2, 16 additional exposures occurred
        <ul>
          <li>2 iPhone/iPhone exposures occurred while both devices were believed to have the app running in the background</li>
          <li>14 exposures occurred in the 10 minutes that the app was running in the foreground on at least one iPhone device</li>
        </ul>
      </li>
      <li>Android/iPhone exposures also occurred rarely considering the proximity of the devices; during two 1hr windows of exposure on Nov 10 no exposures were recorded on the iPhone</li>
    </ul>
  </li>
</ul>

<h1 id="conclusion">Conclusion</h1>

<p>ABTraceTogether does not effectively record iPhone-to-iPhone exposures while both devices have ABTraceTogether running in the background. In over 12 hours of close exposure between two iPhones with ABTraceTogether in the background, zero records of exposure were recorded. After bringing the app into the foreground on an iPhone, exposure records began being collected immediately.</p>

<p>Android/iPhone exposure notification is also quite inconsistent. While some exposure was recorded while both devices were in the background, it did not occur consistently at the same time as physical prolonged exposure occurred. One hour of physical exposure is more than sufficient to contract COVID-19, but did not trigger ABTraceTogether’s exposure recording.</p>

<h1 id="follow-up-work">Follow-Up Work</h1>

<p>iPhone/iPhone exposure notification was collected once when neither device is believed to have the app in the foreground. It is likely a testing error occurred, as this abnormal finding is inconsistent with many hours of previous data collection. Continued data collection will be performed to identify the likelihood of this occurring again.</p>

<ul>
  <li>Update 2020-11-11 @ 11:05am – Twitter user @ZiadFazel has offered a theory that this exposure may have occurred because one iPhone was plugged in, allowing the power-saving features of Apple’s platform to be disabled. It is true that one phone was plugged in; iPhone #2 was prep’d for another data backup during this time window. This theory is unproven, but a plausible explanation for the background exposure in this small time window.</li>
  <li>Update 2011-11-12 @ 6:00am – An analysis of an additional 24 hours of exposure data from iPhone #2 has shown results consistent with the first 24 hours of data; no iPhone to iPhone exposures were recorded despite iPhone #1 and #2 being in close proximity, with the apps in the background. iPhone #2 to Android #1 exposures were observed in two roughly 10 minute windows, 12 hours apart, despite ongoing daily exposure, which continues to indicate that iPhone-Android exposure is also not consistent.</li>
</ul>

<p>Android/Android exposure notification could not be tested with the data source used in this testing. Future experimentation on the effectiveness of Android exposure detection is recommended, but a rooted Android device is likely required to collect an Android exposure database.</p>

<h1 id="appendix">Appendix</h1>

<h2 id="collecting-iphone-exposure-database">Collecting iPhone Exposure Database</h2>

<p>Using the software iMazing, a full iPhone backup was performed. Accessing tracer.sqlite from the application is trivial, as demonstrated in the below screenshot.</p>

<p><img src="../assets/abtracetogether/image3.png" alt="Screen capture of extracting the &quot;tracer.sqlite&quot; file from an iPhone using the tool iMazing" /></p>

<p>As with most Core Data based iOS applications, the data is stored in a SQLite database. This specific database was a SQLite 3.x database, last written using SQLite version 3032003.</p>

<p>Data was exported from the SQLite database for analysis in Google Sheets by exporting the ZENCOUNTER table to CSV:</p>

<p><img src="../assets/abtracetogether/image1.png" alt="Screen capture of extracting the SQLite database &quot;tracer.sqlite&quot; to a CSV files" /></p>]]></content><author><name></name></author><summary type="html"><![CDATA[The Government of Alberta advertises that ABTraceTogether, their COVID contact tracing app, works on iPhones when run in the background. They are incorrect.]]></summary></entry><entry><title type="html">Is “404 Not Found” really a client error?</title><link href="https://mathieu.fenniak.net/is-404-not-found-really-a-client-error/" rel="alternate" type="text/html" title="Is “404 Not Found” really a client error?" /><published>2013-08-23T07:00:00+00:00</published><updated>2013-08-23T07:00:00+00:00</updated><id>https://mathieu.fenniak.net/is-404-not-found-really-a-client-error</id><content type="html" xml:base="https://mathieu.fenniak.net/is-404-not-found-really-a-client-error/"><![CDATA[<p>A time-traveler is passing by 2013 and she opens a browser bookmark to <code class="language-plaintext highlighter-rouge">http://mars.gov/blog/**2056**/11/21/news.html</code>.</p>

<p>What HTTP status code does she get back from her response?</p>

<p>Well, it’s not going to be <code class="language-plaintext highlighter-rouge">200 OK</code>, because it wasn’t OK with the server. The server couldn’t find the article that the client requested, because it won’t be published for another 43 years.</p>

<p>“Couldn’t find the article” sounds like a <code class="language-plaintext highlighter-rouge">404 Not Found</code> status code. OK, very reasonable choice.</p>

<p>But, “<strong>The server</strong> couldn’t find the article” raises a bit of a doubt. A 404 is part of the 4xx-series status codes, which are all for client errors. Was this a client error, if it was the server’s fault for not finding the article? Shouldn’t she be getting <code class="language-plaintext highlighter-rouge">5xx Not Found</code>?</p>

<!--more-->

<p>HTTP error status codes fall into two broad categories; 4xx (400 - 499) and 5xx (500 - 599). Understanding these categories is more important than understanding the specific error codes, because the categories communicate a very important piece of information. 4xx status codes represent a problem with the HTTP request, whereas 5xx status codes represent a problem with the HTTP server.</p>

<p>However, the line can be blurry between these two categories, if you just consider them the difference between “HTTP request” or “HTTP server”. If resources can be created in the future, and they can, then the exact same HTTP request that was 404 today could be a 200 tomorrow. So, is it really a problem with the request?</p>

<p>To address this confusion, I like to add an additional rule to clarify between 4xx and 5xx. When servicing an HTTP request, was it possible to give a 200 OK response? Let’s say it was a bug-free application server, or possibly even a human being manually processing HTTP requests. If it is possible, given perfect software or infinite resources or a brilliant mind, to process the request, but an error occurred, then it should be a 5xx code response. On the other hand, if no amount of perfection, resources, or brilliance would have been capable of processing that request, then it is a 4xx error.</p>

<p>Maybe our time-traveler should get back a <code class="language-plaintext highlighter-rouge">505 HTTP Version Not Supported</code> response, because the server could never have understood her attempt to use the HTTP/9.7 protocol. But other than that, she should get a good old <code class="language-plaintext highlighter-rouge">404 Not Found</code>; even with bug-free flawless server software, it was impossible to process her request. A 4xx error code suggests that the HTTP client needs to do something differently; like time-travel forward 43 years.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[A time-traveler is passing by 2013 and she opens a browser bookmark to http://mars.gov/blog/**2056**/11/21/news.html. What HTTP status code does she get back from her response? Well, it’s not going to be 200 OK, because it wasn’t OK with the server. The server couldn’t find the article that the client requested, because it won’t be published for another 43 years. “Couldn’t find the article” sounds like a 404 Not Found status code. OK, very reasonable choice. But, “The server couldn’t find the article” raises a bit of a doubt. A 404 is part of the 4xx-series status codes, which are all for client errors. Was this a client error, if it was the server’s fault for not finding the article? Shouldn’t she be getting 5xx Not Found?]]></summary></entry><entry><title type="html">High-Level API Documentation Considerations</title><link href="https://mathieu.fenniak.net/high-level-api-documentation-considerations/" rel="alternate" type="text/html" title="High-Level API Documentation Considerations" /><published>2013-07-15T07:00:00+00:00</published><updated>2013-07-15T07:00:00+00:00</updated><id>https://mathieu.fenniak.net/high-level-api-documentation-considerations</id><content type="html" xml:base="https://mathieu.fenniak.net/high-level-api-documentation-considerations/"><![CDATA[<p>When you’re building a Web API, you’re likely going to need to figure out how to provide documentation to your end-users. If you didn’t think of that when you started your project, it can be a real deflating moment when it finally comes to mind. Documentation is boring and tedious. It’s nowhere near as fun as code.</p>

<p>Because documentation is boring, most software developers will quickly turn towards documentation tools. It’ll be so much faster to write documentation if we can just build or apply some software to do it, and building a software tool is much more fun than just sitting down and writing.</p>

<p>Documentation tools have advantages over hand-written documentation: they’re easier to keep up-to-date, they can produce output much faster, and they can provide much better coverage with less effort. But, <strong>before you go looking at API documentation tools</strong>, take a step back and figure out some high-level details about what your documentation looks like.</p>

<!--more-->

<h3 id="who-is-authoring-editing-and-maintaining-the-documentation-project-now-and-in-the-future"><strong>Who</strong> is authoring, editing, and maintaining the documentation project, now and in the future?</h3>

<p>This is the most important point to consider before you choose a documentation tool. Is your documentation going to be written by a technical writer, a developer, or a product owner? Is it going to be authored and edited in-house, are you going to outsource to someone with more expertise?</p>

<p>If you have a technical writer involved in the documentation of your API, you need to consider the environment, systems, and processes that they’re used to working with. Maybe they’re not comfortable writing Javadoc comments; knowing that will definitely impact upon tool selection.</p>

<p><em>Example:</em> A first-draft of documentation comes from the API developer, because they know what the API can do and how to describe it to other developers. After that, it’s usually rewritten and then maintained by a technical writer; that will influence my documentation tool search towards tools that a writer can work with.</p>

<h3 id="how-and-where-is-it-going-to-be-published"><strong>How</strong> and <strong>where</strong> is it going to be published?</h3>

<p>Is your documentation output part of your product, part of your marketing website, part of your support team’s website, or something completely independent? This will vary dramatically between smaller organizations and larger ones, but there are important considerations that need to be figured out.</p>

<p>If you’re a development shop that wants to do continuous deployment, do you want a technical writer to be pushing builds out when they update the documentation project?</p>

<p><em>Example:</em> In an organization with separated web responsibilities between marketing and product, documentation probably belongs hosted with the marketing web site. It doesn’t have the same availability and compliance requirements as product changes.</p>

<h3 id="when-is-it-going-to-be-updated-and-released"><strong>When</strong> is it going to be updated and released?</h3>

<p>Do you release your documentation at the same time as your software upgrades? I think that’s the “default” choice for many people because it allows them to have an integrated release process, but you’ll find that it creates some friction. It can be more difficult to update documentation to cover new capabilities when they aren’t available to be tested, screenshotted, or described by the doc authors.</p>

<p>If you have a rigorous change control process, do documentation changes really need to go through that rigmarole? Does every documentation update need a JIRA issue and a Q/A sign-off?</p>

<h3 id="what-formats-do-you-want-to-present-the-documentation-in"><strong>What</strong> formats do you want to present the documentation in?</h3>

<p>HTML, obviously. Virtually everyone wants HTML documentation.</p>

<p>But, do you also want a more standalone documentation format? It can be quite handy to have PDF output that can be e-mailed to a prospective client, or can be archived on a network folder somewhere.</p>

<hr />

<p>Once you’ve gathered all that information, you might start to have a picture of what you want your documentation tools to look like. Maybe it’s an integrated documentation tool, but maybe not.</p>

<p>Personally, I like <a href="http://sphinx-doc.org/">Sphinx</a> as a documentation tool. It can fit into a wide variety of workflows, it can skirt the edge between automatically-generated and manually-written documentation, and it can output in multiple formats including HTML and PDF. <a href="http://pythonhosted.org/sphinxcontrib-httpdomain/">sphinxcontrib-httpdomain</a> is an excellent contrib add-on specifically for documenting HTTP APIs.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[When you’re building a Web API, you’re likely going to need to figure out how to provide documentation to your end-users. If you didn’t think of that when you started your project, it can be a real deflating moment when it finally comes to mind. Documentation is boring and tedious. It’s nowhere near as fun as code. Because documentation is boring, most software developers will quickly turn towards documentation tools. It’ll be so much faster to write documentation if we can just build or apply some software to do it, and building a software tool is much more fun than just sitting down and writing. Documentation tools have advantages over hand-written documentation: they’re easier to keep up-to-date, they can produce output much faster, and they can provide much better coverage with less effort. But, before you go looking at API documentation tools, take a step back and figure out some high-level details about what your documentation looks like.]]></summary></entry></feed>