There is a tension between the requirement that (a) voter privacy should be strongly protected, and (b) cast vote records should be continuously exported in order to ensure that they can be immediately available in the case of hardware failure. For example, if each scanned ballot naively results in the creation of a new CVR file on the USB drive with a timestamp, the order of CVRs is obviously preserved on the USB drive, as the file creation and modification timestamps reveal the order in which those CVRs were stored.
To meet both requirements, VxScan performs some amount of "shuffling" every time a CVR is saved to the USB drive. Every time a ballot is cast and its CVR is saved to the disk, VxScan picks one or two CVRs already stored on the USB drive and updates their creation and modification time on the USB drive. This is the digital equivalent of taking one or two random ballots from a pile and bringing them to the top of the pile. Thus, if an attacker were to view the CVRs on the USB drive, they would not be able to determine the order in which those ballots were cast, because of this constant shuffling.
Recall from Cast Vote Records that a single CVR is structured, on disk, as a directory that contains the JSON data of the CVR and the ballot images for that CVR. The VxScan operation performed to move a single CVR "up to the top of the pile" involves four steps:
Copy the CVR directory contents:
cp -r 4d6a9dad-e6d6-4a29-89bc-9ab915012b73/ 4d6a9dad-e6d6-4a29-89bc-9ab915012b73-temp/
Once copying completes, mark the new directory as complete:
mv 4d6a9dad-e6d6-4a29-89bc-9ab915012b73-temp/ 4d6a9dad-e6d6-4a29-89bc-9ab915012b73-temp-complete/
Delete the old directory:
rm -r 4d6a9dad-e6d6-4a29-89bc-9ab915012b73/
Rename the new directory:
mv 4d6a9dad-e6d6-4a29-89bc-9ab915012b73-temp-complete/ 4d6a9dad-e6d6-4a29-89bc-9ab915012b73/
This four-step operation ensures that:
All metadata for that CVR (parent directory and children files) is updated, leaving no trace as to when that CVR was first saved to disk.
If a failure occurs at any point, it is possible to recover completely without losing any data.
Refer to the following code links for more details:
Our precinct scanners export cast vote records (CVRs) to USB drive continuously, as ballots are cast. Continuous export also allows for a speedy polls close, as cast vote records need not be exported all at once on polls close.
To perform continuous export while also keeping signatures, i.e., authenticity information, up-to-date, we have to be able to write both CVRs and authenticity information incrementally. (It isn’t enough to just hash and sign new data. We need an up-to-date root hash/signature.) We can use a Merkle tree structure to accomplish this efficiently.
Assuming every CVR has a random UUID, e.g. 4d6a9dad-e6d6-4a29-89bc-9ab915012b73, we can specifically use the following structure:
We don’t actually have to 1) use a nested directory structure on the USB or 2) store all intermediate hashes on the USB. We can store the structure and intermediate hashes on the machine in a database table, e.g.
-
-
-
Root hash
4
-
-
4 hash
4
4d
-
4d hash
4
4d
4d6a9dad-e6d6-4a29-89bc-9ab915012b73
4d6a9dad-e6d6-4a29-89bc-9ab915012b73 hash
…
Then on the USB, we can use a flat structure that’s much easier to reason about and iterate over:
The database table makes recomputing hashes easy, as retrieval of all hashes for a given ID prefix becomes a simple SQL query. This approach also decreases the chance that we accidentally depend on data written to the USB drive when computing hashes, protecting against compromised or faulty USB drives.
In the above examples, we’ve listed a root-hash.txt file and have signed that. In practice, we use a JSON file capable of storing other metadata and sign that:
Sample metadata.json:
Whenever a machine imports a CVR directory, it authenticates the CVRs by 1) verifying the signature on the metadata.json file and 2) recomputing the “castVoteRecordRootHash” in metadata.json from scratch to ensure that it’s correct.
Refer to the following code links for more details:
https://github.com/votingworks/vxsuite/blob/v4.0.0-release-branch/libs/auth/src/cast_vote_record_hashes.ts — CVR hashing logic, including the Merkle tree implementation
When a VxSuite machine exports data to a USB for another VxSuite machine to import, the first machine digitally signs that data so that the second machine can verify its authenticity. We use this mechanism in two places in particular:
To authenticate election definitions/packages — These configuration bundles are exported by VxAdmin and used to configure VxCentralScan, VxMark, and VxScan.
To authenticate cast vote records — These are exported by VxCentralScan and VxScan and imported by VxAdmin for tabulation.
The exporting machine digitally signs the following message using its TPM private key:
MESSAGE_FORMAT_VERSION + “//” + ARTIFACT_TYPE + “//” + ARTIFACT_CONTENTS
It then outputs the following to a .vxsig file:
SIGNATURE_LENGTH + SIGNATURE + SIGNING_MACHINE_CERTIFICATE
The importing machine parses the above and verifies that the signing machine certificate in the .vxsig file is a valid certificate that is signed by VotingWorks, using the VotingWorks CA certificate installed on every machine. The importing machine then extracts the exporting/signing machine’s public key from this certificate, reconstructs the message, and verifies the signature. After this, the importing machine performs artifact-specific authentication checks, e.g., that the signing machine cert is a VxAdmin cert if the artifact is an election package.
If verification fails on the importing machine, the importing machine will refuse to import the artifact. This provides protection against data tampering and/or corruption as data is transferred from one machine to another via USB drive.
Refer to the following code links for more details:
— VxSuite authentication lib, a good starting point for all things authentication
— Artifact authentication logic
— OpenSSL commands underlying various authentication and signing operations