Software Tools for Writing Reproducible Papers


This post is a ♯longread primarily intended for graduate students and postdocs, but should hopefully be accessible more broadly. Reading through the post should take about an hour, while following the instructions completely may take the better part of a day.

As an important caveat, much of what this post discusses is still experimental, such that you may run into minor issues in following the steps listed below. I apologize if this happens, and thank you for your patience.

In any case, if you find this post useful, please cite it in papers that you write using these tools; doing so helps me out and makes it easier for me to write more such advice in the future.

Finally, we note that we have not covered several very important tools here, such as ReproZip. This post is already over 6,000 words long, so we didn’t attempt to run through all possible tools. We encourage further exploration, rather than thinking of this post as definitive.

Thanks for reading! ♥


In my previous post, I detailed some of the ways our software tools and social structures encourage some actions and discourage others. Especially when it comes to tasks such as writing reproducible papers that both offer to significantly improve research culture, but are somewhat challening in their own right, it’s critical to ensure that we positively encourage doing things a bit better than we’ve done them before. That said, though my previous post spilled quite a few pixels on the what and the why of such encouragements, and of what support we need for reproducible research practices, I said very little about how one could practically do better.

This post tries to improve on that by offering a concrete and specific workflow that makes it somewhat easier to write the best papers we can. Importantly, in doing so, I will focus on a paper-writing process that I’ve developed for my own use and that works well for me— everyone approaches things differently, so you may disagree (perhaps even vehemently) with some of the choices I describe here. Even if so, however, I hope that in providing a specific set of software tools that work well together to support reproducible research, I can at least move the conversation forward and make my little corner of academia ever so slightly better.

Having said what my goals are with this post, it’s worth taking a moment to consider what technical goals we should strive for in developing and configuring software tools for use in our research. First and foremost, I have focused on tools that are cross-platform: it is not my place nor my desire to mandate what operating system any particular researcher should use. Moreover, we often have to collaborate with people that make dramatically different choices about their software environments. Thus, we must be careful what barriers to entry we establish when we use methodologies that do not port well to platforms other than our own.

Next, I have focused on tools which minimize the amount of closed-source software that is required to get research done. The conflict between closed-source software and reproducibility is obvious nearly to the point of being self-evident. Thus, without being purists about the issue, it is still useful to reduce our reliance on closed-source gatekeepers as much as is reasonable given other constraints.

The last and perhaps least obvious goal that I will adopt in this post is that each tool we develop or adopt here should be useful for more than a single purpose. Installing software introduces a new cognative load in understanding how it operates, and adds to the general maintenance cost we pay in doing research. While this can be mitigated in part with appropriate use of package management, we should also be careful that we justify each piece of our software infrastructure in terms of what benefits it provides to us. In this post, that means specifically that we will choose things that solve more than just the immediate problem at hand, but that support our research efforts more generally.

Without further ado, then, the rest of this post steps through one particular software stack for reproducible research in a piece by piece fashion. I have tried to keep this discussion detailed, but not esoteric, in the hopes of making an accessible description. In particular, I have not focused at all on how to develop scientific software of how to write reproducible code, but rather how to integrate such code into a high-quality manuscript. My advice is thus necessarily specific to what I know, quantum information, but should be readily adapted to other fields.

Following that, I’ll detail the following components of a software stack for writing reproducible research papers:

  • Command-line environment: PowerShell
  • TeX / LaTeX distribution: TeX Live and MiKTeX
  • Literate programming environment: Jupyter Notebook
  • Text editor: Visual Studio Code
  • LaTeX template: {revtex4-1}, {quantumarticle}, and {revquantum}
  • Project layout
  • Version control: Git
  • arXiv build management: PoShTeX

Command Line

Command-line interfaces and scripting languages provide a very powerful paradigm for automating disparate software tools into a single coherent process. For our purposes, we will need to automate running TeX, running literate scientific computing notebooks, and packaging the results for publication. Since these different software tools were not explicitly designed to work together, it will be easiest for our purposes to use a command-line scripting language to manage the entire process. There are many compelling options out there, including celebrated and versatile systems such as bash, tcsh, and zsh, as well as newer tools such as fish and xonsh. For this post, however, I will describe how to use Microsoft’s open-source PowerShell instead.

Microsoft offers PowerShell easy-to-install packages for Linux and macOS / OS X on at their GitHub repository. For most Windows users, we don’t need to install PowerShell, but we will need to install a package manager to help us install a couple things later. If you don’t already have Chocolatey, go on and install it now, following their instructions.

Similarly, we will use the package manager Homebrew for macOS / OS X. The quickest way to install it is to run the following command in Terminal:

$ /usr/bin/ruby -e "$(curl -fsSL"

Also, be sure to restart your Terminal window after the installation. Then, we install PowerShell with the following two commands:

$ brew tap caskroom/cask
$ brew cask install powershell

The first command installs the Homebrew Cask extension for programs distributed as binaries.

Aside: Why PowerShell?

As a brief aside, why PowerShell? One of the long-standing annoyances to getting anything working in a reliable and cross-platform manner is that Windows is just not like Linux or macOS / OS X. My interest here is not in saying that Windows is either good or bad, but to solve the problem of how to get things working across that divide in a reliable way. Certainly, scripting languages like bash have been ported to Windows and work well there, but they don’t tend to work in a way that plays well with native tools. For instance, it is difficult to get Cygwin Bash to reliably interoperate with commonly-used TeX distributions such as MiKTeX.

Many of these challenges arise from that bash and other such tools work by manipulating strings, rather than providing a typed programming environment. Thus, any cross-platform portability must eventually deal with annoyances such as / versus \ in file name paths, while leaving slashes invariant in cases such as TeX source.

By contrast, PowerShell can be used as a command-line REPL (read-evaluate-print loop) interface to the more structrued .NET programming environment. That way, OS-specific differences such as / versus \ can be handled as an API, rather than relying on string parsing for everything. Moreover, PowerShell comes pre-installed on most recent versions of Windows, making it easier to deal with the comaprative lack of package management on most Windows installations. (PowerShell even addresses this by providing some very nice package management features, which we will use in later sections.)

Since PowerShell has recently been open-sourced, we can readily rely on it for our purposes here.


For writing a reproducible scientific paper, there’s really no substitute still for TeX. Thus, if you don’t have TeX installed already, let’s go on and install that now.

(Linux only) TeX Live

We can use Ubuntu’s package manager to easily install TeX Live:

$ sudo apt install texlive

The process will be slightly different on other variants of Linux.

(Windows only) MiKTeX

Since we installed Chocolatey earlier, it’s quite straightforward to install MiKTeX. From an Administrator session of PowerShell (right-click on PowerShell in the Start menu, and press Run as administrator), run the following command:

PS> choco install miktex

(macOS / OS X only) MacTeX

Installing MacTeX is similarly straightforward using Homebrew Cask (which we should have installed earlier):

$ brew cask install mactex


Moving on, let’s take a few seconds to get Jupyter up and running. Put succiently, Jupyter is a powerful infrastructure fo scientific programming in a variety of different languages. Indeed, even the name points to the diversity of tools supported, as it originates from a portmanteau of Julia, Python and R. Jupyter goes well beyond these three examples, though, and supports a language-agnostic interface for programming in JavaScript, F#, and even MATLAB.

Of particular interest to us is the Jupyter Notebook functionality, previously known as IPython Notebook. This tool allows us to write literate documents that intersperse source code, explanations, mathematics, figures and plots. As such, Jupyter Notebook is ideal for providing lucid and readable explanations of numerical and experimental results, providing a way to clearly explain a reproducible project.

Though Jupyter is a language-independent framework, the code infrastructure itself is written in Python. Thus, the easiest way to get Jupyter in a cross-platform way is to install a distribution of Python, such as Anaconda, that incldues Jupyter as a package. Since we want to focus in this post on how to write papers rather than on the programming aspects, we won’t go into detail at the moment on how to use Jupyter; below, we suggest some resources for getting started with Jupyter as a programming tool. For now, we focus on getting Jupyter installed and running.

On Windows, we can again rely on Chocolatey:

PS> choco install anaconda3

On Linux and macOS / OS X, the process is not much more complicated.

To get started using Juyter Notebook, we suggest the following tutorial:


In keeping with our goals in the introduction, to actually write TeX source code, we don’t want a tool that works only for TeX. Rather, we want something general-purpose that is also useful for TeX. By doing so, we avoid the all-too-familiar workflow of using a specialized editor for each different part of a scientific project. This way, increased familiarity and proficiency with our software tools benefits us across the board.

With that in mind, we’ll follow the example of Visual Studio Code, an open-source and cross-platform text editing and development platform from Microsoft. Notably, many other good examples exist, such as Atom; we focus on VS Code here as an example rather than as a recommendation over other tools.

With that aside, let’s start by installing.

If you’re running on Ubuntu or macOS / OS X, let’s download Visual Studio Code from the VS Code website. Alternatively for macOS / OS X, you can use Homebrew Cask

$ brew cask install visual-studio-code

On Ubuntu, we only need to install VS Code manually the first time; after that, Code can be managed using Ubuntu Software Center in the same manner as built-in packages. Meanwhile, the macOS / OS X version is installed by dragging the downloaded app into Applications.

Once again, Chocolatey comes to the rescue for Windows users:

PS> choco install visualstudiocode

In any case, once we have VS Code installed, let’s install a few extensions that will make our lives much easier in the rest of this post. Thankfully, this is quite straightforward due to the use of extension packs. Roughly, an extension pack is a special kind of extension that does nothing on its own, but specifies a list of other extensions that should be installed. I maintain a rudimentary example of such for use in scientific computing that includes some useful extensions for our purposes here. To install it, press Ctrl+Shift+X (Windows and Linux) / ⌘+Shift+X (macOS / OS X) to open the Extensions panel, and search for cgranade.scicomp-extension-pack. Though the full functionality exposed by these extensions is beyond the scope of this post, we’ll explore some important parts as we discuss other parts of our software stack.

For the most part, the extensions installed by the Scientific Computing Extension Pack do not need any configuration. The exception is that for MiKTeX on Windows, the LaTeX Workshop extension needs to be configured to run texify instead of its default build engine of latexmk. To do so press Ctrl+Shift+P / ⌘+Shift+P and type “Settings” until you are offered “Preferences: Open User Settings.” Next, copy the following JavaScript Object Notation (JSON) code into your user settings:

"latex-workshop.latex.toolchain": [
        "command": "texify",
        "args": [

Getting forward and inverse search with SyncTeX working on Windows also takes a slight bit more work, as is documented on StackExchange.

This demonstrates one of the really neat features of modern editing platforms, by the way. Namely, it’s very easy to share human-readable configuration snippets with others, making it easier to build a common platform with collegues and collaborators.

LaTeX Template

With the slight caveat that this section is the most specific to quantum information processing, we next turn our attention to the raison d’être for this whole endeavor: our LaTeX manuscript itself. In doing so, we try to minimize the size of our initial template. By minimizing the amount of boilerplate, we reduce the extent to which we introduce bugs in creating new manuscripts. More importantly, though, keeping our template minimal reduces how much we have to understand in order to use and maintain it.

That said, we will generally have a lot of LaTeX code shared between projects. To keep our manuscript templates minimal and free of boilerplate, then, we can rely on LaTeX’s document class and package functionality to abstract away this shared code. For instance, the {revtex4-1} document class abstracts away much of the work involved in formatting a manuscript for physics papers, while the {quantumarticle} class does similar for the new Quantum journal. Similarly, my own {revquantum} package attempts to abstract away much of the LaTeX code that I carry from project to project.

Though {revquantum} can be downloaded from the Comprehensive TeX Archive Network (CTAN), it will be easier for us to use Git to download the latest version. We’ll install Git a bit later on in the post, so we’ll focus on the template for now and will install the required LaTeX packages once we have Git at our disposal.

Following that strategy, we can now write a very minimal LaTeX template:




\title{Example paper}

\author{Christopher Granade}





Note that this template strips down the preamble (that is, the part of the LaTeX document before \begin{document}) to just three lines:

  • \documentclass[aps,pra,twocolumn,notitlepage,superscriptaddress]{revtex4-1}: Declares the document class to be {revtex4-1} and specifies some reasonable default options. Note that if an option isn’t specified for the society, journal or font size, {revtex4-1} will raise warnings. Thus, by specifying a few options, we reduce the number of spurious warnings that we have to sort through.
  • \usepackage[pretty,strict]{revquantum}: Includes the {revquantum} package with modern typesetting options. The strict option instructs {revquantum} to promote package incompatability warnings to errors, such that the manuscript will refuse to compile if there are issues with {revtex4-1} compatability.
  • \newcommand{\figurefolder}{.}: We’ll see more about this in the rest of the post, but roughly this command lets us abstract away details of our project structure from our LaTeX source. That in turn will make it much easier to rearrange the project folder as need be, as only minimal changes will be required in the LaTeX source itself.

Project Layout

Now that we have a reasonable template in place for our paper, let’s proceed to make and layout a folder for our project. The project folder needs to have somewhere to store the TeX source we use in typesetting the paper, and will likely need somewhere to store figures as well. Assuming we have either numerics or an experiment in our paper, we will also need somewhere to put our Jupyter Notebooks and any other source files that they rely upon.

Putting these needs together, my projects often wind up looking something like this:

  • project/
    • tex/
      • project.tex: Main TeX source file.
      • project.bib: Bibliography for main TeX source.
      • revquantum.sty: A copy of the {revquantum} package. We will download and build {revquantum} later in this post.
    • fig/
      • *.pdf: PDF-formatted figures for use in the main body.
    • src/
      • project.ipynb: Main literate notebook for the project.
      • *.py: One or two miscellaneous Python modules needed for the main notebook.
    • data/: Folder for experimental data generated by src/project.ipynb.
      NB: do not use NumPy’s *.npz format for uploading data to arXiv as ancillary material, as this is not supported by arXiv. Consider using HDF5 or similar instead. If your data set is moderately-sized (> 6 MiB), then consider uploading to an external service such as Figshare instead of arXiv— this also makes it easier to use other data storage formats.
    • .gitignore: A list of files, folders, and patterns to be excluded from version control. Don’t worry about this for now, we’ll deal with it below.
    • Brief instructions on how to use the ancillary files provided with the paper.
    • environment.yml or requirements.txt: Software dependencies needed by the project. How these files work is fairly specific to programming in Python, so we won’t go into the details here, but they make it easy for both collaborators and readers to quickly set up the software environment they need to run your code. For more details, please see the documentation for pip and conda env.
    • Export-ArXiv.ps1: Build manifest for exporting the paper to an arXiv-formatted ZIP archive. Later in the post, we’ll detail what this file should contain and how to use it. For now, just make a blank text file with this name.

As with everything else in this post, you may want to layout your project differently than I’ve suggested here. For instance, we included a copy of revquantum.sty above, since that’s not part of the “standard” set of packages installed on arXiv; if you depend on any other custom packages or classes, those would also appear in your layout. If you have more than one or two other source files in src/, you may also want to make a package and distribute it separately from your paper. You may also have additional files that you need to include, such as instrument configuration files. It’s also common to have other folders like notes/ to keep rough or unfinished derivations, or refereeing/ to draft correspondance with referees and editors once you submit your paper.

What’s essential, though, is that no matter how you layout your project, you use the layout as consistantly as possible. Doing so ensures that a file you need can always be found in the least surprising place, minimizing the amount of thought you have to spare on searching your folder structure. Importantly, if this means you need to change your layout as your project itself changes, that is a fine and useful thing to do. Your layout should, above all else, work for you and your collaborators with as little friction and additional thought as is reasonable.

Version Control

Once you’ve made a project folder, we need to be able to track how it changes over time and to share it with collaborators. Though file-synchronization tools such as Dropbox, Google Drive, and OneDrive are commonly used for this task, they introduce a lot of additional maintenance costs that we would like to avoid. For instance, it’s very difficult to collaborate using such services— conflicting edits are normally applied with preference only to whichever edits happened last, making it easy to accidently lose important edits. Similarly, it’s hard to look at a source file and understand why a particular set of changes was made, such that it’s again too easy to accidentally undo edits from collaborators.

In keeping with the goals laid out at the start of the post, then, we’ll adopt distributed version control as a tool to enable collaboration and version tracking. In particular, we will use Git in no small part due to its popularity, such that we can build off a large set of community-developed tools and services. Git is a very useful tool in general, such that we again avoid overly-specialized software infrastructure.

I won’t lie: there is a learning curve to Git, such that initially it will take substantially longer to do research backed by Git than by file-synchronization tools. In fairly short order, however, learning Git pays for itself both by avoiding common pitfalls introduced by file-synchronization tools and by providing powerful automation for other tasks outside the synchronization model. Both the learning curve and the power of Git stem from the same source, in that Git is extremely reticent to erase any set of changes, no matter how insignificant. For instance, if two contradictory sets of changes are made to a file, Git will demand that you explicitly specify how to merge them, rather than automatically overwritting changes that may be significant.

We won’t cover how to use Git in this post, but rather will focus on how to install it and configure it for setting up a reproducible paper. In lieu, we recommend the following tutorials:

In following these tutorials, we recommend starting by using the command line as much as possible, as this helps build the volcabulary needed when working with graphical interfaces to Git.

In any case, let’s go on and install Git. We will install Secure Shell (SSH) while we’re at it, since this is a very common and powerful way of interfacing with Git hosting providers such as GitHub, Bitbucket, and GitLab. Notably, SSH is also very useful for other research tasks such as managing cluster resources and running Jupyter Notebooks on remote servers, such that in installing SSH we get access to another general-purpose tool.

On Windows, run the following in an Administrator PowerShell session:

PS> choco install git putty poshgit

If you haven’t already done so, you’ll need to set PuTTY to be the SSH implementation used by Git for Windows. From within PowerShell, run the following:

PS> [Environment]::SetEnvironmentVariable("GIT_SSH", (Get-Command plink.exe | Select-Object -ExpandProperty Definition), "User")

If this doesn’t work, it’s likely because Get-Command plink.exe was unable to find the plink.exe command that comes with PuTTY. This can happen, for instance, if the $Env:PATH environment variable was changed by choco install but not in your current PowerShell session. The easiest way to fix this is to close and re-open your PowerShell session.

Notice that we’ve also installed poshgit (short for PowerShell Git) with this command, as that handles a lot of nice Git-related tasks within PowerShell. To add posh-git to your prompt, please see the instructions provided with posh-git. One of the more useful effects of adding posh-git to your prompt is that posh-git will then look at your setting for $Env:GIT_SSH and automatically manage your PuTTY configuration for you.

On Ubuntu, run the following in your favorite shell:

$ sudo apt install ssh git

This may warn that some or all of the needed packages are already installed— if so, that’s fine.

On macOS / OS X, SSH is pre-installed by default. To install Git, run git at the terminal and follow the installation prompts. However, the versions of ssh and git distributed with macOS / OS X are often outdated. Homebrew to the rescue:

$ brew install ssh git

Note that posh-git also partially works on PowerShell for Linux and macOS / OS X, but does not yet properly handle setting command-line prompts.

Once everything is installed, simply run git init from within your project folder to turn your project into a Git repository. Use git add and git commit, either at the command line or using your editor’s Git support, to add your initial project folder to your local repository.

The next steps from here depend somewhat on which Git hosting provider you wish to use, but proceed roughly in four steps:

  • Create a new repository on your hosting provider.
  • Configure your SSH private and public keys for use with your hosting provider.
  • Add your hosting provider as a git remote to your local project.
  • Use git push to upload your local repository to the new remote.

Since the details depend on your choice of provider, we won’t detail them here, though some of the tutorials provided above may be useful. Rather, we suggest following documentation for the hosting provider of your choice in order to get up and running.

In any case, as promised above, we can now use Git to download and install the LaTeX packages that we require. To get {revquantum}, we’ll run the included PowerShell installer. Note that, due to a bug that installer (currently being fixed), this will fail unless you already have Pandoc installed. Thus, we’ll go on and work around that bug for now by installing Pandoc (besides, it’s useful in writing responses to referees, as I’ll discuss in a future post):

# macOS
$ brew install pandoc
# Ubuntu
$ sudo apt install pandoc
# Windows
PS> choco install pandoc

I sincerely apologize for this bug, and will have it fixed soon. In any case, and having apologized for introducing additional requirements, let’s go on and install the packages themselves:

PS> git clone
PS> cd revquantum
PS> # Only run the following on Windows, as Unblock-File isn't needed
PS> # on Linux and macOS / OS X.
PS> Unblock-File Install.ps1 # Marks that the installer is safe to run.
PS> ./Install.ps1

Installing the {quantumarticle} document class proceeds similarly:

PS> git clone
PS> cd quantum-journal
PS> # Only run the following on Windows, as Unblock-File isn't needed
PS> # on Linux and macOS / OS X.
PS> Unblock-File install.ps1 # NB: "install" is spelled with a lower-case i here!
PS> ./install.ps1

Note that in the above, we used HTTPS URLs instead of the typical SSH. This allows us to download from GitHub without having to set up our public keys first. Since at the moment, we’re only interested in downloading copies of {revquantum} and {quantumarticle}, rather than actively developing them, HTTPS URLs work fine. That said, for your own projects or for contributing changes to other projects, we recommend taking the time to set up SSH keys and using that instead.

Aside: Working with Git in VS Code

As another brief aside, it’s worth taking a moment to see how Git can help enable collaborative and reproducible work. The Scientific Computation Extension Pack for VS Code that we installed earlier includes the amazingly useful Git Extension Pack maintained by Don Jayamanne, which in turn augments the already powerful Git tools built into Code.

For instance, the Git History extension provides us with a nice visualization of the history of a Git repository. Press Ctrl/⌘+Shift+P, then type “log” until you are offered “Git: View History (git log).” Using this on the QInfer repository as an example, I am presented with a visual history of my local repository:

Clicking on any entry in the history visualization presents me with a summary of the changes introduced by that commit, and allows us to quickly make comparisons. This is invaluable in answering that age old question, “what the heck did my collaborators change in the draft this time?”

Somewhat related is the GitLens extension, which provides inline annotations about the history of a file while you edit it. By default, these annotations are only visible at the top of a section or other major division in a source file, keeping them unobtrusive during normal editing. If you temporarily want more information, however, press Alt+B to view “blame” information about a file. This will annotate each line with a short description of who edited it last, when they did so, and why.

The last VS Code extension we’ll consider for now is the Project Manager extension, which makes it easy to quickly switch between Git repositories and manage multiple research projects. To use it, we need to do a little bit of configuration first, and tell the extension where to find our projects. Add the following to your user settings, changing paths as appropriate to point to where you keep your research repositories:

"projectManager.git.baseFolders": [

Note that on Windows, you need to use \\ instead of \, since \ is an escape character. That is, \\ indicates that the next character is special, such that you need two backslashes to type the Windows path separator.

Once configured, press Alt+Shift+P to bring up a list of projects. If you don’t see anything at first, that’s normal; it can take a few moments for Project Manager to discover all of your repositories.

Aside: Viewing TeX Differences as PDFs (Linux and macOS / OS X only)

One very nice advantage of using Git to manage TeX projects is that we can use Git together with the excellent latexdiff tool to make PDFs annotated with changes between different versions of a project. Sadly, though latexdiff does run on Windows, it’s quite finnicky to use with MiKTeX. (Personally, I tend to find it easier to use the Linux instructions on Windows Subsystem for Linux, then run latexdiff from within Bash on Ubuntu on Windows.)

In any case, we will need two different programs to get up and running with PDF-rendered diffs. Sadly, both of these are somewhat more specialized than the other tools we’ve looked at, violating the goal that everything we install should also be of generic use. For that reason, and because of the Windows compatability issues noted above, we won’t depend on PDF-rendered diffs anywhere else in this post, and mention it here as a very nice aside.

That said, we will need by latexdiff itself, which compares changes between two different TeX source versions, and rcs-latexdiff, which interfaces between latexdiff and Git. To install latexdiff on Ubuntu, we can again rely on apt:

$ sudo apt install latexdiff

For macOS / OS X, the easiest way to install latexdiff is to use the package manager of MacTeX. Either use Tex Live Utiliy, a GUI program distributed with MacTeX or run the following command in a shell

$ sudo tlmgr update --self
$ sudo tlmgr install latexdiff

For rcs-latexdiff, I recommend the fork maintained by Ian Hincks. We can use the Python-specific package manager pip to automatically download Ian’s Git repository for rcs-latexdiff and run its installer:

$ pip install git+

Once you have latexdif and rcs-latexdiff installed, we can make very professional PDF renderings by calling rcs-latexdiff on different Git commits. For instance, if you have a Git tag for version 1 of an arXiv submission, and want to prepare a PDF of differences to send to editors when resubmitting, the following command often works:

$ rcs-latexdiff project_name.tex arXiv_v1 HEAD

Sometimes, you may have to specify options such as --exclude-section-titles due to compatability with the \section command in {revtex4-1} and {quantumarticle}.

arXiv Build Management

Ideally, you’ll upload your reproducible research paper to the arXiv once your project is at a point where you want to share it with the world. Doing so manually is, in a word, painful. In part, this pain originates from that arXiv uses a single automated process to prepare every manuscript submitted, such that arXiv must do something sensible for everyone. This translates in practice to that we need to ensure that our project folder matches the expectations encoded in their TeX processor, AutoTeX. These expectations work well for preparing manuscripts on arXiv, but are not quite what we want when we are writing a paper, so we have to contend with these conventions in uploading.

For instance, arXiv expects a single TeX file at the root directory of the uploaded project, and expects that any ancillary material (source code, small data sets, videos, etc.) are in a subfolder called anc/. Perhaps most difficult to contend with, though, is that arXiv currently only supports subfolders in a project if that project is uploaded as a ZIP file. This implies that if we want to upload even once ancillary file, which we certiantly will want to do for a reproducible paper, then we must upload our project as a ZIP file. Preparing this ZIP file is in principle easy, but if we do so manually, it’s all too easy to make mistakes.

To rectify this and to allow us to use our own project folder layouts, I’ve written a set of commands for PowerShell collectively called PoShTeX to manage building arXiv ZIP archives. From our perspective in writing a reproducible paper, we can rely on PowerShell’s built-in package management tools to automatically find and install PoShTeX if needed, such that our entire arXiv build process reduces to writing a single short manifest file then running it as a PowerShell command.

Let’s look at an example manifest. This particular example comes from an ongoing research project with Sarah Kaiser and Chris Ferrie.

#region Bootstrap PoShTeX
$modules = Get-Module -ListAvailable -Name posh-tex;
if (!$modules) {Install-Module posh-tex -Scope CurrentUser}
if (!($modules | ? {$_.Version -ge "0.1.5"})) {Update-Module posh-tex}
Import-Module posh-tex -Version "0.1.5"

Export-ArXivArchive @{
    ProjectName = "sgqt_mixed";
    TeXMain = "notes/sgqt_mixed.tex";
    RenewCommands = @{
        "figurefolder" = ".";
    AdditionalFiles = @{
        # TeX Stuff #
        "notes/revquantum.sty" = "/";
        "figures/*.pdf" = "figures/";
        # "notes/quantumarticle.cls" = $null

        # Theory and Experiment Support #
        "" = "anc/";
        "src/*.py" = "anc/";
        "src/*.yml" = "anc/";

        # Experimental Data #
        "data/*.hdf5" = "anc/";

        # Other Sources #
        # We include this build script itself to provide an example
        # of using PoShTeX in pactice.
        "Export-ArXiv.ps1" = $null;
    Notebooks = @(

Breaking it down a bit, the section of the manifest between #region and #endregion is responsible for ensuring PoShTeX is available, and installing it if not. This is the only “boilerplate” to the manifest, and should be copied literally into new manifest files, with a possible change to the version number "0.1.5" that is marked as required in our example.

The rest is a call to the PoShTeX command Export-ArXivArchive, which produces the actual ZIP given a description of the project. That description takes the form of a PowerShell hashtable, indicated by @{}. This is very similar to JavaScript or JSON objects, to Python dicts, etc. Key/value pairs in a PowerShell hashtable are separated by ;, such that each line of the argument to Export-ArXivArchive specifies a key in the manifest. These keys are documented more throughly on the PoShTeX documentation site, but let’s run through them a bit now. First is ProjectName, which is used to determine the name of the final ZIP file. Next is TeXMain, which specifies the path to the root of the TeX source that should be compiled to make the final arXiv-ready manuscript.

After that is the optional key RenewCommands, which allows us to specify another hashtable whose keys are LaTeX commands that should be changed when uploading to arXiv. In our case, we use this functionality to change the definition of \figurefolder such that we can reference figures from a TeX file that is in the root of the arXiv-ready archive rather than in tex/, as is in our project layout. This provides us a great deal of freedom in laying out our project folder, as we need not follow the same conventions in as required by arXiv’s AutoTeX processing.

The next key is AdditionalFiles, which specifies other files that should be included in the arXiv submission. This is useful for everything from figures and LaTeX class files through to providing ancillary material. Each key in AdditionalFiles specifies the name of a particular file, or a filename pattern which matches multiple files. The values associated with each such key specify where those files should be located in the final arXiv-ready archive. For instance, we’ve used AdditionalFiles to copy anything matching figures/*.pdf into the final archive. Since arXiv requires that all ancillary files be listed under the anc/ directory, we move things like, the instrument and environment descriptions src/*.yml, and the experimental data in to anc/.

Finally, the Notebooks option specifies any Jupyter Notebooks which should be included with the submission. Though these notebooks could also be included with the AdditionalFiles key, PoShTeX separates them out to allow passing the optional -RunNotebooks switch. If this switch is present before the manifest hashtable, then PoShTeX will rerun all notebooks before producing the ZIP file in order to regenerate figures, etc. for consistency.

Once the manifest file is written, it can be called by running it as a PowerShell command:

PS> ./Export-ArXiv.ps1

This will call LaTeX and friends, then produce the desired archive. Since we specified that the project was named sgqt_mixed with the ProjectName key, PoShTeX will save the archive to In doing so, PoShTeX will attach your bibliography as a *.bbl file rather than as a BibTeX database (*.bib), since arXiv does not support the *.bib*.bbl conversion process. PoShTeX will then check that your manuscript compiles without the biblography database by copying to a temporary folder and running LaTeX there without the aid of BibTeX.

Thus, it’s a good idea to check that the archive contains the files you expect it to by taking a quick look:

PS> ii

Here, ii is an alias for Invoke-Item, which launches its argument in the default program for that file type. In this way, ii is similar to Ubuntu’s xdg-open or macOS / OS X’s open command.

Once you’ve checked through that this is the archive you meant to produce, you can go on and upload it to arXiv to make your amazing and wonderful reproducible project available to the world.

Conclusions and Future Directions

In this post, we detailed a set of software tools for writing and publishing reproducible research papers. Though these tools make it much easier to write papers in a reproducible way, there’s always more that can be done. In that spirit, then, I’ll conclude by pointing to a few things that this stack doesn’t do yet, in the hopes of inspiring further efforts to improve the available tools for reproducible research.

  • Template generation: It’s a bit of a manual pain to set up a new project folder. Tools like Yeoman or Cookiecutter help with this by allowing the development of interactive code generators. A “reproducible arXiv paper” generator could go a long way towards improving practicality.
  • Automatic Inclusion of CTAN Dependencies: Currently, setting up a project directory includes the step of copying TeX dependencies into the project folder. Ideally, this could be done automatically by PoShTeX based on a specification of which dependencies need to be downloaded, a la pip’s use of requirements.txt.
  • arXiv Compatability Checking: Since arXiv stores each submission internally as a .tar.gz archive, which is inefficient for archives that themselves contain archives, arXiv recursively unpacks submissions. This in turn means that files based on the ZIP format, such as NumPy’s *.npz data storage format, are not supported by arXiv and should not be uploaded. Adding functionality to PoShTeX to check for this condition could be useful in preventing common problems.

In the meantime, even in lieu of these features, I hope that this description is a useful resource. ♥


What We Encourage


  • What we encourage and incentivize has consequences, both direct and unseen.
  • Those incentives often preclude progress on things we claim to value in academic research.
  • Better tools and better institutional support can help.
  • Hey, I made some crappy tools to point to a couple examples of incentives against reproducible research.


To a large extent, we do what we are encouraged to do. We are after all, more or less social agents, and respond the people and social systems around us as social agents. Examining what we are encouraged to do, and how we build incentive structures to reward some actions and people over others, is thus a matter of perpetual importance. To take one particularly stark example, I have argued here before that sexual harassment and discrimination are at least as pervasive in physics as in many other fields of research. One way to combat this, then, is to develop incentive structures that do not further reinforce inequality but instead make it more difficult to use scientific institutions to harass and attack. Thus, we can use tools such as codes of conduct, effective reporting at institutions, and backchannel communication to build better systems of incentives that encourage positive behavior. As a side note, this is especially critical as science funding is under imminent and unprecedented attack in the United States— we’ve seen that public outcry can have magnificent effects for protecting institutions and funding, such that we must ensure that we earn the public support we now so existentially require.

Having thus seen that systems of encouragement can be social, it’s also important to realize that encouragement can come as well in the form of technical constraints imposed by the systems that we build and interact with. To take one particular and relatively small example, the one-function-per-file design decision of MATLAB encourages users to write longer functions than they might otherwise in a different language. In turn, this makes it harder for MATLAB users to transfer skills to software development in other languages, discouraging diversity in software tools.

Even though the design decisions made by MATLAB can be often be circumvented by users who disagree, they has a significant impact on what kind of software design decisions get made. For instance, using MATLAB affects how code is shared as well as how APIs are designed. These decisions can in turn also compound with social encouragements; consider the magnification of this example, given the near-exclusive focus on proprietary tools such as MATLAB found in many undergraduate physics programs. That is, our approach to physics education encourages certain software development methodologies over others as an immediate consequence of teaching particular tools rather than methods.

The interaction between social and technical encouragements is widely recognized. In user interface and user experience design, for instance, the concept of an affordance is used to encourage a user to some set of interactions with a technical system. This understanding can be leveraged as a powerful tool for persuading users both to benefitial and malicious ends, as has been prominently argued for at least fifteen years. Critically reflecting on how social and technical encouragements combine to affect our behavior is accordingly a pressing issue across different disciplines.

State of the Art

Examining physics methodologies in detail from the perpsective of incentives, let’s then consider what kinds of encouragement we create with our design choices and with our research methodologies by examining a few relevant examples. Importantly, none of these examples represent criticism of individual users or research groups, but rather an examination of what incentives users and groups are under.

  • Experiment control and analysis: Current expeirmental methodology in physics often concerns itself with ad hoc control systems developed internally by graduate and undergraduate students in a single group. This discourages collaboration between groups using different internal systems, and more critically, can discourage challenging assumptions built in to early stages of experimental control systems. Efforts such as InstrumentKit, Qudi, and QCoDeS, Instrumental, SpanishAquisition, and Labscript have attempted to change these incentives by instead providing high-quality open source libraries that can work across groups, and hence remove disincentives that can impede collaboration. Collecting common functionality in this way also encourages contributing safety and correctness features that would otherwise be undervalued in a single ad hoc application.

  • Statistical analysis: Common software tools such as MATLAB, SciPy, and Mathematica all provide “curve fitting” functionality by default, and report frequentist metrics on the results produced by “curve fitting.” This ready availability encourages the use of “fitting” over other methodologies, even in cases where the theory behind “fitting” does not apply. Taken in conjunction with the reticence of reviewers to point out such methological problems, it isn’t a stretch to say that we directly encourage the publication of incorrect research results by failing to provide easy-to-use and principled statistical analysis tools that match experimental models.

  • Collaboration and version control: At least ideally, collaboration forms a cornerstone of scientific research. Accordingly, the writing and editing of research manuscripts is also ideally a collaborative effort. To support that, we often employ software tools such as Dropbox to synchronize changes between collaborators. The technical restrictions imposed by Dropbox, however, present their own encouragements and discouragements. For example, that Dropbox will overwrite changes made concurrently by two users discourages authors from editing on disjoint sections. Often, this is mitigated by social agreements such as “tokens,” where authors agree only to edit a work if they hold the “token” for the TeX source. That Dropbox does not present version history in a consise and readable form also discourages focusing editing efforts on recent changes. The eagerness of Dropbox to share temporary and editor-specific files also discourages the use of modern text editors such as Atom, Sublime Text and VS Code, each of which uses machine-specific metadata to store information such as undo history. Though these issues can somewhat be mitigated by using formal version control systems such as Git or Mercurial, that many students and postdocs have little to no opportunity to learn such tools presents a strong social discouragement against robust and efficient collaboration.

  • Plotting and accessibility: Approximately 10% of the population is at least mildly color-blind such that the use of colors in scientific visualizations can prevent a significant barrier to accessibility. Default color schemes in widely-used tools such as MATLAB discourage accessible design by forcing users to specify that each plot individually should use an accessible palette. Up until recently, the Matplotlib plotting library for Python has similarly discouraged accessible design, though other positive encouragements such as declarative style sheets have somewhat mitigated this.

Open Science

One critical component common to each of these examples is that they use the concept of open source software development to change the incentives for researchers. If it becomes easier — if we encourage — contributing to existing efforts instead of reinventing well-understood wheels, then not only does it become less expensive to do cutting-edge research, but it also becomes more practical to do reproducible research. As we can see from reproducibility crises in other fields, physics research likely cannot survive its current model of closed-source closed-data closed-analysis papers hidden behind increasingly-expensive paywalls. Perhaps most critically, the lack of reproducibility in scientific papers can also play a role in exacerbating public mistrust of science. I’ll return to the role that the arXiv plays in changing these incentives later, but suffice to say that open access is not sufficient to imply that a paper or other research output can be independently assessed for correctness. The gap between the openness we currently take as being standard in physics and the level of transparency that we need is thus not a trivial one.

In this discussion, of course, we also cannot afford to elide that open science in general and open source in particular also helps to remove elements of gatekeeping that can be used to preclude participation. In current practice, the expense of performing physics research, be it theory or experiment, naturally concentrates decision-making power in a smaller number of individuals. Though the effects of this concentration can be mitigated to some degree, it is also important to reduce the areas in which cognative biases can hide from critical examination. Reducing barriers to entry can provide a way to do so.

I posit that these self-same barriers to entry are, perhaps ironically, one of the largest impediments to the adoption of open science methodologies in physics. Take, for instance, the advice offered to me in response to a question about how to adopt better open science methods when posting to the arXiv:

For the sake of persistent archiving you should put all ancillary material at a sustainably archived website specifically designed to preserve scholarly communications material e.g. Zenodo or Dryad or Figshare

I agree with this advice entirely, and have in fact advocated this exact approach with both open-source efforts and reproducible papers. That said, let’s consider the incentive structure imposed by this approach. A student or postdoc taking responsibility for posting a reproducible work to the arXiv must then not only know how to do so directly, including how to write literate source code, but also about the existence of services such as Figshare. Moreover, a researcher must then manage their work’s presence on arXiv as well as additional services, increasing the complexity of maintaining an accessible research output.

Happily, however, arXiv itself supports including ancillary material such as source code and data along with a paper. Even here, though, the workflow for providing such ancillary material is more complicated than the comparable closed-science workflow. In particular, arXiv identifies ancillary material as being those files included in the anc/ subfolder of a submission package, but subfolders are only supported by uploading a ZIP archive of the entire submission. To make such a ZIP file with most existing tools, the original file must itself be in a folder called anc/, imposing a nontrivial technical restriction on project folder structures.

To some degree, encouragements of this kind that run counter to open science goals are inevitable. Indeed, there is a bit of a contradiction in that the very goal of reproducible research is to make it more difficult to publish in a way that is disporportionate between papers that are likely correct and those that cannot be as easily assessed for correctness. The natural tendency, then, is for reproducibility advocates to yell into the void or into the echo chamber about the existential importance of open science, and for the rest of the community to go forth with their business in the best way that they can given very limited resources.

In the same way as in dealing with discrimination by providing better tools that change incentive structures, we can break this circularity by providing better tools for writing reproducible papers. For instance, one very impressive tool to do precisely this is ReproZip, which works by leveraging modern software and hardware engineering technologies such as virtual machines. Indeed, ReproZip works great at making it easier to provide compact descriptions of reproducible research projects. This approach works especially well in combination with tools such as Jupyter Notebook, which allow for producing standardized and well-understood literate source code for a variety of different languages. Though Jupyter is perhaps most famous for its Python support, having grown out of the IPython project, it also supports Julia, R, F#, and even MATLAB when augmented with simple-to-install extensions.

Recently, I have started to experiment with other software tools for changing incentive structures, in the hopes of building on the backs of these efforts. My goals are of course more modest, in keeping with the resources available, but at the least I would like to propose ways of making tools such as Jupyter Notebook more accessible to arXiv users. In the rest of this post, I will detail two such (experimental!) efforts and how they are each motivated by reducing barriers to entry for reproducible physics research.


The first such effort is the {revquantum} style file for LaTeX, intended to make it easier to write reproducible and accessible papers in the {revtex4-1} and {quantumarticle} document classes. The {revquantum} package supports accessiblility in a number of different ways:

  • Semantic notation: {revquantum} provides commands such as \newoperator, which allow users to write in a more semantic fashion than worrying about implementation details. In that example, \newoperator{Tr} defines \Tr as \newcommand{\Tr}{\operatorname{Tr}}, resulting in more readable code than what might be produced directly, such as \mathrm{Tr}(\rho).

  • Accessible color palettes: {revquantum} also provides definitions for the Color Universal Design palette, such that users can quickly set colors for TikZ diagrams and other features in an accessible manner.

  • {listings} integration: The {listings} package provides invaluable tools for including source code along with research manuscripts. There are some subtle issues in integrating {listings} with {revtex4-1} and other such classes, however, such that {revquantum} provides a reasonable set of defaults that for use with {revtex4-1}.

  • Package compatibility warnings: There are a few commonly-used packages which are known to be incompatible with {revtex4-1}. These incompatibilities can cause difficult-to-diagnose errors, however, instead of immediately raising errors that identify the problem. To address this and prevent users from being discouraged against modular TeX design, {revquantum} will raise warnings or errors when incompatible packages are loaded so that users are immediately informed as to the issue.

  • Bibliography improvements: The default BibTeX style file provided with {revtex4-1} suppresses the titles of cited papers in order to match the formatting that might appear in PRL and similar journals. For deployment to the arXiv, though, where space constraints are much less of an issue, this design decision removes context from the reader and encourages more of a focus on where something is published than the content that is being cited. To address this and encourage a stronger focus on research content, {revquantum} sets a default configuration that provides more context in generated bibliographies. Moreover, {revquantum} changes citation URLs to default to HTTPS, encouraging better security practices.

  • Annotations for collaborative writing: Finally, {revquantum} provides new lightweight commands such as \todo and \citeneed for annotating remaining tasks in a collaborative document. These commands leave warnings in the log file, making them easy to identify in modern text editors that annotate warnings in the TeX source. Moreover, the [final] option ot {revquantum} promotes these warnings to errors, providing a safety check against leaving outstanding tasks.


The other effort I would like to detail a bit here is PoShTeX, a set of PowerShell tools for working with TeX-based research projects. In particular, PoShTeX provides the Export-ArXivArchive command, which makes a ZIP archive suitable for uploading to arXiv. In doing so, PoShTeX will rebuild the current TeX file, and will optionally re-run any nominated Jupyter Notebooks to ensure consistency with the figures and descriptions provided in the project’s main text. For example, a project might include a short Export-ArXiv.ps1 script to produce an arXiv-compatible archive:

#region Bootstrap PoShTeX
$modules = Get-Module -ListAvailable -Name posh-tex;
if (!$modules) {Install-Module posh-tex -Scope CurrentUser}
if (!($modules | ? {$_.Version -ge "0.1.4"})) {Update-Module posh-tex}
Import-Module posh-tex -Version "0.1.4"

Export-ArXivArchive -RunNotebooks @{
    ProjectName = "my-little-pauli";
    TeXMain = "my-little-pauli.tex";
    AdditionalFiles = @{
        "fig/*.pdf" = "fig/";
        "data/tomography-data.hdf5" = $null;
        "revquantum.sty" = $null;
    Notebooks = @(

This script will then install PoShTeX if it is not already available, and will then rerun the listed notebooks, compile the TeX output, and produce the final build file for uploading to arXiv. In this way, PoShTeX makes it very straightforward to include reproducible material.

The second major functionality provided by PoShTeX is that it makes it much easier to write reusable LaTeX resources such as style files by making it easy to write installers and produce CTAN-compatible archives. This is, for example, how {revquantum} is packaged and uploaded to CTAN. Thus, PoShTeX hopefully also encourages not only sharing the research code itself but also the LaTeX code used to produce legible and readable research documents.

By the way, why PowerShell? It may seem like an odd choice for encouraging open scientific research, but consider that a very significant majority of Windows users already have PowerShell, given its status as a built-in tool since Windows 7. Though bash enjoys even wider deployment on macOS / OS X and on Linux, the [recent availability of PowerShell as an open source tool] on non-Windows platforms changes this calculus dramatically. In particular, given the ease of installing PowerShell modules with the Install-Module command, PowerShell may offer a way forward to distribute cross-platform shell libraries in an easy-to-use way.

Concluding Remarks

In this post, I’ve described a few ways that what we encourage can have dramatic negative or hopefully positive effects on research culture and practice. These examples all speak to how thinking not only critically but also systematically can provide a path forward for doing the best science that we can.

Learning Multiexponential Models with QInfer


Randomized benchmarking (RB) is a useful tool for characterizing quantum information processing systems, and has seen a range of different extensions over the past few years. In addition to being useful, though, RB presents an interesting statistical problem: how should one infer the parameters of interest from RB data? Traditionally, this is done by thinking of data analysis in terms of "fitting" a curve, but this puts severe constraints on the kinds of experiments that can be performed.

In particular, most commonly-available fitting routines work based on least-squares estimation, and hence assume that measurements follow a "true" signal plus Gaussian noise. Justifying this assumption requires large amounts of experimental data, motivating a more rigorous statistical analysis. Using Bayesian inference allows for overcoming this challenge, while also providing a characterization of one's uncertianty and allowing for adaptivity. A ready-to-use implementation of Bayesian inference for RB is provided along with our QInfer software package, providing a useful alternative to traditional fitting.

This is even more critical for the recent extensions of RB, in that the models describing these extensions often require "fitting" to multiple exponential decays. Taking the Bayesian perspective allows us to more easily break the approximate degeneracy in multiexponential models, either by using prior information or experimental variety. In this post, we explain how to learn multiexponential models, using dihedral RB as an example. The single-qubit version of this protocol was recently proposed by Dugas et al. 10/brjj to allow for learning the fidelity of gates outside the Clifford group, and extends the RB model to include two exponential decays. Though for the sake of simplicity, we don't consider it here, the same ideas apply to the multi-qubit version of Cross et al. 10/brjk. In this post, I'll step through in detail (possibly annoyingly so) how to implement and use a QInfer model for dihedral RB, stopping at times along the way to explain some nice techniques for scientific computing in Python. I'll assume basic familiarity with Python and with randomized benchmarking.

Python Preamble

We start by turning off warnings in this post. While I don't recommend doing so in practice, there's a few superfulous warnings (mainly raised in plotting) that will be distracting from the main point of the post.

In [1]:
import warnings

Next, we use the division and print_function features to make this post more compatible between Python 2 and 3. We'll also enable inline plotting so that our figures appear nicely in the post itself, and will set the DPI for inline figures high enough to be readable on the web or in print.

In [2]:
from __future__ import division, print_function
%matplotlib inline
%config InlineBackend.print_figure_kwargs = {'dpi' : 150}

This done, we'll import the libraries we need. We'll mainly rely on QInfer and QuTiP, but we'll also use Matplotlib, Mpltools and Seaborn for plotting support.

In [3]:
import numpy as np
import pandas as pd
from itertools import product
In [4]:
import matplotlib.pyplot as plt
import seaborn as sbs
import mpltools.special
In [5]:
import qinfer as qi
import qutip as qt

Defining a Model and a Prior

With all of the boring distractions out of the way, we can now focus on defining a model for dihedral RB. As noted by Dugas et al. 10/brjj, similar to the standard randomized benchmarking model, the probability of getting a "+1" measurement in a dihedral RB experiment is given by \begin{align} \Pr(+1 | m, b_1, b_2) & = (-1)^{b_1} A p_0^m + \left((-1)^{b_1 + b_2} B_1 + (-1)^{b_2} B_2\right) p_1^m + C, \end{align} where $m$ is the length of the sequence being measured, $b_1, b_2 \in \{0, 1\}$ are bits describing the final measurement, and where $A, B_1, B_2, C$ are the state preparation and measurement (SPAM) constants analgous to $A$ and $B$ in traditional RB. Since this specifies the probability of each possible observation, we call the above function the likelihood functuion for dihedral RB.

Having specified the likelihood, we can think of it as specifying a six-dimensional model $\vec{x} = (p_0, p_1, A, B_1, B_2, C)$ with three experiment parameters $\vec{e} = (m, b_1, b_2)$. In particular, measurments in this model consist of:

  • Picking a dihedral gate sequence of length $m$,
  • Applying the sequence to a fixed initial state $\rho$,
  • Applying the inverse of the sequence, followed by $\sigma_x^{b_1} \sigma_z^{b_2}$,
  • Performing a fixed measurement effect $E$ on the final state.

We can encode this in QInfer as a two-outcome model, with the outcomes 0 and 1 corresponding to observing $\mathbb{1} - E$ and $E$ (respectively) in the final step of each measurement. To do so, we need to provide some metadata about the model (how many outcomes, how many model parameters, etc.), as well as two methods which specify what model parameters are valid, and what the likelihood function is. We provide each of these by making a new class with methods and properties corresponding to each part of our model.

To explain each piece of our new dihedral RB model at a time, we'll use a stupid trick to break up the class definition. Ordinarily, we would have to define the class all at once, which makes it harder to walk through the definition piece-by-piece. By making a class inherit from itself, however, we can circumvent this and explain each part of the class. To be perfectly clear, this trick should never be used in practice, as it results in some very subtle bugs and much less organized code. Rather, one should always define the entire class, and use a hierarchy of classes to break up definitions if needed. That said, our trick helps us at the moment, so we'll do something massively inadvisable and use it anyway. If you use the trick presented here in your own project, please don't say I didn't warn you.

With that caveat aside, we'll start by defining the metadata needed to tell QInfer what model parameters, experiment parameters and outcomes our new model will support.

In [6]:
class DihedralRBModel(qi.FiniteOutcomeModel):    
    def n_modelparams(self):
        return 6
    def modelparam_names(self):
        return ['p_0', 'p_1', 'A', 'B_1', 'B_2', 'C']        
    def expparams_dtype(self):
        return [('m', 'uint'), ('b_1', bool), ('b_2', bool)]
    def is_n_outcomes_constant(self):
        return True

    def n_outcomes(self, expparams):
        return 2

By inheriting from FiniteOutcomeModel, we tell QInfer that our model has a finite number of outcomes that are each labeled by nonnegative integers. In particular, by specifying is_n_outcomes_constant and n_outcomes, we tell QInfer that there are always two outcomes. This is not really a limitation, though, since we can use qi.BinomialModel to easily extend our two-outcome definition to include repeated measurements of the same $m$, $b_1$ and $b_2$.

Next, we'll use our trick to add a method which specifies which values of our model parameters $\vec{x}$ are allowable for dihedral RB. If you want to use this class in practice, just copy and paste the new method below into the definition above instead of using the trick, making one big class that has all of the metadata along with the next two methods.

In [7]:
class DihedralRBModel(DihedralRBModel):    
    def are_models_valid(self, modelparams):
        p_0, p_1, A, B_1, B_2, C = modelparams.T
        return np.all(sum((
                s_1 * A + s_1 * s_2 * B_1 + s_2 * B_2 + C <= 1,
                s_1 * A + s_1 * s_2 * B_1 + s_2 * B_2 + C >= 0,
                          s_1 * s_2 * B_1 + s_2 * B_2 + C <= 1,
                          s_1 * s_2 * B_1 + s_2 * B_2 + C >= 0,
                s_1 * A                               + C <= 1,
                s_1 * A                               + C >= 0,
            for s_1, s_2 in product([-1, +1], repeat=2)
        ), []) + [
                p_0 <= 1,
                p_0 >= 0,
                p_1 <= 1,
                p_1 >= 0

There's a fair bit going on here, so let's break it down a bit. First, note that we've used destructuring assignment to unpack the model parameters into convienent names. Destructuring assignment allows us to assign more than one variable at once by using an iterable collection of values. For instance:

In [8]:
zero, one, two = range(3)

In the context of NumPy arrays, we need to know that arrays iterate over their leftmost index. For a two dimensional array (that is, what's colloquially known as a matrix), destructing thus assigns each row to a different variable:

In [9]:
a, b = np.array([
    [10, 11],
    [12, 13]
[10 11]
[12 13]

In the context of are_models_valid, QInfer will call this method of our model with the argument modelparams being an array of shape (n_models, 6), where n_models is the number of models that QInfer is considering, and where 6 is derived from our earlier specification of n_modelparams. That is, QInfer represents each model vector $\vec{x}$ as a row of a matrix $X$. We thus need to destructure the transpose $X^{\mathrm{T}}$ instead to get each model parameter into its own variable. This is especially convienent in the context of broadcasting, as it means that we can combine these destructured model parameters using various arithmetic operators and NumPy functions while preserving the shape.

Next, we have used np.all([...], axis=0) to test many different conditions at once. This works because np.all reduces over the specified axis (here, 0 specifies the leftmost/outermost index) to return an array with one less axis. For instance:

In [10]:
    [True, False, True],
    [False, True, True],
], axis=0)
array([False, False,  True], dtype=bool)

Thus, np.all tells us that only the third "column" contains True for each row. We can then build a list of separate conditions and check each with np.all to efficiently test each model vector. To build up the list of conditions, we note that we need our likelihood to return a probability for each $b_1,b_2 \in \{0, 1\}$. We can represent this concept using a generator comprehension:

In [11]:
    [s_1, s_2] for s_1, s_2 in product([-1, +1], repeat=2)
<generator object <genexpr> at 0x000000000CBEC828>

For our example, this is basically a fancy way of saying something that iterates over each assignment of $b_1$ and $b_2$:

In [12]:
[[-1, -1], [-1, 1], [1, -1], [1, 1]]

Next, we've used sum (notably not np.sum, which acts only on arrays!) to concatenate the lists of conditions for each $b_1$ and $b_2$. This works because the builtin sum effectively just calls + a lot, which for lists gives the concatenation:

In [13]:
['a'] + ['b']
['a', 'b']

The only thing to watch out for with using sum to concatenate lists is that we need to tell it to start with the empty list []:

In [14]:
sum(([idx] * idx for idx in range(1, 4)), [])
[1, 2, 2, 3, 3, 3]

With all this out of the way, we can now much more easily see why are_models_valid works: it builds a list of conditions for each possible pair of bits, then concatenates them together, adds in a few other conditions that don't depend on the bits $b_1$ and $b_2$, then uses np.all to ensure that every condition holds. In this way, are_models_valid quickly tests that each model vector is guaranteed to return a likelihood between 0 and 1, as is required by the fact that likelihood functions describe probabilities.

With this done, we can now specify the likelihood function itself.

In [15]:
class DihedralRBModel(DihedralRBModel):    
    def likelihood(self, outcomes, modelparams, expparams):
        # The logic to count calls is defined in the superclass qi.Model,
        # which is what FiniteOutcomeModel and DihedralRBModel each
        # derive from in turn. Because of our stupid self-inheritence
        # trick, we don't use the super function built into Python
        # the "right" way; see
        # for how to do things correctly.
        super(type(self), self).likelihood(
            outcomes, modelparams, expparams
        # This indexing trick means that p_0, p_1, ... each have
        # shape (n_models, 1), making it easier to broadcast with the
        # experiment parameters below.
        p_0, p_1, A, B_1, B_2, C = (modelparams.T)[:, :, np.newaxis]
        m = expparams['m']
        s_1 = (-1) ** expparams['b_1']
        s_2 = (-1) ** expparams['b_2']
        # Allocating first serves to make sure that a shape mismatch later
        # will cause an error.
        pr1 = np.zeros((modelparams.shape[0], expparams.shape[0]))
        pr1[:, :] = (
            s_1 * A * p_0 ** m + ((s_1 * s_2) * B_1 + s_2 * B_2) * p_1 ** m + C
        # Now we concatenate over outcomes.
        return qi.FiniteOutcomeModel.pr0_to_likelihood_array(outcomes, 1 - pr1)

Let's break this one down a bit as well. First, we use super to ensure that QInfer can properly keep track of how many times the likelihood function is called--- this gives us a useful diagnostic when we're trying to figure out what sort of simulation costs we're incurring.

Next, we use the same unpacking technique, along with a slice using np.newaxis. When used in a slice, newaxis does just what it says on the tin: it adds a new axis to an array. For instance:

In [16]:
arr = np.empty((2, 3))
print(arr[:, :, np.newaxis].shape)
(2L, 3L)
(2L, 3L, 1L)

This is very convienent for broadcasting. In particular, QInfer expects that our likelihood function (at least for a two-outcome model, more on this later) to return an array $L_{jk}$, with $j$ indexing models and $k$ indexing experiments. With broadcasting, we can do this if each unpacked model parameter has shape (n_models, 1), so that the second index (of length 1) repeats over experiments. To accomplish this in a straightforward manner, we use transpose and newaxis to get an array of shape (6, n_models, 1), then unpack the leftmost index as before.

For experiment parameters, we can just pull each out by name to get an array of shape (n_experiments,). Importantly, this shape broadcasts with (n_models, 1) to give a shape (n_models, n_experiments), exactly as required. Let's see how this works in a quick example by defining two arrays of shape (3,) and (4,).

In [17]:
arr1 = np.array([1, 2, 3])

arr2 = np.array([10, 20, 30, 40])
In [18]:
print(arr1[:, np.newaxis].shape)
(3L, 1L)

Suppose we want to make a new array arr3 such that arr3[i, j] == arr1[i] + arr2[j] (that is, an outer product over + instead of *). We do this by adding a new axis to arr1 of length 1; this will then broadcast over the index of arr2 when added.

In [19]:
arr3 = arr1[:, np.newaxis] + arr2
(3L, 4L)
[[11 21 31 41]
 [12 22 32 42]
 [13 23 33 43]]

As we can see, elements of arr1 are repeated 3 times in order to make a (4, 3) shape array. Broadcasting similarly repeats arr2, even though we did not explicitly add a new axis to its left, as shapes are aligned at the right when finding which indices need repeated to match shapes.

Returning to likelihood, we ensure that the shape of our likelihood array is correct by pre-allocating an array pr1 to hold the probabilty of getting a 1 outcome for each model and experiment. We then assign the actual likelihood values as computed using our unpacked and reshaped model and epxeriment parameters into a slice of pr1, such that the assignment will cause an error if we make an indexing mistake, as opposed to proceeding silently with our mistake intact.

In doing so, note that our index gymnastics paid off; the final expression for pr1 reads pretty much exactly like the mathematical expression we're trying to implement. Broadcasting works, is fun, and helps you write efficient code.

Anyway, with pr1, we have one last step to do before returning. QInfer supports more than just two-outcome models, such that in general we need to care about which outcome the likelihood function is being evaluated for. This is handled by a third index, such that the likelihood function in general should return $L_{ijk}$, with $i$ indexing which outcome. Since this is redundant for two-outcome models, QInfer provides the pr0_to_likelihood_array function to handle two-outcome models. This takes the probability of zero, not one, so we complete our likelihood function by taking 1 - pr0.

Now that we have completed our model, let's go on and use it. As mentioned above, we will want to wrap our model in BinomialModel to handle the case of more than one measurement per sequence length, so let's do that now.

In [20]:
model = qi.BinomialModel(DihedralRBModel())

We can then use our new model's validity checking to define a prior that is supported only on model vectors that make sense (that is, that yield likelihoods in the range 0 to 1). This is done by the PostselectedDistribution provided with QInfer. Since we have more conditions than is usual, we allow for more iterations with the maxiters keyword argument.

In [21]:
prior = qi.PostselectedDistribution(qi.UniformDistribution(
    # We'll pick a prior on p₀ and p₁ consistent with
    # F approximately ≥ 0.8.
    [[0.8, 1]] * 2 +
    # We'll be as agnostic as possible with respect to
    # A, B₀, B₁ and C; most of the parameters in this
    # range will be invalid, but the posts
    [[-0.5, 0.5]] * 3 + [[-1, 1]]
), model, maxiters=10000)

Next, let's get an idea what this prior looks like. We can use Seaborn's PairGrid to plot estimated distributions from samples of the prior.

In [22]:
prior_samples = pd.DataFrame(prior.sample(n=500), columns=model.modelparam_names)
# Example from documentation at:
grid = sbs.PairGrid(prior_samples)
grid.map_upper(plt.scatter, s=1)
grid.map_lower(sbs.kdeplot, shade=False)
grid.map_diag(sbs.kdeplot, lw=3, legend=False)
<seaborn.axisgrid.PairGrid at 0xcbf3fd0>