Learning Multiexponential Models with QInfer

Introduction

Randomized benchmarking (RB) is a useful tool for characterizing quantum information processing systems, and has seen a range of different extensions over the past few years. In addition to being useful, though, RB presents an interesting statistical problem: how should one infer the parameters of interest from RB data? Traditionally, this is done by thinking of data analysis in terms of "fitting" a curve, but this puts severe constraints on the kinds of experiments that can be performed.

In particular, most commonly-available fitting routines work based on least-squares estimation, and hence assume that measurements follow a "true" signal plus Gaussian noise. Justifying this assumption requires large amounts of experimental data, motivating a more rigorous statistical analysis. Using Bayesian inference allows for overcoming this challenge, while also providing a characterization of one's uncertianty and allowing for adaptivity. A ready-to-use implementation of Bayesian inference for RB is provided along with our QInfer software package, providing a useful alternative to traditional fitting.

This is even more critical for the recent extensions of RB, in that the models describing these extensions often require "fitting" to multiple exponential decays. Taking the Bayesian perspective allows us to more easily break the approximate degeneracy in multiexponential models, either by using prior information or experimental variety. In this post, we explain how to learn multiexponential models, using dihedral RB as an example. The single-qubit version of this protocol was recently proposed by Dugas et al. 10/brjj to allow for learning the fidelity of gates outside the Clifford group, and extends the RB model to include two exponential decays. Though for the sake of simplicity, we don't consider it here, the same ideas apply to the multi-qubit version of Cross et al. 10/brjk. In this post, I'll step through in detail (possibly annoyingly so) how to implement and use a QInfer model for dihedral RB, stopping at times along the way to explain some nice techniques for scientific computing in Python. I'll assume basic familiarity with Python and with randomized benchmarking.

Python Preamble

We start by turning off warnings in this post. While I don't recommend doing so in practice, there's a few superfulous warnings (mainly raised in plotting) that will be distracting from the main point of the post.

In [1]:
import warnings
warnings.simplefilter('ignore')

Next, we use the division and print_function features to make this post more compatible between Python 2 and 3. We'll also enable inline plotting so that our figures appear nicely in the post itself, and will set the DPI for inline figures high enough to be readable on the web or in print.

In [2]:
from __future__ import division, print_function
%matplotlib inline
%config InlineBackend.print_figure_kwargs = {'dpi' : 150}

This done, we'll import the libraries we need. We'll mainly rely on QInfer and QuTiP, but we'll also use Matplotlib, Mpltools and Seaborn for plotting support.

In [3]:
import numpy as np
import pandas as pd
from itertools import product
In [4]:
import matplotlib.pyplot as plt
import seaborn as sbs
import mpltools.special
try:
    plt.style.use('ggplot-rq')
except:
    pass
In [5]:
import qinfer as qi
import qutip as qt

Defining a Model and a Prior

With all of the boring distractions out of the way, we can now focus on defining a model for dihedral RB. As noted by Dugas et al. 10/brjj, similar to the standard randomized benchmarking model, the probability of getting a "+1" measurement in a dihedral RB experiment is given by \begin{align} \Pr(+1 | m, b_1, b_2) & = (-1)^{b_1} A p_0^m + \left((-1)^{b_1 + b_2} B_1 + (-1)^{b_2} B_2\right) p_1^m + C, \end{align} where $m$ is the length of the sequence being measured, $b_1, b_2 \in \{0, 1\}$ are bits describing the final measurement, and where $A, B_1, B_2, C$ are the state preparation and measurement (SPAM) constants analgous to $A$ and $B$ in traditional RB. Since this specifies the probability of each possible observation, we call the above function the likelihood functuion for dihedral RB.

Having specified the likelihood, we can think of it as specifying a six-dimensional model $\vec{x} = (p_0, p_1, A, B_1, B_2, C)$ with three experiment parameters $\vec{e} = (m, b_1, b_2)$. In particular, measurments in this model consist of:

  • Picking a dihedral gate sequence of length $m$,
  • Applying the sequence to a fixed initial state $\rho$,
  • Applying the inverse of the sequence, followed by $\sigma_x^{b_1} \sigma_z^{b_2}$,
  • Performing a fixed measurement effect $E$ on the final state.

We can encode this in QInfer as a two-outcome model, with the outcomes 0 and 1 corresponding to observing $\mathbb{1} - E$ and $E$ (respectively) in the final step of each measurement. To do so, we need to provide some metadata about the model (how many outcomes, how many model parameters, etc.), as well as two methods which specify what model parameters are valid, and what the likelihood function is. We provide each of these by making a new class with methods and properties corresponding to each part of our model.

To explain each piece of our new dihedral RB model at a time, we'll use a stupid trick to break up the class definition. Ordinarily, we would have to define the class all at once, which makes it harder to walk through the definition piece-by-piece. By making a class inherit from itself, however, we can circumvent this and explain each part of the class. To be perfectly clear, this trick should never be used in practice, as it results in some very subtle bugs and much less organized code. Rather, one should always define the entire class, and use a hierarchy of classes to break up definitions if needed. That said, our trick helps us at the moment, so we'll do something massively inadvisable and use it anyway. If you use the trick presented here in your own project, please don't say I didn't warn you.

With that caveat aside, we'll start by defining the metadata needed to tell QInfer what model parameters, experiment parameters and outcomes our new model will support.

In [6]:
class DihedralRBModel(qi.FiniteOutcomeModel):    
    @property
    def n_modelparams(self):
        return 6
    
    @property
    def modelparam_names(self):
        return ['p_0', 'p_1', 'A', 'B_1', 'B_2', 'C']        
    
    @property
    def expparams_dtype(self):
        return [('m', 'uint'), ('b_1', bool), ('b_2', bool)]
    
    @property
    def is_n_outcomes_constant(self):
        return True

    def n_outcomes(self, expparams):
        return 2

By inheriting from FiniteOutcomeModel, we tell QInfer that our model has a finite number of outcomes that are each labeled by nonnegative integers. In particular, by specifying is_n_outcomes_constant and n_outcomes, we tell QInfer that there are always two outcomes. This is not really a limitation, though, since we can use qi.BinomialModel to easily extend our two-outcome definition to include repeated measurements of the same $m$, $b_1$ and $b_2$.

Next, we'll use our trick to add a method which specifies which values of our model parameters $\vec{x}$ are allowable for dihedral RB. If you want to use this class in practice, just copy and paste the new method below into the definition above instead of using the trick, making one big class that has all of the metadata along with the next two methods.

In [7]:
class DihedralRBModel(DihedralRBModel):    
    def are_models_valid(self, modelparams):
        p_0, p_1, A, B_1, B_2, C = modelparams.T
        return np.all(sum((
            [
                s_1 * A + s_1 * s_2 * B_1 + s_2 * B_2 + C <= 1,
                s_1 * A + s_1 * s_2 * B_1 + s_2 * B_2 + C >= 0,
                    
                          s_1 * s_2 * B_1 + s_2 * B_2 + C <= 1,
                          s_1 * s_2 * B_1 + s_2 * B_2 + C >= 0,
                
                s_1 * A                               + C <= 1,
                s_1 * A                               + C >= 0,
            ]
            for s_1, s_2 in product([-1, +1], repeat=2)
        ), []) + [
                p_0 <= 1,
                p_0 >= 0,
                        
                p_1 <= 1,
                p_1 >= 0
            ],
        axis=0)

There's a fair bit going on here, so let's break it down a bit. First, note that we've used destructuring assignment to unpack the model parameters into convienent names. Destructuring assignment allows us to assign more than one variable at once by using an iterable collection of values. For instance:

In [8]:
zero, one, two = range(3)
print(zero)
print(one)
print(two)
0
1
2

In the context of NumPy arrays, we need to know that arrays iterate over their leftmost index. For a two dimensional array (that is, what's colloquially known as a matrix), destructing thus assigns each row to a different variable:

In [9]:
a, b = np.array([
    [10, 11],
    [12, 13]
])
print(a)
print(b)
[10 11]
[12 13]

In the context of are_models_valid, QInfer will call this method of our model with the argument modelparams being an array of shape (n_models, 6), where n_models is the number of models that QInfer is considering, and where 6 is derived from our earlier specification of n_modelparams. That is, QInfer represents each model vector $\vec{x}$ as a row of a matrix $X$. We thus need to destructure the transpose $X^{\mathrm{T}}$ instead to get each model parameter into its own variable. This is especially convienent in the context of broadcasting, as it means that we can combine these destructured model parameters using various arithmetic operators and NumPy functions while preserving the shape.

Next, we have used np.all([...], axis=0) to test many different conditions at once. This works because np.all reduces over the specified axis (here, 0 specifies the leftmost/outermost index) to return an array with one less axis. For instance:

In [10]:
np.all([
    [True, False, True],
    [False, True, True],
], axis=0)
Out[10]:
array([False, False,  True], dtype=bool)

Thus, np.all tells us that only the third "column" contains True for each row. We can then build a list of separate conditions and check each with np.all to efficiently test each model vector. To build up the list of conditions, we note that we need our likelihood to return a probability for each $b_1,b_2 \in \{0, 1\}$. We can represent this concept using a generator comprehension:

In [11]:
(
    [s_1, s_2] for s_1, s_2 in product([-1, +1], repeat=2)
)
Out[11]:
<generator object <genexpr> at 0x000000000CBEC828>

For our example, this is basically a fancy way of saying something that iterates over each assignment of $b_1$ and $b_2$:

In [12]:
list(_)
Out[12]:
[[-1, -1], [-1, 1], [1, -1], [1, 1]]

Next, we've used sum (notably not np.sum, which acts only on arrays!) to concatenate the lists of conditions for each $b_1$ and $b_2$. This works because the builtin sum effectively just calls + a lot, which for lists gives the concatenation:

In [13]:
['a'] + ['b']
Out[13]:
['a', 'b']

The only thing to watch out for with using sum to concatenate lists is that we need to tell it to start with the empty list []:

In [14]:
sum(([idx] * idx for idx in range(1, 4)), [])
Out[14]:
[1, 2, 2, 3, 3, 3]

With all this out of the way, we can now much more easily see why are_models_valid works: it builds a list of conditions for each possible pair of bits, then concatenates them together, adds in a few other conditions that don't depend on the bits $b_1$ and $b_2$, then uses np.all to ensure that every condition holds. In this way, are_models_valid quickly tests that each model vector is guaranteed to return a likelihood between 0 and 1, as is required by the fact that likelihood functions describe probabilities.

With this done, we can now specify the likelihood function itself.

In [15]:
class DihedralRBModel(DihedralRBModel):    
    def likelihood(self, outcomes, modelparams, expparams):
        # The logic to count calls is defined in the superclass qi.Model,
        # which is what FiniteOutcomeModel and DihedralRBModel each
        # derive from in turn. Because of our stupid self-inheritence
        # trick, we don't use the super function built into Python
        # the "right" way; see
        #    https://docs.python.org/2/library/functions.html#super
        # for how to do things correctly.
        super(type(self), self).likelihood(
            outcomes, modelparams, expparams
        )
        
        # This indexing trick means that p_0, p_1, ... each have
        # shape (n_models, 1), making it easier to broadcast with the
        # experiment parameters below.
        p_0, p_1, A, B_1, B_2, C = (modelparams.T)[:, :, np.newaxis]
        
        m = expparams['m']
        s_1 = (-1) ** expparams['b_1']
        s_2 = (-1) ** expparams['b_2']
        
        # Allocating first serves to make sure that a shape mismatch later
        # will cause an error.
        pr1 = np.zeros((modelparams.shape[0], expparams.shape[0]))
        pr1[:, :] = (
            s_1 * A * p_0 ** m + ((s_1 * s_2) * B_1 + s_2 * B_2) * p_1 ** m + C
        )
        
        # Now we concatenate over outcomes.
        return qi.FiniteOutcomeModel.pr0_to_likelihood_array(outcomes, 1 - pr1)

Let's break this one down a bit as well. First, we use super to ensure that QInfer can properly keep track of how many times the likelihood function is called--- this gives us a useful diagnostic when we're trying to figure out what sort of simulation costs we're incurring.

Next, we use the same unpacking technique, along with a slice using np.newaxis. When used in a slice, newaxis does just what it says on the tin: it adds a new axis to an array. For instance:

In [16]:
arr = np.empty((2, 3))
print(arr.shape)
print(arr[:, :, np.newaxis].shape)
(2L, 3L)
(2L, 3L, 1L)

This is very convienent for broadcasting. In particular, QInfer expects that our likelihood function (at least for a two-outcome model, more on this later) to return an array $L_{jk}$, with $j$ indexing models and $k$ indexing experiments. With broadcasting, we can do this if each unpacked model parameter has shape (n_models, 1), so that the second index (of length 1) repeats over experiments. To accomplish this in a straightforward manner, we use transpose and newaxis to get an array of shape (6, n_models, 1), then unpack the leftmost index as before.

For experiment parameters, we can just pull each out by name to get an array of shape (n_experiments,). Importantly, this shape broadcasts with (n_models, 1) to give a shape (n_models, n_experiments), exactly as required. Let's see how this works in a quick example by defining two arrays of shape (3,) and (4,).

In [17]:
arr1 = np.array([1, 2, 3])
print(arr1.shape)

arr2 = np.array([10, 20, 30, 40])
print(arr2.shape)
(3L,)
(4L,)
In [18]:
print(arr1[:, np.newaxis].shape)
(3L, 1L)

Suppose we want to make a new array arr3 such that arr3[i, j] == arr1[i] + arr2[j] (that is, an outer product over + instead of *). We do this by adding a new axis to arr1 of length 1; this will then broadcast over the index of arr2 when added.

In [19]:
arr3 = arr1[:, np.newaxis] + arr2
print(arr3.shape)
print(arr3)
(3L, 4L)
[[11 21 31 41]
 [12 22 32 42]
 [13 23 33 43]]

As we can see, elements of arr1 are repeated 3 times in order to make a (4, 3) shape array. Broadcasting similarly repeats arr2, even though we did not explicitly add a new axis to its left, as shapes are aligned at the right when finding which indices need repeated to match shapes.

Returning to likelihood, we ensure that the shape of our likelihood array is correct by pre-allocating an array pr1 to hold the probabilty of getting a 1 outcome for each model and experiment. We then assign the actual likelihood values as computed using our unpacked and reshaped model and epxeriment parameters into a slice of pr1, such that the assignment will cause an error if we make an indexing mistake, as opposed to proceeding silently with our mistake intact.

In doing so, note that our index gymnastics paid off; the final expression for pr1 reads pretty much exactly like the mathematical expression we're trying to implement. Broadcasting works, is fun, and helps you write efficient code.

Anyway, with pr1, we have one last step to do before returning. QInfer supports more than just two-outcome models, such that in general we need to care about which outcome the likelihood function is being evaluated for. This is handled by a third index, such that the likelihood function in general should return $L_{ijk}$, with $i$ indexing which outcome. Since this is redundant for two-outcome models, QInfer provides the pr0_to_likelihood_array function to handle two-outcome models. This takes the probability of zero, not one, so we complete our likelihood function by taking 1 - pr0.

Now that we have completed our model, let's go on and use it. As mentioned above, we will want to wrap our model in BinomialModel to handle the case of more than one measurement per sequence length, so let's do that now.

In [20]:
model = qi.BinomialModel(DihedralRBModel())

We can then use our new model's validity checking to define a prior that is supported only on model vectors that make sense (that is, that yield likelihoods in the range 0 to 1). This is done by the PostselectedDistribution provided with QInfer. Since we have more conditions than is usual, we allow for more iterations with the maxiters keyword argument.

In [21]:
prior = qi.PostselectedDistribution(qi.UniformDistribution(
    # We'll pick a prior on p₀ and p₁ consistent with
    # F approximately ≥ 0.8.
    [[0.8, 1]] * 2 +
    # We'll be as agnostic as possible with respect to
    # A, B₀, B₁ and C; most of the parameters in this
    # range will be invalid, but the posts
    [[-0.5, 0.5]] * 3 + [[-1, 1]]
), model, maxiters=10000)

Next, let's get an idea what this prior looks like. We can use Seaborn's PairGrid to plot estimated distributions from samples of the prior.

In [22]:
prior_samples = pd.DataFrame(prior.sample(n=500), columns=model.modelparam_names)
# Example from documentation at:
#     http://stanford.edu/~mwaskom/software/seaborn/generated/seaborn.PairGrid.html#seaborn.PairGrid
grid = sbs.PairGrid(prior_samples)
grid.map_upper(plt.scatter, s=1)
grid.map_lower(sbs.kdeplot, shade=False)
grid.map_diag(sbs.kdeplot, lw=3, legend=False)
Out[22]:
<seaborn.axisgrid.PairGrid at 0xcbf3fd0>

Importantly, we note that the $B_1$, $B_2$ and $C$ parameters are no longer independent once we include the validity checking. We may be able to deal with this in a more elegant fashion by reparameterizing the problem, but there's no real need in this case. QInfer handles the correlated prior returned by PostselectedDistribution just fine, and the validity checks are quite fast.

In [23]:
%timeit prior.sample(n=10000)
1 loop, best of 3: 322 ms per loop

Estimation with the Dihedral RB Model

Sampling from a prior is all well and good, but we defined our dihedral RB model so that we could learn from dihedral experiments. We start by defining a "true" set of model parameters that we'll use to generate data, and to compare our estimates to at the end. For this, we'll use QuTiP to implement the expressions given in Dugas et al. for a initial state and final measurement given by the magic state.

In [24]:
I, X, Y, Z = qt.qeye(2), qt.sigmax(), qt.sigmay(), qt.sigmaz()
SI, SX, SY, SZ = map(qt.to_super, [I, X, Y, Z])
error_channel = 0.98 * SI + 0.01 * SZ + 0.005 * (SY + SX)
initial_state = (I + (X + Y + Z) / np.sqrt(3)).unit()
final_meas = (I + (X + Y + Z) / np.sqrt(3)).unit()

We'll define $p_0$ and $p_1$ manually so that we don't have to define twirling in this post; for more detail, please see the traditional RB example provided with QInfer.

In [25]:
p_0 = 0.97
p_1 = 0.98
A   = np.real(0.5 * (final_meas * error_channel(Z)).tr() * (initial_state * Z).tr())
B_1 = np.real(0.5 * (final_meas * error_channel(Y)).tr() * (initial_state * Y).tr())
B_2 = np.real(0.5 * (final_meas * error_channel(X)).tr() * (initial_state * X).tr())
C   = np.real(0.5 * (final_meas * error_channel(I)).tr())

We then pack these parameters into an array and check that they're valid.

In [26]:
true_model_parameters = np.array([[
    # Set p₀ and p₁ to be a channel with noise
    # that's more strong in the Z direction,
    # and that has fidelity
    # F = ½ + ⅙ (p₀ + 2 p₁) ≈ 0.9916.
    p_0, p_1,
    A, B_1, B_2, C
]])
In [27]:
assert model.are_models_valid(true_model_parameters)

Importantly, this provides a very useful guard against mistakes in our definitions--- a badly normalized initial state, for instance, may cause this validity check to fail, such that we can notice and correct the problem.

At any rate, we now create the SMC updater as usual:

In [28]:
updater = qi.SMCUpdater(model, 12000, prior)

A quick check shows that our initial uncertianty is mostly uncorrelated, such that the correlations introduced by the boundaries in our prior are relatively small.

In [29]:
updater.plot_covariance(corr=True)

Next, we define a heuristic that we'll use to pick each tuple of $m$, $b_1$ and $b_2$. We'll choose $b_1$ and $b_2$ randomly for each set of sequences, and will pick $m \approx 1 / (1 - F)$, such that we choose longer sequences as necessary to refine our uncertianty in $p_0$ and $p_1$.

In [30]:
def dihedral_heuristic(updater, **kwargs):
    expparams = np.empty((1,), dtype=updater.model.expparams_dtype)
    
    for name, value in kwargs.items():
        expparams[name] = value
    
    def next_experiment():
        n_experiments_so_far = len(updater.data_record)
        p_0, p_1, A, B_1, B_2, C = updater.est_mean()
        F = 1 / 2 + (p_0 + 2 * p_1) / 6        
        
        expparams['m'] = max(2, np.floor(1 / (1 - F)))
        expparams['b_1'] = np.random.choice([True, False])
        expparams['b_2'] = np.random.choice([True, False])
        
        return expparams
        
    return next_experiment

Putting everything together, we can now run our typical updater loop.

In [31]:
heuristic = dihedral_heuristic(updater, n_meas=10)
for idx_experiment in range(1000):
    expparams = heuristic()
    datum = model.simulate_experiment(true_model_parameters, expparams)
    updater.update(datum, expparams)

Let's check and see how well we did.

In [32]:
fig, axes = plt.subplots(2, 3, figsize=(12, 9))
for idx, subplot_axes in enumerate(axes.flat):
    plt.sca(subplot_axes)
    updater.plot_posterior_marginal(idx_param=idx, res=100)
    ylim = plt.ylim()
    plt.vlines(true_model_parameters[0, idx], *ylim, label='True')
    plt.ylim(*ylim)
    
plt.sca(axes[0, 2])
plt.legend(['Posterior', 'True'], ncol=2, bbox_to_anchor=(1, 1.125))
Out[32]:
<matplotlib.legend.Legend at 0x1479bb38>

The posterior covariance is much more correlated than the prior, indicating our uncertianty is being dominated by a combination of different model parameters.

In [33]:
updater.plot_covariance(corr=True)

To investigate this further, we can take the spectrum of the covariance, and look at the ratio of the covariance explained by each eigenvector.

In [34]:
cov = updater.est_covariance_mtx()
evals, eigvecs = np.linalg.eigh(cov)
evals /= np.sum(evals)
plt.bar(0.6 + np.arange(6), evals[::-1])
plt.gca().set_yscale('log')
plt.xlim(0.5, 6.5)
plt.xlabel('Eigenvector Index')
plt.ylabel('Ratio of Covariance Explained')
plt.xticks()
Out[34]:
(array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.]),
 <a list of 8 Text xticklabel objects>)

The dominant eigenvector then tells us about what parameters we should try to learn more about to most effectively improve our final estimates.

In [35]:
plt.bar(0.6 + np.arange(6), eigvecs[:, -1] / np.max(np.abs(eigvecs[:, -1])))
plt.xlim(0.5, 6.5)
plt.xticks(1 + np.arange(6), list(map("${}$".format, updater.model.modelparam_names)))
plt.xlabel('Model Parameter')
plt.ylabel('Normalized Overlap with Principal Eigenvector')
Out[35]:
<matplotlib.text.Text at 0x17f49240>

We can get a much more complete sense of our uncertianty by plotting the same slices through the posterior as we did above for the prior.

In [36]:
posterior_samples = pd.DataFrame(updater.sample(n=500), columns=model.modelparam_names)
# Example from documentation at:
#     http://stanford.edu/~mwaskom/software/seaborn/generated/seaborn.PairGrid.html#seaborn.PairGrid
grid = sbs.PairGrid(posterior_samples)
grid.map_upper(plt.scatter, s=1)
grid.map_lower(sbs.kdeplot, shade=False)
grid.map_diag(sbs.kdeplot, lw=3, legend=False)
Out[36]:
<seaborn.axisgrid.PairGrid at 0x12afca20>

Epilogue

We finish the post by exporting it to Jekyll:

In [37]:
!jupyter nbconvert --to html --template jekyll.tpl 2016-10-07-rb-multiexponential
[NbConvertApp] Converting notebook 2016-10-07-rb-multiexponential.ipynb to html
[NbConvertApp] Writing 1387701 bytes to 2016-10-07-rb-multiexponential.html

Documenting Long Classes in Jupyter Notebook

In an earlier post, I described my experiment with using Jupyter Notebook and Jekyll together to write blog posts. This turns out to be very convienent for writing about scientific software, as it allows me to make blog posts that interlace code, figures, mathematics and explanatory prose in a way that hopefully is helpful for my readers. My approach does, however, come with some annoying limitations. For instance, if I want to describe a Python class with several methods and properties, it is difficult to explain each part of that class independently. Though addressing this limitation is currently on the wishlist for the Jupyter Notebook project, the current version doesn't really have a "good" way of dealing with this limitation.

In this post, I'll describe a "bad" way of describing long classes, by using inheritance to break up long definitions. I say that this is "bad" because it creates a needlessly complicated class hierarchy and code that is far from production-quality, but I think it's still a useful technique in that it allows for clearer explanations.

We begin by using print_function and the future library to provide better Python 2/3 compatability.

In [1]:
from __future__ import print_function
from future.utils import with_metaclass

Next, to show that the inheritence trick works well for abstract classes, we import from the abc standard library package.

In [2]:
from abc import ABCMeta, abstractmethod

Getting to the example itself, suppose that you want to define a class with two methods, foo and bar. We represent this as an abstract base class (hence abc) with two abstract methods.

In [3]:
class AbstractBase(with_metaclass(ABCMeta, object)):
    @abstractmethod
    def foo(self, x):
        pass
    
    @abstractmethod
    def bar(self, y):
        pass

If we now try to describe an example concrete (that is, not abstract) class inheriting from AbstractBase, we must define both foo and bar. If we leave one out, then we get a TypeError.

In [4]:
class A(AbstractBase):
    def foo(self, x):
        return x
In [5]:
try:
    a = A()
except TypeError as ex:
    print(ex)
Can't instantiate abstract class A with abstract methods bar

Thankfully, we can use inheritance to provide the second method (in this case, bar) after we have defined our example class.

In [6]:
class A(A):
    def bar(self, y):
        return y ** 2
In [7]:
a = A()
In [8]:
a.bar(2)
Out[8]:
4

What's going on here? In particular, how can A inherit from itself? Remember that in Python, classes are themselves values that can be manipulated at runtime. Thus, if we define a class B, we can do things like print that new class out to the console.

In [9]:
class B(object):
    pass

print(B)
<class '__main__.B'>

As with any other Python value, we can ask for the id of a class. This returns a unique number that identifies that class, even if we assign it to different variables.

In [10]:
print(id(B))
C = B
print(id(C))
67899912
67899912

Notably, if we define a new class that is also called B, this "hides" the old definition of B and gives us a class with a different id.

In [11]:
class B(object):
    pass

print(id(B))
67900856

The old class still exists, however, such that we can get to it if we assigned it to a different variable.

In [12]:
print(id(C))
print(C)
67899912
<class '__main__.B'>

Thus, when we make a class that "inherits from itself," it doesn't really do that per se, but rather inherits from the old value of the variable that held that class. We can confirm this by looking at the special attribute __bases__, which returns a tuple of all base classes of a class.

In [13]:
class D(object):
    pass

print(id(D))

class D(D):
    pass

print(id(D))
print([id(base_class) for base_class in D.__bases__])
67901800
67882920
[67901800L]

Thus, the "old value" of our class still lives on, as referred to by the __bases__ attribute of our new class. Practically speaking, this is a terrible and confusing thing to do, but it has a very nice effect for our purposes, of letting us progressively add attributes such as methods and properties to a class.

In [14]:
class E(object):
    x = 'foo'
    
class E(E):
    y = 'bar'
    
print(E.x, E.y)
foo bar

In this way, self-inheritence can provide a useful technique for splitting up long class definitions. That said, it's a bad idea, so please don't use this in "actual" code, and if you do, don't blame me for the confusion that results.

We finish the post by exporting to Jekyll:

In [15]:
!jupyter nbconvert --to html --template jekyll.tpl 2016-10-07-documenting-long-classes-jupyter-notebook
[NbConvertApp] Converting notebook 2016-10-07-documenting-long-classes-jupyter-notebook.ipynb to html
[NbConvertApp] Writing 19416 bytes to 2016-10-07-documenting-long-classes-jupyter-notebook.html

Save Physics from Physicists: Bias and Extraordinary Claims

I am a physicist, in no small part because I believe that physics is a fascinating topic to study. These facts carry important implications, such as that I must also care about the process by which physics research is carried out, the people who perform research in physics, and what effects our research can have on the world. The latter of these underscores the sobering importance of the moral implications of physics research— we cannot afford to feign that our work is “neutral,” that it has no place in relation to the ethical calculus by which we understand how we must act with respect to each other.

I am continually exasperated by the wholly insufficient job that the culture of physics does at serving and supporting those furthering the cause of physics. This is to say nothing of how poorly the culture of physics ensures our relation to the rest of the world is in keeping with good ethics and morals. I often express this exasperation with what has almost become a kind of resigned catchphrase: “save physics from physicists.” Put differently, I contend that the culture of physics is acutely and dangerously broken in many ways. In this post, I will explore one of those ways in a vain attempt to provide context to that exasperation, with the naïve hope of turning my disappointment towards positive change. After all, I am a physicist of the precise sort that physics needs saving from.

I propose that a core tenet of the philosophy and practice of physics is that of criticality: we must be skeptical of new claims, we must reexamine old claims, we must break down our understanding of the world and problematize it at every level to develop new understandings. We apply this tenet quite well, modulo some notable exceptions, to some aspects of our research efforts. Where we fail, however, is in applying that criticality not only to our understanding of the world, but to the process by which we arrive at that understanding.

The myth behind this is that physicists are already rational, hence obviating the need for introspection in the practice of criticality. This myth creates no small room for occluding and even oppressing biases to hide, safe from challenge by any cultural examination.

To see an example of this in action, let us consider an important fact about our world: it is a fact, plan and simple, that there are fewer women than men currently engaged in the practice of physics research at academic institutions. Insofar as physics research is carried out by people, this fact is thus a highly relevant one to any search for truth, even if we are to set aside the massive ethical implications of this fact.

When presented with this fact, however, it is common to hear claims made to rationalize this fact away. There are many such claims, but in this post, I will focus on one particular example: the claim that perhaps women are simply “less interested in physics.” To take this claim as the default position, as anything other than the extraordinary claim that it is, is wholly and entirely incompatible with rationality and criticality. Indeed, let’s examine for a moment what one must also believe in order for the interest claim to be seen as a reasonable default:

  • Unlike every other aspect of human society, physics alone is isolated from the devestating effects of blatant and long-standing misogyny.
  • Despite the devestating misogyny that is endemic in educational systems the world over, we are able to assess in any fair manner how much gender inherently affects interest in physics.
  • As opposed to our best understanding of the impact of sexual dimorphism on every other field of study (to a very good approximation, none), physics alone has a biological component that can be understood through dimorphism.

In particular, first subordinate claim further demands that we believe that the rampant and institutionalized sexual harassment and assault of physics students and researchers has no impact whatsoever on how women decide upon their careers. In 2016 and in the United States alone, we have seen three physicists commit grievous acts of sexual harassment against their students severe enough to draw the focus not only of the national news media, but of the United States Congress. (From the view of preserving grant money, it is a very bad thing for your field to be known to the Congress as being perfectly OK with not investigating criminal acts of harassment.) These are far from the only such incidents to occur in our field; our institutions continue to protect many more harassers in full knowledge.

Even if one is not familiar with these facts about physics, the point stands still more clearly. One should assume as a default that sexism and harassment are problems in physics, precisely because we are not more rational, nor immune to the endemic disregard for women that characterizes so many other fields of academic study. Indeed, why should physics be so dramatically different from biology, medicine, information security (content warning for explicit sexual content), anthropology, and philosophy, all of which have been shown to be perfectly comfortable with defending sexual harassers and punishing their targets?

If we are to adopt the myth of rationality, then, we are ignoring the massive body of evidence that says that our rationality is compromised; that we are biased. This bias does dramatic and immediate harm, and is directly responsible for driving people away from the research that they love. As reported by the Institute of Physics, endemic bias in physics erodes the quality of mentorship and supervision provided to female students, further cementing these biases. Even denying the existence and extent of sexism in physics is a key microaggression, and has demonstrable health impacts for physics students.

This harm is by no means limited to sexism, of course, but extends to wherever the culture of physics entrenches the biases that blind us to our moral duties. What else to call it but a moral crisis that 49% of transgender physicists are actively subjected to exclusionary behavior? What is it but an unforgiviable failing of our civic duty to say nothing in the face of centuries of violent and oppressive racism?

How can we expect physics to survive as a discipline when we are so manifestly careless with respect to its practitioners? Unfortunately, the answer seems to be quite simple: we cannot. After all, the moral issues notwithstanding, ignoring endemic bias can be devestating to a community. To move forward, then, we must shed the belief that we are already rational and engage our research with the criticality that is morally demanded of us at every step. To do anything less is neither fair, effective, morally justifiable, nor sustainable.

QuTiP on Anaconda on Bash on Ubuntu on Windows

This post covers:

  • Installing Ubuntu on Windows.
  • Installing Ubuntu packages.
  • Installing Anaconda Python for Linux in Ubuntu on Windows.
  • Installing QuTiP in Ubuntu on Windows.

Introduction

Recently, Microsoft introduced Bash on Ubuntu on Windows, a full Ubuntu stack running on the Windows kernel instead of the Linux kernel. This opens an opportunity for scientific computing, as it allows for Windows users to run the same version of powerful tools such as Python as Linux users. While at first blush, installing Ubuntu on Windows seems like massive overkill, it goes a long way towards improving interoperability and accessibility for scientific computing on Windows. Moreover, Ubuntu on Windows is much lighter-weight than running a full virtual machine, and can easily interoperate with native Windows software by sharing common file systems. Using Ubuntu on Windows can also help lower the barrier to entry for a full Ubuntu Linux stack by making it much easier to learn and experiment with Linux software.

That said, Bash on Ubuntu on Windows is still a very new tool, and has a few limitations that can cause problems. In particular, Bash on Ubuntu on Windows only works on Windows 10 Anniversary Update (launched August 2016). In this post, I’ll detail how to get up and running with Bash on Ubuntu on Windows, using the example of the QuTiP library for quantum information.

In particular, this post proceeds in four parts, each concerning how to install a different part of the full stack, along with some background on what each different layer provides. As a result, this post runs a bit long. That said, the actual steps are fairly straightforward.

What is Ubuntu on Windows?

First, however, it helps to delve a little bit into what exactly Bash on Ubuntu on Windows is. Largely, Bash on Ubuntu on Windows is a complete installation of the Ubuntu Linux distribution, but running on the new Windows Subsystem for Linux (WSL) instead of Linux itself. WSL works by translating Linux operating system features into their equivalents under Windows, such that software designed for Linux can be run on WSL without modification. In particular, the Bash command-line that ships with Ubuntu works wonderfully on WSL, and forms the basis for Bash on Ubuntu on Windows.

For more detail, Microsoft’s WSL blog has a very nice introduction to the architecture underlying Ubuntu on Windows.

Installing Windows Subsystem for Linux

Since the Windows Subsystem for Linux is still in beta, we have to jump through a few hoops to enable WSL. Thankfully, these steps only need to be done once, and then we’ll have a working Ubuntu install that we can use for whatever additional software packages we like. Microsoft has provided a very complete guide to installing WSL; here, we’ll briefly summarize those instructions.

First, it is critical that your Windows 10 installation is fully up to date, and is running Anniversary Update. To check, go to Settings → System → About, and check that the build number is at least 14393. If you are running an earlier version of Windows 10, you can normally update to Anniversary Update by using Settings → Update and security → Windows Update to check for updates. Alternatively, you can manually upgrade to Anniversary Update by using the installer provided by Microsoft.

Once that’s done, we can enable beta features in Windows. This requires turning on Developer Mode under Settings → Update and Security → For developers. Ensure that the Developer Mode option is checked here; this may require rebooting. Next, if you aren’t already a member of the Windows Insider Program, you’ll need to join with your Microsoft account.

Now that we have access to beta features, we can enable Windows Subsystem for Linux. Press the Start button, and search for “Turn Windows features on or off.” From that dialog, check the box next to Windows Subsystem for Linux (Beta) and press OK. This will reboot your computer and install WSL.

Finally, we can install Ubuntu on Windows by running one last command. From the Start menu, run PowerShell, then type and run the following command:

PS> bash

You will be prompted to install Ubuntu on Windows, and to create a new user account for Ubuntu. This need not match your Windows account, as it exists only within your Ubuntu installation.

Installing Ubuntu Packages

Once we have WSL and Ubuntu on Windows installed, we can easily use Ubuntu’s built-in package manager to quickly add additional tools. In particular, QuTiP compiles code on the fly to help improve performance, such that we’ll need to install a compiler toolchain. Ubuntu uses the powerful apt program to manage software installations.

To install packages, first run Bash on Ubuntu on Windows from the Start Menu to bring up the Bash command line.

We will need GCC and GFortran, the C and Fortran compilers that come with Ubuntu. We’ll also install Git, as this will make it easy to later install pre-release versions of QuTiP. Finally, we also install the lsb package, which provides the Linux Standards Base expected by many common Linux software packages. We can install all four packages with the following command:

$ sudo apt-get install gcc gfortran git lsb

Breaking this down a bit, sudo causes the rest of the command to be run as the root user. In Ubuntu on Windows, this is the root user for your Ubuntu installation. Since WSL installs a separate copy of Ubuntu on Windows for each Windows user, the root Ubuntu user only has access to the resource your Windows account can access.

The rest of the command tells apt to install the aforementioned four packages, along with any other packages that they require. It should take just a few minutes to download and install everything.

Installing Anaconda

The next step is to install the Anaconda distribution of Python, as Anaconda provides a wide range of cutting-edge Python packages for scientific computing. This is not strictly required, as Ubuntu on Windows does ship with a perfectly fine Python installation.

That said, here I’ll focus on installing Anaconda 4.1.1 for Python 3.5; later versions can be found from Continuum Analytics as they are released. Importantly, we will be using the 64-bit Linux version of Anaconda. The Windows Subsystem for Linux allows us to run the Linux version from within Bash on Ubuntu on Windows.

We’ll use the wget command that comes with Ubuntu to download Anaconda. From Bash on Ubuntu on Windows, run the following command:

$ wget http://repo.continuum.io/archive/Anaconda3-4.1.1-Linux-x86_64.sh

Once this is done, we can run the Anaconda installer:

$ bash Anaconda3-4.1.1-Linux-x86_64.sh

The Anaconda installer will ask a sequence of questions. We’ll want the default answer (press ENTER) on each question but the last one:

Do you wish the installer to prepend the Anaconda3 install location
to PATH in your /home/cgranade/.bashrc ? [yes|no]
[no] >>>

Type yes and press ENTER. Anaconda is now installed into Ubuntu on Windows. Quit and relaunch Bash for the changes to take effect. To confirm, run which python to see which version of Python will be run by Bash. You should see something like /home/cgranade/anaconda3/bin/python.

By default, Anaconda will use the Intel Math Kernel Library (MKL) to accelerate numeric computation in Python. Unfortunately, MKL is not yet fully compatible with WSL, so we’ll need to disable some MKL features. As reported by Evfro, MKL for WSL is incompatable with a feature known as kernel thread affinity, so we’ll disable that feature now.

$ printf "set KMP_AFFINITY=disabled\nexport KMP_AFFINITY" >> ~/.bashrc
$ source ~/.bashrc

Now that Anaconda is installed, we can use Python, IPython, and other powerful tools from within Ubuntu on Windows. In particular, we can use Jupyter Notebook by using the command

$ jupyter notebook --no-browser

This will start a server at http://localhost:8888 that can be used from your favorite browser on Windows. When you’re done, press Ctrl-C from within Bash on Ubuntu on Windows to stop the Notebook server.

Installing QuTiP

We’re almost there, as we now have installed Ubuntu on Windows, a compiler toolchain, and the Anaconda distribution. We’ll finish up by installing QuTiP itself. With everything else in place, this is quite straightforward. Anaconda ships with pip, an installer for Python packages.

$ pip install qutip

If you would prefer to install the latest development version, you can specify the QuTiP Git repository instead:

$ pip install git+https://github.com/qutip/qutip.git

In either case, QuTiP ships with a wide variety of tests that can be used to ensure that the install worked properly. We can run these using the qutip.testing.run() function:

$ ipython
Python 3.5.2 |Anaconda custom (64-bit)| (default, Jul  2 2016, 17:53:06)
Type "copyright", "credits" or "license" for more information.

IPython 4.2.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: import qutip.testing

In [2]: qutip.testing.run()

Depending on your system, these tests should take from ten to forty minutes to run. If all goes well, they should complete without any errors (some warnings and debug messages are normal, though), indicating that the install was successful and that you’re up and running with QuTiP. Enjoy!

Using SSH Keys in Visual Studio Code on Windows

Visual Studio Code is Microsoft’s open-source code editor for Windows, OS X and Linux. Nicely, VS Code has built-in support for Git and support for Python through an extension, making it a useful for scientific development. Using VS Code on Windows is somewhat frustrated, however, if you want to work with a Git repository that was cloned using SSH. Thankfully, I found a workable solution using PuTTY and Git for Windows, such that VS Code transparently works with password-protected SSH keys. Below, I detailed how I got it working in as complete a detail as reasonable, but you may have already done some or even many of these steps. If so, the procedure is actually fairly simple, and consists of pointing Git (and hence VS Code) to use PuTTY and Pageant instead of the SSH version that ships with Git for Windows.

First, though, a disclaimer. These steps worked on my Windows 10 installation, but may not work on yours. If you find that this is the case, let me know, and I’ll try and update accordingly.

Step 0. Install Required Software

Before we get into things, we’ll need a bit of software. In particular, we’ll need:

WARNING: Do not install PuTTY from its official homepage, as this will download PuTTY over an insecure connection. This guide will cover how to download PuTTY securely.

For much of this, we can use the Chocolatey package manager for Windows to save some grief, so let’s start by installing that. If you already have Chocolatey, please skip this step. (If you aren’t sure, try running choco from PowerShell.) Run PowerShell as administrator, then run the following command to download and install Chocolatey:

PS> Set-ExecutionPolicy -Scope Process RemoteSigned
PS> iwr https://chocolatey.org/install.ps1 -UseBasicParsing | iex 

Once this is done, close and reopen PowerShell (again as administrator). This will make choco available as a command. Now we can use it to install Git and OpenSSH (as above, we will not install PuTTY using Chocolatey, as it will download PuTTY from its official homepage using an insecure connection). Run the following PowerShell commands to install Git and OpenSSH:

PS> choco install git
PS> choco install win32-openssh

We’ll finish up by downloading the version of PuTTY that ships with WinSCP, since that version is delivered via HTTPS and not insecure HTTP. In particular, use this link to download PuTTY, then run the installer once you’ve downloaded it.

Step 1. Setup Private Keys

Once everything is installed, we now need to make sure that you have an SSH private key and that this key is registered with your Git hosting service (for instance, GitHub or Bitbucket). If you already have keys and have registered them with your hosting provider, please skip on ahead.

In any case, to generate keys, we’ll again use PowerShell:

ssh-keygen

Simply follow the prompts to make yourself a new public/private key pair, making sure to choose a long (~40 character) passphrase. This passphrase provides much of the entropy for your key, such that it should be much longer than a typical password. Never type your passphrase into a remote password prompt— the passphrase is used to unlock your key locally on your machine, and should never be sent over the network. If a website asks you for your SSH passphrase, you are probably being scammed.

By default, the new keys will be located in C:\Users\<username>\.ssh\id_rsa and C:\Users\<username>\.ssh\id_rsa.pub. As the names suggest, the first of these is the private key and should not be shared with anyone. The other is the public key, and serves to identify yourself to others. Follow the instructions for GitHub or Bitbucket (for Bitbucket, make sure to follow the Linux and OS X instructions, even from Windows) to upload your public key to your hosting provider.

Step 2. Set up SSH Agent

Next, we’ll make sure that your private key is setup in an SSH agent. This will securely remember your passphrase within a given session, so that you don’t have to type it in every time you use it. In particular, we’ll configure Pageant, since this is installed with PuTTY, and works well with a variety of command-line and GUI tools for Windows— most notably, with VS Code.

Pageant must be run at startup in order to be useful, so we’ll begin by adding it to the startup folder now. In Windows Explorer (Windows 8.1 and earlier) or in File Explorer (Windows 10 and later), go to the folder C:\Users\<username>\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup. Right-click inside this folder and select New → Shortcut. From there, browse to C:\Program Files (x86)\PuTTY and select pageant.exe.

Next, we need to import your new key into PuTTY/Pageant. Run PuTTYgen from the Start Menu and select File → Load Key.... From there, navigate to C:\Users\<username>\.ssh\ and select id_rsa (the private key). You may have to drop down the file types selector in the dialog box to see this, as PuTTYgen defaults to filtering out everything but files ending in *.ppk. Once selected, you’ll be prompted by PuTTY to unlock your key by typing in your passphrase. Do so, and PuTTYgen will show the corresponding public key. Select File → Save private key to export your private key in PuTTY, rather than OpenSSH, format. I suggest saving it as id_rsa.ppk in the same folder as id_rsa, but this is up to you. Just be sure that to save it in a folder that only you can read, and that is not synchronized using Dropbox, OneDrive, Google Drive or similar.

Finally, run Pageant from the Start Menu (in the future, this will be handled automatically by the shortcut we created above). This will add a new icon to your system tray. It may be hidden by the arrow; if so, click the arrow to make all fo the system tray icons visible. Right-click on Pageant and select Add Key. Browse to where you saved id_rsa.ppk and select it. You’ll be prompted to unlock your key. Upon doing so, your unlocked key will then be made available in Pageant until you log out or quit Pageant.

Step 3. Add SSH Server Fingerprints

Despite the name, this is a short step. Whenever you log into an SSH server, PuTTY will check that the server’s fingerprint is correct. This is a short cryptographic string identifying that server, such that checking the fingerprint helps against man-in-the-middle attacks. If you haven’t logged into a server with PuTTY before, however, it has no idea how to check the fingerprint, and will fail to login. Since VS Code ignores these errors, Git support will silently fail unless you first attempt to log into the SSH server offered by your Git host. To do so, we’ll use PowerShell one last time. Run one of the following commands below, depending on which hosting provider you use.

PS > & 'C:\Program Files (x86)\PuTTY\plink.exe' git@github.com
PS > & 'C:\Program Files (x86)\PuTTY\plink.exe' git@bitbucket.org

In either case, you’ll be prompted to add the server’s fingerprint to the registry. If you are confident that your traffic is not being intercepted, select y at this prompt. Neither GitHub nor Bitbucket actually allows logins via SSH, so you’ll get an error, but this is OK: you’ve gotten far enough to see the server’s fingerprint, and that’s all we needed. To check, you can run the commands above again, and note that you are no longer prompted to add the fingerprint, but instead fail immediately.

Step 4. Configure Environment Variables

We’re almost done. All that’s left is to point Git for Windows at PuTTY and Pageant, rather than its own built-in SSH client. Since VS Code uses Git for Windows, this will ensure that VS Code does what we want.

Right-click on My Computer or This PC in Windows/File Explorer, and select Properties. From there, click Advanced system settings in the sidebar to the left. On the Advanced tab, press the Environment Variables... button at the bottom. Finally, click New... on the user variables pane (top), and add a new variable named GIT_SSH with value C:\Program Files (x86)\PuTTY\plink.exe. You may want to use Browse File... in this dialog box to make sure you get the path correct. Once done, press OK to add the variable, OK again to close the Environment Variables dialog, then OK a third time to close System Properties. Finally, close the System window.

If you have VS Code open already, close it and reopen VS Code to make sure it sees the new environment variable.

Conclusions

Congratulations, you should now have a full Git + SSH client toolchain working on Windows, and made visible to VS Code. Have fun!