matplotlib and multiple y-axis scales

This week I had to create a plot using two different scales in the same graph to show the evolution of two related, but not directly comparable, variables. This operation is described in this FAQ on the matplot lib website. Nonetheless I’d like to give a small step by step example…

Consider my input data of the form date release total broken outdated .

20110110T034549Z unstable 29989 133 3
20110210T034103Z wheezy 28900 8 0
20110210T034103Z unstable 30125 209 11
20110310T060132Z wheezy 29179 8 0
20110310T060132Z unstable 30230 945 28
20110410T040442Z wheezy 29487 8 0
20110410T040442Z unstable 31142 991 12
20110510T034745Z wheezy 30247 8 0
20110510T034745Z unstable 31867 610 31
20110610T041209Z wheezy 30328 9 0
20110610T041209Z unstable 32395 328 15
20110710T030855Z wheezy 31403 9 0

I want to create one graph containing three sub graphs, each one containing data for unstable and wheezy. For the sub graph plotting the total number of packages, since the data is kinda uniform, the plot is pretty and self explanatory. The problem arise if we compare the non installable packages in unstable and wheezy, since the data from unstable will squash the plot for wheezy, making it useless.

Below I’ve added the commented python code and the resulting graph. You can get the full source of this example here.

# plot two distribution with different scales
def plotmultiscale(dists,dist1,dist2,output) :

    fig = plt.figure()
# add the main title for the figure
    fig.suptitle("Evalution during wheezy release cycle")
# set date formatting. This is important to have dates pretty printed 
    fig.autofmt_xdate()

# we create the first sub graph, plot the two data sets and set the legend
    ax1 = fig.add_subplot(311,title='Total Packages vs Time')
    ax1.plot(dists[dist1]['date'],dists[dist1]['total'],'o-',label=dist1.capitalize())
    ax1.plot(dists[dist2]['date'],dists[dist2]['total'],'s-',label=dist2.capitalize())
    ax1.legend(loc='upper left')

# we need explicitly to remove the labels for the x axis  
    ax1.xaxis.set_visible(False)

# we add the second sub graph and plot the first data set
    ax2 = fig.add_subplot(312,title='Non-Installable Packages vs Time')
    ax2.plot(dists[dist1]['date'],dists[dist1]['broken'],'o-',label=dist1.capitalize())
    ax2.xaxis.set_visible(False)

# now the fun part. The function twinx() give us access to a second plot that 
# overlays the graph ax2 and shares the same X axis, but not the Y axis 
    ax22 = ax2.twinx()
# we plot the second data set
    ax22.plot(dists[dist2]['date'],dists[dist2]['broken'],'gs-',label=dist2.capitalize())
# and we set a nice limit for our data to make it prettier
    ax22.set_ylim(0, 20)

# we do the same for the third sub graph
    ax3 = fig.add_subplot(313,title='Outdated Packages vs Time')
    ax3.plot(dists[dist1]['date'],dists[dist1]['outdated'],'o-',label=dist1.capitalize())

    ax33 = ax3.twinx()
    ax33.plot(dists[dist2]['date'],dists[dist2]['outdated'],'gs-',label=dist2.capitalize())
    ax33.set_ylim(0, 10)

# this last function is necessary to reset the date formatting with 30 deg rotation
# that somehow we lost while using twinx() ...
    plt.setp(ax3.xaxis.get_majorticklabels(), rotation=30)

# And we save the result
    plt.savefig(output)

Aggregate Results