Figure and Animation Visualization in Python – The Using of Matplotlib and Seaborn

First of all, every language or tool can produce information accurate and nice-looking figure. They’re just tools, we should have a definite idea what we want to present and the way to achieve it. Most of the scientific figures are extremely boring. If  we can make them appearing before they bring the audiences yawns, why not?

There are many tools can be used to visualize our data,

  • Matlab
  • Mathematica
  • R
  • Latex
  • Origin
  • Postscript
  • Python

They are commonly used by physicists or other scientists. Surely, there are may other tools besides these ones, and I’d deliver the same comment, they are just tools, what’s important is what we think. A man with fine ideas can produce effective figure even with PowerPoint or Excel. A suitable tool can greatly reduce repeated work.

Python comes to our choices because of it’s abundant APIs and libraries. We can calculate the data and visualize it immediately. Besides that, it can produce animation movies. We have to admit that movies are always more expressive than figures, although we can not insert them into our papers.

Matplotlib  is a python powerful plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. we can get more about it on http://matplotlib.org/. FuncAnimation in Matplotlib is a simple tools can be used to produce movies. Seaborn is another library used to processing statistical figure with wonderful looking.

This post is my conclusion on random walk, then the examples here are all related to that subject.

 

Matplotlib

 

If we have installed matplotlib, we can use it in python just with

import matplotlib.pyplot as plt

If we have not install it, we can use Anaconda or Canopy in windows, or use

sudo apt-get install python-matplotlib

in linux to install it.
More about the installation can be found in http://matplotlib.org/users/installing.html
Depended on our program, we may use numpy

import numpy as np

For a line plot, we can use

plt.plot(x,y)

It’s argument can be found in http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.plot
The following codes are used to add more details of the figure

plt.xlabel('') # add xlabel
plt.ylabel('') # add ylabel
plt.legend(loc='upper left') # add legend if we have more lines in one figure

After all, we can display the figure by

plt.show()

Example 1

This example is a maker plot, in which we plot marker instead of line with a little change in the argument of plt.plot. It’s produced by


plt.style.use('ggplot')
plt.plot(step,x2,'.',label='x^2')
plt.plot(step,y2,'.',label='y^2')
plt.xlabel('Number of Steps')
plt.ylabel('The average of x^2 or y^2')
plt.legend(loc='upper left')
plt.show()

I like ggplot2 in R, which has a wonderful color mathing, so I add this style in

plt.style.use('ggplot')

The lists step, x2 and y2 was calculated for random walk, the integrity program for the figure is


# -*- coding: utf-8 -*-

import random
import numpy as np
import matplotlib.pyplot as plt

# Parameters
nsample = 200
nstep=1000

# Define function disstep(xdis2,ydis2,nsample,nstep)
def disstep(xdis2,ydis2,nsample,step):

    # Initiate variables
    ydis2=0
    xdis2=0
    
    # Let's walk
    for n in range(nsample):
        x=0
        y=0
        for m in range(step):
            walk=random.randrange(4)
            if walk == 0:
                x=x+1
            elif walk == 1:
                y=y+1
            elif walk == 2:
                x=x-1
            elif walk == 3:
                y=y-1
        xdis2=xdis2+x*x
        ydis2=ydis2+y*y
    xdis2=xdis2/nsample
    ydis2=ydis2/nsample

    return xdis2, ydis2

# Variables
step = []
x2 = []
y2 = []
ydis2=0
xdis2=0

# Calculate the average of x^2 and y^2 after the random walk
for n in range(nstep):
    step.append(n+1)
    x20,y20 = disstep(xdis2,ydis2,nsample,step[n])
    x2.append(x20)
    y2.append(y20)

# Plot x^2 and y^2 via step
plt.style.use('ggplot')
plt.plot(step,x2,'.',label='x^2')
plt.plot(step,y2,'.',label='y^2')
plt.xlabel('Number of Steps')
plt.ylabel('The average of x^2 or y^2')
plt.legend(loc='upper left')
plt.show()

Example 2

This example is use to draw a scatter plot.

where I have set the limits of x and y axis by

plt.xlim(-0.02,1.02)
plt.ylim(-0.02,1.02)

In order to make x and y axis have the same scale

plt.gca().set_aspect('equal', adjustable='box')

is used.
The complete code is


# -*- coding: utf-8 -*-

import random
import numpy as np
import matplotlib.pyplot as plt

x=[]
y=[]
a=1000
for i in range(1000):
    x.append(random.random())
    y.append(random.random())

plt.style.use('ggplot')
plt.plot(x,y,'.',markersize=20,alpha=0.5)
plt.xlim(-0.02,1.02)
plt.ylim(-0.02,1.02)
plt.gca().set_aspect('equal', adjustable='box')
plt.text(0,1.05,'Number of Dots = 1000')
plt.show()

Class matplotlib.animation

 

The main interfaces are TimedAnimation and FuncAnimation, which you can read more about in http://matplotlib.org/api/animation_api.html. The following is base on https://jakevdp.github.io/blog/2012/08/18/matplotlib-animation-tutorial/ with FuncAnimation.
Firstly, import the libraries.


import random
import matplotlib.pyplot as plt
from matplotlib import animation 

Secondly, create a figure windows, axis and empty line.


fig = plt.figure()
plt.xlim(-0.02,1.02)
plt.ylim(-0.02,1.02)
plt.gca().set_aspect('equal', adjustable='box')
line, = plt.plot([],[],'.',markersize=20,alpha=0.5)

Thirdly, create the data of every frame that is used to fill the empty line.


x=[]
y=[]
def adddot(i):
    x.append(random.random())
    y.append(random.random())
    line.set_data(x,y)
    return line,

Lastly, animate it just with

anim = animation.FuncAnimation(fig,adddot,frames=500,interval=10,blit=True)

In addition, if we want to export our movies, we should install ffmpeg or some other encoder. As for how to install it, we can ref http://adaptivesamples.com/how-to-install-ffmpeg-on-windows/
Then, we can save the movie as mp4 with

anim.save('animation', writer = FFwriter, fps=100, 
          extra_args=['-vcodec', 'libx264'])

Seaborn

 

Seaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing attractive statistical graphics. We can find more in https://stanford.edu/~mwaskom/software/seaborn/
We have to install seaborn in the first time. If we have installed Anaconda, just open Anaconda Prompt and command

conda install seaborn

Example 3

 

This example will demonstrate how to visualize a linear relationship as determined through regression. After import this library

import seaborn as sns
sns.set(color_codes=True)

We can plot a linear fit just with one single line of code

sns.regplot(x=sstep, y=sx2);

The sstep and sx2 I used here is Pandas Serious, and the complete program for the following figure

is


# -*- coding: utf-8 -*-

import random
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
sns.set(color_codes=True)

# Parameters
nsample = 20
nstep=1000

# Define function disstep(xdis2,ydis2,nsample,nstep)
def disstep(xdis2,ydis2,nsample,step):

    # Initiate variables
    ydis2=0
    xdis2=0
    
    # Let's walk
    for n in range(nsample):
        x=0
        y=0
        for m in range(step):
            walk=random.randrange(4)
            if walk == 0:
                x=x+1
            elif walk == 1:
                y=y+1
            elif walk == 2:
                x=x-1
            elif walk == 3:
                y=y-1
        xdis2=xdis2+x*x
        ydis2=ydis2+y*y
    xdis2=xdis2/nsample
    ydis2=ydis2/nsample

    return xdis2, ydis2

# Variables
step = []
x2 = []
y2 = []
ydis2=0
xdis2=0

# Calculate the average of x^2 and y^2 after the random walk
for n in range(nstep):
    step.append(n+1)
    x20,y20 = disstep(xdis2,ydis2,nsample,step[n])
    x2.append(x20)
    y2.append(y20)

# Plot x^2 and y^2 via step
sstep=pd.Series(step)
sx2=pd.Series(x2)
sy2=pd.Series(y2)
sns.regplot(x=sstep, y=sx2);
plt.xlim(0,1000)
plt.show()

A more compact and expressive example can be found in their homepage


import numpy as np
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns
sns.set(color_codes=True)

np.random.seed(sum(map(ord, "regression")))
tips = sns.load_dataset("tips")

sns.regplot(x="total_bill", y="tip", data=tips);

And the result is

Leave a Reply

Your email address will not be published. Required fields are marked *