mrnoutahi Some thoughts on boring stuff, and bioinformatics

D3IpyPlus: Experimenting with IPython Notebook + D3Plus for interactive viz

D3IpyPlus is an attempt to incorporate D3 interactive viz into an IPython Notebook. The main objective here is to take advantage of Jupyter’s interface which enables manipulation of the DOM, to build interactive plot. D3IpyPlus can also be used to quickly and automatically generate, from python, JS/HTML code for interactive data visualization.

Why ?

Because IPython Notebook is really convenient. In fact, since the notebook supports markdown, HTML (through IPython.core.display) in addition to python/julia/R type cells, it is suitable for writing a complete blog post or a web-oriented scientific essay with only little post-editing needed. Therefore, it would be nice to have a way to automatically incorporate interactive graphs in it, by writing python code. There are actually ways to perfom such task :

All of them are pretty good, and if you are reading this, you have probably already heard of them. Truthfully, you should try either Altair or holoviews, if you are looking for a well maintained package that will prove useful in the long term. However, I like simplicity and full control over the packages I used, and more importantly, I like writing useless stuffs.

D3IpyPlus’s plots are based on D3plus which has a nice API, and is much easier to work with than d3js. The source is available at https://github.com/maclandrol/d3IpyPlus.

So what can you do with this thing?

The module contains the following methods:

  • from_csv and from_json for loading raw data as pandas dataframe
  • to_js to convert python type to JavaScript type

and the following classes:

  • PyD3Plus, the super base class that interacts with the DOM
  • Plot, the generic plotting class that offer fine-grained control over most D3Plus type of plot.
  • ScatterPlot, LinePlot, BarPlot, BoxPlot, StackedArea, which are subclasses of Plot and are just syntactic sugar to simplify the API
  • TreeMap for plotting treemap
  • _GeoMap and _GeoMap2 for geo data (cannot be displayed in the notebook).

A basic scatter plot

from D3IpyPlus import ScatterPlot

sample_data = [
    {"value": 100, "weight": .45, "type": "alpha"},
    {"value": 70, "weight": .60, "type": "beta"},
    {"value": 40, "weight": -.2, "type": "gamma"},
    {"value": 15, "weight": .1, "type": "delta"}
 ]

# you can pass a container_id parameter, that will correspond 
# to the id of the div to which your plot will be attached.
# Alternatively, a unique div id will be generated if the argument is missing.
scplot = ScatterPlot(x='value', y='weight', id='type', width=600, 
                                size=10, container_id="scatterviz")
# The following will display the plot inside the notebook
scplot.draw(sample_data)

# You can also print the html source corresponding to the scatter plot above
print(scplot.dump_html(sample_data))

    <script src='http://www.d3plus.org/js/d3.js' type='text/javascript'></script>
    <script src='http://www.d3plus.org/js/d3plus.js' type='text/javascript'></script>
    <div id='scatterviz' ></div>
    <style>
    div#scatterviz{
       width: 600px;
       height: 400px;
    }
        
    </style>
    <script>
        
    (function (){
        
        var viz_data = [{"type": "alpha", "weight": 0.45, "value": 100}, {"type": "beta", "weight": 0.6, "value": 70}, {"type": "gamma", "weight": -0.2, "value": 40}, {"type": "delta", "weight": 0.1, "value": 15}];

        var visualization = d3plus.viz()
            .container('#scatterviz')
            .type('scatter')
            .color('type')
            .text('type')
            .y('weight')
            .x('value')
            .id('type')
            .size(10)
            .data(viz_data)
            .draw();

    })();
    
    </script>

A Tree Map example

Let’s make a tree map showing the import partners of Benin in 2016. The dataset (csv) was downloaded from the observatory of economic complexity and is available on github

Since D3IpyPlus support pandas dataframe as data input, we only have to load the csv data and plug it directly into a TreeMap object. We can use the from_csv method provided by D3IpyPlus, which can take a function as input for data preprocessing.

   year country_origin_id country_destination_id  import_val  \
0  2016               BEN                    AGO    33188982   
1  2016               BEN                    BDI        8226   
2  2016               BEN                    BEN        1809   
3  2016               BEN                    BFA     1052344   
4  2016               BEN                    BWA        2597   

  country_destination_name country_destination_continent  
0                   Angola                        Africa  
1                  Burundi                        Africa  
2                    Benin                        Africa  
3             Burkina Faso                        Africa  
4                 Botswana                        Africa  

Now let’s make a Tree Map from that dataset. We will organize the visualization by continent, and use each country full name in the tooltips info.

tmap = TreeMap(id=["country_destination_continent", "country_destination_name"], value="import_val", color="import_val", legend=True, width=700)
tmap.draw(df)
# Ta da !