by Mandi Cai
Charting the waters: between Bokeh and D3
There comes a time in the life of a budding “low-key but also high-key trying to become a front-end designer and developer” when they must enter the world of charting libraries.
Charting libraries produce data-driven visualizations. They are the reason you can quickly grasp trends in life expectancy on FiveThirtyEight or gauge the national sentiment about an upcoming presidential election (yikes) on The New York Times.
Think about the charts that you can create within Google Sheets, except now you have direct viewing and editing rights to the library that drives those charts. You are the master of these low level building blocks constituting a “chart”.
Recently, our team was tasked with creating an interface that needed to integrate a charting library in order to fulfill the goal. As a result, we had to decide on a library that satisfied our specific use cases. If you weigh your needs correctly and choose a library that somehow satisfies all of them, life is golden.
However, libraries are never a one-size-fits-all kind of deal. More often than not, your initial assumption that a library is the perfect match will be incorrect because of unforeseen obstacles that pop up.
Perhaps you’re thinking: “What are those options?”, “ How did you approach choosing an option?” or “Why did you feel stupid?” (refers to the Slack message above).
Why We Tried Bokeh First
Our needs fell into two camps: speed and interactivity. Because we were handling larger quantities of data, our visualization had to be able to update at lightning speed (or at least at a speed that had no perceivable lag).
Our application also needed to have the desired interactivity that we envisioned for the user. In an ideal scenario, the library would already include some of these interactive functions that we could easily throw in, instead of having to write them from scratch.
You can also choose to use the Bokeh Server to handle streaming of your data. In the Bokeh 0.12.13 documentation, it states: “This capability to synchronize between python and the browser is the main purpose of the Bokeh Server.”
Bokeh is magical for a lot of reasons. It renders first using WebGL with a HTML5 Canvas fallback, provides several built-in tools to interact with charts, handles egregiously large data sets, and ultimately, creates something that can go on the web immediately.
One drawback to Bokeh, however, is that it is limited in the degree of interactivity that a visualization can have. Bokeh enables you to “chart” in the more conventional sense— it offers a 2-D, grid-like canvas with axes as the baseline. And that’s okay, because often that’s what the user needs and wants. Experienced Bokeh users can make really beautiful things (see examples here).
But if I wanted to make a visualization that went outside of the conventional characteristics of a chart, such as simulating forces between atoms and dragging the atoms around, I don’t know how I would accomplish that in Bokeh.
Bokeh Barchart using Python (via Jupyter Notebook)
Finally, there’s more documentation and active usage of other charting libraries, like D3.js or three.js, compared to Bokeh. With more active contributors and users of a library comes a higher probability of finding the solution you need to fix a specific bug.
— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —
Why We Switched to D3
Our initial concern with D3 was that it would render our visualizations too slowly, given past experiences with rendering SVG’s in the browser with larger quantities of data. We also knew that the learning curve for D3 was significantly higher than Bokeh’s learning curve.
But we were still optimistic given D3’s popularity, the infinite amount of beautifully documented D3 applications, and our “Get Sh*t Done” attitude … so we decided to give it a try anyway.
To our surprise, the D3 visualizations we created with our datasets were very buttery. We quickly realized that D3 is structured specifically for quick rendering, despite the massive arrays we were passing in to the library.
Instead of passing in data points one by one and generating the respective SVG, which can be quite tedious, D3 allows you to bind your entire data set to your SVG’s before they exist. The SVG’s are then generated rapid-fire and associated with their respective data point all in one go.
It’s like a chef in the kitchen that receives a list of orders at once and can prep the food in an order that omits unnecessary wait time, rather than always waiting to receive the next order after preparing one dish.
The best part about D3 is that it offers ample opportunity for smooth interactions and transitions between data sets. Because our ultimate goal was and is to empower the user, we wanted to create a visualization that would invite an individual to engage with it.
Yes, there are more complex lines of code compared the code required for a Bokeh chart. It took more time and energy to pick up. But you have complete control over every small detail of your chart, and it’s all documented somewhere online (probably via the creator, Mike Bostock). That’s pretty great.
Lastly, there has been extensive usage of D3 in recent years to visualize the 2017 US elections, the movement of refugee populations, infant vaccination rates for WHO, and countless other trends and stories. As a result, D3 has garnered a significant amount of exposure and attention, which leads to more active users and new ways to use the library every day.
When choosing a library for the long haul and keeping in mind that your teammates will also need to learn it eventually, it is absolutely crucial to consider the library’s current and future community of contributors. A library with a continuously thriving community is ideal, and D3 seems to foster that type of community.
It is hard understanding how selections work, what .enter() and .exit() even mean, and the magic that just happens with one simple line of code (.transition() anybody?). BUT — once you’ve wrapped your head around D3’s unique structure of assuming things exist before they exist, the possibilities are endless.
Ultimately, the benefits of D3 outweighed the effort and time of learning it, and we had a hunch that switching to D3 would be a good long-term investment.
So there you have it! We are still actively using and learning D3 as we integrate the library into our application and our team. Though just because we are moving forward with D3 does not mean that we won’t use Bokeh for a different application in the future. There are pros and cons to every charting library, and it’s important to reflect constantly to determine whether you should continue with your current library or start exploring other options.
Before choosing a charting library, know your specific needs and don’t be afraid to dive headfirst into the uncharted waters of charting libraries with those needs in mind. If something does not work the first time around, try something new that seems promising.
It’s about exploring, documenting, and checking back in with yourself and your teammates to continue evolving the project in productive ways.
If you have any comments, corrections, suggestions, or just want to talk, feel free to e-mail me at firstname.lastname@example.org. You can find some of my work at http://mandilicai.com/.