Today’s blog is written by Elasticiti’s Mark Permann.
On July 9 and 10, 2016, I attended Data Visualization Camp at the United Nations, one of the many Open Camps conferences for open source technologies.
Over my last year I’ve focused on dashboard design and development, and I love the mix of creativity and analysis, design and development, aesthetics and metrics involved. Most of my experience is with Tableau, the well known commercial visualization software, and I’m happy with it - there’s a large development community, support is fantastic, and the product is robust enough to build some amazingly flexible dashboards. But Data Viz Camp offered a live opportunity to survey what else is out there - techwise, definitely, but also hear from different types of practitioners. It was a worthwhile weekend for which I wanted to share some highlights.
Diverse solutions for needs niche to common
There were a wide variety of tools presented, some from sources that surprised me. The American Museum of Natural History’s Eozin Che talked about the work behind the planetarium show Dark Universe, which is (of course! I realized only then) a visualization of truly big data from space exploration instruments. Partiview is a free version of the software used, and while most of us probably aren’t visualizing geospatial data, Partiview can make 3D scatterplots actually useful for needs like inspecting cluster algorithm results. An example from Optical Character Recognition used Partiview to display each handwritten number as its own data point in cluster space.
Partiview Viz of OCR Clustering
Keynote speaker Edward Tufte - from Day 2- whose scholarship in information design well predates today’s tools, also shared a tool of his own, Image Quilts. It’s a Chrome extension that allows you to manipulate Google Images search output, most notably to eliminate the white space - applying his maxim to minimize non-data ink.
Subatomic Particles Quilt
There were two presentations addressing a more common visualization: network graphs. I haven’t personally used these graphs yet, but have often seen them and thought “that looks cool, but how do I get insights out of that hairball?” Nick Fernandez, a postdoc at the Icahn School of Medicine, presented Clustergrammer, which displays networks as adjacency matrices. The ability to spread out the data, chose different sort orders and use color to encode additional information makes such matrices more useful to my eye.
That said, I totally agree that Alicia Powers’ network graphs are an effective means of displaying and analyzing nutrition data - you know, like how you connect to lettuce? She used Neo4J to graph individuals, their meals, food and ingredient constituents and tell a convincing story that a hot dog with sauerkraut can be recommended for better nutrition! If that sounds incredible, I think you’ll find her talk entertaining and thought provoking.
“Ladies Who Lunch” in Neo4J
What do you do if you’re prototyping viz but the dataset isn’t ready yet? Matt Strom launched datumipsum.com to the rescue. Datum Ipsum lets you create real-looking data by adding Perlin noise (invented to depict imaginary landscapes in the movie Tron) to tweakable change signals; you can get it looking the way you like and export the data out.
Open source easier to use with Tiny Tools and Vega
D3’s gallery shows how powerful it is for developing browser viz, but it requires a lot of code to get even something basic on the screen - as Adam Pearce of the New York Times put it, 48 lines for a scatterplot. Adam shared several D3 tools that can shrink the verbosity down to 9 lines, reduce time spent on formatting, and enable annotations.
Practices and principles for design & development
The diversity of tech presented was equalled by the fields and approaches of the presenters themselves. Several were journalists, for whom a common goal was was personalization: designing graphics that quickly connect the reader to the story. Nadja Popovich’s piece “Are you reflected in the new Congress”, is a great example. It invites the reader to begin by filtering on multiple dimensions immediately to find “you”, inverting the usual story order of big picture first, then drill-down. It makes a lot of sense and parallels what I do with my Tableau business dashboards, putting the interactive controls at top left.
K.K. Rebecca Lai ‘s “Death in Syria” follows a big-picture-then-details order, but tackles the problem of connecting us to that big picture (in this case, 200,000 civilian deaths) by representing each with a small, slightly fuzzy marker, which plotted fill several inches of scrolling column space. She described the marker as a “dot” and said it was controversial to represent people as such, but frankly, I think she undersold their work using that term. Rebecca didn’t talk about how they arrived at the marker, but to me it reads as what it is: individuals seen from a great distance, pixelated but not pixels. Reading it on the web page, you can’t see it all at once, and you scroll and scroll till you find the end of it...It’s a triumph of information design, powerfully conveying the scale of a single number while doing so with great sensitivity.
A former journalist, Natalia Rodriguez, talked about the mindset shift she experienced in her current role working with scientists at the American Museum of Natural History: using visualization to explore and discover data before the story has been determined, to see before showing. I was charmed to hear her wish for museums to be at the forefront of innovation, because they’ve certainly inspired my own innovations. I found an echo of my own experience of the creative process - multiple possibilities, beginnings that aren’t pretty, middle stages of incomplete functioning - in the evolution on display in AMNH’s Hall of Invertebrate Origins. And a recent MoMA Jackson Pollock retrospective unconsciously influenced the development of a series ranking bump chart; “that looks like art!” were the first words uttered by a client upon seeing it.
Similar sentiment was expressed by Hermann Zschiegner, who said one of his “favorite things is going to the Met” before showing a cuneiform tablet to illustrate the long history and usefulness of data visualization. Hermann, founder of data viz agency TWO-N (who sponsored and organized Data Viz Camp) characterized their work as creating “story platforms” - software for clients to tell stories with their data e.g. the Art Genome Browser. He described their development approach: get data first, quickly prototype, and iterate. “Agile is the only way to succeed,” he said; “we can’t just hand over design specs to developers”. Rather than following “a theory, we focus on being aware, open, curious” and engaged. I find that to match up pretty well with how we work at Elasticiti.
Being careful matters too, of course, and Elliot Noma reminded us that the choice of what to graph matters just as much as how. He conveyed this crisply by showing how the typical “hockey stick” linear growth chart can become a slowing growth chart with a log scale, and a declining growth rate chart, all depictions of the same data.
Not surprisingly, former professor Tufte provided the most principled thinking about the opportunities and challenges of data visualization. He suggested we find a way to increase the data-ink ratio of linking lines by using words to draw them. He proclaimed the future of visualization to be ever increasing throughput via video and 4K, and that we should design up to the latest display standards to foster the abandonment of low-rez tech. Perhaps most simply and compellingly, he proffered Google Maps as familiar and compelling evidence that there “is no relationship between the amount of information and the ability of comprehension...Clutter and overload aren’t inherent properties of information; they are failures of design.” That’s data visualization inspiration you can see every day.