Like many people in the weeks following the earthquake, I found myself trying to wrap my head around radiation data. I've been a big fan of data journalists/geeks for the past couple of years, and many of these people were doing great work after the quake. I tried to point them out on my own website by gathering their "quaketools," and by writing a short feature in The Japan Times about data visualization.

In the interests of learning how to do news on the web better, I make occasional attempts to dive into different sets of data myself, primarily using a few simple tools that are both free and not so hard to use. While none one of these techniques are earth-shattering, I figure if I share a few more people in Japan might start using them.

The two I want to mention, Google Spreadsheets and Fusion Tables, are used by reputable news orgs around the world (see The Guardian, The New York Times for starters). I really urge any Japan-based newsies out there to experiment with them as well, as a sort of introduction to working with data.

Connecting to CSV files with Google Spreadsheets

Function: =importData("URL")

  • Step 1: Get the CSV file URL available on TEPCO's website
  • Step 2: Import into a Google Spreadsheet using the function: =importData("http://www.tepco.co.jp/forecast/html/images/juyo-j.csv")
  • Step 3: Select menu INSERT > CHART, and customize according to your preference.

For those new to CSV (comma separated values) files think of them as files that house tables of data in plain text format, or a sort of alternative to a spreadsheet or a HTML table. Read about them on Wikipedia.

Advantages of publishing data in CSV (over PDF/XLS, usually the convention):

  • It's accessible. If your computer gets swept away by a tsunami, it's a lot easier to access a tiny plain text file via mobile device than a clunky PDF or Excel file.
  • Transfer of small CSV files is much easier in times of peak traffic (read 'emergencies')
  • Data can be f*cking hard to get out of a PDF. Seriously. I've had some luck with the pdf2txt command in Linux though. Zamzar can work occasionally too.

One week after the earthquake Japan's Local Authorities Systems Development Center recommended that organizations publish in CSV and HTML format rather than PDF and XLS. I wonder how many will listen?

One other happy by-product that happens when organizations publish in CSV is that you can connect a Google Spreadsheet to it directly. For example, I connected to the CSV file provided by TEPCO (see above) about electricity usage in Tokyo, and the result is an spreadsheet that is changes as the TEPCO data is updated every hour. Convert the spreadsheet to graph, and now you have a live graph posted on the web. What it lacks in prettiness it makes up for in auto-f**king-updating.

Daily Power Usage in Tokyo (source: TEPCO)

Note: I was having some display issues in Drupal with this, so it it doesn't show, refresh. If that fails see here. There were also some issues with Google Spreadsheets choking on Japanese Text, see raw data view.

HTML tables are just as awesome

Function: =ImportHtml("URL"; "table";1)

  • Step 1: Get the URL of the webpage that contains a table of data. In this case, we'll use some radiation data at http://ftp.jaist.ac.jp/pub/emergency/monitoring.tokyo-eiken.go.jp/monitoring/past_data.html
  • Step 2: Import into a Google Spreadsheet using the function: =ImportHtml("http://ftp.jaist.ac.jp/pub/emergency/monitoring.tokyo-eiken.go.jp/monitoring/past_data.html"; "table";2)
  • Step 3: Select menu INSERT > CHART, and customize according to your preference.
  • OR Step 4: Select menu INSERT > GADGET which is what I did to product the graph below.

Daily environmental radiation levels in Tokyo

*Note: As I mentioned above Google Spreadsheets doesn't appear to handle Japanese text (am I missing something?) so I did get some shitty '???' in my labels on top.

Mapping with Google Fusion Tables

This one is a little trickier but it's pretty powerful, particularly for making maps. I've yet to master Google Maps API sorcery, so Google Fusion tables allows me to covert basic spreadsheets that include location data into a very cool map. Fusion tables lets you designate certain colors to outlying data if you choose.

So if you look at the "top global internet speeds" map I made recently over on Penn-Olson, you'll see that regions with the highest internet speeds are colored green, with yellow and red showing progressively slower speeds. The original data comes from a table in a PDF recently published by Akamai. So Fusion Tables could be used to make a basic version of Haiyan Zhang's Geiger Map with little more knowledge than some spreadsheet-fu. I hear way more can be done with the Fusion Table API, but I haven't worked up the courage to dive into that rabbit hole yet.

Top Global Internet Speeds

Where to learn more

Those are just a few simple examples. Check the links below if you'd like to dive in further. Note that a few of these have a much higher degree of difficulty for the average bear, but I'm dropping them here for myself as much as you.

Oh, and Chris' Tools for News is awesome too. Go get lost in it.

In other news

I've not posted here in a while due to numerous reasons. The earthquake here in Japan was a big one as I quite literally did not move from the computer for a month, trying to gather information and share where possible (Storify, [Twitter][1c0]). I also started writing for Penn-Olson.com, and that occupies most of my time these days. It's a blast so far, as I get to write about China again and there's lots of room to experiment with new forms of publishing like the ones mentioned above.

I'll share more about that whole experience soon, but until then, feel free to follow the blog on Twitter or Facebook. My posts are here (RSS), though if you're interested in technology in Asia I encourage you to read them all.