|This is my face most days...|
Two years and three months later, I can confirm the well-known Seattle drear. It feels like it's cloudy more often than not, and it's nearly always wet. ALWAYS! Am I just suffering from confirmation bias, or is Seattle's weather always just generally crappy? Well, being the scientist that I am, I sought to answer these interesting and personally-relevant questions.
The initial idea that I had was to gather a mass amount of data regarding Seattle's rainfall over the past 10 years, do some basic statistical analysis on that data, and create some plots of my results. This also happened to coincide with my initial serious forays into learning Python, so it would be a good way to learn how to use Python to condense large data sets into easy-to-digest conclusions. I went to the obvious sites first: weather.com and weather.gov. I ended up dumping a few hours into looking through each, finding them annoyingly frustrating to navigate, more frustrating to pull data from, and even more frustrating to try to manipulate said data with my fledgling knowledge of Python. So, I abandoned those and did what any good scientist would do: complain about his data troubles to his friends on Facebook. My colleague Jim commented on that post with a link to an awesome wind map, noting that someone found a way to use weather data, so I checked that out to distract myself for a little bit. While on the page, I noticed that they had a link to some weather data for basic maps. Intrigued, I hit the link and found Weather Underground. They had data just how I wanted it! So, I stole all of Seattle's monthly data over the past 10 years and got to work.
Note: This is my first data visualization post of what I hope to be many more in the future (inspired by my good colleague and Red Army co-member Jim Davenport). I apologize in advance if the image format sucks with respect to my blog layout. Now, let's have a look at some data! (click to enlarge)
Here we have precipitation, cloud cover, and humidity data for Seattle over the past 10 years up to and including November 27th 2012. Each individual brick in the graphs represents a month of data. The bricks in the Precipitation plot represent cumulative rainfall over the month. The bricks in the other two plots take the average values of the median (wiki: median) cloud cover and humidity over that month. I drew some lines in the leftmost plot to bracket the "nice season" in Seattle: May-Sept. Also, the cloud cover percentages are roughly ±6%. For whatever reason, the source site lists cloud cover in eighths of the sky (e.g. 4/8, 7/8) so I converted them to percentages. Hence, error!
The lovely thing about this data is that it essentially speaks for itself. The really truly dreary months are clearly October through April, where we get the worst of it all. Anyone living here for at least a year could tell you that. Worse yet, the averaged median cloud cover per month doesn't dip below about 25%...at all, clearing up only between July and September. This explains my Vitamin D deficiency last year! What's the silver lining here? If you ever want to visit Seattle to experience some of its real beauty, come in August!
Here's what's really interesting to me in this particular set of plots. Check out March 2010 in the left two. It was uncharacteristically nice for March in Seattle! Look at all of the other March data on both graphs. It's a local minimum in both rainfall and cloud cover! I got duped by Seattle!
After producing these pretty plots I began to wonder how the weather in Seattle compared to my home in NY. So, I altered my Python script a little bit (i.e. changed "Seattle" to "New York") and made the same three graphs for NYC (at JFK airport).
Well, would you look at that: the amount of rainfall in NYC appears to be roughly on par with that of Seattle. But wait! What do the numbers say? Between 2002 and 2012, there are only 3 years where Seattle's total rainfall exceeds New York's! WHAT!? It can't be! In 2006, 2010, and 2012, Seattle's rainfall exceeds NYC's by no more than 3.35 inches (2006). NYC dwarfs Seattle in all other years, hitting a maximum difference of 23.89 inches last year. Crazy. The cloud cover is better in NYC though, being more or less even on average throughout the year at ~60%. Lastly, as anyone who's lived in or near NYC can tell you, humidity in NYC is horrid over the summer. Had I gotten info from somewhere on Manhattan island, I'm absolutely certain that the humidity result would've been just disgusting.
For no other reason than the fact that I could, I decided to query one more city. Here's some results for Los Angeles via LAX.
Well, that's all I've got for data today. If you want to play with the data yourself, or use my code to reproduce some of this stuff, shoot me an email. I'll send you my code and the .tar files for each city. My email's on here somewhere right? No? nhuntwalker -at- gmail -dot- com.
- Jim Davenport for inspiring me to try some data viz stuff and providing me with a gateway to my source data. Check his blog out. It's awesome!
- Yusra AlSayyad for her Python wizardry.
- Weather Underground for their mass amounts of data.
- Python, for existing and being free.
- AstroML: Machine Learning and Data Mining for Astronomy. It's an awesome package for Python consisting of various tools that we use to analyze and visualize data in astronomy. It was produced by Zeljko Ivezic (my advisor), Andrew Connolly, Jacob VanderPlas (former grad student, now post doc), and Alex Gray. Link here.
- wget. This is a terminal command you can use to *ahem* borrow things from websites. Used extensively here.