In the previous post, we looked at using Datameer 2.0 to analyze some Federal Election Commission data. In this post, I will show some of the analysis performed on the data specifically using the table join, filter, sort, a couple of functions and then a couple of infographics.
The problem statement for this exercise is to determine the top 15 Virginia candidates in terms of total campaign contributions and then show the source of the campaign contributions (in state vs. out-of-state).
As mentioned previously, I created multiple workbooks for each of the data sets (committees, candidates and individual contributors). In the candidate data, I used the filter feature and created a worksheet of only Virginia candidates.
I created a new worksheet in order to join the Virginia candidates with the committees that are supporting them. I imported the filtered candidate data as well as the committee data. Using Datameer’s join capability, I was able to easily join the two data sets in order to create a new worksheet. The join feature allows the user to select the columns to join and also select which columns to include in the joined result. If you forget what type of join you want to use, a link to a wiki describing the joins is available.
One of the impressive things about Datameer is that it shows you the sheet dependencies or where the data came from. For Virginia Contributions, I joined the worksheet of candidates/committees with the individual campaign contributions. I then created new filtered worksheets by first filtering on contributors with an address listing VA as their state and a second worksheet for all other states. An example of the dependencies is below:
Once that worksheet was created, I created a new worksheet and then used the Datameer groupby and groupbysum functions to total up the contributions to each candidate.
After that, another join of the last two worksheets followed by a filtering of the Top 15 by total contribution left me with this worksheet.
Infographics or Cool Charts
Once the data was available, it was time to create a bar chart of the data. Datameer’s extensive library of widgets includes tables, graphs, charts, diagrams, maps, and tag clouds which enables users to create simple dashboards or stunning business infographics and visualizations. Datameer’s WYSIWYG Editor speeds creation of compelling and insightful business infographics. The Editor includes a graphics Inspector, a simple but powerful tool for configuring graphic and text elements including colors, fonts, etc. The Editor provides the realtime view of all graphic and text elements as they are created and edited so that the final visualizations are rendered exactly as intended.
For this case, I chose a rather simple bar chart. Once placed on the WYSIWYG editor, you simply drag the fields from the list of worksheets and the chart takes shape. There are widget settings for the each type of display allowing you to customize the look and layout of the final product. You can see the stacked bar chart below:
In this graph, we are looking at what the breakdown is of contributions from Virginia addresses versus non-Virginia addresses. In the graph, the total amount is Blue. The amount from Virginia is Orange, and the amount outside of Virginia is Green. You can see that Cantor and Kaine have received a majority of campaign contributions from outside of Virginia.
One of the last charts I used was a map. Data with latitude and longitude coordinates are automatically plotted on the map. In this case, I filtered out the contribution for one Virginia candidate (Tim Kaine) and geocoded a portion of the data. That information was then loaded into a worksheet and plotted on a map infographic shown below.
The data did not include any Virginia contributions but at a glance you can see a large number of contributions to the Kaine campaign from the southern US.
Datameer 2.0 opens up data analytics to more people making it relatively easy. I didn’t need to stand up a Hadoop cluster in order to get started. With a little bit of knowledge and asking some questions, I was able to easily drill down into the data for answers.