Author: dave fauth

Extracting Insights from FBO.Gov data – Part 2

Earlier this year, Sunlight foundation filed a lawsuit under the Freedom of Information Act. The lawsuit requested solication and award notices from FBO.gov. In November, Sunlight received over a decade’s worth of information and posted the information on-line for public downloading. I want to say a big thanks to Ginger McCall and Kaitlin Devine for […]

Extracting Insights from FBO.Gov data – Part 1

Extracting Insights from FBO.Gov data – Part 1 Earlier this year, Sunlight foundation filed a lawsuit under the Freedom of Information Act. The lawsuit requested solication and award notices from FBO.gov. In November, Sunlight received over a decade’s worth of information and posted the information on-line for public downloading. I want to say a big […]

Hadoop to Neo4J

Leading up to Graphconnect NY, I was distracting myself from working on my talk by determining if there was any way to import data directly from Hadoop into a graph database, specifically, Neo4j. Previously, I had written some Pig jobs to output the data into various files and then used the Neo4J batchinserter to load […]

Creating an Elasticsearch index of Congress Bills using Pig

Recently Mortar worked with Pig and CPython to have it committed into the Apache Pig trunk. This now allows to take advantage of Hadoop with real Python. Users get to focus just on the logic you need, and streaming Python takes care of all the plumbing. Shortly thereafter, Elasticsearch announced integration with Hadoop. “Using Elasticsearch […]

Health Insurance Marketplace Costs

Data.Healthcare.Gov released QHP cost information for various health care plans for states in the Federally-Facilitated and State-Partnership Marketplaces. The data is available in a variety of formats and lays out costs for various levels of health care plans (Gold, Silver, Bronze and Catastrophe) for different categories. Premium Information Premium amounts do not include tax credits […]

Part 2 – Building an Enhanced DocGraph Dataset using Mortar (Hadoop) and Neo4J

In the last post, I talked about creating the enhanced DocGraph dataset using Mortar and Neo4J. Our data model looks like the following: Nodes Organizations Specialties Providers Locations CountiesZip Census Relationships * Organizations -[:PARENT_OF] – Providers -[:SPECIALTY]- Specialties * Providers -[:LOCATED_IN]-Locations * Providers -[:REFERRED]-Providers * Counties -[:INCOME_IN]- CountiesZip * Locations – [:LOCATED_IN]-Locations Each of the […]

Building an Enhanced DocGraph Dataset using Mortar (Hadoop) and Neo4J

“The average doctor has likely never heard of Fred Trotter, but he has some provocative ideas about using physician data to change how healthcare gets delivered.” This was from a recent Gigaom article. You can read more details about DocGraph from Fred Trotter’s post. The basic data set is just three columns: two separate NPI […]

Recommender Tips, Mortar and DocGraph

Jonathan Packer wrote on Mortar’s blog about flexible recommender models. Jonathan articulates that “from a business perspective the two most salient advantages of graph-based models: flexibility and simplicity.” Some of salient points made in the article are: graph-based models are modular and transparent simple graph-based model will allow you to build a viable recommender system […]

Chicago Sacred Heart Hospital – Medicare Kickback Scheme

According to an April 16, 2013 FBI press release, Chicago Sacred Heart Hospital Owner, Executive, and Four Doctors Arrested in Alleged Medicare Referral Kickback Conspiracy. From the press release: CHICAGO—The owner and another senior executive of Sacred Heart Hospital and four physicians affiliated with the west side facility were arrested today for allegedly conspiring to […]

DocGraph Analysis using Hadoop and D3.JS

Visualizing the DocGraph for Wyoming Medicare Providers I have been participating in the DocGraph MedStartr project. After hearing about the project at GraphConnect 2012, I wanted to use this data to investigate additional capabilities of Hadoop and BigData processing. You can read some excellent work already being done on this data here courtesy of Janos. […]