Increasing the Value of Data by Cutting Clutter and Duplication
This is a cross-post from HealthData.gov.
By Bruce Greenstein, HHS Chief Technology Officer; Mona Siddiqui, HHS Chief Data Officer; and Kate Appel, Operations Lead for HealthData.gov
HealthData.gov launched in 2011 with 30 datasets and as of 2017, there were more than 1,900 data sets available. We are proud of HHS' commitment to continuously and responsibly release publicly-funded data.
How we measure success of HealthData.gov should not be merely a reflection of the number of data sets published. Rather, the data available to the public needs to be discoverable, usable, timely, and high-quality.
Within HHS, we are also committed to making sure data is actionable and can inform decision-making across the Department. The Office of the CTO is leading an effort called the Data Insights Initiative to unlock, share and connect data in responsible ways to improve how HHS delivers on its mission.
Members of the Office of the CTO team have also traveled around the country meeting with, and listening to, members of our community and asked - how can we make Healthdata.gov more usable and deliver more value?
We listened and today, we are updating the technical platform of HealthData.gov to reduce clutter and duplication. While this will result in fewer data sets in total, we believe that this will increase the overall quality and usability of the Department's premier open data website. During this update, we will also make updates to maintain website security, which is an upmost priority to us.
We remain open to feedback, and we'd like to hear from our users - please send in comments and feedback to email@example.com.
What datasets are affected
A quick note about where the data is sourced: the data presented to you on HealthData.gov is often first shared by one of our partner government organizations at the federal and state levels. Our partners provide data on an open portal and that data is brought together on Healthdata.gov for your convenience and discovery.
The updated version of the website will scan data and will be pickier about the data that shows up when you search on HealthData.gov. In essence, the update will make sure that fewer data will show up on HealthData.gov if there is a broken link to the data.
During this update, data sets were removed under the following criteria:
- The data set was a duplicate data entry; More than one instance of a data set exists on HealthData.gov.
- The data feed of a partner organization was changed, moved, removed, or otherwise was not discoverable after outreach to that organization. The data entry had a broken or empty link.
As with any change to the site, we'd like to hear from our users - you can reach us at firstname.lastname@example.org.