Show how public data are being used in science so that the government can make more transparent public investments. By using automated NLP approaches we enable government agencies and researchers to quickly find the information they need.
The Democratizing Data project is inspired by the 2018 Foundations for Evidence-based Policymaking Act. Its goal is to facilitate the collaboration between federal agencies and the public for the purpose of understanding how Government data assets are used. The intent is to engage the public by providing information about the usage of the assets and expanding the use of the public data assets.
As an initial step in meeting that goal, the Search and Discovery Platform describes how datasets identified by federal agencies have been used in scientific research. It uses Machine Learning algorithms to search over 90 million documents and find how datasets are cited, in what publications, and what topics they are used to study.
Time: 12:00pm-1:00pm EST
The platform is designed to provide users with multiple access modalities to answer usage questions - a usage dashboard, Jupyter notebooks, and an API – and through workshops to engage the community.
Dashboards mainly illustrate how data are being used for research, the primary topics and researchers, and the affiliated institutions
The Jupyter Notebooks are structured to enable researchers to access a database that contains the metadata and perform their own queries.
For sophisticated users, the information can be accessed through an API.
User feedback on the tools and the usage measures is very important. Keep checking this site for current and future workshop information, as well other events.
12:00 pm-1:00 pm EST | Virtual
An online USDA/ ERS & NASS Democratizing Data Joint Info Session.
View all events
Using the retail scanner data made available by the Economic Research Service at the U.S. Department of Agriculture, Dr. Chen Zhen constructed panel price indexes provides valuable insights how changes in food prices can affect consumer behavior and ultimately impact public health outcomes.
Learn more by listening to the Show Us the Data Podcast, Episode 3.
Dr. Tiffany Oliver's work with the Survey of Earned Doctorates (SED) conducted by the National Center for Science and Engineering Statistics (NCSES) provides important insights into the experiences of Black women in STEM fields and their journey to earning a PhD.
Learn more by listening to the Show Us the Data Podcast, Episode 4.
A data ecosystem overview presented at the COPAFS Quarterly meeting on March 3, 2023 showcasing four agency use cases and presented by: Peggy Carr, NCES, Kevin Barnes, NASS, Emilda Rivers, NCSES, and Spiro Stefanou, ERS. The panel was moderated by Nancy Potok, former Chief Statistician of the US.
Learn more by viewing the COPAFS Search and Discovery Platform slide deck.
The Democratizing Data initiative is working with a number of government agencies to ensure that data are more effectively used for public decision-making. We are partnering with:
Learn more about our data partners
This project is a joint effort by:
Learn more about our team