Technical Report
This report presents a methodology for identifying dataset mentions in research publications across various citation databases and evaluating their differences.
What is the Issue?
Federal datasets are widely used in research, but tracking their impact requires identifying where and how they are cited in publications. Tools like the Food and Agricultural Research Data Usage Dashboard rely on these mentions to measure reach which informs future data investments.
How to Use This Report
The report is preliminary in nature. It provides an initial approach to characterizing dataset mentions about food and agriculture research datasets in research papers reported in various databases, specifically Scopus, OpenAlex, and Dimensions. It includes procedures for:
- Identifying publication coverage across citation databases
- Cross-referencing publications between datasets
- Analyzing research themes and institutional representation
Stakeholder Applications
Research Evaluation
- Track and evaluate the use of public datasets in academic research
- Improve methods for measuring research impact and dataset reach
Strategic Planning
- Understand coverage differences across citation databases
- Inform decisions about data preservation, access, and investment
Note: The methods described can be applied to evaluate other citation databases such as Web of Science, Crossref, and Microsoft Academic, to name a few.
Report Features
The report features these reusable components:
Code Repository
Data cleaning and standardization tools
Data Schemas
Structured schemas by citation database
Standardized Institution Tables
Institution tables using IPEDS identifiers
How to Cite Our Work
Please use the following citation when referencing our methodology.