The Data Citation Index was launched in November 2012 and is designed to quantify the impact and reuse of research data. This is still quite a new area for many of us, and a recent working paper by Torres-Salinas et al. looks at the coverage of the DCI in terms of disciplines, document types and repositories. The authors' analysis estimates that:
- 80% of the records included in the index are classified as Science, 18% as Social Sciences and 2% as Humanities & Arts records. Engineering & Technology is almost non-existent, accounting for less than 0.1% of the total records.
- The DCI uses three document types. There are 96 data repositories, and the predominant typology is the data set which is 94% of the entire database. The third document type, data studies, comprise around 6% of the total records included in the index.
- 64 of the 96 repositories included in the index contain at least 100 records. However, there is a very high concentration across just four repositories, which together account for 75% of all records from repositories in the DCI: Gene Expression Omnibus, UniProt Knowledgebase, PANGAEA and U.S. Census Bureau TIGER/Line Shapefiles.