A Directory of Datasets for Mining Software Repositories
A Directory of Datasets for Mining Software Repositories
Blog Article
The amount of software engineering data is constantly growing, as more and more developers VEG OMEGA 3 employ online services to store their code, keep track of bugs, or even discuss issues.The data residing in these services can be mined to address different research challenges; therefore, certain initiatives have been established to encourage sharing research datasets collecting them.In this work, we investigate the effect of such an initiative; we create a directory that includes the papers and the corresponding datasets of the data track of the Mining Software Engineering (MSR) conference.
Specifically, our directory includes metadata and citation information Lubricants for the papers of all data tracks, throughout the last twelve years.We also annotate the datasets according to the data source and further assess their compliance to the FAIR principles.Using our directory, researchers can find useful datasets for their research, or even design methodologies for assessing their quality, especially in the software engineering domain.
Moreover, the directory can be used for analyzing the citations of data papers, especially with regard to different data categories, as well as for examining their FAIRness score throughout the years, along with its effect on the usage/citation of the datasets.