This project was part of the Chan Zuckerberg Initiative on "Mapping the Impact of Research Software in Science". In this project, we are interested in studying the following questions:
- What is the distribution of publications mentioning (or not) software across disciplines?
- How is different software used by researchers across their publications?
- What is the ‘proximity’ of scientific publications to the use of software? (ongoing)
We conduct scientometric analysis of publications mentioning software to match software mentions with papers, authors, and disciplines.
- CZI mentions dataset (1.7 million PubMed IDs)
- [OpenAlex] (https://openalex.org/)
- [SoftwareKG] (https://zenodo.org/records/3715147)
- Google BigQuery (InSySPo project - Brazil)
- Databricks
- VOSviewer
- R
- Python
Match CZI software mentions and SoftwareKG mentions with OpenAlex publications (DOI, PMCID)
There were software names in the CZI dataset that were not disambiguated. We used fuzzy matching to identify the "similar" software names to merge them before plotting our networks.
- Alexy Khrabrov
- Frank Krüger
- Fuqi Xu
- Huimin Xu
- Puyu Yang
- Rodrigo Costas
- Shahan Ali Memon