Skip to content

Analyse disciplinary differences of software mentions across different large scale software mention datasets

License

Notifications You must be signed in to change notification settings

samemon/SoftwareImpactHackathon2023_DisciplinaryDifferences

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project name: Exploring Disciplinary Differences in Software Mentions

project-banner

Project description

Project slides

This project was part of the Chan Zuckerberg Initiative on "Mapping the Impact of Research Software in Science". In this project, we are interested in studying the following questions:

  • What is the distribution of publications mentioning (or not) software across disciplines?
  • How is different software used by researchers across their publications?
  • What is the ‘proximity’ of scientific publications to the use of software? (ongoing)

Methodology

We conduct scientometric analysis of publications mentioning software to match software mentions with papers, authors, and disciplines.

Datasets

Software/Tools

  • Google BigQuery (InSySPo project - Brazil)
  • Databricks
  • VOSviewer
  • R
  • Python

Data collection

Match CZI software mentions and SoftwareKG mentions with OpenAlex publications (DOI, PMCID)

Software name disambiguation in CZI dataset

There were software names in the CZI dataset that were not disambiguated. We used fuzzy matching to identify the "similar" software names to merge them before plotting our networks.

Findings

Top softwares per discipline

top softwares per discipline

Software mentions per discipline across time

software mentions across disciplines across time

Software mention networks

Using the CZI dataset (1.7 million publications)

software network mentions in CZI dataset

Using the KG dataset

software network mentions in KG dataset

Software network differences across contrasting disciplines

software mention networks comparison

Future work

Software dependency per domain

future1

Software dependency domain comparison

future2

Contributers

  • Alexy Khrabrov
  • Frank Krüger
  • Fuqi Xu
  • Huimin Xu
  • Puyu Yang
  • Rodrigo Costas
  • Shahan Ali Memon

About

Analyse disciplinary differences of software mentions across different large scale software mention datasets

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 80.8%
  • R 16.9%
  • Python 2.3%