Skip to content

Commit

Permalink
Predatory journal checker (JabRef#10592)
Browse files Browse the repository at this point in the history
* add PredatoryJournalRepository class

* Refactor PredatoryJournalRepository to match JournalAbbreviationRepository design

* Add PredatoryJournalChecker and PredatoryJournalLoader classes

* Add Integrity Message for en Resources

* Integrate PredatoryJournalChecker into IntegrityCheck

* Initialize PredatoryJournalRepository on Launch

* Add PredatoryJournalCheckerTest and more logging

* Refactor PredatoryJournalLoader to switch from temp dir to user's app data

* Add MV file generation for predatory journal lists

* update CHANGELOG.md

* Refactor and create own record class for journal information
move to own gradle task and file

* checkstyle, rename methods in gradle

* run rewrite

* fix test

* fix duplicate handling

* fix gradle task
fix zwsp in name

* more exception handling

* Make serializable to fix mvstore

Just use simpler levenstein distance should be enough
add javadoc

* checkstyle

* fuck you checkstyle

* checkstyle

* refactor test

* use same copy behavior as for journal abbrevs

* make link elemens non static

fix checkstyle

* fix static vars

* Split loader into crawler and "real" loader

* Checkstyle

* Fix variable type

---------

Co-authored-by: Siedlerchr <[email protected]>
Co-authored-by: Oliver Kopp <[email protected]>
  • Loading branch information
3 people authored Dec 5, 2023
1 parent 67e25a0 commit d03a43c
Show file tree
Hide file tree
Showing 19 changed files with 467 additions and 16 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ Note that this project **does not** adhere to [Semantic Versioning](https://semv
- We added a button to let users reset the cite command to the default value. [#10569](https://github.com/JabRef/jabref/issues/10569)
- We added the option to use System Preference for Light/Dark Theme [#8729](https://github.com/JabRef/jabref/issues/8729).
- We added [scholar.archive.org](https://scholar.archive.org/) as a new fetcher. [#10498](https://github.com/JabRef/jabref/issues/10498)
- We integrated predatory journal checking as part of the Integrity Checker based on the [check-bib-for-predatory](https://github.com/CfKu/check-bib-for-predatory). [koppor#348](https://github.com/koppor/jabref/issues/348)
- We added a 'More options' section in the main table right click menu opening the preferences dialog. [#9432](https://github.com/JabRef/jabref/issues/9432)

### Changed
Expand Down
15 changes: 13 additions & 2 deletions build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -327,8 +327,19 @@ tasks.register("generateJournalListMV", JavaExec) {
!file("build/resources/main/journals/journal-list.mv").exists()
}
}
jar.dependsOn "generateJournalListMV"
compileTestJava.dependsOn "generateJournalListMV"

tasks.register("generatePredatoryJournalListMV", JavaExec) {
group = "JabRef"
description = "Load predatory journal information from online sources to a H2 MVStore"
classpath = sourceSets.main.runtimeClasspath
mainClass = "org.jabref.cli.PredatoryJournalsMvGenerator"
onlyIf {
!file("build/resources/main/journals/predatory-journals.mv").exists()
}
}

jar.dependsOn("generateJournalListMV", "generatePredatoryJournalListMV")
compileTestJava.dependsOn("generateJournalListMV","generatePredatoryJournalListMV")

tasks.register('generateCitaviSource', XjcTask) {
group = 'JabRef'
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,16 +11,16 @@ It is also possible to use IntelliJ's internal build and run system to launch Ja
Due to [IDEA-119280](https://youtrack.jetbrains.com/issue/IDEA-119280), it is a bit more work.

1. Navigate to **File > Settings... > Build, Execution, Deployment > Build Tools > Gradle**.
2. Change the setting "Build an run using:" to "IntelliJ IDEA".
2. Change the setting "Build and run using:" to "IntelliJ IDEA".
3. Navigate to **File > Settings... > Build, Execution, Deployment > Compiler > Java Compiler**.
4. Uncheck `--Use 'release' option for cross-compilation`.
5. **Build > Build Project**
6. Open the project view (<kbd>Alt</kbd>+<kbd>1</kbd> , on mac <kbd>cmd><kbd>+<kbd>1</kbd>)
7. Copy all build resources to the folder of the build classes
1. Navigate to the folder `out/production/resources`
1. Navigate to the folder `build/resoruces/main`
2. Select all folders below (`bst`, `csl-locales`, ...)
3. Press <kbd>Ctrl</kbd>+<kbd>C</kbd> to mark them for copying
4. Select the folder `classes`
4. Select the folder `out/production/classes`
5. Press <kbd>Ctrl</kbd>+<kbd>V</kbd> to start the copy process
8. Locate the class `Launcher` (e.g., by <kbd>ctrl</kbd>+<kbd>N</kbd> and then typing `Launcher`). Press <kbd>Enter</kbd> to jump to that class.
<figure>
Expand Down
8 changes: 4 additions & 4 deletions src/main/java/org/jabref/cli/JournalListMvGenerator.java
Original file line number Diff line number Diff line change
Expand Up @@ -37,15 +37,15 @@ public static void main(String[] args) throws IOException {

// we currently do not have good support for BibTeX strings
"journal_abbreviations_ieee_strings.csv"
);
);

Files.createDirectories(journalListMvFile.getParent());

try (DirectoryStream<Path> stream = Files.newDirectoryStream(abbreviationsDirectory, "*.csv");
MVStore store = new MVStore.Builder().
fileName(journalListMvFile.toString()).
compressHigh().
open()) {
fileName(journalListMvFile.toString()).
compressHigh().
open()) {
MVMap<String, Abbreviation> fullToAbbreviation = store.openMap("FullToAbbreviation");
stream.forEach(Unchecked.consumer(path -> {
String fileName = path.getFileName().toString();
Expand Down
3 changes: 3 additions & 0 deletions src/main/java/org/jabref/cli/Launcher.java
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
import org.jabref.gui.Globals;
import org.jabref.gui.MainApplication;
import org.jabref.logic.journals.JournalAbbreviationLoader;
import org.jabref.logic.journals.predatory.PredatoryJournalListLoader;
import org.jabref.logic.l10n.Localization;
import org.jabref.logic.net.ProxyAuthenticator;
import org.jabref.logic.net.ProxyPreferences;
Expand Down Expand Up @@ -169,6 +170,8 @@ private static void initGlobals(PreferencesService preferences) {
// Read list(s) of journal names and abbreviations
Globals.journalAbbreviationRepository = JournalAbbreviationLoader
.loadRepository(preferences.getJournalAbbreviationPreferences());
Globals.predatoryJournalRepository = PredatoryJournalListLoader
.loadRepository();

Globals.entryTypesManager = preferences.getCustomEntryTypesRepository();
Globals.protectedTermsLoader = new ProtectedTermsLoader(preferences.getProtectedTermsPreferences());
Expand Down
47 changes: 47 additions & 0 deletions src/main/java/org/jabref/cli/PredatoryJournalsMvGenerator.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
package org.jabref.cli;

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.Set;
import java.util.function.Function;
import java.util.stream.Collectors;

import org.jabref.logic.journals.predatory.PredatoryJournalInformation;
import org.jabref.logic.journals.predatory.PredatoryJournalListCrawler;

import org.h2.mvstore.MVMap;
import org.h2.mvstore.MVStore;

public class PredatoryJournalsMvGenerator {
public static void main(String[] args) throws IOException {
boolean verbose = (args.length == 1) && ("--verbose".equals(args[0]));

Path predatoryJournalsMvFile = Path.of("build", "resources", "main", "journals", "predatory-journals.mv");
Files.createDirectories(predatoryJournalsMvFile.getParent());

try (MVStore store = new MVStore.Builder()
.fileName(predatoryJournalsMvFile.toString())
.compressHigh()
.backgroundExceptionHandler((t, e) -> {
System.err.println("Exception occurred in Thread " + t + "with exception " + e);
e.printStackTrace();
})
.open()) {
MVMap<String, PredatoryJournalInformation> predatoryJournalsMap = store.openMap("PredatoryJournals");

PredatoryJournalListCrawler loader = new PredatoryJournalListCrawler();
Set<PredatoryJournalInformation> predatoryJournals = loader.loadFromOnlineSources();

var resultMap = predatoryJournals.stream().collect(Collectors.toMap(PredatoryJournalInformation::name, Function.identity(),
(predatoryJournalInformation, predatoryJournalInformation2) -> {
if (verbose) {
System.out.println("Double entry " + predatoryJournalInformation.name());
}
return predatoryJournalInformation2;
}));

predatoryJournalsMap.putAll(resultMap);
}
}
}
2 changes: 2 additions & 0 deletions src/main/java/org/jabref/gui/Globals.java
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
import org.jabref.gui.util.DefaultTaskExecutor;
import org.jabref.gui.util.TaskExecutor;
import org.jabref.logic.journals.JournalAbbreviationRepository;
import org.jabref.logic.journals.predatory.PredatoryJournalRepository;
import org.jabref.logic.protectedterms.ProtectedTermsLoader;
import org.jabref.logic.remote.RemotePreferences;
import org.jabref.logic.remote.server.RemoteListenerServerManager;
Expand Down Expand Up @@ -51,6 +52,7 @@ public class Globals {
* Only GUI code is allowed to access it, logic code should use dependency injection.
*/
public static JournalAbbreviationRepository journalAbbreviationRepository;
public static PredatoryJournalRepository predatoryJournalRepository;

/**
* This field is initialized upon startup.
Expand Down
1 change: 1 addition & 0 deletions src/main/java/org/jabref/gui/JabRefFrame.java
Original file line number Diff line number Diff line change
Expand Up @@ -488,6 +488,7 @@ private void initLayout() {
taskExecutor,
dialogService,
Globals.journalAbbreviationRepository,
Globals.predatoryJournalRepository,
entryTypesManager,
undoManager,
Globals.getClipboardManager());
Expand Down
6 changes: 5 additions & 1 deletion src/main/java/org/jabref/gui/MainMenu.java
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@
import org.jabref.logic.importer.IdFetcher;
import org.jabref.logic.importer.WebFetchers;
import org.jabref.logic.journals.JournalAbbreviationRepository;
import org.jabref.logic.journals.predatory.PredatoryJournalRepository;
import org.jabref.logic.l10n.Localization;
import org.jabref.logic.util.OS;
import org.jabref.model.entry.BibEntryTypesManager;
Expand All @@ -82,6 +83,7 @@ public class MainMenu extends MenuBar {
private final TaskExecutor taskExecutor;
private final DialogService dialogService;
private final JournalAbbreviationRepository abbreviationRepository;
private final PredatoryJournalRepository predatoryJournalRepository;
private final BibEntryTypesManager entryTypesManager;
private final UndoManager undoManager;
private final ClipBoardManager clipBoardManager;
Expand All @@ -95,6 +97,7 @@ public MainMenu(JabRefFrame frame,
TaskExecutor taskExecutor,
DialogService dialogService,
JournalAbbreviationRepository abbreviationRepository,
PredatoryJournalRepository predatoryJournalRepository,
BibEntryTypesManager entryTypesManager,
UndoManager undoManager,
ClipBoardManager clipBoardManager) {
Expand All @@ -107,6 +110,7 @@ public MainMenu(JabRefFrame frame,
this.taskExecutor = taskExecutor;
this.dialogService = dialogService;
this.abbreviationRepository = abbreviationRepository;
this.predatoryJournalRepository = predatoryJournalRepository;
this.entryTypesManager = entryTypesManager;
this.undoManager = undoManager;
this.clipBoardManager = clipBoardManager;
Expand Down Expand Up @@ -224,7 +228,7 @@ private void createMenu() {
quality.getItems().addAll(
factory.createMenuItem(StandardActions.FIND_DUPLICATES, new DuplicateSearch(frame, dialogService, stateManager, preferencesService, entryTypesManager, taskExecutor)),
factory.createMenuItem(StandardActions.MERGE_ENTRIES, new MergeEntriesAction(dialogService, stateManager, preferencesService)),
factory.createMenuItem(StandardActions.CHECK_INTEGRITY, new IntegrityCheckAction(frame, preferencesService, dialogService, stateManager, taskExecutor, abbreviationRepository)),
factory.createMenuItem(StandardActions.CHECK_INTEGRITY, new IntegrityCheckAction(frame, preferencesService, dialogService, stateManager, taskExecutor, abbreviationRepository, predatoryJournalRepository)),
factory.createMenuItem(StandardActions.CLEANUP_ENTRIES, new CleanupAction(frame, preferencesService, dialogService, stateManager, taskExecutor)),

new SeparatorMenuItem(),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
import org.jabref.logic.integrity.IntegrityCheck;
import org.jabref.logic.integrity.IntegrityMessage;
import org.jabref.logic.journals.JournalAbbreviationRepository;
import org.jabref.logic.journals.predatory.PredatoryJournalRepository;
import org.jabref.logic.l10n.Localization;
import org.jabref.model.database.BibDatabaseContext;
import org.jabref.model.entry.BibEntry;
Expand All @@ -29,19 +30,22 @@ public class IntegrityCheckAction extends SimpleCommand {
private final PreferencesService preferencesService;
private final StateManager stateManager;
private final JournalAbbreviationRepository abbreviationRepository;
private final PredatoryJournalRepository predatoryJournalRepository;

public IntegrityCheckAction(JabRefFrame frame,
PreferencesService preferencesService,
DialogService dialogService,
StateManager stateManager,
TaskExecutor taskExecutor,
JournalAbbreviationRepository abbreviationRepository) {
JournalAbbreviationRepository abbreviationRepository,
PredatoryJournalRepository predatoryJournalRepository) {
this.frame = frame;
this.stateManager = stateManager;
this.taskExecutor = taskExecutor;
this.preferencesService = preferencesService;
this.dialogService = dialogService;
this.abbreviationRepository = abbreviationRepository;
this.predatoryJournalRepository = predatoryJournalRepository;

this.executable.bind(needsDatabase(this.stateManager));
}
Expand All @@ -53,6 +57,7 @@ public void execute() {
preferencesService.getFilePreferences(),
preferencesService.getCitationKeyPatternPreferences(),
abbreviationRepository,
predatoryJournalRepository,
preferencesService.getEntryEditorPreferences().shouldAllowIntegerEditionBibtex());

Task<List<IntegrityMessage>> task = new Task<>() {
Expand Down
6 changes: 5 additions & 1 deletion src/main/java/org/jabref/logic/integrity/IntegrityCheck.java
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@

import org.jabref.logic.citationkeypattern.CitationKeyPatternPreferences;
import org.jabref.logic.journals.JournalAbbreviationRepository;
import org.jabref.logic.journals.predatory.PredatoryJournalRepository;
import org.jabref.model.database.BibDatabase;
import org.jabref.model.database.BibDatabaseContext;
import org.jabref.model.entry.BibEntry;
Expand All @@ -22,6 +23,7 @@ public IntegrityCheck(BibDatabaseContext bibDatabaseContext,
FilePreferences filePreferences,
CitationKeyPatternPreferences citationKeyPatternPreferences,
JournalAbbreviationRepository journalAbbreviationRepository,
PredatoryJournalRepository predatoryJournalRepository,
boolean allowIntegerEdition) {
this.bibDatabaseContext = bibDatabaseContext;

Expand All @@ -40,7 +42,9 @@ public IntegrityCheck(BibDatabaseContext bibDatabaseContext,
new CitationKeyDuplicationChecker(bibDatabaseContext.getDatabase()),
new AmpersandChecker(),
new LatexIntegrityChecker(),
new JournalInAbbreviationListChecker(StandardField.JOURNAL, journalAbbreviationRepository)
new JournalInAbbreviationListChecker(StandardField.JOURNAL, journalAbbreviationRepository),
new PredatoryJournalChecker(predatoryJournalRepository,
List.of(StandardField.JOURNAL, StandardField.PUBLISHER, StandardField.BOOKTITLE))
));
if (bibDatabaseContext.isBiblatexMode()) {
entryCheckers.add(new UTF8Checker(bibDatabaseContext.getMetaData().getEncoding().orElse(StandardCharsets.UTF_8)));
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
package org.jabref.logic.integrity;

import java.util.List;
import java.util.Objects;

import org.jabref.logic.journals.predatory.PredatoryJournalRepository;
import org.jabref.logic.l10n.Localization;
import org.jabref.model.entry.BibEntry;
import org.jabref.model.entry.field.Field;

public class PredatoryJournalChecker implements EntryChecker {

private final PredatoryJournalRepository predatoryJournalRepository;
private final List<Field> fieldNames;

public PredatoryJournalChecker(PredatoryJournalRepository predatoryJournalRepository, List<Field> fieldsToCheck) {
this.predatoryJournalRepository = Objects.requireNonNull(predatoryJournalRepository);
this.fieldNames = fieldsToCheck;
}

@Override
public List<IntegrityMessage> check(BibEntry entry) {
return entry.getFieldMap().entrySet().stream()
.filter(field -> fieldNames.contains(field.getKey()))
.filter(field -> predatoryJournalRepository.isKnownName(field.getValue()))
.map(field -> new IntegrityMessage(Localization.lang("Predatory journal %0 found", field.getValue()), entry, field.getKey()))
.toList();
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
package org.jabref.logic.journals.predatory;

import java.io.Serializable;

/**
* Represents predatory journal information
*
* @param name The full journal name
* @param abbr Abbreviation, if any
* @param url Url of the journal
*/
public record PredatoryJournalInformation(
String name,
String abbr,
String url) implements Serializable { // must implement @Serializable otherwise MVStore fails
}
Loading

0 comments on commit d03a43c

Please sign in to comment.