Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix searching from creators, enable searching from new fields such as funding and add more normalized fields #685

Assignees
Milestone

Comments

@markusjt
Copy link

I'm not sure why searching from creators is not working since it was updated in searchkit where it is specifically included and boosted (used to be just creators^2 when creators was simply a string):

      //create json body for ElasticSearchClient - search query
      mustQuery.push({
        simple_query_string: {
          query: q,
          lenient: true,
          default_operator: "AND",
          fields: [
            "titleStudy^4",
            "abstract^2",
            "creators.name^2",
            "keywords.term^1.5",
            "*"
          ],
          flags: "AND|OR|NOT|PHRASE|PRECEDENCE|PREFIX"
        }
      });

Somehow searching with 'Lundby' still doesn't find this study and same goes for other creators. Sometimes it looks like it's finding a study but then the same name is actually found in related publications.

I also didn't always think about searching while making or reviewing changes but Katja already tried searching with the name of the funder and realized it doesn't work so it should probably be made possible.

I assume we would need to change from:

"funding": {
  "type": "nested",
  "properties": {
    "agency": {
      "type": "keyword",
      "ignore_above": 256
    },
    "grantNumber": {
      "type": "keyword",
      "ignore_above": 256
    }
  }
},

to:

"funding": {
  "type": "nested",
  "properties": {
    "agency": {
      "type": "keyword",
      "ignore_above": 256,
      "copy_to": "fundingSearchField"
    },
    "grantNumber": {
      "type": "keyword",
      "ignore_above": 256,
      "copy_to": "fundingSearchField"
    }
  }
},
"fundingSearchField": {
  "type": "text",
  "analyzer": "pasc_standard_analyzer"
},

Same for any other fields we want to be findable through search but currently aren't. Series information and data kind should already be good though as they already have SearchField in mappings, so from the new fields it just leaves data access (Open/Restricted) and creator identifier. For data access it would make more sense to have it as a filter if we want to make it possible to search by it and for creator identifier, I assume people would rather search by name and not e.g. ORCID iD.

@MortenSikt
Copy link

Not sure how to fix this. But agree that for access we want this as a filter for 3.8.

There might be an interest in using ORCID as an input for searching, but do not think this needs to be fixed quickly in comparison to being able to search in the creator names.

@markusjt markusjt self-assigned this Nov 13, 2024
@markusjt
Copy link
Author

Okay, I'll look into fixing searching by creator name while keeping the boost like it is now. I'll make funding and creator's id searchable through a change in the mappings.

@markusjt markusjt added this to the 3.8.0 milestone Dec 19, 2024
@markusjt markusjt reopened this Jan 22, 2025
@markusjt markusjt changed the title Searching from creators is not working and searching from new fields such as funding Fix searching from creators, enable searching from new fields such as funding and add more normalized fields Jan 22, 2025
@markusjt
Copy link
Author

Renamed to better reflect changes and opened for a bit to add one more normalized field.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment