Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

[PROTOTYPE - DO NOT MERGE] Opensearch prototype #8

Closed
wants to merge 3 commits into from

Conversation

chouinar
Copy link
Collaborator

This is very, very hacky, and not even remotely production ready - trying to use it, it almost certainly will break in many ways

We'll implement this in several pieces, this is just a proof of concept prototype to test out how the search index works, and the rough level of effort to use it.

Features

  • Opensearch running locally
  • Search index can be populated by a script that loads records from the DB, makes them into JSON
  • API is able to construct queries and process results from the search index
  • Relevancy sorting (the scores are added to the opportunity title)
  • Very rudimentary weighting of the query text

Not implemented

  • Any meaningful configuration
  • Sorting in the search index in any way that isn't relevancy-based
  • Any features beyond our existing search experience

Additional information

Assuming I'm not missing any steps, you should be able to run this locally by running the following:

# Nuke your local DB to be safe
make db-recreate
make init
# Wait for the index to be ready (we'll add a wait later) - The first time this is pretty slow (1-2 minutes). You'll know it works when this URL works - http://localhost:5601/app/dev_tools#/console

# If you've setup poetry to work locally run (this isn't the default):
poetry run python src/adapters/opensearch/populate_search_index.py
# Otherwise modify the 'populate_search_index.py' file - in the get_client function, change the host to "host.docker.internal" and then run this:
docker compose run --rm grants-api poetry run python src/adapters/opensearch/populate_search_index.py

# Start the API / see the logs
make run-logs

If you also run the frontend (npm run dev from the frontend folder) you should be able to use it. For example:
Screenshot 2024-05-15 at 12 53 26 PM

If you'd like to try querying the search index directly, you can use the dev tools console at http://localhost:5601/app/dev_tools#/console - below is a query that you can test out that contains all of the same filters as the API. This may be too specific of a query and you'll likely get few or no results. I suggest removing a few filters to do so. Note that the way this query works is that there is a must section, these queries are used in scoring, and a filters section that don't impact scoring, they just act as true filters on the data.

GET test-opportunity-index/_search
{
  "track_total_hits": true,
  "size": 25,
  "from": 0,
  "query": {
    "bool": {
      "must": [
        {
          "multi_match": {
            "query": "research",
            "fields": [
              "agency^16",
              "opportunity_title^2",
              "opportunity_number^12",
              "summary.summary_description",
              "opportunity_assistance_listings.assistance_listing_number^10",
              "opportunity_assistance_listings.program_title^4"
            ],
            "type": "best_fields",
            "tie_breaker": 0.3
          }
        }
      ],
      "filter": [
        {
          "terms": {
            "agency.keyword": [
              "ARPAH",
              "HHS"
            ]
          }
        },
        {
          "terms": {
            "opportunity_status": [
              "forecasted",
              "posted"
            ]
          }
        },
        {
          "terms": {
            "summary.applicant_types": [
              "state_governments",
              "county_governments",
              "individuals"
            ]
          }
        },
        {
          "terms": {
            "summary.funding_categories": [
              "recovery_act",
              "arts",
              "natural_resources"
            ]
          }
        },
        {
          "terms": {
            "summary.funding_instruments": [
              "cooperative_agreement",
              "grant"
            ]
          }
        }
      ]
    }
  }
}

@chouinar
Copy link
Collaborator Author

chouinar commented Jun 7, 2024

Closing this - this was meant as a prototype and way to share a rough idea of what we might do. As I have either implemented or have a PR out for the actual implementation, no need to keep this around anymore.

@chouinar chouinar closed this Jun 7, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants