-
Notifications
You must be signed in to change notification settings - Fork 5
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
f2a4c9e
commit 5649035
Showing
4 changed files
with
83 additions
and
49 deletions.
There are no files selected for viewing
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
# Core - Amazon Product Search | ||
|
||
## Installation | ||
|
||
```shell | ||
$ pyenv install 3.11.8 | ||
$ pyenv local 3.11.8 | ||
$ pip install poetry | ||
$ poetry env use python | ||
$ poetry install | ||
``` | ||
|
||
The following libraries are necessary for Japanese text processing. | ||
|
||
```shell | ||
# For macOS | ||
$ brew install mecab mecab-ipadic | ||
$ poetry run python -m unidic download | ||
``` | ||
|
||
## Dataset | ||
|
||
Clone https://github.com/amazon-science/esci-data and copy `esci-data/shopping_queries_dataset/*` into `amazon-product/search/data/raw/`. Then, run the following command to preprocess the dataset. | ||
|
||
```shell | ||
$ poetry run inv data.merge-and-split | ||
``` | ||
|
||
## Index Products | ||
|
||
This project involves indexing products into search engines. If you'd like to test it on your own machine, you can start by launching Elasticsearch or Vespa locally. Then, execute the document indexing pipeline against the created index. | ||
|
||
```shell | ||
$ docker compose --profile elasticsearch up | ||
$ poetry run inv es.create-index --index-name=products_jp | ||
$ poetry run inv indexing.feed \ | ||
--index-name=products_jp \ | ||
--locale=jp \ | ||
--dest=es \ | ||
--dest-host=http://localhost:9200 \ | ||
--nrows=10 | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
# Demo - Amazon Product Search | ||
|
||
## Installation | ||
|
||
```shell | ||
$ pyenv install 3.11.8 | ||
$ pyenv local 3.11.8 | ||
$ pip install poetry | ||
$ poetry env use python | ||
$ poetry install | ||
``` | ||
|
||
## Demo | ||
|
||
The command below launches the [Streamlit](https://streamlit.io/) demo app. | ||
|
||
```shell | ||
$ make run_eda | ||
$ make run_tokenization | ||
$ make run_es | ||
$ make run_vespa | ||
``` | ||
|
||
![](https://user-images.githubusercontent.com/883148/203654537-8b495c9c-f8af-4c3f-90f9-60edacf647b9.png) |