This is a proof of concept for overhauling the Solr External File Field. Main goal is to speed up reading and matching external file field definitons.
Related issues are:
If you have Apache Maven installed, run:
$ mvn clean verify -DskipTests=true
If you have Docker installed, run:
$ docker run -it \
-v $(pwd):$(pwd) -w $(pwd) \
maven:3.8.4-jdk-11 mvn clean verify
Copy all files from target/classes
into your Solr installation's
solr-webapp/webapp/WEB-INF/classes
directory. and
restart Solr.
😟 This is the rough part
- First, build a patched Solr Docker version
$ mvn clean package -DskipTests=true && \
docker build -t solr:8.11.1-patched .
- Then, open a second terminal and launch the patched Solr
$ docker run --rm -it -p 8983:8983 \
-v $(pwd)/target/solr:/var/solr/data \
solr:8.11.1-patched
- Afterwards, create a new core named
eff-test
curl "http://localhost:8983/solr/admin/cores?action=CREATE&name=eff-test&instanceDir=eff-test&config=solrconfig.xml&dataDir=data"
- Create some random boosts by executing the
ExternalFileFieldIntegrationTest#generateBoosts
JUnit test. - Load some 8.8M documents by executing the
ExternalFileFieldIntegrationTest#createTestDocuments
JUnit test.
On my M1 Mac with NVMe SSD discs, the process of loading took 30s in unpatched, sequential loading. With the patched, parallel loading this takes about 15s.