Eager

Biomedical text mining and question answering.

Modules

api
Interface definitions.
This package will likely be removed.
core
Common classes and utilities.
docs
Documentation.
elasticsearch Not Used
A placeholder project for eventual indexing and searching with ElasticSearch.
error
Error logging service. Services can send logging message to consolidate error messages in a common location.
indexer
Standalone program for creating the Solr index of PubMed and PubMed Central.
nlp
Standalone service that uses Stanford CoreNLP to perform sentence splitting, tokenization, lemmatization, and part of speech tagging.
preprocess
Process PMC documents to:
1. Extract just the text content to a separate file.
2. Create LIF versions with sentence, token, lemma', and pos` annotations.
3. Create text versions with stop words, punctuation, numbers and symbols removed ready to be processed with word2vec or doc2vec
query
Query processors. Accepts natural language from the user and converts it into a search engine query.
rabbitmq
RabbitMQ messaging services.
ranking
Document ranking algorithms.
retreival
Standalone service for retrieving PubMed or PubMed central documents.
scraper-pubmedmedline
Python script used to download and extract PubMed documents from the NIH FTP server.
solr
Solr configuration files.
test (To be removed)
Experimental programming. This module has nothing to do with actual testing.
upload
Upload service for loading json into Galaxy.
web
Spring Boot application that provides a web user interface and REST API.

Building

Running mvn install in the top level project directory will build all of the Java/Groovy modules, but not all modules are Maven projects.

Building The Web Application

The web project includes a Makefile that can be used to generate the Docker image and push the image to docker.lappsgrid.org.

$> make clean
$> make 
$> make docker
$> make push

Since the web project is a Spring Boot application simply run the jar file:

$> java -Xmx8G -jar eager.jar

Note In the (near) future JMX capabilities will be added which means the start up procedure will change considerably. Check for the presence of a startup.sh script in the root directory of the project.

Services

See the README.md files in each project for instruction on running that module.

The following modules are intended to be run as standalone services:

error Error logging service used to collect error messages in a single location.
nlp Stanford Core NLP processing service.
retrieval Document retrieval service.
upload Galaxy upload service.

All of the above services use RabbitMQ as a message broker. The nlp project has an example Groovy script for submitting documents to the Stanford NLP service for processing.

Applications

The following modules contain standalone programs that are intended to be run from the command line.

indexer Creates the Solr index(es).
[preprocess](preprocess/README.md

)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Eager

Modules

Building

Building The Web Application

Services

Applications

About

Releases

Packages

Contributors 4

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 164 Commits
api		api
core		core
docs		docs
elasticsearch		elasticsearch
error		error
indexer		indexer
nlp		nlp
preprocess		preprocess
query		query
rabbitmq		rabbitmq
ranking		ranking
retrieval		retrieval
scraper-pubmedmedline		scraper-pubmedmedline
solr		solr
test		test
upload		upload
web		web
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pom.xml		pom.xml

License

lappsgrid-incubator/Eager

Folders and files

Latest commit

History

Repository files navigation

Eager

Modules

Building

Building The Web Application

Services

Applications

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages