Release DeepDive 0.0.3-alpha.1 · HazyResearch/deepdive

Changelog for version 0.0.3-alpha.1 (05/25/2014)

Updated example walkthrough and spouse_example code
Added python utility ddlib for text manipulation (need exporting PYTHONPATH, usage see its pydoc)
Added utility script util/extractor_input_writer.py to sample extractor inputs
Updated nlp_extractor format (use sentence_offset, textual sentence_id)
Cleaned up unused datastore code
Update templates
Bug fixes

Changelog for version 0.0.3-alpha (05/07/2014)

Non-backward-compatible syntax change: Developers must include id column with type bigint in any table containing variables, but they MUST NOT use this column anywhere. This column is preserved for learning and inference, and all values will be erased and reassigned in grounding phase.
Non-backward-compatible functionality change: DeepDive is no longer responsible for any automatic assignment of sequential variable IDs. You may use examples/spouse_example/scripts/fill_sequence.sh for this task.
Updated dependency requirement: requires JDK 7 or higher.
Supported four new types of extractors. See documentation for details:
Even faster factor graph grounding and serialization using better optimized SQL.
The previous default Java sampler is no longer supported. Made C++ sampler as the default sampler.
New configuration supported: pipeline.relearn_from to skip extraction and grounding, only perform learning and inference with a previous version. Useful for tuning sampler arguments.
New configuration supported: inference.skip_learning to use weights learned in the last execution.
New configuration supported: inference.weight_table to fix factor weights in a table and skip learning. The table is specified by factor description and weights. This table can be results from one execution of DeepDive, or manually assigned, or a combination of them. It is useful for learning once and using learned model for later inference tasks.
Supported manual holdout by a holdout query.
Updated spouse_example with implementations in different styles of extractors.
The nlp_extractor example has changed table requirements and usage. See HERE.
In db.default configuration, users should define dbname, host, port and user. If not defined, by default system will use environmental variables DBNAME,PGHOST, PGPORT and PGUSER accordingly.
Fixed all examples.
Updated documentation.
Print SQL query execution plans for extractor inputs.
Skip grounding, learning and inference if no factors are active.
If using GreenPlum, users should add DISTRIBUTED BY clause in all CREATE TABLE commands. Do not use variable id as distribution key. Do not use distribution key that is not initially assigned.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DeepDive 0.0.3-alpha.1

Changelog for version 0.0.3-alpha.1 (05/25/2014)

Changelog for version 0.0.3-alpha (05/07/2014)