What is a SentenceSet? #205

tastyminerals · 2016-08-06T14:40:43Z

I do not understand the following SentenceSet paragraph:

A DataSet used for language modeling. 
Takes a sequence of words stored as a tensor of word IDs and a Tensor holding the start index of the sentence of its commensurate word id (the one at the same index). 
Unlike DataSets, for memory efficiency reasons, this class does not store its data in Views. 
However, the outputs of factory methods batch, sub, and index are Batches containing input and target ClassViews.
The returned batch:inputs() are filled according to Google 1-Billion Words guidelines.

So words stored as two tensors, a "tensor of word IDs" and the other as "tensor holding the start index of the sentence of its commensurate word id" what?
I am not even asking why words are stored as two tensors in the first place, because this is even more confusing. Can somebody explain how are words actually stored?

The text was updated successfully, but these errors were encountered:

tastyminerals changed the title ~~What is the DataSet?~~ What is a DataSet? Aug 6, 2016

tastyminerals changed the title ~~What is a DataSet?~~ What is a SentenceSet? Aug 6, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is a SentenceSet? #205

What is a SentenceSet? #205

tastyminerals commented Aug 6, 2016 •

edited

Loading

What is a SentenceSet? #205

What is a SentenceSet? #205

Comments

tastyminerals commented Aug 6, 2016 • edited Loading

tastyminerals commented Aug 6, 2016 •

edited

Loading