Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

数据中前两列是什么意思? #2

Open
cdj0311 opened this issue Sep 24, 2016 · 12 comments
Open

数据中前两列是什么意思? #2

cdj0311 opened this issue Sep 24, 2016 · 12 comments

Comments

@cdj0311
Copy link

cdj0311 commented Sep 24, 2016

第一列和第二列是什么意思?

@snowlord
Copy link

好像没有用到吧,通过build_vocab()来看

@white127
Copy link
Owner

white127 commented Oct 8, 2016

有些列的数据在这里没有用到,可以忽略

@snowlord
Copy link

@white127 @cdj0311 在validation函数中,我的理解第一列1表示是正例,0表示是反例。第二列表示一个问题。不知道对不对。但是我通过test1.sample进行验证top-1 precition:只有10%。没有达到62%。求解释,是不是test1.sample文件数据不对。

@white127
Copy link
Owner

@snowlord test1.sample数据是采样的很少量数据,不要用这个测试吧,最好用全量的

@snowlord
Copy link

@white127 怎样采样,是从train文件进行采样吗?一个反例要设为多少?或者可以把你用的全量的上传

@white127
Copy link
Owner

@snowlord ,主要是sample中的数据量少,所以你训练出来的模型效果可能会打折扣。正例和负例的数据量可以是1:1左右。全量的数据太大,不好传

@ineWsTut
Copy link

我理解的top1是指,对于某一个问题,比如1个正例答案和499个负例答案,然后每一个去和问题算一个相似度,取最高的相似度吗? 我感觉这样的准确度达不到62%呢,是我理解有问题吗?

@white127
Copy link
Owner

是这样的

在 2016年10月18日 下午8:40,ssdf93 [email protected]写道:

我理解的top1是指,对于某一个问题,比如1个正例答案和499个负例答案,然后每一个去和问题算一个相似度,取最高的相似度吗?
我感觉这样的准确度达不到62%呢,是我理解有问题吗?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#2 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABffUIpdg14-Tu7rIo_FO8NC49eIXjBLks5q1L4qgaJpZM4KFkeV
.

@white127
Copy link
Owner

LSTM-BASED DEEP LEARNING MODELS FOR NON-FACTOID ANSWER SELECTION

2016-10-19 20:57 GMT+08:00 cuixue [email protected]:

请问是哪篇论文的代码


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#2 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABffUBErh63BlVej3MLrim25Z787l4CHks5q1hO1gaJpZM4KFkeV
.

@highven
Copy link

highven commented Nov 14, 2016

作者您好,我在阅读您的基于tensorflow的代码时,有个问题需要请教
您的源代码insqa_train.py似乎并没有使用vectors.nobin的向量化表示
代码截取如下
x_train_1, x_train_2, x_train_3 = insurance_qa_data_helpers.load_data_6(vocab, alist, raw, FLAGS.batch_size)
testList, vectors = insurance_qa_data_helpers.load_test_and_vectors()
vectors = '' vectors被覆盖了
请问这是什么原因

@liuluyeah
Copy link

同问 vectors被覆盖了,vectors.nobin这个文件有用到吗?
testList, vectors = insurance_qa_data_helpers.load_test_and_vectors()
vectors = ''

@white127
Copy link
Owner

这个忘记了,pretrain的词向量在这一份语料上用不用差异好像都不大。随机初始化的词向量也能有比较好的效果

同问 vectors被覆盖了,vectors.nobin这个文件有用到吗?
testList, vectors = insurance_qa_data_helpers.load_test_and_vectors()
vectors = ''

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants