Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于InsuranceQA训练语料转,负样本采样 #25

Open
Fyweven opened this issue Oct 12, 2018 · 7 comments
Open

关于InsuranceQA训练语料转,负样本采样 #25

Fyweven opened this issue Oct 12, 2018 · 7 comments

Comments

@Fyweven
Copy link

Fyweven commented Oct 12, 2018

代码中训练数据的获取接口是: utils.gen_train_batch_qpn(train_data, FLAGS.batch_size)
但是在该函数中
def gen_train_batch_qpn(_data, batch_size):
psample = random.sample(_data, batch_size)
nsample = random.sample(_data, batch_size)
q = [s1 for s1, s2 in psample]
qp = [s2 for s1, s2 in psample]
qn = [s2 for s1, s2 in nsample]
return np.array(q), np.array(qp), np.array(qn)
psample和nsample获取方式一样??

@zemu121
Copy link

zemu121 commented Oct 31, 2018

我也有同样的疑问,你明白了吗?

@zemu121
Copy link

zemu121 commented Oct 31, 2018

train_data中只有qp,没有qn吧

@Fyweven
Copy link
Author

Fyweven commented Oct 31, 2018

train_data中只有qp,没有qn吧

直接从所有问题中随机选择了一个,作为负样本

@zemu121
Copy link

zemu121 commented Oct 31, 2018

train中的数据都是正样本,nsample也是从train中随机选取的,所以qn其实也是正确的答案?

@Fyweven
Copy link
Author

Fyweven commented Oct 31, 2018

train中的数据都是正样本,nsample也是从train中随机选取的,所以qn其实也是正确的答案?
并不是,是所有的问题,也有可能采样到正样本,但是概率很低,大概率是qn

@zemu121
Copy link

zemu121 commented Nov 2, 2018

我明白了,非常感谢你的回复。

@zemu121
Copy link

zemu121 commented Nov 6, 2018

如果将模型变成一般处理图像的模型,就是利用小滑窗,多次卷积,max_pooling,你觉得可行吗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants