Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prior answers – how it works in reality? #175

Open
pruzhinskaya opened this issue Apr 15, 2024 · 2 comments
Open

Prior answers – how it works in reality? #175

pruzhinskaya opened this issue Apr 15, 2024 · 2 comments

Comments

@pruzhinskaya
Copy link

Prior answers argument does not allow to re-start from the same step where we stopped the AAD run.

  1. If we stop AAD at some step and then want to re-start from the same step, we suppose to include our previous answers to --prior_answers argument.
    However, the objects that will be proposed to the expert after priors do not correspond to the objects that we would get if we do not stop AAD run at some step.
    So running AAD with budget=100 it is not the same as running AAD with budget=30 and 70 priors taken from the 100-budget run.

  2. Matwey's idea was that the reason is prior_influence argument. We performed a test by setting it to 0.
    I used the same ~70 priors + --prior_influence 0 and run AAD. The first outlier (544212400005781) corresponds to the same outliers as it was without prior_influence, the second outlier (623215300020957) was something new.
    So, we obtained 3 different versions of outputs.

Below the names of datafiles at coin, not to forget:

test_answers_aad_snad4_art_no.csv — 69 меток, на которых у меня оборвался AAD

test2_answers_aad_snad4_art_no.csv — 35 меток, которые я поставила, используя те 69 как priors. Вот команда, которую запускала:

zwaad --random_seed 42 --budget 100 --oid /media/snad/data/features/snad4_art/sid_snad4_r_100.dat --feature /media/snad/data/features/snad4_art/feature_snad4_r_100.dat --feature-name /media/snad/data/features/snad4_art/feature_snad4_r_100.name --answers=/media/maria/data/aad/answers_aad_snad4_art_no.csv --anomalies=/media/maria/data/aad/anomalies_aad_snad4_art_no.csv --prior_answers=/media/maria/data/aad/answers_aad_snad4_art_no.csv aad

answers_aad_snad4_art_no.csv — 100 меток, без всяких priors, первые 69 совпадают с первым оборвавшимся run. Но оставшиеся 31 никак не пересекаются с метками из run с priors.

answers_aad_snad4_art_no_test_v2.csv – answers with --prior_influence 0

@pruzhinskaya
Copy link
Author

I made same test with outher dataset. Result is the same, all files are different.

zwaad --random_seed 42 --budget 10 --n_jobs 20 --oid /media/snad/data/features/dr3/oid_718.dat --feature /media/snad/data/features/dr3/feature_718.dat --feature-name /media/snad/data/features/dr3/feature_718.name --answers=/media/maria/data/aad/test_answers_dr3_bud10.csv --anomalies=/media/maria/data/aad/test_anomalies_dr3_bud10.csv aad

zwaad --random_seed 42 --budget 5 --n_jobs 20 --oid /media/snad/data/features/dr3/oid_718.dat --feature /media/snad/data/features/dr3/feature_718.dat --feature-name /media/snad/data/features/dr3/feature_718.name --answers=/media/maria/data/aad/test_answers_dr3_bud5.csv --anomalies=/media/maria/data/aad/test_anomalies_dr3_bud5.csv aad

zwaad --random_seed 42 --budget 5 --oid /media/snad/data/features/dr3/oid_718.dat --feature /media/snad/data/features/dr3/feature_718.dat --feature-name /media/snad/data/features/dr3/feature_718.name --answers=/media/maria/data/aad/test_answers_dr3_bud5_priors.csv --anomalies=/media/maria/data/aad/test_anomalies_dr3_bud5_priors.csv --prior_answers=/media/maria/data/aad/test_answers_dr3_bud5.csv aad

zwaad --random_seed 42 --budget 5 --n_jobs 20 --oid /media/snad/data/features/dr3/oid_718.dat --feature /media/snad/data/features/dr3/feature_718.dat --feature-name /media/snad/data/features/dr3/feature_718.name --answers=/media/maria/data/aad/test_answers_dr3_bud5_priors_inf0.csv --anomalies=/media/maria/data/aad/test_anomalies_dr3_bud5_priors_inf0.csv --prior_answers=/media/maria/data/aad/test_answers_dr3_bud5.csv aad --prior_influence 0

test_answers_dr3_bud10.csv
test_answers_dr3_bud5.csv
test_answers_dr3_bud5_priors.csv
test_answers_dr3_bud5_priors_inf0.csv

@pruzhinskaya
Copy link
Author

I run these 3 -- still different

zwaad --random_seed 42 --budget 10 --n_jobs 20 --oid /media/snad/data/features/dr3/oid_718.dat --feature /media/snad/data/features/dr3/feature_718.dat --feature-name /media/snad/data/features/dr3/feature_718.name --answers=/media/maria/data/aad/test_answers_dr3_bud10_test2.csv --anomalies=/media/maria/data/aad/test_anomalies_dr3_bud10_test2.csv aad --prior_influence 0

zwaad --random_seed 42 --budget 5 --n_jobs 20 --oid /media/snad/data/features/dr3/oid_718.dat --feature /media/snad/data/features/dr3/feature_718.dat --feature-name /media/snad/data/features/dr3/feature_718.name --answers=/media/maria/data/aad/test_answers_dr3_bud5_test2.csv --anomalies=/media/maria/data/aad/test_anomalies_dr3_bud5_test2.csv aad --prior_influence 0

zwaad --random_seed 42 --budget 5 --n_jobs 20 --oid /media/snad/data/features/dr3/oid_718.dat --feature /media/snad/data/features/dr3/feature_718.dat --feature-name /media/snad/data/features/dr3/feature_718.name --answers=/media/maria/data/aad/test_answers_dr3_bud5_priors_inf0_test2.csv --anomalies=/media/maria/data/aad/test_anomalies_dr3_bud5_priors_inf_test20.csv --prior_answers=/media/maria/data/aad/test_answers_dr3_bud5_test2.csv aad --prior_influence 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant