Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Having trouble with chunk_10.zip #4

Open
tanxiaoman opened this issue Jun 5, 2024 · 2 comments
Open

Having trouble with chunk_10.zip #4

tanxiaoman opened this issue Jun 5, 2024 · 2 comments

Comments

@tanxiaoman
Copy link

Hi authors,

Thanks for your detailed codebase. I process the raw data through the preprocessing code you provided, but I cannot get the same data as chunk_10.zip. The raw data I used is from the labeled_csv directory. Is it because the preprocessing code you uploaded is inconsistent with what you actually use? If so, is it possible to update to the correct version?

@BEbillionaireUSD
Copy link
Owner

Hi, have you already tested your processed data? Are the results very different from the paper? There might be slight differences after processing because of the parameter setting of the parsing tools, the way of aligning metrics with logs, but the overall results should be similar.

@tanxiaoman
Copy link
Author

Thank you very much for your answer. Yes, I've tested.The test result is: Test -- f1:0.3948,rc:0.4859,pc:0.3325.
The parameter setting as:
meta_info.json:
{
"faults": [
"2nd_namenode_killed",
"2nd_namenode_suspended",
"cpu_stress",
"datanode_killed",
"datanode_suspended",
"io_stress",
"lose_package",
"master_killed",
"master_suspended",
"memory_stress",
"mq_stress",
"namenode_suspended",
"net_delay",
"net_killed",
"nodemanager_killed",
"nodemanager_suspended",
"resourcemanager_suspended",
"slow_stress",
"vm_stress",
"worker_killed",
"worker_suspended"
],
"metrics": [
"system",
"idle",
"user",
"iowait",
"wkB_s",
"util",
"rkB_s",
"commit",
"memused",
"rxkB_s",
"txkB_s"
],
"standards": [
"normal1",
"normal2",
"normal3",
"normal4",
"normal5",
"normal6",
"normal7"
],
"workloads": [
"aggregation", "als", "bayes", "gbt", "gmm", "graph_pagerank", "join", "kmeans", "lda", "lr",
"nweight", "pca", "repartition", "rf", "scan", "sort", "svd", "svm", "terasort", "web_pagerank", "wordcount"
]
}

seen_wks = ["join", "lda", "pagerank", "web_pagerank", "nweight", "wordcount"]
seen_faults = ['net_delay', 'datanode_killed', 'datanode_suspended', 'mq_stress', 'slow_stress', 'resourcemanager_suspended', 'vm_stress']

test_wks = ['bayes','gbt','sort','pca','svd']

The parameter setting comes from the preprocessing code you provided. Is there any problem with the parameter setting? Or is there anything else I've overlooked?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants