Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.lang.OutOfMemoryError: Java heap space #479

Open
arezki1990 opened this issue Nov 15, 2017 · 0 comments
Open

java.lang.OutOfMemoryError: Java heap space #479

arezki1990 opened this issue Nov 15, 2017 · 0 comments

Comments

@arezki1990
Copy link

i am trying to build (spark)a cube on kylin with a join between two huge tables , can you help me out with that please
java.lang.OutOfMemoryError: Java heap space
at java.util.IdentityHashMap.resize(IdentityHashMap.java:471)
at java.util.IdentityHashMap.put(IdentityHashMap.java:440)
at org.apache.kylin.dict.TrieDictionaryBuilder.buildTrieBytes(TrieDictionaryBuilder.java:476)
at org.apache.kylin.dict.TrieDictionaryBuilder.build(TrieDictionaryBuilder.java:418)
at org.apache.kylin.dict.lookup.SnapshotTable.takeSnapshot(SnapshotTable.java:98)
at org.apache.kylin.dict.lookup.SnapshotManager.buildSnapshot(SnapshotManager.java:139)
at org.apache.kylin.cube.CubeManager.buildSnapshotTable(CubeManager.java:287)
at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:87)
at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:49)
at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:66)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:62)
at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:125)
at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:64)
at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:125)
at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:144)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

this is my spark config

kylin.env.hadoop-conf-dir=/home/hadoop/apache-kylin-2.2.0-bin/hadoop-conf

Estimate the RDD partition numbers

#kylin.engine.spark.rdd-partition-cut-mb=10

Minimal partition numbers of rdd

#kylin.engine.spark.min-partition=1

Max partition numbers of rdd

#kylin.engine.spark.max-partition=5000

Spark conf (default is in spark/conf/spark-defaults.conf)

#kylin.engine.spark-conf.spark.master=yarn
##kylin.engine.spark-conf.spark.submit.deployMode=cluster
#kylin.engine.spark-conf.spark.yarn.queue=default
kylin.engine.spark-conf.spark.executor.memory=20G
kylin.engine.spark-conf.spark.executor.cores=8
kylin.engine.spark-conf.spark.executor.instances=6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant