Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PLUGIN-1856] Error management for Wrangler plugin #726

Merged

Conversation

Amit-CloudSufi
Copy link

@Amit-CloudSufi Amit-CloudSufi commented Jan 29, 2025

https://cdap.atlassian.net/browse/PLUGIN-1856

image

[
  {
    "stageName": "Wrangler",
    "errorCategory": "Plugin-'Wrangler'",
    "errorReason": "Error in stage 'Wrangler'. Format of output schema specified is invalid. Please check the format. com.google.gson.stream.MalformedJsonException: Unterminated object at line 3 column 4 path $.type",
    "errorMessage": "Error in stage 'Wrangler'. Format of output schema specified is invalid. Please check the format. com.google.gson.stream.MalformedJsonException: Unterminated object at line 3 column 4 path $.type",
    "errorType": "USER",
    "dependency": "false"
  }
]
2025-01-30 12:36:48,237 - ERROR [spark-submitter-phase-1-bfc7d018-ded8-11ef-8496-0000009adf90:o.a.s.i.i.SparkHadoopWriter@98] - Aborting job job_202501301236463168859500037448187_0005.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0) (192.168.1.100 executor driver): org.apache.spark.SparkException: Task failed while writing rows
	at org.apache.spark.internal.io.SparkHadoopWriter$.executeTask(SparkHadoopWriter.scala:163)
	at org.apache.spark.internal.io.SparkHadoopWriter$.$anonfun$write$1(SparkHadoopWriter.scala:88)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
	at org.apache.spark.scheduler.Task.run(Task.scala:136)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)
Caused by: java.util.concurrent.ExecutionException: Error when transforming stage Wrangler: com.google.common.util.concurrent.UncheckedExecutionException: io.cdap.cdap.api.exception.WrappedStageException: Stage 'Wrangler' encountered : io.cdap.cdap.api.exception.ProgramFailureException: Error in stage 'Wrangler'. Format of output schema specified is invalid. Please check the format. com.google.gson.stream.MalformedJsonException: Unterminated object at line 3 column 4 path $.type
	at io.cdap.cdap.etl.spark.function.TransformFunction.call(TransformFunction.java:61)
	at org.apache.spark.api.java.JavaRDDLike.$anonfun$flatMap$1(JavaRDDLike.scala:125)
	at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:491)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:491)
	at org.apache.spark.internal.io.SparkHadoopWriter$.$anonfun$executeTask$1(SparkHadoopWriter.scala:136)
	at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1538)
	at org.apache.spark.internal.io.SparkHadoopWriter$.executeTask(SparkHadoopWriter.scala:135)
	... 9 more
Loading Directive exception:

image
image

Precondition excdeption:

Screenshot from 2025-02-10 10-45-02

Directive Parsing Exception

Screenshot from 2025-02-10 11-02-06

RecordConvertorException

Screenshot from 2025-02-10 13-05-49

@Amit-CloudSufi Amit-CloudSufi force-pushed the wranglerTransformPlugin branch from df84fa4 to b70d696 Compare January 29, 2025 08:25
@Amit-CloudSufi Amit-CloudSufi changed the title Error management for Wrangler plugin [PLUGIN-1856] Error management for Wrangler plugin Jan 29, 2025
@Amit-CloudSufi Amit-CloudSufi force-pushed the wranglerTransformPlugin branch from f48259e to 97fa44c Compare January 29, 2025 08:41
@Amit-CloudSufi Amit-CloudSufi force-pushed the wranglerTransformPlugin branch from a54195a to d46f401 Compare January 29, 2025 08:57
@Amit-CloudSufi Amit-CloudSufi force-pushed the wranglerTransformPlugin branch from 335bb68 to 1e04c2d Compare January 29, 2025 09:13
@Amit-CloudSufi Amit-CloudSufi force-pushed the wranglerTransformPlugin branch 2 times, most recently from be7d0f8 to 645593d Compare January 29, 2025 09:35
@psainics psainics added the build Triggers unit test build label Jan 29, 2025
Copy link
Contributor

@psainics psainics left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please Fix Unit Tests

@psainics
Copy link
Contributor

E2E Fixed in #727 !

@Amit-CloudSufi Amit-CloudSufi force-pushed the wranglerTransformPlugin branch from cb50c30 to 5465f7e Compare February 10, 2025 13:02
@@ -573,7 +619,9 @@ public Relation transform(RelationalTranformContext relationalTranformContext, R
&& checkPreconditionNotEmpty(true)) {

if (!Feature.WRANGLER_PRECONDITION_SQL.isEnabled(relationalTranformContext)) {
throw new RuntimeException("SQL Precondition feature is not available");
String errorReason = "SQL Precondition feature is not available";
throw WranglerErrorUtil.getProgramFailureExceptionDetailsFromChain(null, errorReason,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar comment here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this method is private, how was it tested?

Copy link
Member

@itsankit-google itsankit-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Please squash commits before merge.

Please remember to cherry-pick the PR in release/4.11.

@Amit-CloudSufi Amit-CloudSufi force-pushed the wranglerTransformPlugin branch from 0293f11 to 7b4619d Compare February 12, 2025 05:19
@Amit-CloudSufi Amit-CloudSufi force-pushed the wranglerTransformPlugin branch from 7b4619d to f26ecd2 Compare February 12, 2025 09:15
@psainics psainics merged commit f9ece97 into data-integrations:develop Feb 12, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Triggers unit test build
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants