Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "Preserve case for RowType's field name and JSON content when … #21864

Closed
wants to merge 63 commits into from

Conversation

NikhilCollooru
Copy link
Contributor

Reverts the PR 21602
#21602

== NO RELEASE NOTE ==

arhimondr and others added 30 commits January 24, 2024 12:58
When process is terminated with core wait for termination
The current join nodes are in presto-main. Moving them requires moving a
lot of objets.

Since we don't need to allow the connectos to optimize the joins for now
we can just add an adapter node that will be converted back to the
internal node type when returned by connector optimzers.
Support for $path and $data_sequence_number hidden columns.

The first is generally useful and the second is a requirement for
implementing equality deletes as a join.
The current equality delete implementation applies deletes at the split
level.

Since equality deletes often apply to a lot of files, the current
implementation ends up opening the delete files #splits * #delete_files
times.

This commit implements equality deletes as a join. A connector optimizer
is added to apply the appropriate join(s).
…ffer manager

Add http request active check when fetch data from output buffer manager
in Velox. The active check is based on whether the http callstate has been destroyed
or the associated request has expired. This is to avoid arbitrary output buffer
to load data into a destination buffer which has set notify but the associated client
request has expired. This helps to accelerate the shuffle for query with scale writer
which uses arbitrary output buffer.

Unit test is added to verify this behavior.
Point the original endpoint to /v1/task/async
Add task runtime stats to collect memory reclaim stats during a task execution.
This helps to debug how much time spent on task memory reclaim of a slow query
execution. If the memory reclaim is triggered by the task itself, then the time will
also be counted in task cpu execution time. If not, it is counted in the task's scheduled
time but not execution time as the task will be stopped before memory reclamation.
To reduce likelihood of running into port allocation conflicts
partitions are a coordinator-only parameter, therefore can be ignored when serializing
…ck for too long

Also detect deadlock or starving and signal alerts if these happen.
- Implement HTTP request retries in PrestoSparkHttpTaskClient once and
  reuse for all kind of Task requests
- Do not retry PrestoException and invalid responses by the server as
  they usually indicate programming errors and not communication errors
  (rely SimpleHttpResponseHandler to mimic Presto coordinator logic)
Avoid using scheduled executors to run callbacks as it may run out of
threads and have none to schedule tasks
ScheduledThreadPoolExecutor is a fixed size thread pool and it never
grows
Presto doesn't have PartitionedOutputNode and assigns its source node's plan
node id to PartitionedOutputOperator.
so Presto UI is expecting PartitionedOutputOperator having same node id as its
source operator.
However, Velox has PartitionedOutputNode whose plan node is is set to "root".

To comply with original assumption, fix partitionedOutputNode plan node id to be
"root.<source plan node ID>" and parse them back to source plan node ID in task
stats reporting.

Issue: #21741
feilong-liu and others added 19 commits January 31, 2024 11:05
We want to use runtimestats in DirectoryLister implementation to
measure different runtime metrics
This was missed in the recent dependency update.
This has been moved to the corresponding Velox setup scripts
This was required for the antlr dependency which is now removed.
RE2 is installed in the Velox centos setup script
The six package is not used in Prestissimo
@NikhilCollooru NikhilCollooru requested review from presto-oss and removed request for a team February 5, 2024 18:14
@NikhilCollooru NikhilCollooru deleted the revert-21602-master branch February 5, 2024 18:15
Copy link

github-actions bot commented Feb 5, 2024

Codenotify: Notifying subscribers in CODENOTIFY files for diff c192fea...25c5208.

Notify File(s)
@steveburnett presto-docs/src/main/sphinx/connector/iceberg.rst
presto-docs/src/main/sphinx/installation.rst
presto-docs/src/main/sphinx/installation/deploy-helm.rst
presto-docs/src/main/sphinx/language/types.rst

@sdruzkin
Copy link
Collaborator

sdruzkin commented Feb 5, 2024

@NikhilCollooru looks like you reverted a bunch of other PRs with this 4 line PR. Was it intentional?

@ajaygeorge
Copy link
Contributor

@sdruzkin doesn't look like this PR was merged eventually. Are you seeing your commits reverted on master?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.