Skip to content

Hive 29574 merge join poc#6456

Open
illiabarbashov-sketch wants to merge 2 commits intoapache:masterfrom
illiabarbashov-sketch:HIVE-29574_merge_join_poc
Open

Hive 29574 merge join poc#6456
illiabarbashov-sketch wants to merge 2 commits intoapache:masterfrom
illiabarbashov-sketch:HIVE-29574_merge_join_poc

Conversation

@illiabarbashov-sketch
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

  1. New Hive configurations: hive.merge.join.skew.threshold and hive.merge.join.skew.abort
  2. A new SkewedMergeJoinMonitor class to manage those configurations and log skew join event or abort it.
  3. Unit tests to cover positive and negative cases
  4. Query tests to cover clientpositive and clientnegative test cases

Why are the changes needed?

  1. This feature adds observability to merge join operator and flags the skewed keys
  2. Clients have problem with Skewed Merge Join when the job is stuck and there are no progress and no information about the reasons behind this issue. This feature adds a configuration to abort the stuck job if the threshold is hit.

Does this PR introduce any user-facing change?

No

How was this patch tested?

  • Unit tests
  • Query tests with clientpositive and clientnegative cases

@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants