Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MigrationSequencer for jobs #3008

Merged
merged 106 commits into from
Nov 1, 2024
Merged
Show file tree
Hide file tree
Changes from 105 commits
Commits
Show all changes
106 commits
Select commit Hold shift + click to select a range
9539244
make simple_dependency_resolver available more broadly
ericvergnaud Oct 16, 2024
1740e53
build migration steps for workflow task
ericvergnaud Oct 16, 2024
186ea87
fix pylint warnings
ericvergnaud Oct 16, 2024
f01c4c0
fix pylint warnings
ericvergnaud Oct 16, 2024
6613d6e
add object name
ericvergnaud Oct 16, 2024
0252ed4
populate object owner
ericvergnaud Oct 16, 2024
9f5705a
be more defensive
ericvergnaud Oct 16, 2024
1cee0ce
move last_node_id to sequencer
ericvergnaud Oct 17, 2024
069b4d9
cherry-pick changes
ericvergnaud Oct 17, 2024
2472d35
use existing Ownership classes
ericvergnaud Oct 17, 2024
446fab5
fix merge issues
ericvergnaud Oct 17, 2024
a19f939
move package
ericvergnaud Oct 21, 2024
5bd3b67
improve assert style
ericvergnaud Oct 21, 2024
1087463
formatting
ericvergnaud Oct 21, 2024
77af278
make 'incoming' transient and improve comments
ericvergnaud Oct 21, 2024
8478d26
Lint unit test
JCZuurmond Oct 29, 2024
66e7122
Test all steps in sequence
JCZuurmond Oct 29, 2024
cfb555e
Sort step ids
JCZuurmond Oct 29, 2024
c5a27f4
Add docstring for migration sequencing
JCZuurmond Oct 29, 2024
e7ff130
Format
JCZuurmond Oct 29, 2024
ad8f457
Update docstring
JCZuurmond Oct 29, 2024
00f4dc1
Use priority queue
JCZuurmond Oct 29, 2024
08a676a
Add custom queue
JCZuurmond Oct 29, 2024
29555e3
Add source
JCZuurmond Oct 29, 2024
7c9e770
Fix implementation of custom queue
JCZuurmond Oct 29, 2024
e2be4a3
Update docs
JCZuurmond Oct 29, 2024
652d173
Comment counter
JCZuurmond Oct 29, 2024
d2505af
Add typehint for migration node id
JCZuurmond Oct 29, 2024
4aed3da
Add docstrings to MigrationNodes
JCZuurmond Oct 29, 2024
183eba5
Add TODO
JCZuurmond Oct 29, 2024
ace7ef2
Add docstrings to methods
JCZuurmond Oct 29, 2024
ee42ec3
Add docstrings on MigrationStep attributes
JCZuurmond Oct 29, 2024
3497b1c
Make types more consistent
JCZuurmond Oct 29, 2024
f2324e8
Make migration node frozen
JCZuurmond Oct 29, 2024
7fcaaee
Add generic to priorityqueue
JCZuurmond Oct 29, 2024
ad34395
Fix PriorityQueue type hints
JCZuurmond Oct 29, 2024
c048880
Update docs
JCZuurmond Oct 29, 2024
4ba7408
Resolve TODO
JCZuurmond Oct 29, 2024
9410c84
Add seen migration node set
JCZuurmond Oct 29, 2024
08dfe32
Use ordered steps to compute step number
JCZuurmond Oct 29, 2024
4b84652
Fix create node queue
JCZuurmond Oct 29, 2024
c227165
Remove update from queue
JCZuurmond Oct 29, 2024
0138906
Move mock admin locator to fixture
JCZuurmond Oct 29, 2024
74d115d
Use ResourceDoesNotExists
JCZuurmond Oct 29, 2024
12c6241
Introduce MaybeMigrationNode
JCZuurmond Oct 29, 2024
8b02db1
Test register job cluster
JCZuurmond Oct 29, 2024
f12c25b
Add happy path test for cluster
JCZuurmond Oct 29, 2024
346f253
Rewrite order of workflow task to job
JCZuurmond Oct 29, 2024
f301839
Make register workflow task hidden
JCZuurmond Oct 29, 2024
f1225be
Use itertools.counter
JCZuurmond Oct 29, 2024
a4ac8fa
Move register workflow job up
JCZuurmond Oct 29, 2024
0e07552
Rename to register job
JCZuurmond Oct 29, 2024
9b3b2e6
Make methods hidden
JCZuurmond Oct 29, 2024
23d158b
Rename protected access methods
JCZuurmond Oct 30, 2024
844cc7f
Test job referencing unknown job cluster
JCZuurmond Oct 30, 2024
8261f0f
Test job with existing cluster
JCZuurmond Oct 30, 2024
04d7fe5
Use MaybeMigrationNode when registering job
JCZuurmond Oct 30, 2024
136ee2e
Test register a job with non existing cluster
JCZuurmond Oct 30, 2024
8fc4437
Propagate cluster problems
JCZuurmond Oct 30, 2024
ab40144
Update condition
JCZuurmond Oct 30, 2024
5d93a77
Fix passing job node instead of job
JCZuurmond Oct 30, 2024
a6cbaad
Remove redundant import
JCZuurmond Oct 30, 2024
e8c64d3
fix testing dependency problem
JCZuurmond Oct 30, 2024
6a62ee0
Add docstring to registering job
JCZuurmond Oct 30, 2024
a29d130
Register job cluster
JCZuurmond Oct 30, 2024
6444363
Test register job with non-existing job cluster key
JCZuurmond Oct 30, 2024
e93db45
Test register existing job cluster
JCZuurmond Oct 30, 2024
a0b3a67
Fix missing parameter
JCZuurmond Oct 30, 2024
4869679
Handle non-existing job cluster
JCZuurmond Oct 31, 2024
bc6f7f2
Test for new cluster
JCZuurmond Oct 31, 2024
bea1eb1
Fix sequencing
JCZuurmond Oct 31, 2024
26b48c0
Fix parent object id
JCZuurmond Oct 31, 2024
04e2943
Update sequencing
JCZuurmond Oct 31, 2024
7e577e1
Add type hints for problems
JCZuurmond Oct 31, 2024
d431702
Explain why not handling new cluster
JCZuurmond Oct 31, 2024
5f2d07b
Format
JCZuurmond Oct 31, 2024
8f85d69
Sequence with new cluster
JCZuurmond Oct 31, 2024
d25664c
Add job cluster to nodes
JCZuurmond Oct 31, 2024
698e338
Make job cluster id unique
JCZuurmond Oct 31, 2024
5551c26
Make register signature similar
JCZuurmond Oct 31, 2024
cb22042
Explain outgoing references
JCZuurmond Oct 31, 2024
cfc23c9
Remove TODO
JCZuurmond Oct 31, 2024
6feebfa
Update docs and naming
JCZuurmond Oct 31, 2024
ae059a0
Update tests docs
JCZuurmond Oct 31, 2024
2b90951
Test sequence job with task referencing job cluster
JCZuurmond Oct 31, 2024
d453f53
Fix test sequence in docstring
JCZuurmond Oct 31, 2024
5524191
Add test when referencing non-existing cluster
JCZuurmond Oct 31, 2024
8d40f52
Remove unused fixtures
JCZuurmond Oct 31, 2024
6d757bb
Remove redundant comment and fix docstring
JCZuurmond Oct 31, 2024
a4ace82
Test task dependency
JCZuurmond Oct 31, 2024
742694f
Sequence tasks with dependencies
JCZuurmond Oct 31, 2024
f14a0f5
Add TODOs for register workflow task
JCZuurmond Oct 31, 2024
d093a47
Add todo for register cluster
JCZuurmond Oct 31, 2024
26f060b
Add migration sequencer to RuntimeContext
JCZuurmond Oct 31, 2024
418718f
Add integration test for simple job
JCZuurmond Oct 31, 2024
c55c821
Test sequencing job with task referencing existing cluster
JCZuurmond Oct 31, 2024
23bec09
Test sequencing job with task referencing non existing cluster
JCZuurmond Oct 31, 2024
b2c1671
Format
JCZuurmond Oct 31, 2024
0392cae
Test sequence job with task dependency
JCZuurmond Oct 31, 2024
0137e85
Test non-existing task dependency
JCZuurmond Oct 31, 2024
5278398
Return early when no job.settings
JCZuurmond Oct 31, 2024
df45c53
Fix typo's
JCZuurmond Oct 31, 2024
1ea02f7
Rename administrator locator
JCZuurmond Nov 1, 2024
d26916b
Do not expect cluster in sequence steps when not found
JCZuurmond Nov 1, 2024
ba8b4c6
Refactor register_job to register_jobs
JCZuurmond Nov 1, 2024
9842f64
Remove unused import
JCZuurmond Nov 1, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 14 additions & 10 deletions src/databricks/labs/ucx/assessment/clusters.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,18 @@ class ClusterInfo:

__id_attributes__: ClassVar[tuple[str, ...]] = ("cluster_id",)

@classmethod
def from_cluster_details(cls, details: ClusterDetails):
return ClusterInfo(
cluster_id=details.cluster_id if details.cluster_id else "",
cluster_name=details.cluster_name,
policy_id=details.policy_id,
spark_version=details.spark_version,
creator=details.creator_user_name or None,
success=1,
failures="[]",
)


class CheckClusterMixin(CheckInitScriptMixin):
_ws: WorkspaceClient
Expand Down Expand Up @@ -156,7 +168,7 @@ def _crawl(self) -> Iterable[ClusterInfo]:
all_clusters = list(self._ws.clusters.list())
return list(self._assess_clusters(all_clusters))

def _assess_clusters(self, all_clusters):
def _assess_clusters(self, all_clusters: Iterable[ClusterDetails]):
for cluster in all_clusters:
if cluster.cluster_source == ClusterSource.JOB:
continue
Expand All @@ -166,15 +178,7 @@ def _assess_clusters(self, all_clusters):
f"Cluster {cluster.cluster_id} have Unknown creator, it means that the original creator "
f"has been deleted and should be re-created"
)
cluster_info = ClusterInfo(
cluster_id=cluster.cluster_id if cluster.cluster_id else "",
cluster_name=cluster.cluster_name,
policy_id=cluster.policy_id,
spark_version=cluster.spark_version,
creator=creator,
success=1,
failures="[]",
)
cluster_info = ClusterInfo.from_cluster_details(cluster)
failures = self._check_cluster_failures(cluster, "cluster")
if len(failures) > 0:
cluster_info.success = 0
Expand Down
24 changes: 13 additions & 11 deletions src/databricks/labs/ucx/assessment/jobs.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
RunType,
SparkJarTask,
SqlTask,
Job,
)

from databricks.labs.ucx.assessment.clusters import CheckClusterMixin
Expand All @@ -43,6 +44,17 @@ class JobInfo:

__id_attributes__: ClassVar[tuple[str, ...]] = ("job_id",)

@classmethod
def from_job(cls, job: Job):
job_name = job.settings.name if job.settings and job.settings.name else "Unknown"
return JobInfo(
job_id=str(job.job_id),
success=1,
failures="[]",
job_name=job_name,
creator=job.creator_user_name or None,
)


class JobsMixin:
@classmethod
Expand Down Expand Up @@ -127,17 +139,7 @@ def _prepare(all_jobs) -> tuple[dict[int, set[str]], dict[int, JobInfo]]:
job_settings = job.settings
if not job_settings:
continue
job_name = job_settings.name
if not job_name:
job_name = "Unknown"

job_details[job.job_id] = JobInfo(
job_id=str(job.job_id),
job_name=job_name,
creator=creator_user_name,
success=1,
failures="[]",
)
job_details[job.job_id] = JobInfo.from_job(job)
return job_assessment, job_details

def _try_fetch(self) -> Iterable[JobInfo]:
Expand Down
Loading
Loading