Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Flink] Integrate open lineage #532

Merged
merged 5 commits into from
Sep 24, 2024
Merged

Conversation

moresun
Copy link
Contributor

@moresun moresun commented Aug 26, 2024

1 support lineage for kafka、lakesoul datastream source to sink into lakesoul
2 suppport lineage for laksoul datastream source to sink to doris、lakesoul and kafka
3 support lineage for flinl sql submit
4 support yarn and k8s application mode

Close #167

@xuchen-plus xuchen-plus changed the title support flink lineage [Flink] Integrate open lineage Aug 27, 2024
@@ -359,6 +374,12 @@ SPDX-License-Identifier: Apache-2.0
<version>3.3.2</version>
<scope>${local.scope}</scope>
</dependency>
<dependency>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already included

@@ -446,6 +467,7 @@ SPDX-License-Identifier: Apache-2.0
<include>com.ververica:flink-sql-connector-mysql-cdc</include>
<include>com.dmetasoul:lakesoul-common</include>
<include>com.dmetasoul:lakesoul-io-java</include>
<include>org.apache.flink:flink-connector-mongodb</include>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this line

@@ -35,14 +37,53 @@ public class ExecuteSql {

private static final String COMMENT_PATTERN = "(--.*)|(((\\/\\*)+?[\\w\\W]+?(\\*\\/)+))";


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this method still in use? If not remove it

@xuchen-plus xuchen-plus added enhancement New feature or request flink flink support into lakesoul labels Aug 27, 2024
maosen and others added 5 commits September 24, 2024 12:40
…akesoul

2 suppport lineage for laksoul datastream source to sink to doris、lakesoul and kafka
3 support lineage for flinl sql submit
4 support yarn and k8s application mode

Signed-off-by: maosen <[email protected]>
@xuchen-plus xuchen-plus merged commit 5d0b522 into lakesoul-io:main Sep 24, 2024
14 checks passed
Ceng23333 pushed a commit to Ceng23333/LakeSoul that referenced this pull request Nov 13, 2024
* 1 support lineage for kafka、lakesoul datastream source to sink into lakesoul
2 suppport lineage for laksoul datastream source to sink to doris、lakesoul and kafka
3 support lineage for flinl sql submit
4 support yarn and k8s application mode

Signed-off-by: maosen <[email protected]>

* modify lineage dependency

Signed-off-by: maosen <[email protected]>

* add lakesoul lineage listener

Signed-off-by: maosen <[email protected]>

* remove redundant items

Signed-off-by: maosen <[email protected]>

* default to jdk 11 for build

Signed-off-by: chenxu <[email protected]>

---------

Signed-off-by: maosen <[email protected]>
Signed-off-by: chenxu <[email protected]>
Co-authored-by: maosen <[email protected]>
Co-authored-by: chenxu <[email protected]>
Ceng23333 pushed a commit to Ceng23333/LakeSoul that referenced this pull request Nov 13, 2024
MR-title: [Flink] Integrate open lineage (lakesoul-io#532)* 1 support lineage for kafka、lakesoul datastream source to sink into lakesoul2 suppport lineage for laksoul datastream source to sink to doris、lakesoul and kafka3 support lineage for flinl sql submit4 support yarn ...
Created-by: hw_syl_cyh
Author-id: 7155560
MR-id: 7279028
Commit-by: hw_syl_cyh;ChenYunHey;zenghua;hw_syl_zenghua;maosen;fphantam;Ceng;hw_jnj_syl;Xu Chen;moresun
Merged-by: hw_syl_cyh
Description: merge "openlinage" into "merge_main"
[Flink] Integrate open lineage (lakesoul-io#532)

* 1 support lineage for kafka、lakesoul datastream source to sink into lakesoul
2 suppport lineage for laksoul datastream source to sink to doris、lakesoul and kafka
3 support lineage for flinl sql submit
4 support yarn and k8s application mode

Signed-off-by: maosen <[email protected]>

* modify lineage dependency

Signed-off-by: maosen <[email protected]>

* add lakesoul lineage listener

Signed-off-by: maosen <[email protected]>

* remove redundant items

Signed-off-by: maosen <[email protected]>

* default to jdk 11 for build

Signed-off-by: chenxu <[email protected]>

---------

Signed-off-by: maosen <[email protected]>
Signed-off-by: chenxu <[email protected]>
Co-authored-by: maosen <[email protected]>
Co-authored-by: chenxu <[email protected]>,
add kafka lineage into and out of lakesoul,
extract job name for batch schedule
change job runstate from running to completed,
[Flink] Fix arrow sink config (lakesoul-io#546)

* fix arrow sink config

Signed-off-by: chenxu <[email protected]>

* fix ut

Signed-off-by: chenxu <[email protected]>

---------

Signed-off-by: chenxu <[email protected]>
Co-authored-by: chenxu <[email protected]>,
add scheduleTime replacer in flink sql submitter,
fix clean task bug case path is null

Signed-off-by: fphantam <[email protected]>,
[NativeIO/Fix] Add error info of native writer && fix case of aux_sort_cols (lakesoul-io#547)

* add error info of native writer && fix case of aux_sort_cols

Signed-off-by: zenghua <[email protected]>

* fix clippy

Signed-off-by: zenghua <[email protected]>

* do cargo fmt

Signed-off-by: zenghua <[email protected]>

---------

Signed-off-by: zenghua <[email protected]>
Co-authored-by: zenghua <[email protected]>,
add LakeSoulLocalJavaWriter

Signed-off-by: zenghua <[email protected]>,
fix case of native reader timestamp convert

Signed-off-by: zenghua <[email protected]>,
support cdc table

Signed-off-by: zenghua <[email protected]>,
add debug log for LakeSoulLocalJavaWriter

Signed-off-by: zenghua <[email protected]>,
add compaction ut and update transactionCommit interface

Signed-off-by: zenghua <[email protected]>,
compact with file number limit

Signed-off-by: zenghua <[email protected]>,
[WIP]support compaction with changing bucket num

Signed-off-by: zenghua <[email protected]>,
fix changing bucket num

Signed-off-by: zenghua <[email protected]>,
update pg tableinfo with bucketnum,
update assertion of ut

Signed-off-by: zenghua <[email protected]>,
add CompactionTask parameter file_num_limit

Signed-off-by: zenghua <[email protected]>,
fix case of cleanOldCompaction

Signed-off-by: zenghua <[email protected]>,
continue compaction operation when newBucketNum exists

Signed-off-by: fphantam <[email protected]>,
fileNumLimit compaction do not filter 'delete'

Signed-off-by: fphantam <[email protected]>,
fix compaction filter bug

Signed-off-by: fphantam <[email protected]>,
optimize clean sql and compaction task add reconnect function

Signed-off-by: fphantam <[email protected]>,
add limit push down for table source for batch and stream,but not work with order by clause because of flink,
fix out put,
update the build.yml,
update the build.yml,
add compaction paras of size limit

Signed-off-by: zenghua <[email protected]>,
fix LakeSoulFileWriter bucketId

Signed-off-by: zenghua <[email protected]>,
copy compacted file by fs directly instead of spark read&write

Signed-off-by: zenghua <[email protected]>,
fix ci

Signed-off-by: zenghua <[email protected]>,
compaction with file size condition in parallel

Signed-off-by: zenghua <[email protected]>,
..

Signed-off-by: ChenYunHey <[email protected]>

See merge request: 42b369588d84469d95d7b738fc58da8e/LakeSoul/for-nanhang!9
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request flink flink support into lakesoul
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Flink] Add support for generating OpenLineage event in Flink sink job
3 participants