-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Flink-27826] Support training very high dimensional logistic regression #237
Conversation
d57c3de
to
c8cf098
Compare
81b8391
to
3138661
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. Left some comments below.
flink-ml-lib/src/main/java/org/apache/flink/ml/common/updater/ModelUpdater.java
Outdated
Show resolved
Hide resolved
flink-ml-lib/src/main/java/org/apache/flink/ml/common/updater/FTRL.java
Outdated
Show resolved
Hide resolved
flink-ml-lib/src/main/java/org/apache/flink/ml/common/updater/ModelUpdater.java
Outdated
Show resolved
Hide resolved
flink-ml-lib/src/main/java/org/apache/flink/ml/common/updater/ModelUpdater.java
Outdated
Show resolved
Hide resolved
...n/java/org/apache/flink/ml/classification/logisticregression/LogisticRegressionWithFtrl.java
Show resolved
Hide resolved
flink-ml-lib/src/main/java/org/apache/flink/ml/common/ps/message/MessageType.java
Outdated
Show resolved
Hide resolved
flink-ml-lib/src/main/java/org/apache/flink/ml/common/ps/message/ZerosToPushM.java
Outdated
Show resolved
Hide resolved
flink-ml-lib/src/main/java/org/apache/flink/ml/common/ps/message/IndicesToPullM.java
Outdated
Show resolved
Hide resolved
flink-ml-lib/src/main/java/org/apache/flink/ml/common/ps/message/MessageUtils.java
Outdated
Show resolved
Hide resolved
flink-ml-lib/src/main/java/org/apache/flink/ml/common/ps/MirrorWorkerOperator.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. Left some comments below.
Hi, @zhipeng93 . As you have mentioned the proposed solution in this PR could be an alternative to implement GBDT in #210, can we make the target applications/scenarios clearer in the description of this PR? Although the description states the infrastructure can be reused by other ML algorithms, the infra codes seem not easy to be adapted. Many APIs/enums/codes in the infra codes are specifically designed/hard-coded for gradient-based algorithms, like If the PS infra is targeting to algorithms more than gradient-based algorithms, the infra APIs may need to reflect the considerations on those cases. |
This PR introduces too many changes (i.e., communication infra, vectors, algorithms) and it is hard to review. I will open another PR [1] to introduce the communication infra first and address the comments raised in this PR. [1] #251 |
What is the purpose of the change
This PR aims to (1) support training high dimensional logistic regression models with a parameter-server style infrasture, with flink-ml-iteration infra. (2) Abstract infrasture that could be reused by other machine learning algorithms.
Brief change log
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: (no)Documentation