-
Notifications
You must be signed in to change notification settings - Fork 842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Phi3poc #2301
base: master
Are you sure you want to change the base?
[WIP] Phi3poc #2301
Conversation
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #2301 +/- ##
==========================================
- Coverage 83.67% 83.41% -0.27%
==========================================
Files 331 331
Lines 17177 17177
Branches 1526 1526
==========================================
- Hits 14373 14328 -45
- Misses 2804 2849 +45 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
core/src/main/python/synapse/ml/llm/HuggingFaceCausallmTransform.py
Outdated
Show resolved
Hide resolved
core/src/main/python/synapse/ml/llm/HuggingFaceCausallmTransform.py
Outdated
Show resolved
Hide resolved
core/src/main/python/synapse/ml/llm/HuggingFaceCausallmTransform.py
Outdated
Show resolved
Hide resolved
self.config.update(kwargs) | ||
|
||
|
||
def camel_to_snake(text): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there might already be one in library to use
"output column", | ||
typeConverter=TypeConverters.toString, | ||
) | ||
modelParam = Param(Params._dummy(), "modelParam", "Model Parameters") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe explain difference between model params and other params (you can just link to other docs if easier)
typeConverter=TypeConverters.toString, | ||
) | ||
modelParam = Param(Params._dummy(), "modelParam", "Model Parameters") | ||
modelConfig = Param(Params._dummy(), "modelConfig", "Model configuration") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe explain difference between model config and other params (you can just link to other docs if easier)
useFabricLakehouse = Param( | ||
Params._dummy(), | ||
"useFabricLakehouse", | ||
"Use FabricLakehouse", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is for a local cache then you might be able to make the verbage generic like useLocalCache
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
return re.sub(r"(?<!^)(?=[A-Z])", "_", text).lower() | ||
|
||
|
||
class ComputableObject: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: rename to _BroadcastableModel
if value is not None: | ||
bc_computable = ComputableObject(value, self.getModelConfig()) | ||
sc = SparkSession.builder.getOrCreate().sparkContext | ||
self.bcObject = sc.broadcast(bc_computable) | ||
else: | ||
self.bcObject = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dont do this here
if self.getCachePath(): | ||
bc_computable = ComputableObject(self.getCachePath(), self.getModelConfig()) | ||
sc = SparkSession.builder.getOrCreate().sparkContext | ||
self.bcObject = sc.broadcast(bc_computable) | ||
else: | ||
self.bcObject = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do this in transform
modelName = Param( | ||
Params._dummy(), | ||
"modelName", | ||
"model name", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might want to link to the list of models on huggingface
cachePath = Param( | ||
Params._dummy(), | ||
"cachePath", | ||
"cache path for the model. could be a lakehouse path", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should mention that this should be a shared location between the workers
deviceMap = Param( | ||
Params._dummy(), | ||
"deviceMap", | ||
"Specifies a model parameter for the device Map. For GPU usage with models such as Phi 3, set it to 'cuda'.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to mention phi3 specifically here
Related Issues/PRs
#xxx
What changes are proposed in this pull request?
Briefly describe the changes included in this Pull Request.
How is this patch tested?
Does this PR change any dependencies?
Does this PR add a new feature? If so, have you added samples on website?
website/docs/documentation
folder.Make sure you choose the correct class
estimators/transformers
and namespace.DocTable
points to correct API link.yarn run start
to make sure the website renders correctly.<!--pytest-codeblocks:cont-->
before each python code blocks to enable auto-tests for python samples.WebsiteSamplesTests
job pass in the pipeline.