This repository has been archived by the owner on Mar 30, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 17
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add support for OpenChat 3.5/Zephyr 7B β, improve fallbacks of `repet…
…ition_penalty`, support multiple messages in request body (#82) Signed-off-by: Hung-Han (Henry) Chen <[email protected]>
- Loading branch information
1 parent
d84b08e
commit 0c6e6b4
Showing
7 changed files
with
152 additions
and
51 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -55,7 +55,7 @@ jobs: | |
push: true | ||
tags: | | ||
${{ env.REGISTRY }}/${{ env.REPO_ORG_NAME }}/${{ env.IMAGE_NAME }}:${{ github.sha }} | ||
build-gptq-cuda12-image: | ||
build-gptq-image: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- name: Checkout | ||
|
@@ -74,7 +74,7 @@ jobs: | |
uses: docker/build-push-action@v4 | ||
with: | ||
context: . | ||
file: ./Dockerfile.cuda12 | ||
file: ./Dockerfile.gptq | ||
push: true | ||
tags: | | ||
${{ env.REGISTRY }}/${{ env.REPO_ORG_NAME }}/${{ env.IMAGE_NAME }}:${{ env.GPTQ_IMAGE_TAG }} | ||
|
@@ -124,7 +124,7 @@ jobs: | |
helm install $LLAMA_HELM_RELEASE_NAME -f values.yaml --namespace $HELM_NAMESPACE ./charts/ialacol | ||
echo "Wait for the pod to be ready, it takes about 36s to download a 1.93GB model (~50MB/s)" | ||
sleep 40 | ||
sleep 120 | ||
- if: always() | ||
run: | | ||
kubectl get pods -n $HELM_NAMESPACE | ||
|
@@ -215,7 +215,7 @@ jobs: | |
helm install $GPT_NEOX_HELM_RELEASE_NAME -f values.yaml --namespace $HELM_NAMESPACE ./charts/ialacol | ||
echo "Wait for the pod to be ready, it takes about 36s to download a 1.93GB model (~50MB/s)" | ||
sleep 40 | ||
sleep 120 | ||
- if: always() | ||
run: | | ||
kubectl get pods -n $HELM_NAMESPACE | ||
|
@@ -283,7 +283,7 @@ jobs: | |
helm install $STARCODER_HELM_RELEASE_NAME -f values.yaml --namespace $HELM_NAMESPACE ./charts/ialacol | ||
echo "Wait for the pod to be ready" | ||
sleep 20 | ||
sleep 120 | ||
- if: always() | ||
run: | | ||
kubectl get pods -n $HELM_NAMESPACE | ||
|
@@ -303,7 +303,7 @@ jobs: | |
kubectl logs --tail=200 --selector app.kubernetes.io/name=$STARCODER_HELM_RELEASE_NAME -n $HELM_NAMESPACE | ||
gptq-smoke-test: | ||
runs-on: ubuntu-latest | ||
needs: build-gptq-cuda12-image | ||
needs: build-gptq-image | ||
steps: | ||
- name: Create k8s Kind Cluster | ||
uses: helm/[email protected] | ||
|
@@ -323,7 +323,7 @@ jobs: | |
cat > values.yaml <<EOF | ||
replicas: 1 | ||
deployment: | ||
image: ${{ env.REGISTRY }}/${{ env.REPO_ORG_NAME }}/${{ env.IMAGE_NAME }}:${{ ${{ env.GPTQ_IMAGE_TAG }} | ||
image: ${{ env.REGISTRY }}/${{ env.REPO_ORG_NAME }}/${{ env.IMAGE_NAME }}:${{ env.GPTQ_IMAGE_TAG }} | ||
env: | ||
DEFAULT_MODEL_HG_REPO_ID: $GPTQ_MODEL_HG_REPO_ID | ||
DEFAULT_MODEL_HG_REPO_REVISION: $GPTQ_MODEL_HG_REVISION | ||
|
@@ -347,8 +347,8 @@ jobs: | |
EOF | ||
helm install $GPTQ_HELM_RELEASE_NAME -f values.yaml --namespace $HELM_NAMESPACE ./charts/ialacol | ||
echo "Wait for the pod to be ready, it takes about 36s to download a 1.93GB model (~50MB/s)" | ||
sleep 40 | ||
echo "Wait for the pod to be ready, GPTQ image is around 1GB" | ||
sleep 240 | ||
- if: always() | ||
run: | | ||
kubectl get pods -n $HELM_NAMESPACE | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
apiVersion: v2 | ||
appVersion: 0.12.0 | ||
appVersion: 0.13.0 | ||
description: A Helm chart for ialacol | ||
name: ialacol | ||
type: application | ||
version: 0.12.0 | ||
version: 0.13.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters