libraries-json configuration not working #26

JamesBorg · 2022-11-03T00:59:02Z

Trying to have a library installed with the created cluster but am running into the following error.

Error: {"error_code":"MALFORMED_REQUEST","message":"Could not parse request object: Expected 'START_OBJECT' not 'VALUE_STRING'\n at [Source: (ByteArrayInputStream); line: 1, column: 405]\n at [Source: java.io.ByteArrayInputStream@c6f971a; line: 1, column: 405]"}

Here is a copy of the workflow configuration:

name: Databricks notebook running test
on:
  workflow_dispatch:
  push:

env:
  DATABRICKS_HOST: https://******************.azuredatabricks.net
  NODE_TYPE_ID: Standard_NC6s_v3
  GITHUB_TOKEN: ${{ secrets.REPO_TOKEN }}

jobs:
  databricks_notebook_test:
    runs-on: ubuntu-20.04
    steps:
      - name: Checkout repo
        uses: actions/checkout@v3
      - name: Generate AAD Token
        run: ./.github/workflows/scripts/generate-aad-token.sh ${{ secrets.AZURE_SP_TENANT_ID }} ${{ secrets.AZURE_SP_APPLICATION_ID }} ${{ secrets.AZURE_SP_CLIENT_SECRET }}
      - name: Train model
        uses: databricks/run-notebook@v0
        id: train
        with:
          local-notebook-path: notebooks/test.py
          git-commit: ${{ github.event.pull_request.head.sha || github.sha}}
          libraries-json: >
            [
              { "pypi": "accelerate" }
            ]
          new-cluster-json: >
            {
              "spark_version": "11.1.x-gpu-ml-scala2.12",
              "num_workers": 0,
              "spark_conf": {
                "spark.databricks.cluster.profile": "singleNode",
                "spark.master": "local[*, 4]",
                "spark.databricks.delta.preview.enabled": "true"
              },
              "node_type_id": "${{ env.NODE_TYPE_ID }}",
              "custom_tags": {
                "ResourceClass": "SingleNode"
              }
            }
          access-control-list-json: >
            [
              {
                "group_name": "users",
                "permission_level": "CAN_VIEW"
              }
            ]
          run-name: testing github triggering of databricks notebook

The workflow runs through fine with the libraries-json configuration removed (and the necessary library installed within the triggered notebook.)

Is this a bug? Or am I misunderstanding how libraries-json can be used?

The text was updated successfully, but these errors were encountered:

JamesBorg · 2022-11-03T02:22:52Z

Thanks to @vladimirk-db who provided me the solution.

Needed to modify to:

{ "pypi": { "package": "accelerate" } }

Perhaps the README should be updated to reflect this?

motya770 · 2023-09-05T08:18:11Z

Also #46

benoitmiserez · 2024-04-15T14:25:59Z

Updated in #52

benoitmiserez mentioned this issue Apr 15, 2024

Updated libraries definition in readme. #52

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

libraries-json configuration not working #26

libraries-json configuration not working #26

JamesBorg commented Nov 3, 2022

JamesBorg commented Nov 3, 2022

motya770 commented Sep 5, 2023

benoitmiserez commented Apr 15, 2024

libraries-json configuration not working #26

libraries-json configuration not working #26

Comments

JamesBorg commented Nov 3, 2022

JamesBorg commented Nov 3, 2022

motya770 commented Sep 5, 2023

benoitmiserez commented Apr 15, 2024