Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Full preload example #1041

Closed
benjick opened this issue Nov 19, 2024 · 3 comments
Closed

Full preload example #1041

benjick opened this issue Nov 19, 2024 · 3 comments
Labels
question Further information is requested

Comments

@benjick
Copy link

benjick commented Nov 19, 2024

Question

Hello!

I'm looking for a full "preload model" nodejs example.

Say I do this:

import { env } from '@huggingface/transformers';
env.allowRemoteModels = false;
env.localModelPath = '/path/to/local/models/';

how do I "get" the model to that path? I want to download it when building my docker image

@benjick benjick added the question Further information is requested label Nov 19, 2024
@benjick benjick closed this as completed Nov 19, 2024
@benjick benjick reopened this Nov 19, 2024
@emojiiii
Copy link

@xenova
Copy link
Collaborator

xenova commented Nov 26, 2024

@benjick You can save your model to the following path: '/path/to/local/models/<model_id>'
e.g., '/path/to/local/models/hf-internal-testing/tiny-random-LlamaForCausalLM'

The folder should have the same structure as it is on the Hugging Face hub, e.g.,
image

@benjick
Copy link
Author

benjick commented Nov 26, 2024

Thank you both. I ended up with this script:

import { createWriteStream } from "fs";
import fs from "fs/promises";
import path from "path";

import { env } from "./env.js";

const FILES_TO_DOWNLOAD = [
  "config.json",
  "tokenizer.json",
  "tokenizer_config.json",
  "onnx/model.onnx",
  "onnx/model_quantized.onnx",
] as const;

const downloadFileWithProgress = async ({
  modelId,
  baseFolder,
  filePath,
  onProgress,
}: {
  modelId: string;
  baseFolder: string;
  filePath: string;
  onProgress?: (progress: number) => void;
}) => {
  try {
    const fullPath = path.join(baseFolder, modelId, filePath);
    const tempPath = `${fullPath}.tmp`;

    try {
      await fs.access(fullPath);
      console.log(`${filePath} already exists ✅`);
      return fullPath;
    } catch {
      console.log(`Downloading ${filePath}... 📥`);
    }

    await fs.mkdir(path.dirname(fullPath), { recursive: true });
    const response = await fetch(
      `https://huggingface.co/${modelId}/resolve/main/${filePath}`,
    );

    if (!response.ok) {
      throw new Error(`HTTP error! status: ${response.status}`);
    }

    const totalSize = Number(response.headers.get("content-length"));
    let loadedSize = 0;

    const writer = createWriteStream(tempPath);
    const reader = response.body!.getReader();

    while (true) {
      const { done, value } = await reader.read();

      if (done) break;

      writer.write(value);
      loadedSize += value.length;

      const progress = (loadedSize / totalSize) * 100;
      onProgress?.(progress);
    }

    writer.end();

    await new Promise((resolve) => writer.on("finish", resolve));
    await fs.rename(tempPath, fullPath);

    return fullPath;
  } catch (error) {
    console.error(`Failed to download ${filePath}:`, error);
    throw error;
  }
};

export const downloadModel = async () => {
  try {
    const start = new Date().getTime();
    const results: string[] = [];

    for (const file of FILES_TO_DOWNLOAD) {
      let lastProgressString = "";
      const filePath = await downloadFileWithProgress({
        modelId: env.MODEL,
        baseFolder: env.LOCAL_MODEL_PATH,
        filePath: file,
        onProgress: (progress) => {
          const progressString = `Downloading ${file}: ${progress.toFixed(0)}%`;
          if (progressString !== lastProgressString) {
            process.stdout.write(`\r${progressString}`);
            lastProgressString = progressString;
          }
        },
      });
      console.log(`\n${file} ready at: ${filePath} 🎉`);
      results.push(filePath);
    }

    const end = new Date().getTime();
    console.log(`\nAll files downloaded successfully! ✨`);
    console.log(`Time taken: ${end - start}ms`);
    return results;
  } catch (error) {
    console.error("Download failed", error);
    throw error;
  }
};

if (import.meta.url === new URL(process.argv[1], "file:").href) {
  void downloadModel();
}

@benjick benjick closed this as completed Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants