Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ISSUE: Import error during setup #14

Open
bgereke opened this issue Nov 13, 2024 · 1 comment
Open

ISSUE: Import error during setup #14

bgereke opened this issue Nov 13, 2024 · 1 comment

Comments

@bgereke
Copy link

bgereke commented Nov 13, 2024

In attempting to follow the setup in the README, am able to successfully call:

poetry poe local-infrastructure-up 

Can then access the ZenML dashboard. However, none of the pipelines show up in the dashboard.

Attempting to run a pipeline like:

poetry poe run-digital-data-etl-paul

produces the following stack trace:

2024-11-13 15:34:17.646 | INFO | llm_engineering.settings:load_settings:94 - Loading settings from the ZenML secret store.
2024-11-13 15:34:17.882 | WARNING | llm_engineering.settings:load_settings:99 - Failed to load settings from the ZenML secret store. Defaulting to loading the settings from the '.env' file.
2024-11-13 15:34:17.934 | INFO | llm_engineering.infrastructure.db.mongo:new:20 - Connection to MongoDB with URI successful: mongodb://llm_engineering:[email protected]:27017
PyTorch version 2.5.1 available.
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in run_module_as_main:198 │
│ in run_code:88 │
│ │
│ /Users/briangereke/Projects/LLM-Engineers-Handbook/tools/run.py:7 in │
│ │
│ 4 import click │
│ 5 from loguru import logger │
│ 6 │
│ ❱ 7 from llm_engineering import settings │
│ 8 from pipelines import ( │
│ 9 │ digital_data_etl, │
│ 10 │ end_to_end_data, │
│ │
│ /Users/briangereke/Projects/LLM-Engineers-Handbook/llm_engineering/init.py:1 in │
│ │
│ ❱ 1 from llm_engineering import application, domain, infrastructure │
│ 2 from llm_engineering.settings import settings │
│ 3 │
│ 4 all = ["settings", "application", "domain", "infrastructure"] │
│ │
│ /Users/briangereke/Projects/LLM-Engineers-Handbook/llm_engineering/application/init.py:1 in │
│ │
│ │
│ ❱ 1 from . import utils │
│ 2 │
│ 3 all = ["utils"] │
│ 4 │
│ │
│ /Users/briangereke/Projects/LLM-Engineers-Handbook/llm_engineering/application/utils/init.py │
│ :2 in │
│ │
│ 1 from . import misc │
│ ❱ 2 from .split_user_full_name import split_user_full_name │
│ 3 │
│ 4 all = ["misc", "split_user_full_name"] │
│ 5 │
│ │
│ /Users/briangereke/Projects/LLM-Engineers-Handbook/llm_engineering/application/utils/split_user

│ full_name.py:1 in │
│ │
│ ❱ 1 from llm_engineering.domain.exceptions import ImproperlyConfigured │
│ 2 │
│ 3 │
│ 4 def split_user_full_name(user: str | None) -> tuple[str, str]: │
│ │
│ /Users/briangereke/Projects/LLM-Engineers-Handbook/llm_engineering/domain/init.py:1 in │
│ │
│ │
│ ❱ 1 from . import base, chunks, cleaned_documents, dataset, documents, embedded_chunks, exce │
│ 2 │
│ 3 all = [ │
│ 4 │ "base", │
│ │
│ /Users/briangereke/Projects/LLM-Engineers-Handbook/llm_engineering/domain/base/init.py:2 in │
│ │
│ │
│ 1 from .nosql import NoSQLBaseDocument │
│ ❱ 2 from .vector import VectorBaseDocument │
│ 3 │
│ 4 all = ["NoSQLBaseDocument", "VectorBaseDocument"] │
│ 5 │
│ │
│ /Users/briangereke/Projects/LLM-Engineers-Handbook/llm_engineering/domain/base/vector.py:13 in │
│ │
│ │
│ 10 from qdrant_client.http.models import Distance, VectorParams │
│ 11 from qdrant_client.models import CollectionInfo, PointStruct, Record │
│ 12 │
│ ❱ 13 from llm_engineering.application.networks.embeddings import EmbeddingModelSingleton │
│ 14 from llm_engineering.domain.exceptions import ImproperlyConfigured │
│ 15 from llm_engineering.domain.types import DataCategory │
│ 16 from llm_engineering.infrastructure.db.qdrant import connection │
│ │
│ /Users/briangereke/Projects/LLM-Engineers-Handbook/llm_engineering/application/networks/init
│ .py:1 in │
│ │
│ ❱ 1 from .embeddings import CrossEncoderModelSingleton, EmbeddingModelSingleton │
│ 2 │
│ 3 all = ["EmbeddingModelSingleton", "CrossEncoderModelSingleton"] │
│ 4 │
│ │
│ /Users/briangereke/Projects/LLM-Engineers-Handbook/llm_engineering/application/networks/embeddin │
│ gs.py:8 in │
│ │
│ 5 import numpy as np │
│ 6 from loguru import logger │
│ 7 from numpy.typing import NDArray │
│ ❱ 8 from sentence_transformers.SentenceTransformer import SentenceTransformer │
│ 9 from sentence_transformers.cross_encoder import CrossEncoder │
│ 10 from transformers import AutoTokenizer │
│ 11 │
│ │
│ /Users/briangereke/Library/Caches/pypoetry/virtualenvs/llm-engineering-29meo7TT-py3.11/lib/pytho │
│ n3.11/site-packages/sentence_transformers/init.py:14 in │
│ │
│ 11 │ export_optimized_onnx_model, │
│ 12 │ export_static_quantized_openvino_model, │
│ 13 ) │
│ ❱ 14 from sentence_transformers.cross_encoder.CrossEncoder import CrossEncoder │
│ 15 from sentence_transformers.datasets import ParallelSentencesDataset, SentencesDataset │
│ 16 from sentence_transformers.LoggingHandler import LoggingHandler │
│ 17 from sentence_transformers.model_card import SentenceTransformerModelCardData │
│ │
│ /Users/briangereke/Library/Caches/pypoetry/virtualenvs/llm-engineering-29meo7TT-py3.11/lib/pytho │
│ n3.11/site-packages/sentence_transformers/cross_encoder/init.py:3 in │
│ │
│ 1 from future import annotations │
│ 2 │
│ ❱ 3 from .CrossEncoder import CrossEncoder │
│ 4 │
│ 5 all = ["CrossEncoder"] │
│ 6 │
│ │
│ /Users/briangereke/Library/Caches/pypoetry/virtualenvs/llm-engineering-29meo7TT-py3.11/lib/pytho │
│ n3.11/site-packages/sentence_transformers/cross_encoder/CrossEncoder.py:18 in │
│ │
│ 15 from transformers.tokenization_utils_base import BatchEncoding │
│ 16 from transformers.utils import PushToHubMixin │
│ 17 │
│ ❱ 18 from sentence_transformers.evaluation.SentenceEvaluator import SentenceEvaluator │
│ 19 from sentence_transformers.readers import InputExample │
│ 20 from sentence_transformers.SentenceTransformer import SentenceTransformer │
│ 21 from sentence_transformers.util import fullname, get_device_name, import_from_string │
│ │
│ /Users/briangereke/Library/Caches/pypoetry/virtualenvs/llm-engineering-29meo7TT-py3.11/lib/pytho │
│ n3.11/site-packages/sentence_transformers/evaluation/init.py:9 in │
│ │
│ 6 from .LabelAccuracyEvaluator import LabelAccuracyEvaluator │
│ 7 from .MSEEvaluator import MSEEvaluator │
│ 8 from .MSEEvaluatorFromDataFrame import MSEEvaluatorFromDataFrame │
│ ❱ 9 from .NanoBEIREvaluator import NanoBEIREvaluator │
│ 10 from .ParaphraseMiningEvaluator import ParaphraseMiningEvaluator │
│ 11 from .RerankingEvaluator import RerankingEvaluator │
│ 12 from .SentenceEvaluator import SentenceEvaluator │
│ │
│ /Users/briangereke/Library/Caches/pypoetry/virtualenvs/llm-engineering-29meo7TT-py3.11/lib/pytho │
│ n3.11/site-packages/sentence_transformers/evaluation/NanoBEIREvaluator.py:11 in │
│ │
│ 8 from torch import Tensor │
│ 9 from tqdm import tqdm │
│ 10 │
│ ❱ 11 from sentence_transformers import SentenceTransformer │
│ 12 from sentence_transformers.evaluation.InformationRetrievalEvaluator import InformationRe │
│ 13 from sentence_transformers.evaluation.SentenceEvaluator import SentenceEvaluator │
│ 14 from sentence_transformers.similarity_functions import SimilarityFunction │
│ │
│ /Users/briangereke/Library/Caches/pypoetry/virtualenvs/llm-engineering-29meo7TT-py3.11/lib/pytho │
│ n3.11/site-packages/sentence_transformers/SentenceTransformer.py:33 in │
│ │
│ 30 from transformers import is_torch_npu_available │
│ 31 from transformers.dynamic_module_utils import get_class_from_dynamic_module, get_relativ │
│ 32 │
│ ❱ 33 from sentence_transformers.model_card import SentenceTransformerModelCardData, generate

│ 34 from sentence_transformers.similarity_functions import SimilarityFunction │
│ 35 │
│ 36 from . import MODEL_HUB_ORGANIZATION, version
│ │
│ /Users/briangereke/Library/Caches/pypoetry/virtualenvs/llm-engineering-29meo7TT-py3.11/lib/pytho │
│ n3.11/site-packages/sentence_transformers/model_card.py:35 in │
│ │
│ 32 from sentence_transformers.util import fullname, is_accelerate_available, is_datasets_av │
│ 33 │
│ 34 if is_datasets_available(): │
│ ❱ 35 │ from datasets import Dataset, DatasetDict, IterableDataset, Value │
│ 36 │
│ 37 logger = logging.getLogger(name) │
│ 38 │
│ │
│ /Users/briangereke/Library/Caches/pypoetry/virtualenvs/llm-engineering-29meo7TT-py3.11/lib/pytho │
│ n3.11/site-packages/datasets/init.py:17 in │
│ │
│ 14 │
│ 15 version = "3.1.0" │
│ 16 │
│ ❱ 17 from .arrow_dataset import Dataset │
│ 18 from .arrow_reader import ReadInstruction │
│ 19 from .builder import ArrowBasedBuilder, BuilderConfig, DatasetBuilder, GeneratorBasedBui │
│ 20 from .combine import concatenate_datasets, interleave_datasets │
│ │
│ /Users/briangereke/Library/Caches/pypoetry/virtualenvs/llm-engineering-29meo7TT-py3.11/lib/pytho │
│ n3.11/site-packages/datasets/arrow_dataset.py:76 in │
│ │
│ 73 from tqdm.contrib.concurrent import thread_map │
│ 74 │
│ 75 from . import config │
│ ❱ 76 from .arrow_reader import ArrowReader │
│ 77 from .arrow_writer import ArrowWriter, OptimizedTypedSequence │
│ 78 from .data_files import sanitize_patterns │
│ 79 from .download.streaming_download_manager import xgetsize │
│ │
│ /Users/briangereke/Library/Caches/pypoetry/virtualenvs/llm-engineering-29meo7TT-py3.11/lib/pytho │
│ n3.11/site-packages/datasets/arrow_reader.py:30 in │
│ │
│ 27 import pyarrow.parquet as pq │
│ 28 from tqdm.contrib.concurrent import thread_map │
│ 29 │
│ ❱ 30 from .download.download_config import DownloadConfig # noqa: F401 │
│ 31 from .naming import _split_re, filenames_for_dataset_split │
│ 32 from .table import InMemoryTable, MemoryMappedTable, Table, concat_tables │
│ 33 from .utils import logging │
│ │
│ /Users/briangereke/Library/Caches/pypoetry/virtualenvs/llm-engineering-29meo7TT-py3.11/lib/pytho │
│ n3.11/site-packages/datasets/download/init.py:9 in │
│ │
│ 6 ] │
│ 7 │
│ 8 from .download_config import DownloadConfig │
│ ❱ 9 from .download_manager import DownloadManager, DownloadMode │
│ 10 from .streaming_download_manager import StreamingDownloadManager │
│ 11 │
│ │
│ /Users/briangereke/Library/Caches/pypoetry/virtualenvs/llm-engineering-29meo7TT-py3.11/lib/pytho │
│ n3.11/site-packages/datasets/download/download_manager.py:32 in │
│ │
│ 29 │
│ 30 from .. import config │
│ 31 from ..utils import tqdm as hf_tqdm │
│ ❱ 32 from ..utils.file_utils import ( │
│ 33 │ ArchiveIterable, │
│ 34 │ FilesIterable, │
│ 35 │ cached_path, │
│ │
│ /Users/briangereke/Library/Caches/pypoetry/virtualenvs/llm-engineering-29meo7TT-py3.11/lib/pytho │
│ n3.11/site-packages/datasets/utils/file_utils.py:45 in │
│ │
│ 42 from ..filesystems import COMPRESSION_FILESYSTEMS │
│ 43 from . import _tqdm, logging │
│ 44 from ._filelock import FileLock │
│ ❱ 45 from .extract import ExtractManager │
│ 46 from .track import TrackedIterableFromGenerator │
│ 47 │
│ 48 │
│ │
│ /Users/briangereke/Library/Caches/pypoetry/virtualenvs/llm-engineering-29meo7TT-py3.11/lib/pytho │
│ n3.11/site-packages/datasets/utils/extract.py:3 in │
│ │
│ 1 import bz2 │
│ 2 import gzip │
│ ❱ 3 import lzma │
│ 4 import os │
│ 5 import shutil │
│ 6 import struct │
│ │
│ /Users/briangereke/.pyenv/versions/3.11.8/lib/python3.11/lzma.py:27 in │
│ │
│ 24 import builtins │
│ 25 import io │
│ 26 import os │
│ ❱ 27 from _lzma import * │
│ 28 from _lzma import _encode_filter_properties, _decode_filter_properties │
│ 29 import _compression │
│ 30 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ModuleNotFoundError: No module named '_lzma'


End of stack trace


This seems to be a dependency issue potentially caused by by tight version limits on sentence-transformers. Attempting to register a pipeline with the zenml cli produces a similar error importing '_lzma'. Here's some info on my environment:

poetry env info

Virtualenv
Python: 3.11.8
Implementation: CPython
Path: /Users/briangereke/Library/Caches/pypoetry/virtualenvs/llm-engineering-29meo7TT-py3.11
Executable: /Users/briangereke/Library/Caches/pypoetry/virtualenvs/llm-engineering-29meo7TT-py3.11/bin/python
Valid: True

Base
Platform: darwin
OS: posix
Python: 3.11.8
Path: /Users/briangereke/.pyenv/versions/3.11.8
Executable: /Users/briangereke/.pyenv/versions/3.11.8/bin/python3.11

@bgereke
Copy link
Author

bgereke commented Nov 14, 2024

It appears to be an issue w/ pyenv on Mac: https://stackoverflow.com/questions/59690698/modulenotfounderror-no-module-named-lzma-when-building-python-using-pyenv-on

This worked for me:

brew install xz
pyenv uninstall <desired-python-version>
pyenv install <desired-python-version>

poetry poe run-digital-data-etl-paul now runs to completion but pipelines don't show in ZenML dashboard until I run them. Maybe this is supposed to be the case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant