Skip to content

feat(document-search): Allow to create DocumentSearch instances from config #117

feat(document-search): Allow to create DocumentSearch instances from config

feat(document-search): Allow to create DocumentSearch instances from config #117

GitHub Actions / JUnit Test Report failed Oct 7, 2024 in 0s

82 tests run, 76 passed, 4 skipped, 2 failed.

Annotations

Check failure on line 27 in packages/ragbits-document-search/tests/unit/test_document_search.py

See this annotation in the file changed.

@github-actions github-actions / JUnit Test Report

test_document_search

ValueError: UNSTRUCTURED_API_KEY environment variable is not set
Raw output
async def test_document_search():
        document_search = DocumentSearch(embedder=NoopEmbeddings(), vector_store=InMemoryVectorStore())
    
>       await document_search.ingest_document(
            DocumentMeta.create_text_document_from_literal("Name of Peppa's brother is George")
        )

packages/ragbits-document-search/tests/unit/test_document_search.py:27: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
packages/ragbits-document-search/src/ragbits/document_search/_main.py:119: in ingest_document
    elements = await document_processor.process(document_meta)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <ragbits.document_search.ingestion.providers.unstructured.UnstructuredProvider object at 0x7fd882ea94b0>
document_meta = DocumentMeta(document_type=<DocumentType.TXT: 'txt'>, source=LocalFileSource(source_type='local_file', path=PosixPath('/tmp/tmp7w8yk0qr')))

    async def process(self, document_meta: DocumentMeta) -> list[Element]:
        """Process the document using the Unstructured API.
    
        Args:
            document_meta: The document to process.
    
        Returns:
            The list of elements extracted from the document.
    
        Raises:
            ValueError: If the UNSTRUCTURED_API_KEY or UNSTRUCTURED_API_URL environment variables are not set.
            DocumentTypeNotSupportedError: If the document type is not supported.
    
        """
        self.validate_document_type(document_meta.document_type)
        if (api_key := os.getenv(UNSTRUCTURED_API_KEY_ENV)) is None:
>           raise ValueError(f"{UNSTRUCTURED_API_KEY_ENV} environment variable is not set")
E           ValueError: UNSTRUCTURED_API_KEY environment variable is not set

packages/ragbits-document-search/src/ragbits/document_search/ingestion/providers/unstructured.py:79: ValueError

Check failure on line 42 in packages/ragbits-document-search/tests/unit/test_document_search.py

See this annotation in the file changed.

@github-actions github-actions / JUnit Test Report

test_document_search.test_document_search_from_config

ValueError: UNSTRUCTURED_API_KEY environment variable is not set
Raw output
async def test_document_search_from_config():
        document_search = DocumentSearch.from_config(CONFIG)
    
>       await document_search.ingest_document(
            DocumentMeta.create_text_document_from_literal("Name of Peppa's brother is George")
        )

packages/ragbits-document-search/tests/unit/test_document_search.py:42: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
packages/ragbits-document-search/src/ragbits/document_search/_main.py:119: in ingest_document
    elements = await document_processor.process(document_meta)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <ragbits.document_search.ingestion.providers.unstructured.UnstructuredProvider object at 0x7fd88340f550>
document_meta = DocumentMeta(document_type=<DocumentType.TXT: 'txt'>, source=LocalFileSource(source_type='local_file', path=PosixPath('/tmp/tmpojvcopfz')))

    async def process(self, document_meta: DocumentMeta) -> list[Element]:
        """Process the document using the Unstructured API.
    
        Args:
            document_meta: The document to process.
    
        Returns:
            The list of elements extracted from the document.
    
        Raises:
            ValueError: If the UNSTRUCTURED_API_KEY or UNSTRUCTURED_API_URL environment variables are not set.
            DocumentTypeNotSupportedError: If the document type is not supported.
    
        """
        self.validate_document_type(document_meta.document_type)
        if (api_key := os.getenv(UNSTRUCTURED_API_KEY_ENV)) is None:
>           raise ValueError(f"{UNSTRUCTURED_API_KEY_ENV} environment variable is not set")
E           ValueError: UNSTRUCTURED_API_KEY environment variable is not set

packages/ragbits-document-search/src/ragbits/document_search/ingestion/providers/unstructured.py:79: ValueError