Skip to content

Commit

Permalink
chore: Bump the default split page concurrency (#122)
Browse files Browse the repository at this point in the history
Verified that this shows a speedup by doing a local pip install and
running the following snippet before and after the change:

```
from unstructured_client import UnstructuredClient
from unstructured_client.models import shared

s = UnstructuredClient(
    server_url=SERVER_URL,
    api_key_auth=API_KEY,
    )

filename = "../_sample_docs/layout-parser-paper.pdf"

with open(filename, "rb") as f:
    # Note that this currently only supports a single file
    files=shared.Files(
        content=f.read(),
        file_name=filename,
	)

req = shared.PartitionParameters(
    files=files,
    strategy="hi_res",
)

start_time = time.time()
resp = s.general.partition(req)
end_time = time.time()
print(f"Elapsed time: {end_time - start_time} seconds")
```
  • Loading branch information
awalker4 authored Jun 28, 2024
1 parent 854dfdf commit 1dd7794
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion src/unstructured_client/_hooks/custom/split_pdf_hook.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@


DEFAULT_STARTING_PAGE_NUMBER = 1
DEFAULT_CONCURRENCY_LEVEL = 5
DEFAULT_CONCURRENCY_LEVEL = 8
MAX_CONCURRENCY_LEVEL = 15
MIN_PAGES_PER_SPLIT = 2
MAX_PAGES_PER_SPLIT = 20
Expand Down

0 comments on commit 1dd7794

Please sign in to comment.