Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vocab-mapper: feedback, ideas, next steps #164

Open
josephjclark opened this issue Jan 29, 2025 · 2 comments
Open

vocab-mapper: feedback, ideas, next steps #164

josephjclark opened this issue Jan 29, 2025 · 2 comments

Comments

@josephjclark
Copy link
Collaborator

josephjclark commented Jan 29, 2025

Randomish thoughts:

  • we need to save the sheets extension to a repo here on github, and work out how to publish it outside of Sheets
  • work out how to add more logging so we can see more stuff from the workflow view. I think this means a) log more frequently in apollo itself, and b) work out how to get log line into the job
  • work how to feedback errors into the sheet
  • Instantly update the sheet columns with "loading..."
  • Add a streaming mode, which processes terms one a time and emits them through a websocket to the caller. This lets us update the sheet in realtime Kind hard but very achievable.
  • Run faster: it currently take 3 minute to process results. Does this matter? Can we easily speed it up? What about streaming mode? vocab-mapper: use batching for better performance #165
  • When we're done, show a toaster saying "we've finished!" in the sheet
  • Is there any way we can poll apollo to get a progress update for the job?
  • how can we limit calls, so that if the workflow is running for that sheet, we return an error? oh, collections!
  • Can users paste their own target values? We'd have to process and embed them. When do we remove them? Should we cache?
  • How do users control target data sets?
@josephjclark
Copy link
Collaborator Author

Regarding logs:

Right now the workflow calls out to apollo via the rest API. It's a black box. All logs are invisible to Lightning (and therefore the user).

How can we use websocket events in the http adaptor in a workflow to stream logs? We must need special adaptor support for this right?

The alternative would be an Apollo adaptor, which would be in a much better place to handle this use-case. It would almost be a copy and paste of the core CLI code.

@hanna-paasivirta
Copy link
Contributor

hanna-paasivirta commented Feb 12, 2025

  • Speed:

    • We have added some concurrent processing vocab-mapper: use batching for better performance #165 , but testing the ideal settings (fast while stable) will cost some time and API tokens. The settings for different steps need to be tuned individually, and the logic could be improved.
    • With our Anthropic limit for tokens/minute, we can only manage a maximum of two concurrent calls. I have only tested 25-50 inputs at a time and we will probably hit the limit with more inputs. This means if we had two users we would need to slow the pipeline down and wait for 60 seconds when limits are reached.
  • Cost & Speed: This first version maximised accuracy. We can probably make the mapper faster and cheaper with optimisations such as:

    • Combining the top n and top 1 selection steps – this is the most obvious one that will lower the cost by up to 40%. Since it improved perf a little, I split them for now
    • Limiting total search results
    • Integrate batch API option for 50% cheaper 24-hour processing.
    • I haven't implemented skipping user locked-in answers yet!
  • Performance:

    • The biggest bottleneck might be the search. If we could optimise it to give fewer specialised results for non-specialised inputs that would make the selection step easier (and cheaper).
    • The vector search in particular favours long and specialised terms over simple ones. For this, different embeddings, search algorithm, text preprocessing might help.
    • For keyword search, something rule-based could work (trim by length in relation to the length of the input?)
    • The LLM picks specialised results. If this is not because of the search step, a different model or reasoning steps might help.

I tracked speed with LangSmith, but it might work better for cost optimisation with OpenAI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants