CI Integration #13

tusharmath · 2024-08-17T12:00:20Z

This is a neat library that does what it says. It would benefit immensly by leveraging some CI integration.

jeremychone · 2024-08-17T16:43:02Z

First, thanks for the PR #14.

However, I have some questions about the net value of GitHub actions for this library.

Context:

This library is single-authored, meaning a single person will merge the PRs to the repo.
For my personal workflow, I am not sure GitHub CI helps me anywhere. Running cargo test followed by a git merge and push is very efficient, and automating it further might lead to side effects (feel free to let me know if I missed anything).
I am always cautious about putting API keys on GitHub. So, if there is no clear and significant value, I tend to bypass this.

Now, where I could see value is for the contributors, so they can check that nothing breaks in their fork/PR branch.

But I do not think that GitHub actions on the main genai repo and its API Secret will be run on forks (I might be wrong). If I am wrong, I am even more uncomfortable having any fork/PR branch using the API keys.

I might be missing something obvious. Feel free to let me know.

tusharmath · 2024-09-01T14:03:09Z

I think it help the developers contributing quite significantly. Gives them feedback that things are working or not and can automate releases for you. For eg: we would love to these commits released — v0.1.7...main

jeremychone · 2024-09-01T15:52:40Z

Thanks for the response.

Yes, I can see the value for PRs, as devs could check if they didn't break any adapters. When I do a PR to sea-query, for example, it's nice to see all tests pass.

Now, for personal productivity, this won't help me, as I never give my crates.io or other service credentials to third-party servers/services. Moreover, it's automated enough. "cargo test" ... "cargo publish" ... commit/push -wip. A few minutes once per release, which is at most weekly. So, there's no value in automating this more, only cost and risk.

Regarding PR contributions, I see the value. The issue is that I'm not super comfortable (yet) putting my AI provider keys there. Cohere, for example, has a pretty low limit on free access, and it will occasionally break the CI, making it hard for anyone to understand what's going on.

So, I think you have some good points, and I will definitely keep thinking about how I can add this. Thanks a lot for the PR; this is perfect.

Note 1: The commit about webc::Error was wrongly associated with this issue. Completely independent.

Note 2: I should have released the update for webc::Error earlier. It wasn't a process issue, just that I didn't realize that typically when a PR is merged, it's nice to do a release soon after, as the PR author probably needs it. I will make sure to change my methodology there.

InAnYan · 2024-10-29T12:57:05Z

Regarding API keys: you contact to real APIs while you test genai?

But wouldn't it be simpler, if you run a mock webserver that have a specific schema from a specific LLM adapter, and that's it? That way you won't use any keys and remote connection

InAnYan · 2024-10-29T12:57:48Z

Regarding CI/CD: I understand that this is only your personal project. But just in case it'll grow in future: it would be nice just to have a small check for cargo test and cargo clippy. You can also check the formatting

jeremychone · 2024-10-29T19:54:09Z

@InAnYan

I understand your perspective, but I differ a little, especially regarding this particular library.

From my experience, for libraries/applications, especially as they scale, it's better to favor a "real path test" rather than mocks, at least for core functions. Otherwise, we spend time writing and testing the mock logic rather than testing what matters, meaning what can/will break.

It's not an all-or-nothing approach. For example, sometimes mocking parts of the subsystem dependencies can be very beneficial. Mocks can add value, so it's not an all-or-nothing approach. A good example is when we use cloud applications or libraries that integrate with S3. Then, using MinIO as a mock S3 is a low-cost, high-value solution for those parts. In my other devai, I might create a "mock-agent" that would probably be an "echo" one, to test the devai flows.

However, in this particular genai library, the core of the logic happens in the Adapter layer, and those are highly provider-specific to the Adapter with nuances per provider (OpenAI for OpenAI, Groq, Ollama have some differences, without even talking about Anthropic and Gemini). Mocking this will defeat the value of the tests.

For example, if we were going to the "mock provider" path, we would have these two options:

Generic Mock AI Provider - This would be a generic provider, probably with a special generic adapter. This is still quite a bit of work, but now we would test only the generic adapter and not a real provider, making the test value relatively low. The net value is very low, if any.
Mock per Provider (e.g., OpenAI, Anthropic, ...) - To make these mocks meaningful, this would require a huge amount of work, and then these mocks would not keep up with the real implementations anyway. The net value is negative by an order of magnitude on this one.

So, both of these options do not really add any sensible value, and just add quite a bit of cost, and actually give a false sense of quality.

Now, on the unit-test side (which I consider more as "in-crate tests"), there are obviously some tests that depend on or not on the AI Provider. Perhaps, allowing a "cargo test ..." that filters the tests that do the AI Provider calls might be added, but this won't add much value.

Interestingly, for the devai, I will probably add a mock model, e.g., mock-echo-model, since the core logic of devai is more about the flow than the intricacies of calling a particular AI provider, as it uses this genai library for this.

So while there might be different valid approaches, this is built with scale in mind.

I hope this clarifies things a little.

jeremychone added the in-discussion label Aug 17, 2024

jeremychone added a commit that referenced this issue Aug 17, 2024

. makes only webc::Error public (relates to: #13, #14)

e8fed12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI Integration #13

CI Integration #13

tusharmath commented Aug 17, 2024

jeremychone commented Aug 17, 2024

tusharmath commented Sep 1, 2024

jeremychone commented Sep 1, 2024

InAnYan commented Oct 29, 2024

InAnYan commented Oct 29, 2024

jeremychone commented Oct 29, 2024

CI Integration #13

CI Integration #13

Comments

tusharmath commented Aug 17, 2024

jeremychone commented Aug 17, 2024

tusharmath commented Sep 1, 2024

jeremychone commented Sep 1, 2024

InAnYan commented Oct 29, 2024

InAnYan commented Oct 29, 2024

jeremychone commented Oct 29, 2024