-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LLM
: Add simple capability system
#69
Conversation
d05c426
to
daea54f
Compare
daea54f
to
d64272a
Compare
WalkthroughThe recent updates aim to enhance type hinting for better clarity and consistency in various domain objects. Additionally, a new capability management system for language models (LLMs) has been introduced. This system includes classes for handling capabilities and requirements of LLMs, enabling smarter model selection based on capabilities. Changes
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
from llm.request_handler import * | ||
from llm.capability import RequirementList |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The explicit import of RequirementList
is redundant due to the preceding wildcard import from the same module (from llm.capability import *
). It's a good practice to avoid such redundancies to keep the code clean and readable.
- from llm.capability import RequirementList
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
from llm.capability import RequirementList |
class RequirementList: | ||
"""A class to represent the requirements you want to match against""" | ||
|
||
input_cost: float | None | ||
output_cost: float | None | ||
gpt_version_equivalent: float | None | ||
speed: float | None | ||
context_length: int | None | ||
vendor: str | None | ||
privacy_compliance: bool | None | ||
self_hosted: bool | None | ||
image_recognition: bool | None | ||
json_mode: bool | None | ||
|
||
def __init__( | ||
self, | ||
input_cost: float | None = None, | ||
output_cost: float | None = None, | ||
gpt_version_equivalent: float | None = None, | ||
speed: float | None = None, | ||
context_length: int | None = None, | ||
vendor: str | None = None, | ||
privacy_compliance: bool | None = None, | ||
self_hosted: bool | None = None, | ||
image_recognition: bool | None = None, | ||
json_mode: bool | None = None, | ||
) -> None: | ||
self.input_cost = input_cost | ||
self.output_cost = output_cost | ||
self.gpt_version_equivalent = gpt_version_equivalent | ||
self.speed = speed | ||
self.context_length = context_length | ||
self.vendor = vendor | ||
self.privacy_compliance = privacy_compliance | ||
self.self_hosted = self_hosted | ||
self.image_recognition = image_recognition | ||
self.json_mode = json_mode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RequirementList
class is well-structured and aligns with the PR's objectives to introduce a flexible capability system for selecting language learning models. Each attribute represents a specific requirement that can be used to match against available models.
However, consider adding documentation comments for each attribute to clarify their purpose and expected values, enhancing maintainability and understanding for future developers.
def get_llms_sorted_by_capabilities_score( | ||
self, requirements: RequirementList, invert_cost: bool = False | ||
): | ||
"""Get the llms sorted by their capability to requirement scores""" | ||
scores = calculate_capability_scores( | ||
[llm.capabilities for llm in self.entries], requirements, invert_cost | ||
) | ||
sorted_llms = sorted(zip(scores, self.entries), key=lambda pair: -pair[0]) | ||
return [llm for _, llm in sorted_llms] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The get_llms_sorted_by_capabilities_score
method is a key addition that aligns with the PR's objectives, efficiently sorting LLMs based on capability scores. This method enhances the model selection process by considering the specified requirements.
However, consider adding error handling or a fallback mechanism for cases where no models match the specified requirements, ensuring robustness in all scenarios.
class CapabilityRequestHandler(RequestHandler): | ||
"""Request handler that selects the best/worst model based on the requirements""" | ||
|
||
requirements: RequirementList | ||
selection_mode: CapabilityRequestHandlerSelectionMode | ||
llm_manager: LlmManager | ||
|
||
def __init__( | ||
self, | ||
requirements: RequirementList, | ||
selection_mode: CapabilityRequestHandlerSelectionMode = CapabilityRequestHandlerSelectionMode.WORST, | ||
) -> None: | ||
self.requirements = requirements | ||
self.selection_mode = selection_mode | ||
self.llm_manager = LlmManager() | ||
|
||
def complete(self, prompt: str, arguments: CompletionArguments) -> str: | ||
llm = self._select_model(CompletionModel) | ||
return llm.complete(prompt, arguments) | ||
|
||
def chat( | ||
self, messages: list[IrisMessage], arguments: CompletionArguments | ||
) -> IrisMessage: | ||
llm = self._select_model(ChatModel) | ||
return llm.chat(messages, arguments) | ||
|
||
def embed(self, text: str) -> list[float]: | ||
llm = self._select_model(EmbeddingModel) | ||
return llm.embed(text) | ||
|
||
def _select_model(self, type_filter: type) -> LanguageModel: | ||
"""Select the best/worst model based on the requirements and the selection mode""" | ||
llms = self.llm_manager.get_llms_sorted_by_capabilities_score( | ||
self.requirements, | ||
self.selection_mode == CapabilityRequestHandlerSelectionMode.WORST, | ||
) | ||
llms = [llm for llm in llms if isinstance(llm, type_filter)] | ||
|
||
if self.selection_mode == CapabilityRequestHandlerSelectionMode.BEST: | ||
llm = llms[0] | ||
else: | ||
llm = llms[-1] | ||
|
||
# Print the selected model for the logs | ||
print(f"Selected {llm.description}") | ||
return llm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CapabilityRequestHandler
class is a key addition that enhances the model selection process by leveraging the new capability system. It correctly implements functionality to select the best or worst model based on specified requirements and selection mode.
However, consider adding documentation to clarify the selection process and the impact of the selection mode on the chosen model, enhancing understanding and maintainability.
def capabilities_fulfill_requirements( | ||
capability: CapabilityList, requirements: RequirementList | ||
) -> bool: | ||
"""Check if the capability fulfills the requirements""" | ||
return all( | ||
getattr(capability, field).matches(getattr(requirements, field)) | ||
for field in requirements.__dict__.keys() | ||
if getattr(requirements, field) is not None | ||
) | ||
|
||
|
||
def calculate_capability_scores( | ||
capabilities: list[CapabilityList], | ||
requirements: RequirementList, | ||
invert_cost: bool = False, | ||
) -> list[int]: | ||
"""Calculate the scores of the capabilities against the requirements""" | ||
all_scores = [] | ||
|
||
for requirement in requirements.__dict__.keys(): | ||
requirement_value = getattr(requirements, requirement) | ||
if ( | ||
requirement_value is None | ||
and requirement not in always_considered_capabilities_with_default | ||
): | ||
continue | ||
|
||
# Calculate the scores for each capability | ||
scores = [] | ||
for capability in capabilities: | ||
if ( | ||
requirement_value is None | ||
and requirement in always_considered_capabilities_with_default | ||
): | ||
# If the requirement is not set, use the default value if necessary | ||
score = getattr(capability, requirement).matches( | ||
always_considered_capabilities_with_default[requirement] | ||
) | ||
else: | ||
score = getattr(capability, requirement).matches(requirement_value) | ||
# Invert the cost if required | ||
# The cost is a special case, as depending on how you want to use the scores | ||
# the cost needs to be considered differently | ||
if ( | ||
requirement in ["input_cost", "output_cost"] | ||
and invert_cost | ||
and score != 0 | ||
): | ||
score = 1 / score | ||
scores.append(score) | ||
|
||
# Normalize the scores between 0 and 1 and multiply by the weight modifier | ||
# The normalization here is based on the position of the score in the sorted list to balance out | ||
# the different ranges of the capabilities | ||
sorted_scores = sorted(set(scores)) | ||
weight_modifier = capability_weights[requirement] | ||
normalized_scores = [ | ||
((sorted_scores.index(score) + 1) / len(sorted_scores)) * weight_modifier | ||
for score in scores | ||
] | ||
all_scores.append(normalized_scores) | ||
|
||
final_scores = [] | ||
|
||
# Sum up the scores for each capability to get the final score for each list of capabilities | ||
for i in range(len(all_scores[0])): | ||
score = 0 | ||
for j in range(len(all_scores)): | ||
score += all_scores[j][i] | ||
final_scores.append(score) | ||
|
||
return final_scores |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The functions capabilities_fulfill_requirements
and calculate_capability_scores
are well-implemented, providing a robust mechanism for evaluating and scoring models based on their capabilities and specified requirements. This functionality is central to the new capability system.
However, consider adding documentation to clarify the scoring logic, especially the role of invert_cost
in the calculation process, to enhance understanding and maintainability.
def matches(self, text: str) -> int: | ||
return int(self.value == text) | ||
|
||
def __str__(self): | ||
return f"TextCapability({super().__str__()})" | ||
|
||
|
||
class OrderedNumberCapability(BaseModel): | ||
"""A capability that is better the higher the value""" | ||
|
||
value: int | float | ||
|
||
def matches(self, number: int | float) -> int | float: | ||
if self.value < number: | ||
return 0 | ||
return self.value - number + 1 | ||
|
||
def __str__(self): | ||
return f"OrderedNumberCapability({super().__str__()})" | ||
|
||
|
||
class InverseOrderedNumberCapability(BaseModel): | ||
"""A capability that is better the lower the value""" | ||
|
||
value: int | float | ||
|
||
def matches(self, number: int | float) -> int | float: | ||
if self.value > number: | ||
return 0 | ||
return number - self.value + 1 | ||
|
||
def __str__(self): | ||
return f"InverseOrderedNumberCapability({super().__str__()})" | ||
|
||
|
||
class BooleanCapability(BaseModel): | ||
"""A simple boolean capability""" | ||
|
||
value: bool | ||
|
||
def matches(self, boolean: bool) -> int: | ||
return int(self.value == boolean) | ||
|
||
def __str__(self): | ||
return f"BooleanCapability({str(self.value)})" | ||
|
||
|
||
class CapabilityList(BaseModel): | ||
"""A list of capabilities for a model""" | ||
|
||
input_cost: InverseOrderedNumberCapability = Field( | ||
default=InverseOrderedNumberCapability(value=0) | ||
) | ||
output_cost: InverseOrderedNumberCapability = Field( | ||
default=InverseOrderedNumberCapability(value=0) | ||
) | ||
gpt_version_equivalent: OrderedNumberCapability = Field( | ||
default=OrderedNumberCapability(value=2) | ||
) | ||
speed: OrderedNumberCapability = Field(default=OrderedNumberCapability(value=0)) | ||
context_length: OrderedNumberCapability = Field( | ||
default=OrderedNumberCapability(value=0) | ||
) | ||
vendor: TextCapability = Field(default=TextCapability(value="")) | ||
privacy_compliance: BooleanCapability = Field( | ||
default=BooleanCapability(value=False) | ||
) | ||
self_hosted: BooleanCapability = Field(default=BooleanCapability(value=False)) | ||
image_recognition: BooleanCapability = Field(default=BooleanCapability(value=False)) | ||
json_mode: BooleanCapability = Field(default=BooleanCapability(value=False)) | ||
|
||
@model_validator(mode="before") | ||
@classmethod | ||
def from_dict(cls, data: dict[str, any]): | ||
"""Prepare the data for handling by Pydantic""" | ||
for key, value in data.items(): | ||
if type(value) is not dict: | ||
data[key] = {"value": value} | ||
return data | ||
|
||
|
||
# The weights for the capabilities used in the scoring | ||
capability_weights = { | ||
"input_cost": 0.5, | ||
"output_cost": 0.5, | ||
"gpt_version_equivalent": 4, | ||
"speed": 2, | ||
"context_length": 0.1, | ||
"vendor": 1, | ||
"privacy_compliance": 0, | ||
"self_hosted": 0, | ||
"image_recognition": 0, | ||
"json_mode": 0, | ||
} | ||
|
||
# The default values for the capabilities that are always considered | ||
always_considered_capabilities_with_default = { | ||
"input_cost": 100000000000000, | ||
"output_cost": 100000000000000, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CapabilityList
class and various capability classes (TextCapability
, OrderedNumberCapability
, etc.) are well-structured and crucial for the new capability system, allowing for detailed specification and evaluation of model capabilities.
However, consider adding documentation comments for each capability class to clarify their purpose and expected use, enhancing maintainability and understanding for future developers.
Motivation
For our first attempt at selecting LLMs we want to try a system based on capabilities. This system should allow the pipeline subsystem to specify what it needs from the LLM and the LLM subsystem will select the best fitting model that matches these requirements.
Description
For this I added several capabilities in the first version. The most notable ones are:
gpt_version_equivalent
as a first simple measure of skillcost
: To set the cost of a model. TODO: Specify unit usedspeed
: The generation speed of the model. Higher = fastercontext_length
: The maximum amount of tokens the model can handleTo match against these capabilities there is a matching
RequirementList
class. The pipelines can specify their requirements and the newCapabilityRequestHandler
will select the best fitting model.The
CapabilityRequestHandler
will not select models that do not fulfill the requirements.The
CapabilityRequestHandler
can either select the best possible model or the worst model that passes the requirements check. This is useful, as often you simply want to get the model that barely fulfills the requirements. Selecting the best should only be done if you specify a cost limit, as it will otherwise always choose e.g. GPT 4 32k (or Turbo 128k).Only the
RequirementList
andCapabilityRequestHandler
classes are considered part of the public interface of the LLM subsystem. These can be considered stable for the nearer future. The rest of the implementation will likely change in the short term and is subject to improvements and refactorings.