Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PROPOSAL] Pydantic Kitfile class #40

Open
wants to merge 24 commits into
base: main
Choose a base branch
from

Conversation

Burhan-Q
Copy link
Contributor

@Burhan-Q Burhan-Q commented Mar 9, 2025

Summary

Instead of using custom validators, proposed changes incorporates Pydantic BaseModel and validation. This PR is opened as a proposal, as it's understood there could be reasons beyond my awareness as to why not to make such a change. The aim of these changes is to reduce the amount of code needed for validation, while maintain robust validation and rule checking by implementing the Pydantic library.

What changes

  • Includes new objects for components of Kitfile which can be initialized in code directly or by using dict
  • Adds build() method for Kitfile definition via dict or Pydantic Models
  • Uses Pydantic model validation logic to ensure compliance

Advantages

  • Provides class objects for all components of Kitfile with validation included.
  • Inherit validations from Pydantic BaseModel class.
  • Includes custom validation logic (requirements for Kitfile, minimum author count, relative paths).
  • Modularized Kitfile components, possible to use for independent validation.
  • Will attempt to coerce numeric values provide for manifestVersion into strings.
  • Fully defined JSON schema from Kitfile.model_json_schema()
Data Model Schema

{"$defs": {"CodeEntry": {"description": "Single entry with information about the source code.\nArgs:\n    path (str): Location of the source code file or directory relative to the context.\n    description (str): Description of what the code does.\n    license (str): SPDX license identifier for the code.",
                         "properties": {"description": {"description": "Description of what the code does.",
                                                        "title": "Description",
                                                        "type": "string"},
                                        "license": {"description": "SPDX license identifier for the code.",
                                                    "title": "License",
                                                    "type": "string"},
                                        "path": {"description": "Location of the source code file or directory relative to the context.",
                                                 "title": "Path",
                                                 "type": "string"}},
                         "required": ["path", "description", "license"],
                         "title": "CodeEntry",
                         "type": "object"},
           "DatasetEntry": {"description": "Single entry with information about the datasets used.\nArgs:\n    name (str): Name of the dataset.\n    path (str): Location of the dataset file or directory relative to the context.\n    description (str): Overview of the dataset.\n    license (str): SPDX license identifier for the dataset.",
                            "properties": {"description": {"description": "Overview of the dataset.",
                                                           "title": "Description",
                                                           "type": "string"},
                                           "license": {"description": "SPDX license identifier for the dataset.",
                                                       "title": "License",
                                                       "type": "string"},
                                           "name": {"description": " Name of the dataset.",
                                                    "title": "Name",
                                                    "type": "string"},
                                           "path": {"description": "Location of the dataset file or directory relative to the context.",
                                                    "title": "Path",
                                                    "type": "string"}},
                            "required": ["path",
                                         "name",
                                         "description",
                                         "license"],
                            "title": "DatasetEntry",
                            "type": "object"},
           "DocsEntry": {"description": "Single entry with information about included documentation for the model.\nArgs:\n    description (str): Description of the documentation.\n    path (str): Location of the documentation relative to the context.",
                         "properties": {"description": {"description": "Description of the documentation.",
                                                        "title": "Description",
                                                        "type": "string"},
                                        "path": {"description": "Location of the documentation relative to the context.",
                                                 "title": "Path",
                                                 "type": "string"}},
                         "required": ["path", "description"],
                         "title": "DocsEntry",
                         "type": "object"},
           "ModelPart": {"description": "One entry of the related files for the model, e.g. model weights.\nArgs:\n    name (str): Identifier for the part.\n    path (str): Location of the file or a directory relative to the context.\n    type (str): The type of the part (e.g. LoRA weights).",
                         "properties": {"name": {"description": "Identifier for the part.",
                                                 "title": "Name",
                                                 "type": "string"},
                                        "path": {"description": "Location of the file or a directory relative to the context.",
                                                 "title": "Path",
                                                 "type": "string"},
                                        "type": {"description": "The type of the part (e.g. LoRA weights).",
                                                 "title": "Type",
                                                 "type": "string"}},
                         "required": ["path", "name", "type"],
                         "title": "ModelPart",
                         "type": "object"},
           "ModelSection": {"description": "Details of the trained models included in the package.\nArgs:\n    name (str): Name of the model.\n    path (str): Location of the model file or directory relative to the context.\n    framework (str): AI/ML framework.\n    version (str): Version of the model.\n    description (str): Overview of the model.\n    license (str): SPDX license identifier for the model.\n    parts (list[ModelPart]): List of related files for the model (e.g. LoRA weights).\n    parameters (Any): An arbitrary section of YAML that can be used to store any additional data that may be \n        relevant to the current model.",
                            "properties": {"description": {"description": "Overview of the model.",
                                                           "title": "Description",
                                                           "type": "string"},
                                           "framework": {"description": "AI/ML framework.",
                                                         "examples": ["tensorflow",
                                                                      "pytorch",
                                                                      "onnx",
                                                                      "TensorRT"],
                                                         "title": "Framework",
                                                         "type": "string"},
                                           "license": {"description": "SPDX license identifier for the model.",
                                                       "title": "License",
                                                       "type": "string"},
                                           "name": {"description": "Name of the model.",
                                                    "title": "Name",
                                                    "type": "string"},
                                           "parameters": {"anyOf": [{},
                                                                    {"type": "null"}],
                                                          "default": null,
                                                          "description": "An arbitrary section of YAML that can be used to store any additional data that may be relevant to the current model, with a few caveats. Only a json-compatible subset of YAML is supported. Strings will be serialized without flow parameters. Numbers will be converted to decimal representations (0xFF -> 255, 1.2e+3 -> 1200). Maps will be sorted alphabetically by key.",
                                                          "title": "Parameters"},
                                           "parts": {"anyOf": [{"items": {"$ref": "#/$defs/ModelPart"},
                                                                "type": "array"},
                                                               {"type": "null"}],
                                                     "description": "List of related files for the model (e.g. LoRA weights).",
                                                     "title": "Parts"},
                                           "path": {"description": "Location of the model file or directory relative to the context.",
                                                    "title": "Path",
                                                    "type": "string"},
                                           "version": {"description": "Version of the model.",
                                                       "examples": ["0.0a13",
                                                                    "1.8.0"],
                                                       "title": "Version",
                                                       "type": "string"}},
                            "required": ["path",
                                         "name",
                                         "framework",
                                         "version",
                                         "description",
                                         "license"],
                            "title": "ModelSection",
                            "type": "object"},
           "Package": {"description": "This section provides general information about the AI/ML project.\nArgs:\n    name (str): The name of the AI/ML project.\n    version (str): The current version of the project.\n    description (str): A brief overview of the project's purpose and capabilities.\n    authors (list[str]): A list of individuals or entities that have contributed to the project.",
                       "properties": {"authors": {"description": "A list of individuals or entities that have contributed to the project.",
                                                  "items": {"type": "string"},
                                                  "minItems": 1,
                                                  "title": "Authors",
                                                  "type": "array"},
                                      "description": {"description": "A brief overview of the project's purpose and capabilities.",
                                                      "title": "Description",
                                                      "type": "string"},
                                      "name": {"description": "The name of the AI/ML project.",
                                               "title": "Name",
                                               "type": "string"},
                                      "version": {"description": "The current version of the project.",
                                                  "examples": ["1.2.3",
                                                               "0.13a"],
                                                  "title": "Version",
                                                  "type": "string"}},
                       "required": ["name",
                                    "version",
                                    "description",
                                    "authors"],
                       "title": "Package",
                       "type": "object"}},
 "description": "Kitfile class using Pydantic for validation.",
 "properties": {"code": {"anyOf": [{"items": {"$ref": "#/$defs/CodeEntry"},"type": "array"}, {"type": "null"}],
                         "description": "Information about the source code.",
                         "title": "Code"},
                "datasets": {"anyOf": [{"items": {"$ref": "#/$defs/DatasetEntry"}, "type": "array"}, {"type": "null"}],
                             "description": "Information about the datasets used.",
                             "title": "Datasets"},
                "docs": {"anyOf": [{"items": {"$ref": "#/$defs/DocsEntry"}, "type": "array"}, {"type": "null"}],
                         "description": "Information about included documentation for the model.",
                         "title": "Docs"},
                "manifestVersion": {"description": "Specifies the manifest format version.",
                                    "examples": ["1.0.0", "0.13a"],
                                    "title": "Manifestversion",
                                    "type": "string"},
                "model": {"anyOf": [{"$ref": "#/$defs/ModelSection"}, {"type": "null"}],
                          "default": null,
                          "description": "Details of the trained models included in the package."},
                "package": {"$ref": "#/$defs/Package",
                            "description": "This section provides general information about the AI/ML project."}},
 "required": ["manifestVersion", "package"],
 "title": "Kitfile",
 "type": "object"}

Updated Kitfile Usage

From file

from kitops.modelkit import Kitfile

kitfile = Kitfile("my_kitfile")

kitfile
>>> Kitfile(  # NOTE: shown here with wrapping for easier parsing
    manifestVersion='0.1.1',
    package=Package(
        name='my_package',
        version='0.1.0',
        description='My package description',
        authors=['Author 1', 'Author 2']),
    code=[
        CodeEntry(
            path='code/',
            description='Code description',
            license='Apache-2.0'
        )
    ],
    datasets=[
        DatasetEntry(
            path='datasets/',
            name='my_dataset',
            description='Dataset description',
            license='Apache-2.0'
        )
    ],
    docs=[
        DocsEntry(
            path='docs/',
            description='Docs description'
        )
    ],
    model=ModelSection(
        path='model/',
        name='my_model',
        framework='tensorflow',
        version='2.0.0',
        description='Model description', license='Apache-2.0', parts=[], parameters=''
    )
)


From dictionary

from kitops.modelkit import Kitfile

cfg = {
    'code': [
        {
            'description': 'Code description',
            'license': 'Apache-2.0',
            'path': 'code/'
        }
],
    'datasets': [
        {
            'description': 'Dataset description',
            'license': 'Apache-2.0',
            'name': 'my_dataset',
            'path': 'datasets/'
        }
    ],
    'docs': [
        {
            'description': 'Docs description',
            'path': 'docs/'
        }
    ],
    'manifestVersion': '1.0',
    'model': {
        'description': 'Model description',
        'framework': 'tensorflow',
        'license': 'Apache-2.0',
        'name': 'my_model',
        'parameters': '',
        'parts': [],
        'path': 'model/',
        'version': '2.0.0'
    },
    'package': {
        'authors': ['Author 1', 'Author 2'],
        'description': 'My package description',
        'name': 'my_package',
        'version': '0.1.0'
    }
}

kitfile = Kitfile()

kitfile.build(**cfg)  # build from entire dict

kitfile.build(  # build from parts
    manifestVersion=cfg["manifestVersion"],
    package=cfg["package"],
    code=cfg["code"],
    datasets=cfg["datasets"],
    docs=cfg["docs"],
    model=cfg["model"],
)


From Objects

from kitops.modelkit import Kitfile
from kitops.modelkit.pydantic_kit import (
    CodeEntry,
    DatasetEntry,
    DocsEntry,
    ModelSection,
    Package,
)

code = [CodeEntry(description='Code description', license='Apache-2.0', path='code/')]
data = [DatasetEntry(description='Dataset description', license='Apache-2.0', name='my_dataset', path='datasets/')]
docs = [DocsEntry(description='Docs description', path='docs/')]
pack = Package(authors=['Author 1', 'Author 2'], description='My package description', name='my_package', version='0.1.0')
model = ModelSection(
    description='Model description',
    framework='tensorflow',
    license='Apache-2.0',
    name='my_model',
    parameters='',
    parts=[],
    path='model/',
    version='2.0.0'
)

kitfile = Kitfile()
kitfile.build(manifestVersion='1.0', package=pack, code=code, datasets=data, docs=docs, model=model)


Additional Notes

  • Other validation rules can be enforced, such as ensuring manifestVersion follows a specific format (if needed).
  • Can include Literal strings or Enum for license to ensure valid licenses are used.
  • Additional Pydantic Types can be used for specific data validation, including semver from pydantic-extra-types.

@gorkem gorkem requested a review from jfuss March 10, 2025 18:44
@jfuss
Copy link
Contributor

jfuss commented Mar 10, 2025

@Burhan-Q Thanks for the PR. I had a similar thought not to long ago on moving towards Pydantic. I don't have any strong reservations here. I will have a look at this more in depth sometime in the coming days.

My only callout is for the future, it may make sense to draft a quick issue over a PR for changes like these. I would hate for you or others in the community to invest significant time in a feature that not be merged.

@Burhan-Q
Copy link
Contributor Author

@jfuss I agree, an issue would have probably been the right way to go before starting work. I would love to see this work merged, but even if it weren't, I find (personal) value in doing it regardless, which is why I just started with a PR.

@amisevsk amisevsk requested a review from nida-hasan March 11, 2025 15:34
@amisevsk
Copy link
Contributor

I think this would be a useful addition, even for general "I want to parse a Kitfile in my python code" cases (i.e. outside of doing something with KitOps). I'm mostly on the Go side though, so I'll leave it for Nida and Jacob to review for appoval.

One potential concern is maintaining sync between the classes here and the original definitions in KitOps, but this is becoming less of a concern as we've mostly stabilized the Kitfile format (the intention is that manifestVersion is used for this, but I'll admit to playing a little fast-and-loose with it so far).

@jfuss
Copy link
Contributor

jfuss commented Mar 13, 2025

One potential concern is maintaining sync between the classes here and the original definitions in KitOps, but this is becoming less of a concern as we've mostly stabilized the Kitfile format (the intention is that manifestVersion is used for this, but I'll admit to playing a little fast-and-loose with it so far).

@amisevsk
I don't see this as a major concern. We own both parts (KitOps and PyKitOps), we should be able to do changes in sync. If this is still a concern, we could look into generating objects from the source. We do need to make sure new properties/capabilities can happen without breaking previous versions. Pydantic allows this through model_config = ConfigDict(extra='allow'). You can then access these through model.model_extra or model.model_dump() to dump the model back to json. This should allow us to add new properties without having customers force to upgrade PyKitOps (though I don't think making customers upgrade is a bad thing).

Is there anything else that gives you pause from the Kitfile POV?

Copy link
Contributor

@jfuss jfuss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the Models, paths are the only required fields. Everything else needs to be marked as Optional.

raise FileNotFoundError(f"Path '{self.path}' not found.")
if Path(self.path).is_absolute():
try:
self.path = Path(self.path).relative_to(Path.cwd()).as_posix()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure we should be doing this relative_to the cwd. I could be running a script from a directory that is not the context for the commands. Could be missing something though.

My off the cuff thought here is, we should leave that validation to the kitops cli we shell out too for pack and such.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a script is called from another directory, Path.cwd() will return the directory where the shell is calling the script, not where the script is located. This validation attempt to update to a relative path if given an absolute path instead.

If all absolute paths are provided, then relative paths to the current working directory will be used, if it's a mixture of absolute and relative paths, with the relative paths being relative to another directory, then that would be an issue.

I presumed that a user would be most likely to provide either all relative or all absolute paths, not a mixture. That said, I would guess that at a higher-level there could be a check to verify all paths provided are relative to the same source path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This validation attempt to update to a relative path if given an absolute path instead.

We cannot do this as a library. The paths need to be relative to the context of the command not where a python script is running. From the docs on kit pack:

"Unless a different location is specified, this command looks for the kitfile at the root of the provided context directory. Any relative paths defined within the kitfile are interpreted as being relative to this context directory."

We cannot be assume and mutate paths within a kitfile. We can validate they are files/dir, they exist and go as far as validating that can have a common context but we cannot assume what the context is. We shouldn't assume this is just as a script either. These are all assumptions that limit the use of a library.

My recommendation is to just is to just treat paths as strs and let the kit cli validation further, since the context is passed into the commands. I know I pushed you to move toward FilePath from str, so sorry about the churn. But I believe this validation hinders the library more than it helps right now.

docs (Optional[list[DocsEntry | dict]], optional): Included documentation for the model. Defaults to None.
model (Optional[ModelSection | dict], optional): Details of the models included. Defaults to None.
"""
super().__init__(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we calling __init__ from a build method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't call super().__init__() from the subclass init method. It can only be called after the file is loaded. If constructing the Kitfile object manually, the parent class init method needs to be called. The name build was arbitrary and can be changed to anything that fits better.

The aim was to maintain the inputs as-is for the Kitfile class, but also provide the flexibility to construct the class after init using this method instead of property assignments. If using property assignments is critical to retain, then there might be a lot more effort to make that work without losing the data model assurance Pydantic provides.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't call super().init() from the subclass init method.

This doesn't make sense to me. You can and typically is good practice too. Do you have docs on this you can link to? Could be missing something. Maybe some constraint from pydantic?

Why this generally doesn't make sense to me

To do this, you have to instantiate the class through Kitfile(**kwargs). So by the time you are calling build, you are calling an instance of the class but build calls super().__init__() again. Just feels really weird and I don't see the benefit in doing this over

kit = Kitfile(**kwargs)
...
kit.package = some_new_or_mutated_package
...

Can you give me a concrete example of usage? That might help me understand what you are trying to target for a UX point of view.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing code for init:

class Kitfile:
"""
Kitfile class to manage KitOps ModelKits and Kitfiles.
Attributes:
path (str): Path to the Kitfile.
"""
def __init__(self, path: str | None = None) -> None:

does not accept **kwargs and instead relies on using @property.setter methods to populate the attributes. I did not change the design of the init method, only mirrored what existed. Adding **kwargs (or even better, the accepted key word arguments) would certainly remove some of the need for this, but since the init method currently only accepts the path key word, when using a Pydantic model with any required fields, means that calling super().__init__() should be deferred.

The current design doesn't follow any established patterns, or at least none I'm aware of, it was put together this way to ensure that no errors were thrown when trying to initialize the KitFile class and to mirror the existing design as much as possible. To be honest, it was a choice to make it work sufficiently well so that I felt comfortable opening the PR (since it's the best way to get feedback). I'll admit, it's an odd design but from my standpoint it was better to avoid changing as many interfaces as possible so that most of the changes were "invisible" to anyone else.

That said, it can likely be changed. Given the updates around what fields are optional, it's likely that there is room to change the behavior here. When I have some time soon, I'll reevaluate this design and review the required fields, as I have some ideas on how to improve the design overall.

raise ValueError(
f"Kitfile must be a dictionary with allowed keys: {', '.join(self._kitfile_allowed_keys)}"
) from e
if any(ALLOWED_KEYS.difference(data.keys())):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will cause issues if we add a new field right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me ponder this. I am not sure we care if we are making the classes now.

from pydantic import BaseModel, Field, model_validator


class BasePathModel(BaseModel):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In all the models I think we need to add extras=allowed. So we have some forward compatibility.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a maintainer of the repo, so it's not my call to make, but given the steps taken for validation and existing logic to ensure only permitted keys were in use, it seems to me it would be preferable to add anything new explicitly. If that's not done, then documentation and tests might be lacking, but it also might not be clear to users what is or isn't valid.

I have no issue adding it, but from the outside it doesn't seem like it aligns with the design.

@Burhan-Q
Copy link
Contributor Author

@jfuss thanks for the feedback. I've incorporated most of the changes and marked those comments as resolved. I've replied to others to get your feedback. I did have one additional question that will help me understand better what to do next here regarding:

For the Models, paths are the only required fields. Everything else needs to be marked as Optional.

When I looked thru the docs https://kitops.org/docs/kitfile/format/ it wasn't clear which fields were required or not. So I want to make it clear for myself, it's not necessary for CodeEntry, DatasetEntry, DocsEntry, ModelPart, or ModelSection to have any other fields populated that path? Would it mean that all fields for Package, aside from authors (maybe this was required due to the prior validation method or maybe I'm misremembering it as a hard requirement), all other fields are optional as well? The description and license fields seem like they could be optional, but name seemed like it would make sense to keep for most.

The ModelSection class does seem to me like it might be optimal to require more than just path to initialize, but that's just my inkling. I would think name, framework, and version would be required, and without any other constraints they could be set to empty strings.

@jfuss
Copy link
Contributor

jfuss commented Mar 20, 2025

@Burhan-Q

@amisevsk Can keep me honest but the only required fields are any paths and manifestVersion. We should probably be more specific in the documentation.

You can also look at the Go code for the kitfile: https://github.com/jozu-ai/kitops/blob/main/pkg/artifact/kitfile.go. Anything with omitempty I think should be optional. But would wait for @amisevsk to confirm my understanding.

@Burhan-Q
Copy link
Contributor Author

Just asked GPT real quick, looks like the required attributes are KitFile.ManifestVersionand Docs.Path.

GPT response

The required attributes for the objects defined in pkg/artifact/kitfile.go are determined by the absence of the omitempty tag in the corresponding struct fields. Here are the required attributes for each object:

KitFile

  • ManifestVersion - json:"manifestVersion" yaml:"manifestVersion"

Package

  • All attributes are optional (Name, Version, Description, License, Authors)

Docs

  • Path - json:"path" yaml:"path"

Model

  • All attributes are optional (Name, Path, License, Framework, Format, Version, Description, Parts, Parameters)

ModelPart

  • All attributes are optional (Name, Path, License, Type)

Code

  • All attributes are optional (Path, Description, License)

DataSet

  • All attributes are optional (Name, Path, Description, License, Parameters)

LayerInfo

  • All attributes are optional (Digest, DiffId)

In summary, the only required attributes are:

  • KitFile.ManifestVersion
  • Docs.Path

All other attributes are optional.

@Burhan-Q
Copy link
Contributor Author

I'll hold off until I hear from a trusted source (more than GPT). Might not get to it for another day or two anyhow. The feedback is appreciated

@gorkem
Copy link
Contributor

gorkem commented Mar 20, 2025

You may have found a bug :)

The following is from kitfile overview.

The only mandatory parts of the Kitfile are:
manifestVersion
At least one of code, model, docs or datasets sections

@amisevsk
Copy link
Contributor

Sorry for the delay here! I'm realizing the Go file is not a great source for this information

Required fields are:

  • .manifestVersion: required; Should normally be v1.0.0, though we're currently not enforcing this (and potentially can't enforce it going forward now that ModelKits are in the wild -- I dropped the ball on this one)
  • If a .model is present, .model.path is required, but .model itself is optional.
  • On any subtype in a list (modelpart, docs, code, dataset) .path is required

The Kitfile Go definition has mostly everything as omitempty, which is maybe a mistake; I think additional validation captures this requirement but is harder to notice in the file (this is largely me working around Go serialization weirdness).

Additionally, by "required", we mean "present and not the empty string", since Go interprets empty strings as "not present".

Worth also noting, the LayerInfo fields (.digest and .diffId) are only present in the JSON-serialization of the Kitfile when we use it as a ModelKit manifest's config. These fields will be ignored when reading from/to a Kitfile.

Everything else is optional, and technically the file

manifestVersion: v1.0.0

is a valid Kitfile, though it won't do anything for you :)

Another way to think about it: within the Kitfile, all we're doing is applying semantic categories to local files/directories on disk. To do this, we require a path (else how would we refer to a local file/directory), and don't technically require anything else (though .name, when present, is nice to have)

class BasePathModel(BaseModel):
"""Base class for validating paths."""

path: FilePath
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some paths can be directories. Should this encapsulate that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added DirectoryPath

"""
Serialize the Kitfile to YAML format.

Args:
suppress_empty_values (bool, optional): Whether to suppress
no_empty_values (bool, optional): Whether to suppress
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break anyone doing passing suppress_empty_values to to_yaml(). Since this is just renaming, I would suggest to revert and go back to suppress_empty_values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted

"To print Kitfile use to_yaml() and your favorite way to log/print; date=2025-02-21",
DeprecationWarning,
stacklevel=2,
)
print("\n\nKitfile Contents...")
print("===================\n")
output = self.to_yaml()
output: str = self.to_yaml()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If mypy has a hard time with it, let's update it. Otherwise, let's let mypy interpret the type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gone!

@Burhan-Q
Copy link
Contributor Author

Updated

  • Defaults to use manifestVersion='1.0.0' if nothing else provided.
  • The path argument is required for all classes that inherit from BasePathModel
  • Will not attempt to make path relative to working directory, instead issues a warn.warning
  • Kitfile.__init__ now accepts kwargs and will call super().__init__ method and the .build() has been removed
  • BasePathModel uses FilePath and DirectoryPath type annotations (also accepts strings)
from kitops.modelkit.kitfile import Kitfile

kf = Kitfile()
kf
>>> Kitfile(
    manifestVersion='1.0.0',
    package=Package(
        name='',
        version='',
        description='',
        authors=[]
    ), 
    code=[], 
    datasets=[], 
    docs=[], 
    model=None
)  # NOTE: wrapped for readability

pkg = Package(name="Burhan", version="0.6.9")
ce = CodeEntry(path="kitops/modelkit/pydantic_kit.py")

>>> Kitfile(
    manifestVersion='1.0.0', 
    package=Package(
        name='Burhan', 
        version='0.6.9', 
        description='',
        authors=[]
    ), 
    code=[
        CodeEntry(
            path='kitops/modelkit/pydantic_kit.py',
            description='', 
            license=''
        )
    ], 
    datasets=[], 
    docs=[], 
    model=None
)  # NOTE: wrapped for readability

@Burhan-Q Burhan-Q requested a review from jfuss March 25, 2025 03:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants