Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set local build version #440

Merged
merged 3 commits into from
Oct 2, 2023
Merged

Conversation

wenwei-dev
Copy link
Contributor

Does this meet the requirements for issue #434?

@luketpeterson
Copy link
Contributor

luketpeterson commented Sep 16, 2023

The localbuild version is great. For me, I got 0.1.6-localbuild which is in-spec for semver.

But the pip-installed version I get is 0.1.6.dev281+g8ce9d72.d20230916 Which is unfortunately out of semver spec.

I started digging into how this version gets generated, and apparently python versioning uses a different and mutually incompatible spec from the rest of the world, called PEP 440.

Luckily 0.1.6-localbuild is also PEP 440 compliant.

I see that there is an extension for setuptools-scm to always generate semver-compatible versions.
https://pypi.org/project/setuptools-scm-git-semver/ But given we're talking about a Python version, I think I need to be able to ingest any compliant PEP 440 version string.

So bottom line, I think the version format is fine. But I will need to change version number parsers.

Side note: It appears there is a local setuptools-scm scheme that might be callable from __init__.py, rather than needing to hard-code a string. Of course that pre-supposes the module is part of a git repo, so maybe best to leave it hard-coded. I don't have a strong opinion. It's the devil of remembering to keep the version up to date vs. the devil not having a source version without the git metadata.

My weak opinion is that hard-coding is better, and we already have hard-coded versions in the Cargo.toml files that we need to bump at release time.

@luketpeterson
Copy link
Contributor

@vsbogd , what do you think about updating a hard-coded version in __init__.py each time we release? My thinking is that it could be part of the same checklist as bumping the Cargo.toml versions, but since you're likely the one who will usually do it, your opinion carries the most weight, IMO.

@luketpeterson
Copy link
Contributor

UPDATE: While 0.1.6-localbuild should be an in-spec PEP 440 version (See https://peps.python.org/pep-0440/#development-release-separators), some parsers that are common in the wild (like https://cohost.org/konstin/post/514863-reimplementing-pep-4) don't seem to normalize versions as-per the spec. :-/

So, instead, I propose 0.1.6+localbuild to eliminate all doubt.

python/hyperon/__init__.py Outdated Show resolved Hide resolved
@vsbogd
Copy link
Collaborator

vsbogd commented Sep 18, 2023

I don't object about incrementing the version manually after release. We do it for the Rust package anyway. I am bothered that this PR introduces second version which should be incremented manually, at the same time first version is still tracked by setuptools_scm. Thus we don't have single source of truth and one of the versions anyway requires manual increment.

In general if we are using importlib.metadata approach to get version from a package, then version doesn't exist until we build the package. Thus the absence of the version can be a flag which means that python module is not used from source. @Necr0x0Der suggests do not check the version in such case. I agree that it makes sense. If we choose this path I would return None instead of version or something without version at all, for example dev.

Another option is to set hyperon.__version__ manually and don't use importlib.metadata at all. I made a quick test and see that unfortunately setuptools_scm doesn't get it into account, but may be we can tune it to take it into account. Then we have two versions hyperon.__version__ returns the future release version of the code, distribution package will have version like 0.1.6.dev289+gfcba3fb.d20230918 as a version of a package which is generated automatically. For the released package these two versions are identical.

@wenwei-dev
Copy link
Contributor Author

In the current approach, there will be two versions: one is the version of the Python wheel, and the other is the version of the source code. These versions may deviate if not well maintained. And what makes it more confusing is the git tag.
I am wondering the possibility of __init__.py returning a trivial value such as None or an empty string "" for the source code version. What issues this may cause?

@wenwei-dev
Copy link
Contributor Author

I also wanted to mention that running python -m setuptools_scm can print a version that is consistent with the wheel version.

@luketpeterson
Copy link
Contributor

luketpeterson commented Sep 21, 2023

What issues this may cause?

We need unpublished, "in-development" versions of the package to have valid versions. Conceptually, the source code is the source of all functionality, so the most major part of the version number belongs with the source code, somehow. The build and release components are more minor components.

So the question is how to implement that?

I believe setuptools_scm constructs its version from the git metadata using some rather complex logic dependent on some assumptions about version numbers embedded in git tags. Technically the .git metadata is part of the source code. It's just not a very transparent part.

So it seems to me that we have two options:
1.) Decide that we are happy to mandate that the source version comes the .git metadata, and call setuptools_scm to generate the local version when pip doesn't know about our package.
2.) Decide that we want the published version to come from an easy-to-read part of the source code, and we will let setuptools_scm fill in the minor version components only.

I don't have an opinion between these two. They both provide a single source of truth for the version. (the .git metadata in 1. and whatever file we choose in 2., probably setup.py or pyproject.toml)

Reference: https://github.com/pypa/setuptools_scm/tree/main/docs

@wenwei-dev
Copy link
Contributor Author

wenwei-dev commented Sep 21, 2023

Thanks @luketpeterson. I peronally prefer option 1, using the .git metadata as the source of truth for versioning. The reasons are

  1. It guarantees uniqueness. Each commit that changes the source code also changes the version.
  2. It is immutable unless it is forcefully updated, which makes it difficult to tamper with.
  3. It is shared with all the sub-projects (C, Rust, Python).
  4. Mimimal maintance work once it is automated.
    I'm fine with option 2 too.

@vsbogd
Copy link
Collaborator

vsbogd commented Sep 21, 2023

I don't have an opinion between these two. They both provide a single source of truth for the version. (the .git metadata in 1. and whatever file we choose in 2., probably setup.py or pyproject.toml)

@luketpeterson , I would like to spot that setup.py and pyproject.toml in Python are used to describe a distribution package. In this sense they are build and release components rather then a part of a source code. setup.py/pyproject.toml doesn't work as a versions source if user doesn't install a package. And #434 is about the case when user doesn't install the package.

.git is not a part of a source distribution package and thus it is also not quite a part of the source code. But I believe if a source code package is generated using pyproject.toml then version will be calculated and put into a package metadata.

We need unpublished, "in-development" versions of the package to have valid versions.

This PR is to resolve #434 and I think we need to focus on resolving it. @luketpeterson , @wenwei-dev do you see a way to resolve it using one of the options from #440 (comment) ? I don't see it if we don't put a tricky logic which calculates version by content of the .git or pyproject.toml into __init__.py.

@luketpeterson
Copy link
Contributor

luketpeterson commented Sep 21, 2023

setup.py and pyproject.toml in Python are used to describe a distribution package. In this sense they are build and release components rather then a part of a source code.

What I meant was that they are checked into the source repository and maintained along-side the source code. So they are source in the same way Cargo.toml is source. The distinction I'm making is that they are not build artifacts that are generated on-the-fly during the build process.

I don't see it if we don't put a tricky logic which calculates version by content of the .git or pyproject.toml into init.py

I don't think the logic needs to be tricky at all. I think it's just a matter of invoking setuptools_scm to generate a version based on a local_scheme.

@vsbogd
Copy link
Collaborator

vsbogd commented Sep 21, 2023

The distinction I'm making is that they are not build artifacts that are generated on-the-fly during the build process.

Ah ok, I agree with this.

I don't think the logic needs to be tricky at all. I think it's just a matter of invoking setuptools_scm to generate a version based on a local_scheme.

To me calling setuptool_scm with a correct project root is also kind of tricky :-) Taking into account we need to be careful to not convert setuptools_scm into runtime dependency and thus we also need to keep the previous code which gets the version of the package if it exists. But I agree it can work in principle :-)

@vsbogd
Copy link
Collaborator

vsbogd commented Sep 21, 2023

To add one more solution, we could use simple __init__.py:

__version__ = "0.1.6"

and replace setuptools_scm section in pyproject.toml by:

[tool.setuptools.dynamic]
version = {attr = "hyperon.__version__"}

It will not automatically get the information from .git though, thus I don't insist on it :-)

@luketpeterson
Copy link
Contributor

luketpeterson commented Sep 21, 2023

I don't have a strong opinion on any of this, but...

To add one more solution, we could use simple init.py:

that seems like a step backwards to not include automatic build & release info at all, even when it is available.

What about using the 3-tuple version from a hard-coded file, and augmenting it with build & release information. I would bet this is a capability of setuptools_scm.

To @wenwei-dev 's points:

  1. It guarantees uniqueness. Each commit that changes the source code also changes the version.

The versions are still unique even if the major 3 components come from a file.

  1. It is immutable unless it is forcefully updated, which makes it difficult to tamper with.

True, we'll need to be more discipled. But the Cargo versions are in the same boat.

  1. It is shared with all the sub-projects (C, Rust, Python).

If the minor versions come from setuptools_scm, which in turn gets the info from git then this property still holds for the minor versions. And the major versions will bump manually regardless, whether that's by making a tag or updating a file it will have the same effect.

  1. Mimimal maintance work once it is automated.

True. But once again the Cargo versions are in the same boat.

@vsbogd
Copy link
Collaborator

vsbogd commented Sep 21, 2023

What about using the 3-tuple version from a hard-coded file, and augmenting it with build & release information. I would bet this is a capability of setuptools_scm.

Looks like it is not a case. setuptools_scm is poorly documented but from the source code it just gets the first version it finds, and if it finds it in git then it uses git's one.

@wenwei-dev
Copy link
Contributor Author

What about using the 3-tuple version from a hard-coded file, and augmenting it with build & release information. I would bet this is a capability of setuptools_scm.

Looks like it is not a case. setuptools_scm is poorly documented but from the source code it just gets the first version it finds, and if it finds it in git then it uses git's one.

In my opinion, the only problem with that is you will need to manually update the 3-tuple version and ensure it stays aligned with the git tag.

@vsbogd
Copy link
Collaborator

vsbogd commented Sep 22, 2023

@wenwei-dev , what is the solution you like most?

@wenwei-dev
Copy link
Contributor Author

I'm not sure I understand the motivation for checking the Python version. Since the Python code is coupled with Rust/C code in the same repository, when they are tagged, they will be tagged as the same version. Also, because they are in the same repository, they should always be compatible. However, if we want to decouple it and treat the Python code as a standalone project, the best way is to put it in a separate repository.

@luketpeterson
Copy link
Contributor

... Since the Python code is coupled with Rust/C code in the same repository, when they are tagged, they will be tagged as the same version. Also, because they are in the same repository, they should always be compatible.

This version isn't to protect HyperonPy against HyperonC / Rust core version mismatch; it's to protect Hyperon clients and extensions against incompatible versions of HyperonPy & HyperonCore.

@wenwei-dev
Copy link
Contributor Author

Okay, thanks. I'm not sure what the right approach is. Maybe I'm not clear about the use cases.

I believe that using pip is the standard method for installing a Python package. In the README, it also suggests using the pip command to install it in edit mode python3 -m pip install -e ./python[dev]. Both work very well in the current implementation.

@vsbogd
Copy link
Collaborator

vsbogd commented Sep 26, 2023

I believe three main use-cases for us are:

  • using package installed from PyPi repository
  • using package installed locally via python -m pip -e
  • using package in a custom environment, with hyperonpy library built and python part just used by adding the path to the package into a sys.paths; this is the use-case for the advanced users and this case specifically is what Need HyperonPy version when Hyperon isn't installed via pip #434 was opened for.

First two I believe work with providing a version from the package. Last one requires providing some version from the __init__.py.

@wenwei-dev
Copy link
Contributor Author

wenwei-dev commented Sep 27, 2023 via email

Copy link
Collaborator

@vsbogd vsbogd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are discussing it for 2 weeks already. I suggest merging it and fix it in a next PR if needed.

@vsbogd
Copy link
Collaborator

vsbogd commented Oct 2, 2023

@luketpeterson , PR is blocked by your request for changes, could you please take a look and approve if you are agree with current state or continue discussion.

@luketpeterson
Copy link
Contributor

The change was to turn 0.1.6-localbuild to 0.1.6+localbuild, so I am happy.

@vsbogd vsbogd merged commit 01d0ba4 into trueagi-io:main Oct 2, 2023
1 check passed
@wenwei-dev wenwei-dev deleted the python-version branch October 2, 2023 15:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants