Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Invalid exceptions since #499 #521

Open
Arcitec opened this issue Oct 21, 2024 · 8 comments
Open

[Bug]: Invalid exceptions since #499 #521

Arcitec opened this issue Oct 21, 2024 · 8 comments
Labels
bug Something isn't working

Comments

@Arcitec
Copy link
Contributor

Arcitec commented Oct 21, 2024

This is just an issue to track the situation, so that it's not forgotten. The pull request directly manipulated the internal values of the exceptions, but did not do the required changes to do so in a valid way.

There are two solutions:

  • Switch to Python 3.11 and use the official Exception Group feature. No tweaking needed.
  • Or carefully look at Python's upstream code and documentation and then rewrite the exception's internal state again to implement the missing tweaks that would make them valid again.

#499 (comment)

@Arcitec Arcitec added the bug Something isn't working label Oct 21, 2024
@Arcitec
Copy link
Contributor Author

Arcitec commented Nov 15, 2024

Now that OneTrainer has upgraded PyTorch and is compatible with Python 3.10, 3.11 and 3.12, it would be easy to switch to 3.11 as the minimum requirement, and use the clean exception group feature instead of hacking the (currently broken) internal exception state.

Yes, some beginners currently use the system-provided Python and some distros might only ship 3.10 (which distro?), but it's so trivial to install Miniconda with a single command, and Conda gives them a more reliable OneTrainer environment. So I don't think there's a strong argument to stay on Python 3.10.

Our launcher scripts already detect incorrect Python versions and automatically instruct the user to install Pyenv or Miniconda to solve it. So the transition will be smooth.

Just raise the Python version to 3.11 in lib.include.sh (both the minimum and the "use in conda env" fields), and then we can rewrite the exceptions to use a clean ExceptionGroup instead.

I use Python 3.12 in my Conda environment by the way, since I wanted to help test the latest supported version. Works perfectly.

@Nerogar
Copy link
Owner

Nerogar commented Nov 15, 2024

Now that OneTrainer has upgraded PyTorch and is compatible with Python 3.10, 3.11 and 3.12, it would be easy to switch to 3.11 as the minimum requirement,

It would still require a complete re-install for everyone. Including a python update for everyone on windows. A breaking change like that should only be done if there is a real benefit. Not just a small code cleanup.

@Arcitec
Copy link
Contributor Author

Arcitec commented Nov 15, 2024

@Nerogar Good point.

Well, @hameerabbasi if you have time sometime, could you go through the issues with the current exception refactor so we have non-corrupted internal states?

@hameerabbasi
Copy link
Contributor

@Arcitec If you have a reproducer I could take a look for sure.

@Arcitec
Copy link
Contributor Author

Arcitec commented Nov 15, 2024

@hameerabbasi Awesome! :) The issues are mentioned here:

#499 (comment)

There's a link to their docs which mentions some of the internal state changes that happen when __cause__ is set by the official Python code.

One of the things that is absolutely wrong in OneTrainer right now is the value of __suppress_context__.

Secondly, we would need to look in the CPython source code to look if there's anything else they do when raising exceptions from exceptions. This could be found by searching for __suppress_context__ to see where their code is setting it (it will be near the code that sets __cause__) and seeing what else they do when chaining exceptions.

It seems like we also need to set __context__, perhaps for the new exception, perhaps for the old, I don't know. Their source code should tell.

Lastly, we could put all those "chain exceptions hacks" in one of the OneTrainer library modules so we have a simple chain_exceptions(original_exception, new_exception) function to avoid repeating the code. It would set all properties correctly in both exceptions and return new_exception.

Or... moving to Python 3.11 as minimum to avoid all this and get some other nice features. ;) It's inevitable anyway someday.

@Arcitec
Copy link
Contributor Author

Arcitec commented Nov 15, 2024

@hameerabbasi Something hit me though. Have you tried raise e2 from e1? If we can take 2 existing caught exceptions and re-raise them from each other, Python would do all of this for us automatically.

Maybe even in this "ridiculous" way:

def chain_exceptions(original_exception: Exception, new_exception: Exception) -> Exception:
    try:
        raise new_exception from original_exception
    except Exception as e:
        return e

But it might ruin stack traces, or not work at all, who knows.

@hameerabbasi
Copy link
Contributor

The following works:

from collections.abc import Iterable
from functools import reduce
from typing import Callable, TypeVar

Ret = TypeVar("Ret")

def chain_exceptions(e1: Exception, e2: Exception) -> Exception:
    try:
        raise e2 from e1
    except Exception as e:
        return e

def chain_exception_list(excs: Iterable[Exception]) -> Exception:
    return reduce(chain_exceptions, excs)

def try_actions(actions: Iterable[Callable[[], Ret]]) -> Ret:
    exceptions = []
    for action in actions:
        try:
            return action()
        except Exception as e:
            exceptions.append(e)
    
    raise chain_exception_list(exceptions)

@hameerabbasi
Copy link
Contributor

It can be used as follows:

try_actions((lambda: func1(arg1, arg2, arg3, ...), lambda: func2(arg4, arg5, ...), ...))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants