Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document the new incremental GC #117759

Open
2 of 3 tasks
markshannon opened this issue Apr 11, 2024 · 4 comments
Open
2 of 3 tasks

Document the new incremental GC #117759

markshannon opened this issue Apr 11, 2024 · 4 comments
Assignees
Labels
docs Documentation in the Doc dir

Comments

@markshannon
Copy link
Member

markshannon commented Apr 11, 2024

Documentation

  • Expand the "what's new" section a bit.
  • Update Doc/library/gc.rst, speciifically how the thresholds work now.
  • Update the dev guide to explain how incremental collection works.

Linked PRs

@markshannon markshannon added the docs Documentation in the Doc dir label Apr 11, 2024
@markshannon markshannon self-assigned this Apr 11, 2024
markshannon added a commit that referenced this issue Aug 27, 2024
* Update what's new

* Update gc module docs and fix inconsistency in gc.get_objects
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Aug 27, 2024
* Update what's new

* Update gc module docs and fix inconsistency in gc.get_objects
(cherry picked from commit f49a916)

Co-authored-by: Mark Shannon <[email protected]>
@pablogsal
Copy link
Member

Please, also modify https://github.com/python/devguide/blob/main/internals/garbage-collector.rst to keep it in sync with the current implementation

@markshannon
Copy link
Member Author

python/devguide#1379

@pablogsal
Copy link
Member

Thanks 🙏

Yhg1s pushed a commit that referenced this issue Sep 2, 2024
GH-117759: Document incremental GC (GH-123266)

* Update what's new

* Update gc module docs and fix inconsistency in gc.get_objects
(cherry picked from commit f49a916)

Co-authored-by: Mark Shannon <[email protected]>
@mr-bronson
Copy link

For the reader, it is confusing that in 3.13 Generation 1 is removed, but also *threshold2* is ignored. One would expect threshold1 to be what is ignored if it is generation 1 that is removed. Previously, the role of threshold2 was never documented anyway. But threshold1 was documented, and it related to generation 1, not generation 2.

Also, the document indicates no change to the values returned by get_threshold(), which leaves one wondering about whether the threshold for generation 2 is reflected in the third value (as previously) or whether the second value now holds the threshold for generation 2. Without checking, I have no idea.

(To me it would have made sense to simply change the signatures and return values of all functions that assumed there were three generations, and then document the changes as simply only having two generations now, rather than saying one is removed. If you're already introducing breaking changes in the way something works, then why leave the cruft of extra ignored parameters and undefined or blank return values? But if the goal is some misguided idea of compatibility (even though they fundamentally are not compatible) such that For compatibility, gc.get_objects() pretends there is a generation 1, then the same should apply to the values returned by get_threshold() and accepted by set_threshold(), in which case threshold1 rather than threshold2 should be ignored.)

Also, the document speaks in three places of the collection frequency. Can someone propose a better, less confusing wording? Frequency is count divided by time. But the so-called "frequency" of collection has nothing to do with time. If these are, as the function and argument names indicate, thresholds in a cycle, then it seems like it should talk about "collection cycle parameters" or something like that rather than frequency. Similarly, time words like "slower" are used when it's not about time:

The larger threshold1 is, the slower objects in the old generation
are collected.

So it's also confusing when the document equates the thresholds with "frequency" when they are inverse to "frequency":

Set the garbage collection thresholds (the collection frequency).

Related to this issue is #122862. It points out that the running calculation of the number of allocations minus the number of deallocations since the last collection is never allowed to go below 0, so the current description is inaccurate. The same problem exists in devguide/internals/garbage-collector.rst

Also, it is noted that Setting *threshold0* to zero disables collection. However, this is confusing because this is a totally different kind of disablement than spoken of elsewhere in the document. In other words, gc.isenabled() will still return True, and gc.enable() does not actually enable collection when threshold0 is zero. Using a different word like "inhibit" or "prevent" rather than "disable" might help to avoid confusion. And if negative values achieve the same thing as zero (don't know, just guessing), then it should be stated as "less than one".

(Actually, the better option would be for set_threshold to implement proper bounds checking in the first place, so setting less than one would result in an exception. Then you don't have to document stupid edge cases or distinguish between different kinds of disablement. Currently, you can set thresholds as low as -0x80000000, yet both this and the devguide say the threshold is inverse to the "frequency", but that's not true for more than half the possible values the function accepts.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation in the Doc dir
Projects
None yet
Development

No branches or pull requests

3 participants