Skip to content

Commit

Permalink
Merge remote-tracking branch 'upstream/master'
Browse files Browse the repository at this point in the history
  • Loading branch information
goober3 committed Mar 17, 2024
2 parents 9adb81e + c14e1e5 commit 3a334ee
Show file tree
Hide file tree
Showing 3,541 changed files with 415,341 additions and 325,128 deletions.
The diff you're trying to view is too large. We only load the first 3000 changed files.
4 changes: 4 additions & 0 deletions .github/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -350,6 +350,10 @@ This prevents nesting levels from getting deeper then they need to be.
- [tgui/README.md](../tgui/README.md)
- [tgui/tutorial-and-examples.md](../tgui/docs/tutorial-and-examples.md)

### Don't create code that hangs references

This is part of the larger issue of hard deletes, read this file for more info: [Guide to Harddels](HARDDEL_GUIDE.md))

### Other Notes

- Code should be modular where possible; if you are working on a new addition, then strongly consider putting it in its own file unless it makes sense to put it with similar ones (i.e. a new tool would go in the "tools.dm" file)
Expand Down
265 changes: 265 additions & 0 deletions .github/HARDDEL_GUIDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,265 @@
# Hard Deletes
1. [What is hard deletion](#What-is-hard-deletion)
2. [Causes of hard deletes](#causes-of-hard-deletes)
3. [Detecting hard deletes](#detecting-hard-deletes)
4. [Techniques for fixing hard deletes](#techniques-for-fixing-hard-deletes)
5. [Help my code is erroring how fix](#help-my-code-is-erroring-how-fix)

## What is Hard Deletion
Hard deletion is a very expensive operation that basically clears all references to some "thing" from memory. Objects that undergo this process are referred to as hard deletes, or simply harddels

What follows is a discussion of the theory behind this, why we would ever do it, and the what we do to avoid doing it as often as possible

I'm gonna be using words like references and garbage collection, but don't worry, it's not complex, just a bit hard to pierce

### Why do we need to Hard Delete?

Ok so let's say you're some guy called Jerry, and you're writing a programming language

You want your coders to be able to pass around objects without doing a full copy. So you'll store the pack of data somewhere in memory

```dm
/someobject
var/id = 42
var/name = "some shit"
```

Then you want them to be able to pass that object into say a proc, without doing a full copy. So you let them pass in the object's location in memory instead
This is called passing something by reference

```dm
someshit(someobject) //This isn't making a copy of someobject, it's passing in a reference to it
```

This of course means they can store that location in memory in another object's vars, or in a list, or whatever

```dm
/datum
var/reference
/proc/someshit(mem_location)
var/datum/some_obj = new()
some_obj.reference = mem_location
```

But what happens when you get rid of the object we're passing around references to? If we just cleared it out from memory, everything that holds a reference to it would suddenly be pointing to nowhere, or worse, something totally different!

So then, you've gotta do something to clean up these references when you want to delete an object

We could hold a list of references to everything that references us, but god, that'd get really expensive wouldn't it

Why not keep count of how many times we're referenced then? If an object's ref count is ever 0, nothing whatsoever cares about it, so we can freely get rid of it

But if something's holding onto a reference to us, we're not gonna have any idea where or what it is

So I guess you should scan all of memory for that reference?

```dm
del(someobject) //We now need to scan memory until we find the thing holding a ref to us, and clear it
```

This pattern is about how BYOND handles this problem of hanging references, or Garbage Collection

It's not a broken system, but as you can imagine scanning all of memory gets expensive fast

What can we do to help that?

### How we can avoid hard deletes

If hard deletion is so slow, we're gonna need to clean up all our references ourselves

In our codebase we do this with `/datum/proc/Destroy()`, a proc called by `qdel()`, whose purpose I will explain later

This procs only job is cleaning up references to the object it's called on. Nothing more, nothing else. Don't let me catch you giving it side effects

There's a long long list of things this does, since we use it a TON. So I can't really give you a short description. It will always move the object to nullspace though

## Causes Of Hard Deletes

Now that you know the theory, let's go over what can actually cause hard deletes. Some of this is obvious, some of it's much less so.

The BYOND reference has a list [Here](https://secure.byond.com/docs/ref/#/DM/garbage), but it's not a complete one

* Stored in a var
* An item in a list, or associated with a list item
* Has a tag
* Is on the map (always true for turfs)
* Inside another atom's contents
* Inside an atom's vis_contents
* A temporary value in a still-running proc
* Is a mob with a key
* Is an image object attached to an atom

Let's briefly go over the more painful ones yeah?

### Sleeping procs

Any proc that calls `sleep()`, `spawn()`, or anything that creates a seperate "thread" (not technically a thread, but it's the same in these terms. Not gonna cause any race conditions tho) will hang references to any var inside it. This includes the usr it started from, the src it was called on, and any vars created as a part of processing

### Static vars

`/static` and `/global` vars count for this too, they'll hang references just as well as anything. Be wary of this, these suckers can be a pain to solve

### Range() and View() like procs

Some internal BYOND procs will hold references to objects passed into them for a time after the proc is finished doing work, because they cache the returned info to make some code faster. You should never run into this issue, since we wait for what should be long enough to avoid this issue as a part of garbage collection

This is what `qdel()` does by the by, it literally just means queue deletion. A reference to the object gets put into a queue, and if it still exists after 5 minutes or so, we hard delete it

### Walk() procs

Calling `walk()` on something will put it in an internal queue, which it'll remain in until `walk(thing, 0)` is called on it, which removes it from the queue

This sort is very cheap to harddel, since BYOND prioritizes checking this queue first when it's clearing refs, but it should be avoided since it causes false positives

You can read more about how BYOND prioritizes these things [Here](https://www.patreon.com/posts/diving-for-35855766)

## Detecting Hard Deletes

For very simple hard deletes, simple inspection should be enough to find them. Look at what the object does during `Initialize()`, and see if it's doing anything it doesn't undo later.
If that fails, search the object's typepath, and look and see if anything is holding a reference to it without regard for the object deleting

BYOND currently doesn't have the capability to give us information about where a hard delete is. Fortunately we can search for most all of then ourselves.
The procs to perform this search are hidden behind compile time defines, since they'd be way too risky to expose to admin button pressing

If you're having issues solving a harddel and want to perform this check yourself, go to `_compile_options.dm` and uncomment `TESTING`, `REFERENCE_TRACKING`, and `GC_FAILURE_HARD_LOOKUP`

You can read more about what each of these do in that file, but the long and short of it is if something would hard delete our code will search for the reference (This will look like your game crashing, just hold out) and print information about anything it finds to the runtime log, which you can find inside the round folder inside `/data/logs/year/month/day`

It'll tell you what object is holding the ref if it's in an object, or what pattern of list transversal was required to find the ref if it's hiding in a list of some sort

## Techniques For Fixing Hard Deletes

Once you've found the issue, it becomes a matter of making sure the ref is cleared as a part of Destroy(). I'm gonna walk you through a few patterns and discuss how you might go about fixing them

### Our Tools

First and simplest we have `Destroy()`. Use this to clean up after yourself for simple cases

```dm
/someobject/Initialize()
. = ..()
GLOB.somethings += src //We add ourselves to some global list
/someobject/Destroy()
GLOB.somethings -= src //So when we Destroy() clean yourself from the list
return ..()
```

Next, and slightly more complex, pairs of objects that reference each other

This is helpful when for cases where both objects "own" each other

```dm
/someobject
var/someotherobject/buddy
/someotherobject
var/someobject/friend
/someobject/Initialize()
if(!buddy)
buddy = new()
buddy.friend = src
/someotherobject/Initialize()
if(!friend)
friend = new()
friend.buddy = src
/someobject/Destroy()
if(buddy)
buddy.friend = null //Make sure to clear their ref to you
buddy = null //We clear our ref to them to make sure nothing goes wrong
/someotherobject/Destroy()
if(friend)
friend.buddy = null //Make sure to clear their ref to you
friend = null //We clear our ref to them to make sure nothing goes wrong
```

Something similar can be accomplished with `QDELETED()`, a define that checks to see if something has started being `Destroy()`'d yet, and `QDEL_NULL()`, a define that `qdel()`'s a var and then sets it to null

Now let's discuss something a bit more complex, weakrefs

You'll need a bit of context, so let's do that now

BYOND has an internal bit of behavior that looks like this

`var/string = "\ref[someobject]"`

This essentially gets that object's position in memory directly. Unlike normal references, this doesn't count for hard deletes. You can retrieve the object in question by using `locate()`

`var/someobject/someobj = locate(string)`

This has some flaws however, since the bit of memory we're pointing to might change, which would cause issues. Fortunately we've developed a datum to handle worrying about this for you, `/datum/weakref`

You can create one using the `WEAKREF()` proc, and use weakref.resolve() to retrieve the actual object

This should be used for things that your object doesn't "own", but still cares about

For instance, a paper bin would own the paper inside it, but the paper inside it would just hold a weakref to the bin

There's no need to clean these up, just make sure you account for it being null, since it'll return that if the object doesn't exist or has been queued for deletion

```dm
/someobject
var/datum/weakref/our_coin
/someobject/proc/set_coin(/obj/item/coin/new_coin)
our_coin = WEAKREF(new_coin)
/someobject/proc/get_value()
if(!our_coin)
return 0
var/obj/item/coin/potential_coin = our_coin.resolve()
if(!potential_coin)
our_coin = null //Remember to clear the weakref if we get nothing
return 0
return potential_coin.value
```

Now, for the worst case scenario

Let's say you've got a var that's used too often to be weakref'd without making the code too expensive

You can't hold a paired reference to it because it's not like it would ever care about you outside of just clearing the ref

So then, we want to temporarily remember to clear a reference when it's deleted

This is where I might lose you, but we're gonna use signals

`qdel()`, the proc that sets off this whole deletion business, sends a signal called `COMSIG_PARENT_QDELETING`

We can listen for that signal, and if we hear it clear whatever reference we may have

Here's an example

```dm
/somemob
var/mob/target
/somemob/proc/set_target(new_target)
if(target)
UnregisterSignal(target, COMSIG_PARENT_QDELETING) //We need to make sure any old signals are cleared
target = new_target
if(target)
RegisterSignal(target, COMSIG_PARENT_QDELETING, .proc/clear_target) //Call clear_target if target is ever qdel()'d
/somemob/proc/clear_target(datum/source)
SIGNAL_HANDLER
set_target(null)
```

This really should be your last resort, since signals have some limitations. If some subtype of somemob also registered for parent_qdeleting on the same target you'd get a runtime, since signals don't support it

But if you can't do anything else for reasons of conversion ease, or hot code, this will work

## Help My Code Is Erroring How Fix

First, do a quick check.

Are you doing anything to the object in `Initialize()` that you don't undo in `Destroy()`? I don't mean like, setting its name, but are you adding it to any lists, stuff like that

If this fails, you're just gonna have to read over this doc. You can skip the theory if you'd like, but it's all pretty important for having an understanding of this problem
36 changes: 36 additions & 0 deletions .github/MC_tab.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
The MC tab hold information on how the game is performing. Here's a crash course on what the most important of those numbers mean.

If you already know what these numbers mean and you want to see them update faster than the default refresh rate of once every 2 seconds, you can enable the admin pref to make the MC tab refresh every 4 deciseconds. Please don't do this unless you actually need that information at a faster refresh rate since updating every subsystems information is expensive.

# Main Entries:

* CPU: What percentage of a tick the game is using before starting the next tick. If this is above 100 it means we are over budget.

* TickCount: How many ticks should have elapsed since the start of the game if no ticks were ever delayed from starting.

* TickDrift: How many ticks since the game started that have been delayed. Essentially this is how many ticks the game is running behind. If this is increasing then the game is currently not able to keep up with demand.

* Internal Tick Usage: You might have heard of this referred to as "maptick". It's how much of the tick that an internal byond function called SendMaps() has taken recently. The higher this is the less time our code has to run. SendMaps() deals with sending players updates of their view of the game world so it has to run every tick but it's expensive so ideally this is optimized as much as possible. You can see a more detailed breakdown of the cost of SendMaps by looking at the profiler in the debug tab -> "Send Maps Profile".

# Master Controller Entry:

* TickRate: How many Byond ticks go between each master controller iteration. By default this is 1 meaning the MC runs once every byond tick. But certain configurations can increase this slightly.

* Iteration: How many times the MC has ran since starting.

* TickLimit: This SHOULD be what percentage of the tick the MC can use when it starts a run, however currently it just represents how much of the tick the MC can use by the time that SSstatpanels fires. Someone should fix that.

# Subsystem Entries:

Subsystems will typically have a base stat entry of the form:
[ ] Name 12ms|28%(2%)|3

The brackets hold a letter if the subsystem is in a state other than idle.

The first numbered entry is the cost of the subsystem, which is a running average of how many milliseconds the subsystem takes to complete a full run. This is increased every time the subsystem resumes an uncompleted run or starts a new run and decays when runs take less time. If this balloons to huge values then it means that the amount of work the subsystem needs to complete in a run is far greater than the amount of time it actually has to execute in whenever it is its turn to fire.

The second numbered entry is like cost, but in percentage of an ideal tick this subsystem takes to complete a run. They both represent the same data.

The third entry (2%) is how much time this subsystem spent executing beyond the time it was allocated by the MC. This is bad, it means that this subsystem doesn't yield when it's taking too much time and makes the job of the MC harder. The MC will attempt to account for this but it is better for all subsystems to be able to correctly yield when their turn is done.

The fourth entry represents how many times this subsystem fires before it completes a run.
21 changes: 21 additions & 0 deletions .github/TICK_ORDER.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
The byond tick proceeds as follows:
1. procs sleeping via walk() are resumed (i dont know why these are first)

2. normal sleeping procs are resumed, in the order they went to sleep in the first place, this is where the MC wakes up and processes subsystems. a consequence of this is that the MC almost never resumes before other sleeping procs, because it only goes to sleep for 1 tick 99% of the time, and 99% of procs either go to sleep for less time than the MC (which guarantees that they entered the sleep queue earlier when its time to wake up) and/or were called synchronously from the MC's execution, almost all of the time the MC is the last sleeping proc to resume in any given tick. This is good because it means the MC can account for the cost of previous resuming procs in the tick, and minimizes overtime.

3. control is passed to byond after all of our code's procs stop execution for this tick

4. a few small things happen in byond internals

5. SendMaps is called for this tick, which processes the game state for all clients connected to the game and handles sending them changes
in appearances within their view range. This is expensive and takes up a significant portion of our tick, about 0.45% per connected player
as of 3/20/2022. meaning that with 50 players, 22.5% of our tick is being used up by just SendMaps, after all of our code has stopped executing. Thats only the average across all rounds, for most highpop rounds it can look like 0.6% of the tick per player, which is 30% for 50 players.

6. After SendMaps ends, client verbs sent to the server are executed, and its the last major step before the next tick begins.
During the course of the tick, a client can send a command to the server saying that they have executed any verb. The actual code defined
for that /verb/name() proc isnt executed until this point, and the way the MC is designed makes this especially likely to make verbs
"overrun" the bounds of the tick they executed in, stopping the other tick from starting and thus delaying the MC firing in that tick.

The master controller can derive how much of the tick was used in: procs executing before it woke up (because of world.tick_usage), and SendMaps (because of world.map_cpu, since this is a running average you cant derive the tick spent on maptick on any particular tick). It cannot derive how much of the tick was used for sleeping procs resuming after the MC ran, or for verbs executing after SendMaps.

It is for these reasons why you should heavily limit processing done in verbs, while procs resuming after the MC are rare, verbs are not, and are much more likely to cause overtime since theyre literally at the end of the tick. If you make a verb, try to offload any expensive work to the beginning of the next tick via a verb management subsystem.
Loading

0 comments on commit 3a334ee

Please sign in to comment.