Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Number objects vs. numbers #50

Open
SuperLlama88888 opened this issue Sep 10, 2024 · 6 comments
Open

Number objects vs. numbers #50

SuperLlama88888 opened this issue Sep 10, 2024 · 6 comments
Assignees
Labels
documentation Improvements or additions to documentation question Further information is requested

Comments

@SuperLlama88888
Copy link

Platform: Web JS, imported from https://cdn.jsdelivr.net/npm/[email protected]/+esm

When reading NBT data, integers and bytes are expressed as Number() objects instead of numbers. These can only be created with the new Number() constructor and have the following drawbacks:

  • new Number(2) === 2 is false
  • typeof new Number(2) is object
  • Console debugging is a bit more awkward as they take up more space.

Is this decision intentional? I know this is true for integers and bytes, but I haven't tested other number types.

Thank you for your time and such a useful library!

@Offroaders123 Offroaders123 self-assigned this Sep 10, 2024
@Offroaders123 Offroaders123 added documentation Improvements or additions to documentation question Further information is requested labels Sep 10, 2024
@Offroaders123
Copy link
Owner

General answer

At least for the meantime, this is intentional yes.

Since JavaScript only has a single number primitive type, NBTify needed a way to distinguish what method a given numerical value is intended to be managed in terms of the NBT typings.

Since JavaScript's number type is equivalent to that of a double type in Java, programs would have to assume the type of what a given NBT key-value is, depending on the current state of that value in the file, which seems unsafe/lossy to me.

Deeper dive

Say we look at Java Edition's world save format for example, the level.dat file.

level.dat Key Examples
  • The key DataVersion is expected to hold a value of type int, which allows numbers in a range from -2147483648 to 2147483647.

  • The key DayTime is expected to hold a value of type long, which allows numbers in a range from -9223372036854775808 to 9223372036854775807, which is much larger.

  • The keys Difficulty and DifficultyLocked however, are expected to hold a value of type byte, which allows numbers in a range from -128 to 127, which is significantly smaller.

Number type persistence

If each of these values were to be read into JavaScript-land as plain number types, this range information would be lost, and would either have to be handled in the writing stage by the user manually (which makes the API harder to use since they would have to do this transformation manually for every file that has non-double number types), or things would be re-written again to the file, with completely different number scales.

The base issue is that JavaScript doesn't provide enough primitive number types that will losslessly persist this intended type information down to the NBT object tree, where everything is represented in JavaScript-land. Ideally, JavaScript could have something like byte, short, int, and float types to fix this for us, but right now that isn't something planned, or available to us. Thankfully, we do have the bigint type though, which is plenty enough to hold values for the long type that we need to represent.

I also don't really like the usage of needing to interact with these Number objects as objects themselves, but I would rather have to deal with that on it's own, than lose precision, when reading and overwriting an NBT file back again:

import { read, write, type NBTData } from "nbtify";
import { Buffer } from "node:buffer";

// Your NBT file from the file system, network, etc.
declare const input: Uint8Array;

// NBT object tree
const nbt: NBTData = await read(input);

// Re-compiled NBT file
const output: Uint8Array = await write(nbt);

// Should return with `0`, as the bytes for these files should ideally be identical.
const identical: -1 | 0 | 1 = Buffer.compare(input, output);

Alternate solution: Type inference by value range

One alternate take could be that inferences can be made by the current state of the value, and you could decide the primitives to use, on the fly when writing them back to the file. I really like the premise of this, and I se where it is coming from. However, this has an edgecase that makes things unsafe when a value in one of the larger value-sized types currently holds a small value, which makes the inference inaccurate depending on the current value.

For the level.dat example above, the Difficulty value (byte) could hold a value of 0 if the game is in Peaceful. With the typings inference model, inferring the type of 0 means it would safely resolve to the smallest type, being byte this time around. For this case, it happens to be the correct inference.

However for the DataVersion value (int), it may also initially hold a value of 0. So when inferring the type from that, it would also be inferred as a byte type. This changes how the value will be serialized, and the resulting file will no longer be synonymous with the input file, because the current state of DataVersion is now stored using less bytes than it was to begin with. The value is persisted, but how it is stored has changed.

This comes back to the part about possible user-interaction. If the types of the NBT file can be pre-documented to a schema of sorts, maybe this could be handled by that, and corrected when writing the file. However, this would mean that every NBT file would potentially need type documentation to write it back symmetrically. This seems too far-fetched to me. Anyone can write an NBT file, so not all files can possibly have documentation.

Objects are hard to use

Yeah I agree, I don't like this part of it either. I wish we wouldn't have to unwrap these primitives with things like myValue.valueOf() to get the plain number out of it again. If there is some other way of making this work, while ensuring the types can be preserved, I would love to know!

If anything, I have been wanting to see if there's a way to still use ES6 classes, and also allow for the syntax like Int8(), Int32(), compared to requiring the use of new Int8(), and new Int32().

import { Int8 } from "nbtify";

// Current
const byte0 = new Int8(25);

// Potential option, easier!
// More like `Number()` and `BigInt()` too!
const byte1 = Int8(80);

Sorry this was so long, I haven't really documented this concept outright in the repo anywhere yet, so I thought I would cover all of the bases while it was relevant in the discussion.

Thanks for mentioning this!

@SuperLlama88888
Copy link
Author

Thank you for such a detailed response! I think the current system deals with JavaScript's shortcomings in the best way possible across all uses.

If each of these values were to be read into JavaScript-land as plain number types, this range information would be lost

I didn't quite understand this bit - is precision stored in these Number objects? When I use it in the browser there isn't anything else in the Number object but the number itself - like this:
image

@Offroaders123
Copy link
Owner

The usage of the Number object itself isn't any different, no. It's how NBTify detects what kind of number it is, once it encounters it though, as each number class has it's own prototype inheritance (it is it's own extension of Number).

NBTify Primitive Type Detection

So if the number object value you pass in is an instance of NBTify's Int32 class for example, then we know that it should be serialized with NBT's TAG_Int type.

I did just notice though, now I think I see the confusion around the number types when you're using NBTify. From the CDN build of the package (or maybe it's just Firefox's developer tools), the names of the classes' names appear to be removed, either with the minification from the CDN build, or because of the dev tools.

@SuperLlama88888
Copy link
Author

Thank you for this! In Chrome, it works slightly better:
image
The a, i and o are the class names after being mangled by the minification process. It looks like it's something on jsDelivr's end then.
It makes much more sense with different classes extending Number there, just a shame that this is lost in the debugging process.

Again, thank you so much for your dedication towards helping me with this, you've been really helpful. Cheers!

@Offroaders123
Copy link
Owner

I think I realized a possible way to make this work a bit nicer, at least from the TypeScript side of things. I can possibly expose the typings of these primitives simply as a branded type, rather than an explicit Number, so it can instead just look more similar to that of a number. This could be deceiving though, so maybe it wouldn't be completely straightforward to users.

Once I get a working demo of this setup I can link it here for an example.

Offroaders123 added a commit that referenced this issue Sep 20, 2024
This seems to be a much more useable API in terms of the TypeScript side of things. It's completely symmetrical in terms of the JavaScript implementation, so nothing changes there. It essentially just allows you to type-safely use a `Int8` for example, just as a regular `number` type. This may sound dangerous in some respects, and yes this is just a concept, so I plan to look into it to see if anything is wrong with doing it like this. Essentially, JS already converts `number` primitives back and forth for you, say like when you do `(5).toString()` for example. It's making a primitive wrapper object of that value, for a single line, and getting the result of that method call, from that object that was created under the hood.

So the concept here is to use branded types to distinguish apart the number primitives, and use real `Number`-extended wrapper objects in runtime to distinguish them apart there as well. Being able to extend `Number` also allows you to pass them around and use them safely as plain `number` values too, since JS is able to do this conversion for us under the hood already. Like if you do `new Number(5) + 2`, you get `7`. TypeScript will warn you of doing this, and that has been my issue with using the wrappers for regular scripting purposes. I wanted to be able to pass them around as pure values, and not have to call `valueOf()` every single time I access their values.

Another side to this though, is that this does notable break the concerns that TS provides, even though this is very much nearly doable in most JS cases. Say like you use `typeof myByte === "number"`, it will be `false` because it's actually an `object`. So maybe it makes sense to explicitly unwrap the value when you completely need to, and if not, you can still safely narrow the type and use addition for example, and decide on your own when it is safe to unwrap the type or not. I can provide more examples for this later on. I think this may be the way I take things, I really like that it allows for the use of regular function calls too, rather than with the `new` prefix.

#50

https://www.youtube.com/watch?v=Yz8ySbaeCf8
https://medium.com/@apalshah/javascript-class-difference-between-es5-and-es6-classes-a37b6c90c7f8
https://esdiscuss.org/topic/extending-an-es6-class-using-es5-syntax
Offroaders123 added a commit that referenced this issue Sep 27, 2024
I think I found an actually fairly nice way to handle this!

I can assert the exported types of the classes, and make them more accurate to how they are used, rather than specific to how they are implemented. Going to look into this now, it seems to look like it works very well! Using ES6 classes still is a big plus as well, as it still works with instanceof checks.

TypeScript doesn't like `instanceof` calls on non-object or `never` values though, so if needed I can implement a static method maybe, like arrays? It seems to be okay with using `instanceof` on an `unknown` value, which is good.

```ts
  static isInt8(value: unknown): value is Int8 {
    return value instanceof Int8;
  }
```

I also really like this custom types export setup because it doesn't change the usage of existing programs, and it's only at the type level too, so things functionally won't change at all at the runtime level.

Flagman, Lifesigns!

#50

This commit is essentially a revert back to the start of this branch, then starting new with this custom approach.
@Offroaders123
Copy link
Owner

Ok, I've merged a working version of this into the main branch, going to work with it for a bit to see if it has any issues with existing type usages. It should work nicely though! We'll see.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants