Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 12053- CesiumJS should not use push.apply #12176

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

Tim-Quattrochi
Copy link

@Tim-Quattrochi Tim-Quattrochi commented Aug 31, 2024

Description

Change push.apply to use for loops to avoid exceeding the maximum call stack.

Issue number and link

#12053

Testing plan

Author checklist

  • I have submitted a Contributor License Agreement
  • I have added my name to CONTRIBUTORS.md
  • I have updated CHANGES.md with a short summary of my change
  • I have added or updated unit tests to ensure consistent code coverage
  • I have updated the inline documentation, and included code examples where relevant
  • I have performed a self-review of my code

Copy link

Thank you for the pull request, @Tim-Quattrochi!

✅ We can confirm we have a CLA on file for you.

@Tim-Quattrochi Tim-Quattrochi marked this pull request as ready for review August 31, 2024 01:17
@Tim-Quattrochi Tim-Quattrochi changed the title Issue 12053 Issue 12053- CesiumJS should not use push.apply Aug 31, 2024
@ggetz
Copy link
Contributor

ggetz commented Sep 3, 2024

Thanks @Tim-Quattrochi!

@javagl Are you available to review?

Copy link
Contributor

@javagl javagl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the "pattern" for changing push.apply into a for-loop (mentioned in one inlined comment) :

One of the (faaar too many) quirks of JavaScript is: It is not possible to know what some piece of code is doing just by reading it. Never. Never ever. Period.

Many people might think that this code will print 42:

function exampleFunction(exampleObject) {
  exampleObject.exampleProperty = 42;
  console.log(exampleObject.exampleProperty);
}

Maybe it does. Maybe it doesn't. It could also start playing a funny cat video and format your hard drive.

Specifcally, for the push.apply case:

When calling push.apply and passing in an array, then this array will be used directly.
When changing this to a for-loop, one has to be careful to also use the array directly (i.e. the minimize the "change in behavior" that could result from this refactoring).

Consider the following dummy example:

// A dummy class that has a "getSources" function that returns an array
function ExampleClass() {
  this.sources = ["A", "B", "C"];
};
ExampleClass.prototype.getSources = function() {
  console.log("getSources is called, and returns ", this.sources);
  return this.sources;
}

// Use the getSources with push-apply
function exampleFunctionWithPushApply(exampleObject) {
  const targets = [];
  targets.push.apply(targets, exampleObject.getSources());
  console.log("push.apply targets: ", targets);
}

// Use getSources with a for-loop in the wrong way
function exampleFunctionWithForLoopWrong(exampleObject) {
  const targets = [];
  for (let i=0; i<exampleObject.getSources().length; i++) {
    targets.push(exampleObject.getSources()[i]);
  }
  console.log("for-loop A targets: ", targets);
}

// Use getSources with a for-loop in the right way
function exampleFunctionWithForLoopRight(exampleObject) {
  const targets = [];
  const sources = exampleObject.getSources();
  for (let i=0; i<sources.length; i++) {
    targets.push(sources[i]);
  }
  console.log("for-loop B targets: ", targets);
}

const exampleObject = new ExampleClass();

console.log("Running with push.apply");
exampleFunctionWithPushApply(exampleObject);

console.log("Running with for-loop (wrong)");
exampleFunctionWithForLoopWrong(exampleObject);

console.log("Running with for-loop (right)");
exampleFunctionWithForLoopRight(exampleObject);

It defines an object with some getSources function that returns an array (and prints a message saying when getSources is called).

This array is then used...

  • once in push.apply
  • once in a for-loop (the wrong way)
  • once in a for-loop (the right way)

The output when running this with node Example.js will be

Running with push.apply
getSources is called, and returns  [ 'A', 'B', 'C' ]
push.apply targets:  [ 'A', 'B', 'C' ]
Running with for-loop (wrong)
getSources is called, and returns  [ 'A', 'B', 'C' ]
getSources is called, and returns  [ 'A', 'B', 'C' ]
getSources is called, and returns  [ 'A', 'B', 'C' ]
getSources is called, and returns  [ 'A', 'B', 'C' ]
getSources is called, and returns  [ 'A', 'B', 'C' ]
getSources is called, and returns  [ 'A', 'B', 'C' ]
getSources is called, and returns  [ 'A', 'B', 'C' ]
for-loop A targets:  [ 'A', 'B', 'C' ]
Running with for-loop (right)
getSources is called, and returns  [ 'A', 'B', 'C' ]
for-loop B targets:  [ 'A', 'B', 'C' ]

One can see that in the "wrong" case, getSources is called 7 times. Specifically, in
for (let i=0; i<exampleObject.getSources().length; i++)
it is called for each loop iteration, once for the length check, and once for the access.

And since nobody knows what getSources is doing, this might be arbitrarily expensive. Maybe in reality, it has to read these "sources" from the hard disc, or via some network request or whatnot. (Or it plays 7 cat videos, FWIW 🙂 )

Generally speaking: The arrays that had originally been passed to push.apply as the last parameter should be "pulled out" into a local variable (as shown in exampleFunctionWithForLoopRight).

One could even make a case to apply the pattern that is often used in the CesiumJS source code, for exactly this reason, namely, to even pull out the length. So instead of

  const sources = exampleObject.getSources();
  for (let i=0; i<sources.length; i++) {
  ...

it could be

  const sources = exampleObject.getSources();
  const sourcesLength = sources.length;
  for (let i=0; i<sourcesLength; i++) {
  ...

(but one has to be careful to not mix up any "lengths" there, when multiple arrays are involved in that context)


In some cases, it might be possible to pre-allocate space in the target array. Instead of

const sources = ...;
const n = sources.length;
const targets = [];
for (let i=0; i<n; i++) {
  targets.push(doSomethingWith(sources[i]));
}

it could be

const sources = ...;
const n = sources.length;
const targets = Array(n);
for (let i=0; i<n; i++) {
  targets[i] = doSomethingWith(sources[i]);
}

But... unless this is trivially possible, it may be a "premature optimization": It would be necessary to really carefully check that the size is known and fixed beforehand.

ThirdParty/codemirror-5.52.0/lib/codemirror.js Outdated Show resolved Hide resolved
packages/engine/Source/Scene/Cesium3DTileStyle.js Outdated Show resolved Hide resolved
packages/engine/Source/Scene/GltfLoader.js Outdated Show resolved Hide resolved
packages/engine/Source/Scene/PropertyTable.js Outdated Show resolved Hide resolved
@jjhembd
Copy link
Contributor

jjhembd commented Sep 6, 2024

Not to distract from the main point here, but:

the pattern that is often used in the CesiumJS source code, for exactly this reason, namely, to even pull out the length.

I have actually been changing this pattern… because:

  • it is extra mental overhead when reading the code
  • The pattern for (let i = 0; i < array.length; i++) is so boilerplate that the .length accessor is already optimized out (stored in a register) by modern JIT compilers (at least on every recent benchmark I have seen).

Having said that, I agree that exampleObject.getSources() should be precomputed. The compiler will not know what to do with custom getters.

@javagl
Copy link
Contributor

javagl commented Sep 6, 2024

@jjhembd I agree that not storing the array.length can increase readability (and avoid certain bugs). The performance was once said to be the reason for pulling it out.

Reliable benchmarks can be difficult, and the JIT is just one very large black box. E.g. does it only store that length in a register when it is actually an array, or also when length is a custom property? Or when it is a 'getter' for a custom property that was determined to not be affected by the loop...? In any case, it's a pity that the Iterable construct does not seem to be JITted. The results in #11697 (comment) have been pretty disappointing in that regard...


EDIT: I already mentioned in another comment that we might consider updating the coding guide to no longer recommend the const n = array.length pattern

@Tim-Quattrochi
Copy link
Author

Tim-Quattrochi commented Sep 26, 2024

@jjhembd I agree that not storing the array.length can increase readability (and avoid certain bugs). The performance was once said to be the reason for pulling it out.

Reliable benchmarks can be difficult, and the JIT is just one very large black box. E.g. does it only store that length in a register when it is actually an array, or also when length is a custom property? Or when it is a 'getter' for a custom property that was determined to not be affected by the loop...? In any case, it's a pity that the Iterable construct does not seem to be JITted. The results in #11697 (comment) have been pretty disappointing in that regard...

EDIT: I already mentioned in another comment that we might consider updating the coding guide to no longer recommend the const n = array.length pattern

@javagl If we were to keep the current pattern, how would you ensure it stays DRY? My thought was to create a utility function or perhaps a method, but I wanted to get your opinion. Repeating the length variable and source for each conditional seems like an anti-pattern to me.

@javagl
Copy link
Contributor

javagl commented Sep 26, 2024

From quickly scrolling over the changes, it seems like the inlined comments have been addressed. I'll do a (probably final) review pass early tomorrow (and just for completeness, try out the "artificial" model that I created to provoke the error).

About storing the length: (I'm not even sure whether that question was addressed at me): I don't have a strong opinion there. The main point was about storing things like the getSources() (in the example). And I mentioned the pattern of storing the length as const n = something.length; mainly because that was used in most of the codebase, and even in snippets in the coding guide. But using the something.length directly in the loop is also fine for me.

@jjhembd Do you think that new/modified code should consistently use the pattern without storing the length?

The strong(est) case that I could make against storing the length is when there are "many" arrays and loops in one method, and one has to distinguish the names (i.e. when it cannot be n, but has to be array0Length.. arrayNLength). Is this what your comment about DRY referred to? I'm not sure how a utility function/method could help here...

It would be great if these considerations could be obsolete: I think that the for (const element of array) should be the idiomatic code. (You actually used this in one place - not sure whether that was intentional). Unfortunately, JavaScript compilers/JITs don't seem to be very smart, and these loops seem to be much slower, even though they could trivially be translated into the index-based ones by the compiler or JIT...

@Tim-Quattrochi
Copy link
Author

Tim-Quattrochi commented Sep 26, 2024

About storing the length: (I'm not even sure whether that question was addressed at me): I don't have a strong opinion there. The main point was about storing things like the getSources() (in the example). And I mentioned the pattern of storing the length as const n = something.length; mainly because that was used in most of the codebase, and even in snippets in the coding guide. But using the something.length directly in the loop is also fine for me.

@javagl Yes it was, sorry I '@' the wrong person. I was just referring to the fact that I have to repeat myself a lot. It was a general best practice question, not necessarily something I am introducing. For example, I have to create a variable to point at the source, create a target, create a variable to hold the length of the source. This can happen a lot in a method that has more conditions. An example is here: packages/engine/Source/Scene/Cesium3DTileBatchTable.js lines 375, 376, then again 382, 387. Would it be good practice to create a function to:

  • create a reference to the source.
  • create a variable for the length of the source.
  • create a target.
  • loop through and push.
  • return the result.

Again, this is just a general question and I am not suggesting Cesium do this.

It would be great if these considerations could be obsolete: I think that the for (const element of array) should be the idiomatic code. (You actually used this in one place

I don't believe I refactored those after changes requested. I can have them changed and pushed to this PR before you do a review, and yes I will absolutely try out the artificial model that you created.

@javagl
Copy link
Contributor

javagl commented Sep 27, 2024

Would it be good practice to create a function to:
...

I misunderstood that first. I thought that you referred to a function that can "always" replace such a for-loop, in the most generic form. But now I see that you only referred to the push.apply case here.

And indeed, such a function could make sense, and it could be pretty straightforward, like

function pushAll(source, target) {
    for (let i=0; i<source.length; i++){
        target.push(source[i]);
    }
}

This would turn a code block like the one that you linked to from this

    const hierarchyPropertyIdsLength = hierarchyPropertyIds.length;
    for (let i = 0; i < hierarchyPropertyIdsLength; ++i) {
      results.push(hierarchyPropertyIds[i]);
    }

into this

pushAll(hierarchyPropertyIds, results);

One could even pre-allocate the space in the target array, to avoid repeated allocations/resize operations.


Now... I'm a bit torn:

  1. On the one hand, I don't want to hold up your PR, or let it become "obsolete" by suggesting to introduce such a function
  2. On the other hand, I'd be strongly in favor of introducing such a function
  3. (On the third hand, not being part of the code CesiumJS squad, I probably cannot make the final decision here)

But even at the risk of making most of the changes here obsolete, I'd vote for introducing such a function, because it could tremendously increase readability and maintanability (and performance), with little effort and bascially zero risk.

I just threw together some ""benchmark"", to compare a few things here. (I'm not a "JavaScript performance testing expert", so this has to be taken with a huge grain of salt...).

The benchmark creates arrays of different sizes, repeatedly puts the contents of these source arrays into target arrays, and prints the timing information. The approaches that are compared here are

  • The push.apply (causing errors for larger arrays)
  • The ...spread operator (also causing errors)
  • An index-based for loop (the approach that is now used in this PR)
  • A foreach loop (just for comparison)
  • An index-based for loop with preallocated target array size, as it could be done easily and transparently(!) in such a function

(Some results are given below)

The code:

function runSingleTest(sourceSize, testFunction) {
  const sourceArray = Array(sourceSize);
  const targetArray = [];
  const startMs = performance.now();
  try {
    testFunction(sourceArray, targetArray);
  } catch (errror) {
    return undefined;
  }
  const endMs = performance.now();
  const durationMs = endMs - startMs;
  return durationMs;
}

function runTest(runs, sourceSize, name, testFunction) {
  let sumMs = 0.0;
  for (r = 0; r < runs; r++) {
    const ms = runSingleTest(sourceSize, testFunction);
    if (ms === undefined) {
      console.log(`sourceSize ${sourceSize}, name ${name}, ERROR`);
      return;
    }
    sumMs += ms;
  }
  const averageMs = sumMs / runs;
  console.log(`sourceSize ${sourceSize}, name ${name}, averageMs ${averageMs}`);
}

function usePushApply(sourceArray, targetArray) {
  targetArray.push.apply(targetArray, sourceArray);
}

function useSpread(sourceArray, targetArray) {
  targetArray.push(...sourceArray);
}

function useForEachLoop(sourceArray, targetArray) {
  for (const element of sourceArray) {
    targetArray.push(element);
  }
}

function useForLoop(sourceArray, targetArray) {
  for (let i = 0; i < sourceArray.length; i++) {
    targetArray.push(sourceArray[i]);
  }
}

function useForLoopPreallocated(sourceArray, targetArray) {
  const t = targetArray.length;
  targetArray.length += sourceArray.length;
  for (let i = 0; i < sourceArray.length; i++) {
    targetArray[t + i] = sourceArray[i];
  }
}

function test() {
  const runs = 100;
  for (let n = 100; n <= 1000000; n *= 10) {
    runTest(runs, n, "pushApply", usePushApply);
    runTest(runs, n, "spread", useSpread);
    runTest(runs, n, "forEach", useForEachLoop);
    runTest(runs, n, "for", useForLoop);
    runTest(runs, n, "forAlloc", useForLoopPreallocated);
  }
}

try {
  test();
} catch (error) {
  console.log(error);
}

Some results (a summary will be given below):


On NodeJS:

sourceSize 100, name pushApply, averageMs 0.0016129999561235308
sourceSize 100, name spread, averageMs 0.006134000001475215
sourceSize 100, name forEach, averageMs 0.018598999972455205
sourceSize 100, name for, averageMs 0.007068000007420778
sourceSize 100, name forAlloc, averageMs 0.0077609999664127825
sourceSize 1000, name pushApply, averageMs 0.005562999998219311
sourceSize 1000, name spread, averageMs 0.003792000007815659
sourceSize 1000, name forEach, averageMs 0.0369120000069961
sourceSize 1000, name for, averageMs 0.02806300004012883
sourceSize 1000, name forAlloc, averageMs 0.0023279999662190677
sourceSize 10000, name pushApply, averageMs 0.019967000023461877
sourceSize 10000, name spread, averageMs 0.019638000009581446
sourceSize 10000, name forEach, averageMs 0.038648000014945866
sourceSize 10000, name for, averageMs 0.038686000015586613
sourceSize 10000, name forAlloc, averageMs 0.012039000024087728
sourceSize 100000, name pushApply, averageMs 1.8059109999565408
sourceSize 100000, name spread, averageMs 1.744411000018008
sourceSize 100000, name forEach, averageMs 1.2372270000027492
sourceSize 100000, name for, averageMs 1.229792999974452
sourceSize 100000, name forAlloc, averageMs 0.39260800000280144
sourceSize 1000000, name pushApply, ERROR
sourceSize 1000000, name spread, ERROR
sourceSize 1000000, name forEach, averageMs 21.47643099999521
sourceSize 1000000, name for, averageMs 21.594332000017165
sourceSize 1000000, name forAlloc, averageMs 4.8390970000252125

On Chrome in jsfiddle

"sourceSize 100, name pushApply, averageMs 0"
"sourceSize 100, name spread, averageMs 0"
"sourceSize 100, name forEach, averageMs 0.006999999997206032"
"sourceSize 100, name for, averageMs 0.003000000002793968"
"sourceSize 100, name forAlloc, averageMs 0.00400000000372529"
"sourceSize 1000, name pushApply, averageMs 0.002999999998137355"
"sourceSize 1000, name spread, averageMs 0.003999999999068677"
"sourceSize 1000, name forEach, averageMs 0.01600000000093132"
"sourceSize 1000, name for, averageMs 0.00400000000372529"
"sourceSize 1000, name forAlloc, averageMs 0.003999999999068677"
"sourceSize 10000, name pushApply, averageMs 0.01899999999906868"
"sourceSize 10000, name spread, averageMs 0.017999999998137353"
"sourceSize 10000, name forEach, averageMs 0.04199999999720603"
"sourceSize 10000, name for, averageMs 0.03300000000279397"
"sourceSize 10000, name forAlloc, averageMs 0.011999999997206032"
"sourceSize 100000, name pushApply, averageMs 1.2229999999841674"
"sourceSize 100000, name spread, averageMs 1.1540000000130386"
"sourceSize 100000, name forEach, averageMs 0.6920000000204891"
"sourceSize 100000, name for, averageMs 0.741999999973923"
"sourceSize 100000, name forAlloc, averageMs 0.1800000000046566"
"sourceSize 1000000, name pushApply, ERROR"
"sourceSize 1000000, name spread, ERROR"
"sourceSize 1000000, name forEach, averageMs 12.13999999998603"
"sourceSize 1000000, name for, averageMs 12.027999999970199"
"sourceSize 1000000, name forAlloc, averageMs 2.443000000016764"

On Firefox in jsfiddle:

"sourceSize 100, name pushApply, averageMs 0.01"
"sourceSize 100, name spread, averageMs 0.01"
"sourceSize 100, name forEach, averageMs 0"
"sourceSize 100, name for, averageMs 0"
"sourceSize 100, name forAlloc, averageMs 0.01"
"sourceSize 1000, name pushApply, averageMs 0.06"
"sourceSize 1000, name spread, averageMs 0.03"
"sourceSize 1000, name forEach, averageMs 0"
"sourceSize 1000, name for, averageMs 0"
"sourceSize 1000, name forAlloc, averageMs 0"
"sourceSize 10000, name pushApply, averageMs 0.36"
"sourceSize 10000, name spread, averageMs 0.33"
"sourceSize 10000, name forEach, averageMs 0.08"
"sourceSize 10000, name for, averageMs 0.05"
"sourceSize 10000, name forAlloc, averageMs 0.04"
"sourceSize 100000, name pushApply, averageMs 3.85"
"sourceSize 100000, name spread, averageMs 3.62"
"sourceSize 100000, name forEach, averageMs 0.86"
"sourceSize 100000, name for, averageMs 0.38"
"sourceSize 100000, name forAlloc, averageMs 0.34"
"sourceSize 1000000, name pushApply, ERROR"
"sourceSize 1000000, name spread, ERROR"
"sourceSize 1000000, name forEach, averageMs 9.28"
"sourceSize 1000000, name for, averageMs 5.04"
"sourceSize 1000000, name forAlloc, averageMs 5.58"

For small arrays, it does not seem to matter in terms of performance.

For arrays of size >~150000, the push.apply and ...spread cause errors, so they are out of the game anyhow. But even in the cases where they still work, they are noticably slower than the alternatives.

For large arrays, the index-based for and foreach seem to have roughly the same performance in NodeJS and Chrome. In FireFox, the foreach one is noticably slower.

The "index-based-for-with-preallocated-size" seems to consistently be the fastest (the 5.58 vs. 5.04 on FireFox is "noise"). In fact, for larger arrays on NodeJS and Chrome, this one seems to be 3-5 times faster than the standard for/push approach.


tl;dr (too late for that, I know 🙂 )

I'd second your suggestion to introduce a function like this

function pushAll(source, target) {
    const t = target.length;
    target.length = t + source.length;
    for (let i=0; i<source.length; i++){
        target[t + i] = source[i];
    }
}

That can be used in all places where the contents of one array has to be added to another (i.e. at least all places that are currently affected by this PR - and probably some more, where this already was done with a manual, inlined for loop)

To be confirmed by the CesiumJS core squad.

@Tim-Quattrochi
Copy link
Author

Tim-Quattrochi commented Sep 27, 2024

  1. On the one hand, I don't want to hold up your PR

I could have emailed you about it, I was genuinely curious about your opinion on that. Not only for this project, but for my own development structure. I appreciate your thorough responses.

@ggetz
Copy link
Contributor

ggetz commented Sep 27, 2024

As long as there's not an existing method which does the equivalent (it sounds like we want the concat method, but without creating a new array each time), I see no harm in adding a helper function to the general API.

@javagl
Copy link
Contributor

javagl commented Sep 27, 2024

I always feel like a complete noob when I have to do searches that lead to things like https://stackoverflow.com/questions/4156101/copy-array-items-into-another-array . But ... inside that usual stack overflow noise, there are some hints that seem to confirm that push.apply/...spread are simple but problematic, concat is simple and fast (but not what we want here), and a custom utility function that preallocates the output seems to be the most versatile for what we want to do here.

So unless someone disagrees, there will probably be some function for this. This would reside on the same level as defined and defaultValue. And... similar to the latter, we can hope that it can one day be replaced by one efficient, idiomatic, built-in functionality.

We could now talk about details.... consider this as brainstorming:

  • It could be called pushAll, but... depending on the implementation, it's not "pushing", so I'd call it addAll
  • It could transparently handle the case of source===undefined by doing nothing (?)
  • ...

@jjspace
Copy link
Contributor

jjspace commented Sep 27, 2024

I've only been half following this thread so if I missed the answer already then ignore me.

concat is simple and fast (but not what we want here)

Is there a reason we can't replace push.apply with concat and just reassign variables? I wouldn't expect anything this was previously used on to be anything but a pure Array so I wouldn't expect references to be broken or anything.

arr1.push.apply(arr1, arr2);
// could just be
arr1 = arr1.concat(arr2);

This probably wouldn't work in a helper function but I don't think it would even need the helper function if we went this route. With a very naive test this is faster than even the pre-allocated solution above, it just requires a slight rework to how we're using these variables and maybe converting some from const to let

sourceSize 100, name forAlloc, averageMs 0.006000000238418579
sourceSize 100, name concat, averageMs 0.0029999995231628418
sourceSize 1000, name forAlloc, averageMs 0.006000000834465027
sourceSize 1000, name concat, averageMs 0.0010000002384185792
sourceSize 10000, name forAlloc, averageMs 0.02000000059604645
sourceSize 10000, name concat, averageMs 0.005
sourceSize 100000, name forAlloc, averageMs 0.31600000500679015
sourceSize 100000, name concat, averageMs 0.1809999990463257
sourceSize 1000000, name forAlloc, averageMs 4.498000001907348
sourceSize 1000000, name concat, averageMs 3.0629999989271166

@javagl
Copy link
Contributor

javagl commented Sep 27, 2024

@jjspace I'm kind of surprised that concat is faster than the other approaches. It's difficult to find "the truth" here (and quick websearches bring up things like https://dev.to/uilicious/javascript-array-push-is-945x-faster-than-array-concat-1oki 🥴 ). But I tested it on NodeJS, Firefox, and Chrome, and it seems to be consistent. (Not much in many cases, but consistent).

Somehow related to the concat-approach: When I listed the "details" bullet points in the last comment, I actually considered this:

  • Should addAll return the target array?

But I didn't propose this, because it would raise several tricky follow-up questions.


First of all: There certainly are several places where concat would be the easiest, cleanest, and (apparently) most efficient solution. And it could be worthwhile to examine the places that currently use push.apply in that regard. Many of them can probably be changed easily.

But as a general recommendation, the concern that I have with concat is simple: We don't have the slightest clue about "ownership"!

Imagine that there is some code like this (dummy/pseudocode):

function processSomething(frameState) {
    const targetArray = frameState.targetArray;
    
    addAll(createSourcesA(), targetArray);
    addAll(createSourcesB(), targetArray);
    addAll(createSourcesC(), targetArray);
}

(where addAll is the function that is about to be introduced).

So far, so good. It adds a bunch of sources to the targetArray that is stored in the FrameState.

Now, concat has a drastically different behavior. It returns a new array. So this will have to be...

function processSomething(frameState) {
    const targetArray = frameState.targetArray;
    
    let newTargetArray = targetArray;
    newTargetArray = newTargetArray.concat(createSourcesA());
    newTargetArray = newTargetArray.concat(createSourcesB());
    newTargetArray = newTargetArray.concat(createSourcesC());

    // THIS IS IMPORTANT:
    frameState.targetArray = newTargetArray:
}

We can talk about details of the inner part. And great care is necessary to not accidentally omit that last line.

But even then: You simply never know whether at some point in time, some part of the code, for some reason did
this.stuffFromFrameState = frameState.targetArray;

And this array will then never be updated.
(It is actually possible to write code in a way where this is not a concern. But that code is usually not JavaScript ;-) ).


So my two cents:

  • Yes, we can consider to simply use concat, but have to be very careful that this doesn't break anything
  • For some existing patterns, I think that such an addAll function would still be highly beneficial

@Tim-Quattrochi Sorry, this discussion may have detailed a bit 😁 What started as an "uncontroversial, pattern-based boilerplate code change" now touches some of the guts of the language and library. But I think that the potential improvement is worth these considerations.

@ggetz
Copy link
Contributor

ggetz commented Sep 27, 2024

My biggest concern about always using concat is that is always creates a new array instead of mutating the first one in place. This can cause memory pressure and performance issues, so we should be intentional about when we create new objects or not.

@javagl
Copy link
Contributor

javagl commented Oct 1, 2024

It's not entirely clear how to proceed here. This is a PR that affects "many" files, and there has been a linting update in the meanime, which affects ... "all files", basically - meaning that there are a bunch of conflicts. There are instructions for updating such a diverged branch. But I think that it might be easier to...

  • Create that addAll function (similar to defined.js)
  • Start from the current main branch state (which already has all the linting changes)
  • Mechanically change the target.push.apply(target, source); calls to addAll(target, source);
    • (This will require less thought than the previous change: All the subtle questions about "storing the length" or "not repeatedly calling getSources() do not apply here - and that's exactly why I think that this would be a good thing to do!)
  • Create a new PR for that

But that's only a gut feeling right now. @Tim-Quattrochi if you instead want to update the branch with the linter changes and extend this PR, that should also be possible.

@Tim-Quattrochi
Copy link
Author

if you instead want to update the branch with the linter changes and extend this PR

Can you elaborate on what you mean by linter changes?

@javagl
Copy link
Contributor

javagl commented Oct 8, 2024

CesiumJS uses prettier for automated code formatting. Before committing something, running npm run prettier cleans up the code formatting, removes unnecessary spaces, fixes the indentation etc. But recently, the version of 'prettier' that is used in CesiumJS has been updated (via #12206). One side effect of this change is that "all files are modified". This can lead to conflicts (as indicated by the list of "Conflicting files" currently shown below this PR).

Now, there are two options:

  1. You could go through the process described in Update prettier to v3 #12206 to update this branch and resolve the conflicts
  2. As described above: You could address the issue on a new branch (branched off from the current main), and create a new PR for that

Option 1. requires a bit of twiddling with git, and may still require to resolve some conflicts manually. So I think that option 2. could be easier and make more sense: The suggestion to introduce a utility function for the specific functionality here (namely, "add all") is a good one, and allows for a much cleaner way of addressing the original issue.

@Tim-Quattrochi Tim-Quattrochi marked this pull request as draft October 18, 2024 15:42
@ggetz
Copy link
Contributor

ggetz commented Oct 29, 2024

Hi @Tim-Quattrochi and @javagl, what is remaining in this PR? Is it only to merge in main and account for prettier?

@javagl
Copy link
Contributor

javagl commented Oct 29, 2024

@ggetz The last thing here was that we considered adding an addAll function that provides an efficient and safe way of doing what was formerly done with push.apply, with the two options:

  1. doing this in this PR (changing all the existing changes)
  2. creating a fresh PR that dedicatedly introduces this addAll, and updates all appearances of push.apply mechanically to use that function

For 1, dealing with prettier and conflicts can be a pain in the back, so I think that 2. should be easier. The decision is left to @Tim-Quattrochi for now (but if Tim does not have the bandwidth to do either, then I'd put 2. on my TODO list - I think that the suggestion by Tim (to introduce such a function to foster DRY) is a good one, and shouldn't be sooo much effort),

@Tim-Quattrochi
Copy link
Author

Hi @Tim-Quattrochi and @javagl, what is remaining in this PR? Is it only to merge in main and account for prettier?

Hi sorry, I need to lint and possibly a helper. I put it on draft because I was dealing with hurricane Milton.

@ggetz
Copy link
Contributor

ggetz commented Oct 29, 2024

Hi sorry, I need to lint and possibly a helper. I put it on draft because I was dealing with hurricane Milton.

Understandable! Hope you are safe!

@javagl If you do have the bandwidth to help out here, that would be appreciated.

@javagl
Copy link
Contributor

javagl commented Oct 30, 2024

@Tim-Quattrochi We can set up a call whenever it suits you.

(EDIT: Unless we can sort out the questions via comments here, that's also fine...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants