Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/fix: revised canonical forms; tests; preliminary & associated fixes #238

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

samueltlg
Copy link

Still more work & tests incoming - which is done but needs verification & review - for CanonicalForms Multiply, Divide & Number (tests). Power form most notably up for now: this being the trickiest on

If you review this set of changes, will make any requested changes along with the next/remaining batch of work: this should be next Wed./Thu. evening.

The individual commit messages are good sources of info. WRT changes.

Have a batch of incoming code-comments to make (will do tomorrow) which will likely clarify, and sidestep confusion, and highlight key-points/requests.

Some outstanding queries:

  • WRT symbols being involved in canonicalization (notably, 'Power'):
    • Only recently came to mind that the holdUntil symbol attribute is not accounted for during the check on operand values (which could potentially be symbols): should this be the case?
      • Yet furthermore, if this is to be checked, i.e. that the 'holdUntil' value of a symbol-definition is never, then would it not be the case that symbols are substituted with values, before canonicalization, either partial or full, takes place? That being said, it would be the case that accounting for symbol values during application of canonical-forms, is unnecessary, since they would be substituted beforehand anyway (and any existing symbols would therefore have a 'holdUntil' value of 'evaluate', 'N', etc...)
    • With the above in mind, in the case of Power-form, for example, should operations such as 1^x, where x is declared as a constant of value 1, ever take place... ?

samueltlg added 14 commits April 7, 2025 01:11
- Fixes the return value of 'sgn' for BoxedNumbers with a value of
  'NaN': as a corollary fixes & now correctly computes return-value of
  getter `BoxedSymbol.isNaN`, too.

- Fix: for consistency, 'isNaN' & 'isInfinity' now trigger definition
  binding (canonicalization), in similar manner to 'isOdd', 'isEven',
  'isInteger'...  (This is appropriate because these may be considered
  direct 'value inquiry' getters, similarly to these sibling properties)

- Adds missing getter 'isFinite': computing the result using 'sgn' like
  its sign-checking siblings 'isPositive', 'isNegative'...

- Fix: several properties/getters now return a boolean in cases where
  'undefined' should be returned (because, the symbol is unbound and
  therefore not enough information is known). ^Notably, this is
  restricted to the aforementioned - 'sgn' referencing - properties
  'is(NaN/Infinity)' et cetera.

- Fix: revise getter 'isInfinity' to now account for 'complex-infinity'

Also includes various extra doc., including some corrections
Before:
 - ce.parse('\\sqrt{x}').root(3) -> '\\sqrt[5]{x}'
Now
 - '\\sqrt[6]{x}
(Also, re-declares getter 'value' for BoxedNumbers, for purposes of
refining its return-type)
'numberForm' now longer simply requests the 'canonical' version/property
of boxed-numbers: these notably now being obsolete on account of
BoxedNumber instances always now being being required to be constructed
canonically.

Instead, the 'structural' number-cast operations present in
'boxFunction' - such as casting 'Rational' as a number where appropriate
- are co-present in this function: albeit applicable more
narrowly to 'BoxedExpression' (only).

Prior to this change, for a while now, 'numberForm' has therefore been
redundant.

Notably also, 'boxFunction' now applies these operations during full
canonicalization only, and not when just the 'Number' canonical-form is
requested.
- Fixes sign comp. functions 'is(Non)Positive/Negative' to account for
  wider range of 'Sign' (type) values: and to also *consistently return
  'undefined'* when comparing against complex-number signs.
  Consequently, corresponding BoxedExpr methods now correctly compute
  the sign for a wider range of values: particularly complex numbers
  (including ComplexInfinity), and in some cases zero.

  ^This fix may also be consider a *breaking* change, in that
  'isPositive/Negative' et cet. will return 'undefined' for boxed
  complex-numbers (i.e. indiciating inapplicability): whereas before
  these would return 'false'.

- Fixes, via broadening range of returned values, 'sgn' for
  BoxedNumbers: this now accounting for all Infinity sign values.
  Naturally, this also affects computation of 'sgn' for symbols with
  values.
  An '...-infinity' sign value is now correctly returned for +/-/~
  Infinity for both BoxedNumbers, and symbols with these values.
… FN's now -> undefined; rectify BoxedFunction.isPure

- WRT repeat evaluation of impure FN's (such as 'Random'):
 -Before: `expr.evaluate/N() -> Result1, Result1, Result1...` (i.e.
 essentially behaving as idempotent).
 -After:  `  ...             -> Result1, Result2, Result3...` (new
 results unrelated to previous)

-`get value` (i.e. 'N().valueOf()') for boxed impure functions now
returns 'undefined'.
 This contrasts from before in both senses of now returning 'undefined'
 instead of computing a value, and also by returning the same,
 first-call cached value on reach request.

- 'isPure' (BoxedFunction) now returns a non-undefined/boolean result
  for _non-canonical_/unbound functions.
Fundamental fixes:
- 'powerForm' now calls 'canonicalPower', instead of 'a.pow(b)'
- 'canonicalPower'
 - No longer casts passed args./exprs. as fully canonical
 - Generally, no longer performs operations wherein an operand, esp. the
   exponent, is a function expression: e.g. '1^(2+3)' will no longer
   simplify (but '1^5' still will do so).
    -^There are reasonable exceptions, e.g. 'a^1' will always simplify
    -^As will 'a^{-1}'
 - Values, including those temporarily assigned from assumptions, are no
   longer looked to from symbols unless these are *non-constant*.
   (Before, inadvertently, this could be the case)
 - More consistently, checks for _constant_ symbol values for
   operations, such as whether 'b' is '1' in 'a^b'
    -**warning**: this may be *incorrect* behaviour: since the
    'holdUntil' attribute is not looked to (and this may be desirable).
    Furthermore, if this is an oversight, then it may be the case that
    symbols with a 'never' holdUntil value will be substituted with
    their values before canonicalization/canonical-forms: therefore
    accounting for symbol values here, or within any other
    canonical-form, may be unnecessary & inappropriate.
 - One or two operation fixes, here & there (forgotten now)
 - More careful type-checking throughout: i.e. particularly
   discriminating between 'real' & 'non-finite' numbers; in some cases
   'complex', too...

Changes:
- Should be, significantly more optimized
- !Currently, does not include the collapsing of multi-level exponents, e.g.
  '{a^b}^{c} -> a^{b*c}'
- Heavier with inversion
- Several instances of accounting for wider ranges of values (mainly for
  exponents): particularly for 'a^Infinity'.
-

Notes:
- Now, canonicalPower only returns a new boxed-function with 'canonical:
  true' when given args. are canonical.
- !Function 'pow' - the first half more or less - is a replication of
  what now takes place within canonicalPower: but is less thorough, and
  may contain a few inaccuracies. Likely, would benefit from replacing
  its first half with a call to 'canonicalPower'; which it anyway
  replicates.
This reverts commit a8f85fa.

This property/getter is largely unnecessary: because 'value' largely is  'constantValue' (the exception is for symbols: in which for non-constants its value may vary arbitrarily, or with assumptions).

Note:
- This *could* be introduced, perhaps exclusively for symbols; or its overall definition could be adjusted slightly such that it is useful for boxed-functions too.
I.e., for a boxed-function, only return a value if 'isConstant' && 'isPure'. ('value' differs in that returns non-undefined if 'isPure', *only*...)
Copy link
Member

@arnog arnog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great PR. Looks good overall. See comments, but the thing that bothers me the most is that some of the sign properties are for complex values are undefined, while they were false before. I'm not sure if that was an intentional change or a side effect.

The other thing, which is more of a heads up, is that the definition of Sign is a bit messy right now, as it includes overlapping concepts, such as the type, whether something is finite or not, etc... In general, I think it would be clearner and simpler if there were fewer "kinds" of sign. There is a pending PR that has those changes, I believe, and I'll have to check on its status and if it could be merged.

* Effective only for overall BoxedExpression types which are *non-constant* and therefore for
* which its value, and thereby type, can potentially vary.
*
* For symbols, inference may take place only for undeclared, or previously inferred symbols. For
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will ensure to amend that.

* definition if this is an _undeclared_ boxed-symbol), and for functions, narrows the *(return)
* type*. The return result will for this case be `true`.
*
* (Note that subsequent inferences can be made and will override previous ones if valid)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes; will update

@samueltlg
Copy link
Author

This is a great PR. Looks good overall. See comments, but the thing that bothers me the most is that some of the sign properties are for complex values are undefined, while they were false before. I'm not sure if that was an intentional change or a side effect.

The other thing, which is more of a heads up, is that the definition of Sign is a bit messy right now, as it includes overlapping concepts, such as the type, whether something is finite or not, etc... In general, I think it would be clearner and simpler if there were fewer "kinds" of sign. There is a pending PR that has those changes, I believe, and I'll have to check on its status and if it could be merged.

Thanks . Have responded to initial comments; noticed that I didn't publish a review yesterday, meaning that they are bundled with a series of unpublished comments from then, too. Will also now add a small quantity of comments left to be added from yesterday.
Regardless of any upcoming amendments to the set of sgn changes, in general keep these changes as an interim fix? (fixes several, if you see #238 8502b34)

Think that, after resolving any current changes, & also pushing up the test-cases for Number, will add the changes for remaining forms such as 'Multiply','Divide', in a separate request; stop this one becoming bloated.

* @param a
* @param b
* @returns
*/
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will enumerate ops. in simple form here, & rewrite slightly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This being the biggest commit: an overview of changes present in message (b46904d)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More consistently, checks for constant symbol values for operations, such as whether 'b' is '1' in 'a^b'

The value of symbols should not be considered during canonicalization, even constant ones. That's because their value (and even the fact they are constant) depends on binding. Ideally, canonicalization would be independent of binding, particularly of binding of symbols. Note that currently that is not the case, but that is a fundamental error that I've only fairly recently realized. Eventually, the canonicalization should happen by applying a series of rules to expressions, much like simplification rules, and these should not depend on binding having occurred.

-warning: this may be incorrect behaviour: since the 'holdUntil' attribute is not looked to (and this may be desirable). Furthermore, if this is an oversight, then it may be the case that symbols with a 'never' holdUntil value will be substituted with their values before canonicalization/canonical-forms: therefore accounting for symbol values here, or within any other canonical-form, may be unnecessary & inappropriate.

As per discussion previously, I don't think you need to look at the holdUntil attribute. Symbols with a holdUntil = "never" will get substituted during their canonicalization and that is correct.

!Function 'pow' - the first half more or less - is a replication of what now takes place within canonicalPower: but is less thorough, and may contain a few inaccuracies. Likely, would benefit from replacing its first half with a call to 'canonicalPower'; which it anyway replicates.

yes, that sounds reasonable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value of symbols should not be considered during canonicalization, even constant ones

Cleared-up now.
It was something that didn't come to mind until recently, namely that due to the 'holdUntil' attribute, symbols would be substituted with values prior to/at canonicalization (->& therefore no need to check).

Out of curiosity though, did think that it was OK to check whether a symbol is a constant (isConstant) without it triggering binding? If I recall: think its associated documentation states this to be the case...?

@arnog
Copy link
Member

arnog commented Apr 9, 2025

Regarding holdUntil...

Only recently came to mind that the holdUntil symbol attribute is not accounted for during the check on operand values (which could potentially be symbols): should this be the case?

If holdUntil is never, the value of the symbol should be substituted during canonicalization.

See documentation for holdUntil:

  /**
   * If the symbol has a value, it is held as indicated in the table below.
   * A green checkmark indicate that the symbol is substituted.

<div className="symbols-table">

| Operation     | `"never"` | `"evaluate"` | `"N"` |
| :---          | :-----:   | :----:      | :---:  |
| `canonical()` |    (X)    |              |       |
| `evaluate()`  |    (X)    |     (X)      |       |
| `"N()"`       |    (X)    |     (X)      |  (X)  |

</div>

  * Some examples:
  * - `ImaginaryUnit` has `holdUntil: 'never'`: it is substituted during canonicalization
  * - `x` has `holdUntil: 'evaluate'` (variables)
  * - `Pi` has `holdUntil: 'N'` (special numeric constant)
  * 
  * **Default:** `evaluate`
  */
  holdUntil: 'never' | 'evaluate' | 'N';

Yet furthermore, if this is to be checked, i.e. that the 'holdUntil' value of a symbol-definition is never, then would it not be the case that symbols are substituted with values, before canonicalization, either partial or full, takes place?

It should be accounted for during canonicalization, as part of the canonicalization of the symbol.

That being said, it would be the case that accounting for symbol values during application of canonical-forms, is unnecessary, since they would be substituted beforehand anyway (and any existing symbols would therefore have a 'holdUntil' value of 'evaluate', 'N', etc...)

Correct, no further check should be necessary during the canonicalization of arguments, since symbols should have be substituted by then if indicated.

With the above in mind, in the case of Power-form, for example, should operations such as 1^x, where x is declared as a constant of value 1, ever take place... ?

The fact that x is a constant is not the same as x is a symbol with holdUntil = "never". A constant should not be substituted during canonicalization. For example, Pi is a constant: you don't want to substitute its value during canonicalization.

@samueltlg
Copy link
Author

Regarding holdUntil...

Only recently came to mind that the holdUntil symbol attribute is not accounted for during the check on operand values (which could potentially be symbols): should this be the case?

If holdUntil is never, the value of the symbol should be substituted during canonicalization.

Just thought... does this substitution take place for partial canonicalization/CanonicalForms?

@samueltlg
Copy link
Author

samueltlg commented Apr 10, 2025

Have a closing list of changes to make now; along with the final Number-form tests to be added up.
Final/remaining resolution points as just commented:

  • BoxedExpr.value
  • Constant-symbol, inferredType 'bug'
  • ["Exp", -1] is now ["Divide", 1, "ExponentialE"] is acceptable?

After clearing those up & making final changes + tests, should be good to go.

@arnog
Copy link
Member

arnog commented Apr 10, 2025

BoxedExpr.value
Let's have BoxedExpr.value return undefined if the expression is not a literal or a symbol with a value.

Constant-symbol, inferredType 'bug'
I've checked in a fix.

["Exp", -1] is now ["Divide", 1, "ExponentialE"] is acceptable?
Yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants