Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement UTF-8 strings #43

Open
alexdovzhanyn opened this issue Oct 2, 2024 · 0 comments
Open

Implement UTF-8 strings #43

alexdovzhanyn opened this issue Oct 2, 2024 · 0 comments
Assignees
Labels
CodeGen Impacts code generation enhancement New feature or request

Comments

@alexdovzhanyn
Copy link
Collaborator

We're currently relying on the WebAssembly Stringref proposal as our basis for strings in ThetaLang. However, community support for this proposal is currently limited. While V8 and browsers provide support for it, the proposal is unlikely to progress in its current state.

Proposed Solution:
To ensure robust string support in ThetaLang, we will need to implement strings ourselves as contiguous memory byte allocations. Rather than writing our own UTF-8 implementation, which is complex and frequently updated with new grapheme clusters, we can leverage an existing library like ICU, which is also used internally by V8.

Required Steps:

  • Accessing V8's ICU:
    Investigate whether we can easily access V8’s internal ICU and use it for string operations. If so, this will save us from shipping our own copy of ICU.

  • Manual ICU Integration (If Necessary):
    If accessing V8's internal ICU is not feasible, we will include ICU in our build manually to support UTF-8 encoding and decoding.

  • String Implementation:
    Once ICU is accessible, we will use its UTF-8 encoding capabilities to implement our own string handling in ThetaLang, allowing for robust manipulation of Unicode data.

Dependencies:
This issue depends on the completion of ThetaLang #42, which focuses on building the Theta garbage collector to support memory management for these custom strings.

@alexdovzhanyn alexdovzhanyn added enhancement New feature or request CodeGen Impacts code generation labels Oct 2, 2024
@alexdovzhanyn alexdovzhanyn self-assigned this Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CodeGen Impacts code generation enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant