Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor demos to use .sem_add_columns or .add_columns instead of convert(), remove Schema from demos when possible. #104

Open
wants to merge 7 commits into
base: dev
Choose a base branch
from

Conversation

chjuncn
Copy link
Collaborator

@chjuncn chjuncn commented Feb 3, 2025

This change is based on #101 and #102, please review them first then this change.

  1. This is to refactor all demos to use .sem_add_columns or .add_columns, and remove .convert().

  2. Remove Schema from demos, except demos using ValidationDataSource and dataset.retrieve() that need schema now. We can refactor these cases later.

vitaglianog and others added 5 commits January 30, 2025 18:41
* Create chat.rst

* Update pyproject.toml

Hotfix for chat

* Update conf.py

Hotfix for chat.rst
This implementation basically resolves #84.

One implementation is different from the #84:
.add_columns(
  cols=[
    {"name": "sender", "type": "string", "udf": compute_sender},
    ...
  ]
)

If add_columns() uses cols, udf, types as params, it will make this function confusing again. Instead, if users need to specify different udfs for different columns, they should just call add_columns() multiple times for different columns.
…al values,

use field_values instead of field_types as field_values have the actual values, since field_values have the actual key-value pairs, while field_types are just contain fields and their types.

records[0].schema is the schema of the output, which doesn't mean we already populate the schema into record.
This change is based on #101 and #102, please review them first then this change.

1. This is to refactor all demos to use .sem_add_columns or .add_columns, and remove .convert().

2. Remove Schema from demos, except demos using ValidationDataSource and dataset.retrieve() that need schema now. We can refactor these cases later.
@chjuncn chjuncn requested a review from mdr223 February 3, 2025 08:07
@chjuncn chjuncn changed the title Chjun 0202 Refactor demos to use .sem_add_columns or .add_columns instead of convert(), remove Schema from demos when possible. Feb 3, 2025
@chjuncn chjuncn linked an issue Feb 3, 2025 that may be closed by this pull request
chjuncn and others added 2 commits February 3, 2025 20:30
@mdr223 mdr223 changed the base branch from main to dev February 3, 2025 17:57
@mdr223 mdr223 linked an issue Feb 3, 2025 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Eliminate Need for User-Facing Schema Update Syntax to Reflect New Design Goals
3 participants