Skip to content

Commit

Permalink
doc update for tag python-v0.25.1
Browse files Browse the repository at this point in the history
  • Loading branch information
deltars committed Feb 21, 2025
1 parent ef56538 commit d3c65b1
Show file tree
Hide file tree
Showing 8 changed files with 184 additions and 63 deletions.
61 changes: 51 additions & 10 deletions api/delta_table/delta_table_alterer/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -1286,6 +1286,13 @@
drop_constraint
</a>

</li>

<li class="md-nav__item">
<a href="#deltalake.table.TableAlterer.set_column_metadata" class="md-nav__link">
set_column_metadata
</a>

</li>

<li class="md-nav__item">
Expand Down Expand Up @@ -1988,6 +1995,13 @@
drop_constraint
</a>

</li>

<li class="md-nav__item">
<a href="#deltalake.table.TableAlterer.set_column_metadata" class="md-nav__link">
set_column_metadata
</a>

</li>

<li class="md-nav__item">
Expand Down Expand Up @@ -2130,16 +2144,16 @@ <h4 id="deltalake.table.TableAlterer.add_columns" class="doc doc-heading">

<details class="example" open>
<summary>Example</summary>
<p>from deltalake.schema import Field, PrimitiveType, StructType
dt = DeltaTable("test_table")
new_fields = [
Field("baz", StructType([Field("bar", PrimitiveType("integer"))])),
Field("bar", PrimitiveType("integer"))
]
dt.alter.add_columns(
new_fields
)
```</p>
<div class="language-python highlight"><pre><span></span><code><span id="__span-0-1"><a id="__codelineno-0-1" name="__codelineno-0-1" href="#__codelineno-0-1"></a><span class="kn">from</span><span class="w"> </span><span class="nn">deltalake.schema</span><span class="w"> </span><span class="kn">import</span> <span class="n">Field</span><span class="p">,</span> <span class="n">PrimitiveType</span><span class="p">,</span> <span class="n">StructType</span>
</span><span id="__span-0-2"><a id="__codelineno-0-2" name="__codelineno-0-2" href="#__codelineno-0-2"></a><span class="n">dt</span> <span class="o">=</span> <span class="n">DeltaTable</span><span class="p">(</span><span class="s2">&quot;test_table&quot;</span><span class="p">)</span>
</span><span id="__span-0-3"><a id="__codelineno-0-3" name="__codelineno-0-3" href="#__codelineno-0-3"></a><span class="n">new_fields</span> <span class="o">=</span> <span class="p">[</span>
</span><span id="__span-0-4"><a id="__codelineno-0-4" name="__codelineno-0-4" href="#__codelineno-0-4"></a> <span class="n">Field</span><span class="p">(</span><span class="s2">&quot;baz&quot;</span><span class="p">,</span> <span class="n">StructType</span><span class="p">([</span><span class="n">Field</span><span class="p">(</span><span class="s2">&quot;bar&quot;</span><span class="p">,</span> <span class="n">PrimitiveType</span><span class="p">(</span><span class="s2">&quot;integer&quot;</span><span class="p">))])),</span>
</span><span id="__span-0-5"><a id="__codelineno-0-5" name="__codelineno-0-5" href="#__codelineno-0-5"></a> <span class="n">Field</span><span class="p">(</span><span class="s2">&quot;bar&quot;</span><span class="p">,</span> <span class="n">PrimitiveType</span><span class="p">(</span><span class="s2">&quot;integer&quot;</span><span class="p">))</span>
</span><span id="__span-0-6"><a id="__codelineno-0-6" name="__codelineno-0-6" href="#__codelineno-0-6"></a><span class="p">]</span>
</span><span id="__span-0-7"><a id="__codelineno-0-7" name="__codelineno-0-7" href="#__codelineno-0-7"></a><span class="n">dt</span><span class="o">.</span><span class="n">alter</span><span class="o">.</span><span class="n">add_columns</span><span class="p">(</span>
</span><span id="__span-0-8"><a id="__codelineno-0-8" name="__codelineno-0-8" href="#__codelineno-0-8"></a> <span class="n">new_fields</span>
</span><span id="__span-0-9"><a id="__codelineno-0-9" name="__codelineno-0-9" href="#__codelineno-0-9"></a><span class="p">)</span>
</span></code></pre></div>
</details>
</div>

Expand Down Expand Up @@ -2482,6 +2496,33 @@ <h4 id="deltalake.table.TableAlterer.drop_constraint" class="doc doc-heading">



<h4 id="deltalake.table.TableAlterer.set_column_metadata" class="doc doc-heading">
<span class="doc doc-object-name doc-function-name">set_column_metadata</span>


</h4>
<div class="doc-signature highlight"><pre><span></span><code><span id="__span-0-1"><a id="__codelineno-0-1" name="__codelineno-0-1" href="#__codelineno-0-1"></a><span class="nf">set_column_metadata</span><span class="p">(</span><span class="n">column</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span> <span class="n">metadata</span><span class="p">:</span> <span class="nb">dict</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="nb">str</span><span class="p">],</span> <span class="n">commit_properties</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">CommitProperties</span><span class="p">]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span> <span class="n">post_commithook_properties</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">PostCommitHookProperties</span><span class="p">]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="kc">None</span>
</span></code></pre></div>

<div class="doc doc-contents ">

<p>Update a field's metadata in a schema. If the metadata key does not exist, the entry is inserted.</p>
<p>If the column name doesn't exist in the schema - an error is raised.</p>
<p>:param column: name of the column to update metadata for.
:param metadata: the metadata to be added or modified on the column.
:param commit_properties: properties of the transaction commit. If None, default values are used.
:param post_commithook_properties: properties for the post commit hook. If None, default values are used.
:return:</p>

</div>

</div>


<div class="doc doc-object doc-function">



<h4 id="deltalake.table.TableAlterer.set_table_properties" class="doc doc-heading">
<span class="doc doc-object-name doc-function-name">set_table_properties</span>

Expand Down
31 changes: 30 additions & 1 deletion api/delta_table/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -3360,7 +3360,7 @@ <h4 id="deltalake.DeltaTable.merge" class="doc doc-heading">


</h4>
<div class="doc-signature highlight"><pre><span></span><code><span id="__span-0-1"><a id="__codelineno-0-1" name="__codelineno-0-1" href="#__codelineno-0-1"></a><span class="nf">merge</span><span class="p">(</span><span class="n">source</span><span class="p">:</span> <span class="n">Union</span><span class="p">[</span><span class="n">pyarrow</span><span class="o">.</span><span class="n">Table</span><span class="p">,</span> <span class="n">pyarrow</span><span class="o">.</span><span class="n">RecordBatch</span><span class="p">,</span> <span class="n">pyarrow</span><span class="o">.</span><span class="n">RecordBatchReader</span><span class="p">,</span> <span class="n">ds</span><span class="o">.</span><span class="n">Dataset</span><span class="p">,</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">],</span> <span class="n">predicate</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span> <span class="n">source_alias</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">str</span><span class="p">]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span> <span class="n">target_alias</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">str</span><span class="p">]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span> <span class="n">error_on_type_mismatch</span><span class="p">:</span> <span class="nb">bool</span> <span class="o">=</span> <span class="kc">True</span><span class="p">,</span> <span class="n">writer_properties</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">WriterProperties</span><span class="p">]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span> <span class="n">large_dtypes</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">bool</span><span class="p">]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span> <span class="n">custom_metadata</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">Dict</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="nb">str</span><span class="p">]]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span> <span class="n">post_commithook_properties</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">PostCommitHookProperties</span><span class="p">]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span> <span class="n">commit_properties</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">CommitProperties</span><span class="p">]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">TableMerger</span>
<div class="doc-signature highlight"><pre><span></span><code><span id="__span-0-1"><a id="__codelineno-0-1" name="__codelineno-0-1" href="#__codelineno-0-1"></a><span class="nf">merge</span><span class="p">(</span><span class="n">source</span><span class="p">:</span> <span class="n">Union</span><span class="p">[</span><span class="n">pyarrow</span><span class="o">.</span><span class="n">Table</span><span class="p">,</span> <span class="n">pyarrow</span><span class="o">.</span><span class="n">RecordBatch</span><span class="p">,</span> <span class="n">pyarrow</span><span class="o">.</span><span class="n">RecordBatchReader</span><span class="p">,</span> <span class="n">ds</span><span class="o">.</span><span class="n">Dataset</span><span class="p">,</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">],</span> <span class="n">predicate</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span> <span class="n">source_alias</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">str</span><span class="p">]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span> <span class="n">target_alias</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">str</span><span class="p">]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span> <span class="n">merge_schema</span><span class="p">:</span> <span class="nb">bool</span> <span class="o">=</span> <span class="kc">False</span><span class="p">,</span> <span class="n">error_on_type_mismatch</span><span class="p">:</span> <span class="nb">bool</span> <span class="o">=</span> <span class="kc">True</span><span class="p">,</span> <span class="n">writer_properties</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">WriterProperties</span><span class="p">]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span> <span class="n">large_dtypes</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">bool</span><span class="p">]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span> <span class="n">streamed_exec</span><span class="p">:</span> <span class="nb">bool</span> <span class="o">=</span> <span class="kc">True</span><span class="p">,</span> <span class="n">custom_metadata</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">Dict</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="nb">str</span><span class="p">]]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span> <span class="n">post_commithook_properties</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">PostCommitHookProperties</span><span class="p">]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span> <span class="n">commit_properties</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">CommitProperties</span><span class="p">]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">TableMerger</span>
</span></code></pre></div>

<div class="doc doc-contents ">
Expand Down Expand Up @@ -3438,6 +3438,20 @@ <h4 id="deltalake.DeltaTable.merge" class="doc doc-heading">
<code>None</code>
</td>
</tr>
<tr>
<td><code>merge_schema</code></td>
<td>
<code>bool</code>
</td>
<td>
<div class="doc-md-description">
<p>Enable merge schema evolution for mismatch schema between source and target tables</p>
</div>
</td>
<td>
<code>False</code>
</td>
</tr>
<tr>
<td><code>error_on_type_mismatch</code></td>
<td>
Expand Down Expand Up @@ -3480,6 +3494,21 @@ <h4 id="deltalake.DeltaTable.merge" class="doc doc-heading">
<code>None</code>
</td>
</tr>
<tr>
<td><code>streamed_exec</code></td>
<td>
<code>bool</code>
</td>
<td>
<div class="doc-md-description">
<p>Will execute MERGE using a LazyMemoryExec plan, this improves memory pressure for large source tables. Enabling streamed_exec
implicitly disables source table stats to derive an early_pruning_predicate</p>
</div>
</td>
<td>
<code>True</code>
</td>
</tr>
<tr>
<td><code>arrow_schema_conversion_mode</code></td>
<td>
Expand Down
8 changes: 4 additions & 4 deletions integrations/delta-lake-daft/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -2127,14 +2127,14 @@ <h2 id="write-to-delta-lake">Write to Delta Lake</h2>
</span></code></pre></div>
<p>Daft supports multiple write modes. See the <a href="https://www.getdaft.io/projects/docs/en/latest/api_docs/doc_gen/dataframe_methods/daft.DataFrame.write_deltalake.html#daft.DataFrame.write_deltalake">Daft documentation</a> for more information.</p>
<h2 id="what-can-i-do-with-a-daft-dataframe">What can I do with a Daft DataFrame?</h2>
<p>Daft gives you <a href="https://www.getdaft.io/projects/docs/en/latest/user_guide/basic_concepts.html">full-featured DataFrame functionality</a>, similar to what you might be used to from pandas, Dask or PySpark.</p>
<p>Daft gives you full-featured DataFrame functionality, similar to what you might be used to from pandas, Dask or PySpark.</p>
<p>On top of this, Daft also gives you:</p>
<ul>
<li><strong>Multimodal data type support</strong> to work with Images, URLs, Tensors and more</li>
<li><strong>Expressions API</strong> for easy column transformations</li>
<li><strong>UDFs</strong> for multi-column transformation, incl. ML applications</li>
</ul>
<p>Check out the <a href="https://www.getdaft.io/projects/docs/en/latest/user_guide/index.html">Daft User Guide</a> for a complete list of DataFrame operations.</p>
<p>Check out the <a href="https://www.getdaft.io/projects/docs/en/stable/index.html">Daft User Guide</a> for a complete list of DataFrame operations.</p>
<h2 id="data-skipping-optimizations">Data Skipping Optimizations</h2>
<p>Delta Lake and Daft work together to give you highly-optimized query performance.</p>
<p>Delta Lake stores your data in Parquet files. Parquet is a columnar row format that natively supports column pruning. If your query only needs to read data from a specific column or set of columns, you don't need to read in the entire file. This can save you lots of time and compute.</p>
Expand Down Expand Up @@ -2210,11 +2210,11 @@ <h3 id="z-ordering-for-enhanced-file-skipping">Z-Ordering for enhanced file skip
<p>Read <a href="https://delta.io/blog/daft-delta-lake-integration/">High-Performance Querying on Massive Delta Lake Tables with Daft</a> for an in-depth benchmarking of query optimization with Delta Lake and Daft using partitioning and Z-ordering.</p>
<h2 id="daft-gives-you-multimodal-data-type-support">Daft gives you Multimodal Data Type Support</h2>
<p>Daft has a rich multimodal type-system with support for Python objects, Images, URLs, Tensors and more.</p>
<p>The <a href="https://www.getdaft.io/projects/docs/en/latest/api_docs/expressions.html">Expressions API</a> provides useful tools to work with these data types. By combining multimodal data support with the <a href="https://www.getdaft.io/projects/docs/en/latest/api_docs/udf.html">User-Defined Functions API</a> you can <a href="https://www.getdaft.io/projects/docs/en/latest/user_guide/tutorials.html#mnist-digit-classification">run ML workloads</a> right within your DataFrame.</p>
<p>The <a href="https://www.getdaft.io/projects/docs/en/latest/api_docs/expressions.html">Expressions API</a> provides useful tools to work with these data types. By combining multimodal data support with the <a href="https://www.getdaft.io/projects/docs/en/latest/api_docs/udf.html">User-Defined Functions API</a> you can run ML workloads right within your DataFrame.</p>
<p>Take a look at the notebook in the <a href="https://github.com/delta-io/delta-examples"><code>delta-examples</code> Github repository</a> for a closer look at how Daft handles URLs, images and ML applications.</p>
<h2 id="contribute-to-daft">Contribute to <code>daft</code></h2>
<p>Excited about Daft and want to contribute? Join them on <a href="https://github.com/Eventual-Inc/Daft">Github</a> 🚀</p>
<p>Like many technologies, Daft collects some non-identifiable telemetry to improve the product. This is stricly non-identifiable metadata. You can disable telemetry by setting the following environment variable: <code>DAFT_ANALYTICS_ENABLED=0</code>. Read more in the <a href="https://www.getdaft.io/projects/docs/en/latest/faq/telemetry.html">Daft documentation</a>.</p>
<p>Like many technologies, Daft collects some non-identifiable telemetry to improve the product. This is stricly non-identifiable metadata. You can disable telemetry by setting the following environment variable: <code>DAFT_ANALYTICS_ENABLED=0</code>. Read more in the <a href="https://www.getdaft.io/projects/docs/en/stable/resources/telemetry/">Daft documentation</a>.</p>



Expand Down
Binary file modified objects.inv
Binary file not shown.
2 changes: 1 addition & 1 deletion search/search_index.json

Large diffs are not rendered by default.

Loading

0 comments on commit d3c65b1

Please sign in to comment.