Skip to content

Commit

Permalink
Update dataframe page.
Browse files Browse the repository at this point in the history
  • Loading branch information
mrocklin committed Dec 15, 2023
1 parent 2d8f6d5 commit c8989d9
Show file tree
Hide file tree
Showing 6 changed files with 187 additions and 190 deletions.
4 changes: 2 additions & 2 deletions docs/source/dataframe-api.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
API
---
Dask DataFrame API
==================

.. currentmodule:: dask.dataframe

Expand Down
8 changes: 4 additions & 4 deletions docs/source/dataframe-create.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Create and Store Dask DataFrames
================================
Load and Save Data with Dask DataFrames
=======================================

.. meta::
:description: Learn how to create DataFrames and store them. Create a Dask DataFrame from various data storage formats like CSV, HDF, Apache Parquet, and others.
Expand Down Expand Up @@ -67,7 +67,7 @@ Read from CSV

You can use :func:`read_csv` to read one or more CSV files into a Dask DataFrame.
It supports loading multiple files at once using globstrings:

.. code-block:: python
>>> df = dd.read_csv('myfiles.*.csv')
Expand All @@ -76,7 +76,7 @@ You can break up a single large file with the ``blocksize`` parameter:

.. code-block:: python
>>> df = dd.read_csv('largefile.csv', blocksize=25e6) # 25MB chunks
>>> df = dd.read_csv('largefile.csv', blocksize=25e6) # 25MB chunks
Changing the ``blocksize`` parameter will change the number of partitions (see the explanation on
:ref:`partitions <dataframe-design-partitions>`). A good rule of thumb when working with
Expand Down
4 changes: 2 additions & 2 deletions docs/source/dataframe-design.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. _dataframe.design:

Internal Design
===============
Dask DataFrame Design
=====================

Dask DataFrames coordinate many Pandas DataFrames/Series arranged along an
index. We define a Dask DataFrame object with the following components:
Expand Down
14 changes: 14 additions & 0 deletions docs/source/dataframe-extra.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
Additional Information
======================

.. toctree::
:maxdepth: 1

Parquet <dataframe-parquet.rst>
Indexing <dataframe-indexing.rst>
SQL <dataframe-sql.rst>
Join Performance <dataframe-joins.rst>
Shuffling Performance <dataframe-groupby.rst>
dataframe-categoricals.rst
Extend <dataframe-extend.rst>
Hive Partitioning <dataframe-hive.rst>
4 changes: 2 additions & 2 deletions docs/source/dataframe-groupby.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Shuffling for GroupBy and Join
==============================
Shuffling Performance
=====================

.. currentmodule:: dask.dataframe

Expand Down
Loading

0 comments on commit c8989d9

Please sign in to comment.