AWS SDK for pandas 2.18.0
Noteworthy
- Pyarrow 10 support 🔥 by @kukushking in #1731
- Lambda layers now available in
af-south-1
(Cape Town) 🌍 by @malachi-constant
Features & enhancements
- Add unload_approach to athena.read_sql_table by @jaidisido in #1634
- Pass additional partition projection params to wr.s3.to_parquet & cat… by @kukushking in #1627
- Regenerate poetry.lock with no update by @cnfait in #1663
- Upgrading poetry installed in workflow by @cnfait in #1677
- Improve bucketing series generation by casting only the required columns by @kukushking in #1664
- Add get_query_executions generating DataFrames from Athena query executions detail by @KhueNgocDang in #1676
- Dependency: Set Pandas Version != 1.5.0 bue to memory leak by @malachi-constant in #1688
- read_csv: read file as binary when encoding_errors is set to ignore by @cnfait in #1723
- Deps: Remove upper bound limit on 'python' version by @malachi-constant in #1720
- (enhancement) Redshift: Adding 'primary_keys' to parameter validation by @malachi-constant in #1728
- Add describe_log_streams and filter_log_events to the CloudWatch module by @KhueNgocDang in #1785
- Update lambda layers with pyarrow 10 by @kukushking in #1758
- Add ctas_write_compression argument to athena.read_sql_query by @LeonLuttenberger in #1795
- Add auto termination policy to EMR by @vikramsg in #1818
- timestream.query: add QueryId and NextToken to df attributes by @cnfait in #1821
- Add support for boto3 kwargs to timestream.create_table by @cnfait in #1819
- Adding args to submit spark step by @vikramsg in #1826
Bug fixes
- Fix athena.read_sql_query for empty table and chunk size not returning an empty frame generator by @LeonLuttenberger in #1685
- Fixing index column validation in
s3.read.parquet()
validate schema by @malachi-constant in #1735 - Bug: Replace extra_registries with extra_public_registries by @vikramsg in #1757
- Fix: map datatype issue of athena by @pal0064 in #1753
- Fix Redshift commands breaking with hyphenated table names by @LeonLuttenberger in #1762
- Add correct service names for timestream boto3 clients by @malachi-constant in #1716
- Allow read partitions with extra = in the value by @kukushking in #1779
Documentation
- Update install page in docs with screenshot of new managed layer name by @LeonLuttenberger in #1636
- Remove semicolon from python code eol in s3 tutorial by @cnfait in #1673
- Consistent kernel for jupyter notebooks by @cnfait in #1674
- Correct a few typos in our ipynb tutorials by @cnfait in #1694
- Fix broken links in readme by @lucasasmith in #1702
- Typos in comments and docs by @mycaule in #1761
Tests
- Support for test infrastructure in private subnets by @cnfait in #1698
- Upgrade engine versions to match defaults from aws console by @cnfait in #1709
- Set redshift and Neptune clusters removal policy to destroy by @cnfait in #1675
- Upgrade pytest-xdist by @LeonLuttenberger in #1760
- Fix timestream endpoint tests by @LeonLuttenberger in #1781
New Contributors
- @lucasasmith made their first contribution in #1702
- @vikramsg made their first contribution in #1757
- @mycaule made their first contribution in #1761
- @pal0064 made their first contribution in #1753
Thanks
We thank the following contributors/users for their work on this release:
@lucasasmith, @vikramsg, @mycaule, @pal0064, @LeonLuttenberger, @cnfait, @malachi-constant, @kukushking, @jaidisido
Full Changelog: 2.17.0...2.18.0