Skip to content

Latest commit

 

History

History
546 lines (480 loc) · 51.6 KB

1.2.0.md

File metadata and controls

546 lines (480 loc) · 51.6 KB
description
Release Notes for 1.2.0

1.2.0

This release comes with several Improvements and Bug Fixes for the Multistage Engine, Upserts and Compaction. There are a ton of other small features and general bug fixes.

Multistage Engine Improvements

Features

New Window Functions: LEAD, LAG, FIRST_VALUE, LAST_VALUE #12878 #13340

  • LEAD allows you to access values after the current row in a frame.
  • LAG allows you to access values before the current row in a frame.
  • FIRST_VALUE and LAST_VALUE return the respective extremal values in the frame.

Support for Logical Database in V2 Engine #12591 #12695

  • V2 Engine now supports a "database" construct, enabling table namespace isolation within the same Pinot cluster.
  • Improves user experience when multiple users are using the same Pinot Cluster.
  • Access control policies can be set at the database level.
  • Database can be selected in a query using a SET statement, such as SET database=my_db;.

Improved Multi-Value (MV) and Array Function Support

  • Added array sum aggregation functions for point-wise array operations #13324.
  • Added support for valueIn MV transform function #13443.
  • Fixed bug in numeric casts for MV columns in filters #13425.
  • Fixed NPE in ArrayAgg when a column contains no data #13358.
  • Fixed array literal handling #13345.

Support for WITHIN GROUP Clause and ListAgg #13146

  • WITHIN GROUP Clause can be used to process rows in a given order within a group.
  • One of the most common use-cases for this is the ListAgg function, which when combined with WITHIN GROUP can be used to concatenate strings in a given order.

Scalar/Transform Function and Set Operation Improvements

  • Added Geospatial Scalar Function support for use in intermediate stage in the v2 query engine #13457.
  • Fix 'WEEK' transform function #13483.
  • Support EXTRACT as a scalar function #13463.
  • Added support for ALL modifier for INTERSECT and EXCEPT Set Operations #13151 #13166.

Improved Literal Handling Support

  • Fixed bug in handling literal arguments in aggregation functions like Percentile #13282.
  • Allow INT and FLOAT literals #13078.
  • Fixed literal handling for all types #13344 #13345.
  • Fixed null literal handling for null intolerant functions #13255.

Metrics Improvements

  • Added new metrics for tracking queries executed globally and at the table level #12982.
  • New metrics to track join counts and window function counts #13032.
  • Multiple meters and timers to track Multistage Engine Internals #13035.

Notable Improvements and Bug Fixes

  • Improved Window operators resiliency, with new checks to make sure the window doesn't grow too large #13180 #13428 #13441.
  • Optimized Group Key generation #12394.
  • Fixed SortedMailboxReceiveOperator to honor convention of pulling at most 1 EOS block #12406.
  • Improvement in how execution stats are handled #12517 #12704 #13136.
  • Use Protobuf instead of Reflection for Plan Serialization #13221.

Upsert Compaction and Minion Improvements

Features and Improvements

Minion Resource Isolation #12459 #12786

  • Minions now support resource isolation based on an instance tag.
  • Instance tag is configured at table level, and can be set for each task on a table.
  • This enables you to implement arbitrary resource isolation strategies, i.e. you can use a set of Minion Nodes for running any set of tasks across any set of tables.

Greedy Upsert Compaction Scheduling #12461

  • Upsert compaction now schedules segments for compaction based on the number of invalid docs.
  • This helps the compaction task to handle arbitrary temporal distribution of invalid docs.

Notable Improvements

  • Minions can now download segments from servers when deepstore copy is missing. This feature is enabled via a cluster level config allowDownloadFromServer #12960 #13247.
  • Added support for TLS Port in Minions #12943.
  • New metrics added for Minions to track segment/record processing information #12710.

Bug Fixes

  • Minions can now handle invalid instance tags in Task Configs gracefully. Prior to this change, Minions would be stuck in IN_PROGRESS state until task timeout #13092.
  • Fix bug to return validDocIDsMetadata from all servers #12431.
  • Upsert compaction doesn't retain maxLength information and trims string fields #13157.

Upsert Improvements

Features and Improvements

Consistent Table View for Upsert Tables #12976

  • Adds different modes of consistency guarantees for Upsert tables.
  • Adds a new UpsertConfig called consistencyMode which can be set to NONE, SYNC, SNAPSHOT.
  • SYNC is optimized for data freshness but can lead to elevated query latencies and is best for low-qps use-cases. In this mode, the ingestion threads will take a WLock when updating validDocID bitmaps.
  • SNAPSHOT mode can handle high-qps/high-ingestion use-cases by getting the list of valid docs from a snapshot of validDocID. The snapshot can be refreshed every few seconds and the tolerance can be set via a query option upsertViewFreshnessMs.

Pluggable Partial Upsert Merger #11983

  • Partial Upsert merges the old record and the new incoming record to generate the final ingested record.
  • Pinot now allows users to customize how this merge of an old row and the new row is computed.
  • This allows a column value in the new row to be an arbitrary function of the old and the new row.

Support for Uploading Externally Partitioned Segments for Upsert Backfill 13107

  • Segments uploaded for Upsert Backfill can now explicitly specify the Kafka partition they belong to.
  • This enables backfilling an Upsert table where the externally generated segments are partitioned using an arbitrary hash function on an arbitrary primary key.

Misc Improvements and Bug Fixes

  • Fixed a Bug in Handling Equal Comparison Column Values in Upsert, which could lead to data inconsistency (#12395)
  • Upsert snapshot will now snapshot only those segments which have updates. #13285.

Notable Features

JSON Support Improvements

  • JSON Index can now be used for evaluating Regex and Range Predicates. #12568
  • jsonExtractIndex now supports contextual array filters. #12683 #12531.
  • JSON column type now supports filter predicates like =, !=, IN and NOT IN. This is convenient for scenarios where the JSON values are very small. #13283.
  • JSON_MATCH now supports exclusive predicates correctly. For instance, you can use predicates such as JSON_MATCH(person, '"$.addresses[*].country" != ''us''' to find all people who have at least one address that is not in the US. #13139.
  • jsonExtractIndex supports extracting Multi-Value JSON Fields, and also supports providing any default value when the key doesn't exist. #12748.
  • Added isJson UDF which increases your options to handle invalid JSONs. This can be used in queries and for filtering invalid json column values in ingestion. #12603.
  • Fix ArrayIndexOutOfBoundsException in jsonExtractIndex. #13479.

Lucene and Text Search Improvements

  • Improved Segment Build Time for Lucene Text Index by 40-60%. This improvement is realized when a consuming segment commits and changes to an ImmutableSegment. This significantly helps in lowering ingestion lag at commit time due to a large text index #12744 #13094 #13050.
  • Phrase Search can run 3x faster when the Lucene Index Config enablePrefixSuffixMatchingInPhraseQueries is set to true. This is achieved by rewriting phrase search query to a wildcard and prefix matching query #12680.
  • Fixed bug in TextMatchFilterOptimizer that was not applying precedence to the filter expressions properly, which could lead to incorrect results. #13009.
  • Fixed bug in handling NOT text_match which could have returned incorrect results. #12372.
  • Added SchemaConformingTranformerV2 to enhance text search abilities. #12788.
  • Added metrics to track Lucene NRT Refresh Delay #13307.
  • Switched to NRTCachingDirectory for Realtime segments and prevented duplicates in the Realtime Lucene Index to avoid IndexOutOfBounds query time exceptions. #13308.
  • Lucene Version is upgraded to 9.11.1. #13505.

New Funnel Functions #13176 #13231 #13228

  • Added funnelMaxStep function which can be used to calculate max funnel steps for a given sliding window .
  • Added funnelCompleteCount to calculate the number of completed funnels, and funnelMatchStep to get the funnel match array.

Support for Interning for OnHeapByteDictionary #12342

  • This can reduce the heap usage of a dictionary encoded byte column, for a certain distribution of duplicate values. See #12223 for details.

Column Major Builder On By Default for New Tables #12770

  • Prior to this feature, on a segment commit, Pinot would convert all the columnar data from the Mutable Segment to row-major, and then re-build column major Immutable Segments.
  • This feature skips the row-major conversion and is expected to be both space and time efficient.
  • It can help lower ingestion lag from segment commits, especially helpful when your segments are large.

Support for SQL Formatting in Query Editor #11725

  • You can now prettify SQL right in the Controller UI!

Hash Function for UUID Primary Keys #12538

  • Added a new lossless hash-function for Upsert Primary Keys optimized for UUIDs.
  • The hash function can reduce Old Gen by up to 30%.
  • It maps a UUID to a 16 byte array, vs encoding it in a UTF string which would take 36 bytes.

Column Level Index Skip Query Option #12414

  • Convenient for debugging impact of indexes on query performance or results.
  • You can add the skipIndexes option to your query to skip any number of indexes. e.g. SET skipIndexes=inverted,range;

New UDFs and Scalar Functions

  • New GeoHash functions: encodeGeoHash, decodeGeoHash, decodeGeoHashLatitude and decodeGeoHashLongitude.
  • dateBin can be used to align a timestamp to the nearest time bucket.
  • prefixes, suffixes and uniqueNgrams UDFs for generating all respective string subsequences from a string input. #12392.
  • Added isJson UDF which increases your options to handle invalid JSONs. This can be used in queries and for filtering invalid json column values in ingestion. #12603.
  • splitPart UDF has minor improvements. #12437.

CLP Compression Codec in Forward Indexes #12504

  • CLP is a compressed log processor which has really high compression ratio for certain log types.
  • To enable this, you can set the compressionCodec in the fieldConfigList of the column you want to target.

Misc. Improvements

  • Enable segment preloading at partition level #12451.
  • Use Temurin instead of AdoptOpenJdk #12533
  • Adding record reader config/context param to record transformer #12520
  • Removing legacy commons-lang dependency #13480
  • 12508: Feature add segment rows flush config #12681
  • ADSS Race Condition and update to client error codes #13104
  • Add ExceptionMapper to convert Exception to Response Object for Broker REST API's #13292
  • Add FunnelMaxStepAggregationFunction and FunnelCompleteCountAggregationFunction #13231
  • Add GZIP Compression Codec (#11434) #12668
  • Add PodDisruptionBudgets to the Pinot Helm chart #13153
  • Add Postgres compliant name aliasing for String Functions. #12795
  • Add SchemaConformingTransformerV2 to enhance text search abilities #12788
  • Add a benchmark to measure multi-stage block serde cost #13336
  • Add a plan version field to QueryRequest Protobuf Message #13267
  • Add a post-validator visitor that verifies there are no cast to bytes #12475
  • Add a safe version of CLStaticHttpHandler that disallows path traversal. #13124
  • Add ability to track filtered messages offset #12602
  • Add back 'numRowsResultSet' to BrokerResponse, and retain it when result table id hidden #13198
  • Add back profile for shade #12979
  • Add back some exclude deps from hadoop-mapreduce-client-core #12638
  • Add backward compatibility regression test suite for multi-stage query engine #13193
  • Add base class for custom object accumulator #12685
  • Add clickstream example table for funnel analysis #13379
  • Add config option for timezone #12386
  • Add config to skip record ingestion on string column length exceeding configured max schema length #13103
  • Add controller API to get allLiveInstances #12498
  • Add isJson UDF #12603
  • Add list of collaborators to asf.yaml #13346
  • Add locking logic to get consistent table view for upsert tables #12976
  • Add metric to track number of segments missed in upsert-snapshot #12581
  • Add metrics for SEGMENTS_WITH_LESS_REPLICAS monitoring #12336
  • Add mode to allow adding dummy events for non-matching steps #13382
  • Add offset based lag metrics #13298
  • Add protobuf codegen decoder #12980
  • Add retry policy to wait for job id to persist during rebalancing #13372
  • Add round-robin logic during downloadSegmentFromPeer #12353
  • Add schema as input to the decoder. #12981
  • Add splitPartWithLimit and splitPartFromEnd UDFs #12437
  • Add support for creating raw derived columns during segment reload #13037
  • Add support for raw JSON filter predicates #13283
  • Add the possibility of configuring ForwardIndexes with compressionCodec #12218
  • Add upsert-snapshot timer metric #12383
  • Add validation check for forward index disabled if it's a REALTIME table #12838
  • Added PR compatability test against release 1.1.0 #12921
  • Added kafka partition number to metadata. #13447
  • Added pinot-error-code header in query response #12338
  • Added tests for additional data types in SegmentPreProcessorTest.java #12755
  • Adding a cluster config to enable instance pool and replica group configuration in table config #13131
  • Adding batch api support for WindowFunction #12993
  • Adding bytes string data type integration tests #12387
  • Adding registerExtraComponents to allow registering additional components in various services #13465
  • Adding support of insecure TLS #12416
  • Adding support to insecure TLS when creating SSLFactory #12425
  • Adds AGGREGATE_CASE_TO_FILTER rule #12643
  • Adds per-column, query-time index skip option #12414
  • Allow Aggregations in Case Expressions #12613
  • Allow PintoHelixResourceManager subclasses to be used in the controller starter by providing an overridable PinotHelixResouceManager object creator function #13495
  • Allow RequestContext to consider http-headers case-insensitivity #13169
  • Allow Server throttling just before executing queries on server to allow max CPU and disk utilization #12930
  • Allow all raw index config in star-tree index #13225
  • Allow apply both environment variables and system properties to user and table configs, Environment variables take precedence over system properties #13011
  • Allow configurable queryWorkerThreads in Pinot server side GrpcQueryServer #13404
  • Allow dynamically setting the log level even for loggers that aren't already explicitly configured #13156
  • Allow passing custom record reader to be inited/closed in SegmentProcessorFramework #12529
  • Allow passing database context through database http header #12417
  • Allow stop to interrupt the consumer thread and safely release the resource #13418
  • Allow user configurable regex library for queries #13005
  • Allow using 'serverReturnFinalResult' to optimize server partitioned table #13208
  • Assign default value to newly added derived column upon reload #12648
  • Avoid port conflict in integration tests #13390
  • Better handling of null tableNames #12654
  • CLP as a compressionCodec #12504
  • Change helm app version to 1.0.0 for Apache Pinot latest release version #12436
  • Clean Google Dependencies #13297
  • Clean up BrokerRequestHandler and BrokerResponse #13179
  • Clean up arbitrary sleep in /GrpcBrokerClusterIntegrationTest #12379
  • Cleaning up vector index comments and exceptions #13150
  • Cleanup HTTP components dependencies and upgrade Thrift #12905
  • Cleanup Javax and Jakarta dependencies #12760
  • Cleanup deprecated query options #13040
  • Cleanup the consumer interfaces and legacy code #12697
  • Cleanup unnecessary dependencies under pinot-s3 #12904
  • Cleanup unused aggregate internal hint #13295
  • Consistency in API response for live broker #12201
  • Consolidate bouncycastle libraries #12706
  • Consolidate nimbus-jose-jwt version to 9.37.3 #12609
  • ControllerRequestClient accepts headers. Useful for authN tests #13481
  • Custom configuration property reader for segment metadata files #12440
  • Delete database API #12765
  • Deprecate PinotHelixResourceManager#getAllTables() in favour of getAllTables(String databaseName) #12782
  • Detect expired messages in Kafka. Log and set a gauge. #12608
  • Do not hard code resource class in BaseClusterIntegrationTest #13400
  • Do not pause ingestion when upsert snapshot flow errors out #13257
  • Don't drop original field during flatten #13490
  • Don't enforce -realTimeInstanceCount and -offlineInstanceCount options when creating broker tenants #13236
  • Egalpin/skip indexes minor changes #12514
  • Emit Metrics for Broker Adaptive Server Selector type #12482
  • Emit table size related metrics only in lead controller #12747
  • Enable complexType handling in SegmentProcessFramework #12942
  • Enable more integration tests to run on the v2 multi-stage query engine #13467
  • Enabling avroParquet to read Int96 as bytes #12484
  • Enhance Kinesis consumer #12806
  • Enhance Parquet Test #13082
  • Enhance ProtoSerializationUtils to handle class move #12946
  • Enhance Pulsar consumer #12812
  • Enhance PulsarConsumerTest #12948
  • Enhance commit threshold to accept size threshold without setting rows to 0 #12684
  • Enhance json index to support regexp and range predicate evaluation #12568
  • Enhancement: Sketch value aggregator performance #13020
  • Ensure FieldConfig.getEncodingType() is never null #12430
  • Ensure all the lists used in PinotQuery are ArrayList #13017
  • Ensure brokerId and requestId are always set in BrokerResponse #13200
  • Enter segment preloading at partition level #12451
  • Exclude dimensions from star-tree index stored type check #13355
  • Expose more helper API in TableDataManager #13147
  • Extend compatibility verifier operation timeout from 1m to 2m to reduce flakiness #13338
  • Extract json individual array elements from json index for the transform function jsonExtractIndex #12466
  • Fetch query quota capacity utilization rate metric in a callback function #12767
  • First with time #12235
  • GitHub Actions checkout v4 #12550
  • Gzip compression, ensure uncompressed size can be calculated from compressed buffer #12802
  • Handle errors gracefully during multi-stage stats collection in the broker #13496
  • Handle shaded classes in all methods of kafka factory #13087
  • Hash Function for UUID Primary Keys #12538
  • Ignore case when checking for Direct Memory OOM #12657
  • Improve Retention Manager Segment Lineage Clean Up #13232
  • Improve error message for max rows in join limit breach #13394
  • Improve exception logging when we fail to index / transform message #12594
  • Improve logging in range index handler for index updates #13381
  • Improve upsert compaction threshold validations #13424
  • Improve warn logs for requesting validDocID snapshots #13280
  • Improved metrics for server grpc query #13177
  • Improved null check for varargs #12673
  • Improved segment build time for Lucene text index realtime to offline conversion #12744
  • In ClusterTest, make start port higher to avoid potential conflict with Kafka #13402
  • Introduce PinotLogicalAggregate and remove internal hint #13291
  • Introduce retries while creating stream message decoder for more robustness #13036
  • Isolate bad server configs during broker startup phase #12931
  • Issue #12367 #12922
  • Json extract index filter support #12683
  • Json extract index mv #12532
  • Keep get tables API with and without database #12804
  • Lint failure #12294
  • Logging a warn message instead of throwing exception #12546
  • Made the error message around dimension table size clearer #13163
  • Make Helix state transition handling idempotent #12886
  • Make KafkaConsumerFactory method less restrictive to avoid incompatibility #12815
  • Make task manager APIs database aware #12766
  • Metric for count of tables configured with various tier backends #12940
  • Metric for upsert tables count #12505
  • Metrics for Realtime Rows Fetched and Stream Consumer Create Exceptions #12522
  • Minmaxrange null #12252
  • Modify consumingSegmentsInfo endpoint to indicate how many servers failed #12523
  • Move offset validation logic to consumer classes #13015
  • Move package org.apache.calcite to org.apache.pinot.calcite #12837
  • Move resolveComparisonTies from addOrReplaceSegment to base class #13396
  • Move some mispositioned tests under pinot-core #12884
  • Move wildfly-openssl dependency management to root pom #12597
  • Moving deleteSegment call from POST to DELETE call #12663
  • Optimize unnecessary extra array allocation and conversion for raw derived column during segment reload #13115
  • Pass explicit TypeRef when evaluating MV jsonPath #12524
  • Percentile operations supporting null #12271
  • Prepare for next development iteration #12530
  • Propagate Disable User Agent Config to Http Client #12479
  • Properly handle complex type transformer in segment processor framework #13258
  • Properly return response if SegmentCompletion is aborted #13206
  • Publish helm 0.2.8 #12465
  • Publish helm 0.2.9 #13230
  • Pull janino dependency to root pom #12724
  • Pull pulsar version definitaion into root POM #13002
  • Query response opt #13420
  • Re-enable the Spotless plugin for Java 21 #12992
  • Readme - How to setup Pinot UI for development #12408
  • Record enricher #12243
  • Refactor PinotTaskManager class #12964
  • Refactored CommonsConfigurationUtils for loading properties configuration. #13201
  • Refactored compatibility-verifier module #13359
  • Refactoring removeSegment flow in upsert #13449
  • Refine PeerServerSegmentFinder #12933
  • Refine SegmentFetcherFactory #12936
  • Replace custom fmpp plugin with fmpp-maven-plugin #12737
  • Reposition query submission spot for adaptive server selection #13327
  • Reset controller port when stopping the controller in ControllerTest #13399
  • Rest Endpoint to Create ZNode #12497
  • Return clear error message when no common broker found for multi-stage query with tables from different tenants #13235
  • Returning tables names failing authorization in Exception of Multi State Engine Queries #13195
  • Revert " Adding record reader config/context param to record transformer (#12520)" #12526
  • Revert "Using local copy of segment instead of downloading from remote (#12863)" #13114
  • Short circuit SubPlanFragmenter because we don't support multiple sub-plans yet #13306
  • Simplify Google dependencies by importing BOM #12456
  • Specify version for commons-validator #12935
  • Support NOT in StarTree Index #12988
  • Support empty strings as json nodes^ #12555
  • Supporting human-readable format when configuring broker response size #12510
  • Use ArrayList instead of LinkedList in SortOperator #12783
  • Use a two server setup for multi-stage query engine backward compatibility regression test suite #13371
  • Use more efficient variants of URLEncoder::encode and URLDecoder::decode #13030
  • Use parameterized log messages instead of string concatenation #13145
  • Use separate action for /tasks/scheduler/jobDetails API #13054
  • Use try-with-resources to close file walk stream in LocalPinotFS #13029
  • Using local copy of segment instead of downloading from remote #12863
  • [Adaptive Server Selector] Add metrics for Stats Manager Queue Size #12340
  • [Cleanup] Move classes in pinot-common to the correct package #13478
  • [Feature] Add Support for SQL Formatting in Query Editor #11725
  • [HELM]: Added additional probes options and startup probe. #13165
  • [HELM]: Added checksum config annotation in stateful set for broker, controller and server #13059
  • [HELM]: Added namespace support in K8s deployment. #13380
  • [HELM]: zookeeper chart upgrade to version 13.2.0 #13083
  • [Minor] Add Nullable annotation to HttpHeaders in BrokerRequestHandler #12816
  • [Minor] Small refactor of raw index creator constructor to be more clear #13093
  • [Multi-stage] Clean up RelNode to Operator handling #13325
  • [null-aggr] Add null handling support in mode aggregation #12227
  • [partial-upsert] configure early release of _partitionGroupConsumerSemaphore in RealtimeSegmentDataManager #13256
  • [spark-connector] Add option to fail read when there are invalid segments #13080
  • add Netty arm64 dependencies #12493
  • add Netty unit test #12486
  • add SegmentContext to collect validDocIds bitmaps for many segments together #12694
  • add skipUnavailableServers query option #13387
  • add insecure mode when Pinot uses TLS connections #12525
  • add instrumentation to json index getMatchingFlattenedDocsMap() #13164
  • add jmx to promethues metric exporting rule for realtimeRowsFiltered #12759
  • add metrics for IdeaState update #13266
  • add some metrics for upsert table preloading #12722
  • add some tests on jsonPathString #12954
  • add test cases in RequestUtilsTest #12557
  • add unit test for JsonAsyncHttpPinotClientTransport #12633
  • add unit test for QueryServer #12599
  • add unit test for ServerChannels #12616
  • add unit test for StringFunctions encodeUrl #13391
  • add unit tests for pinot-jdbc-client #13137
  • add url assertion to SegmentCompletionProtocolTest #13373
  • adjust the llc partition consuming metric reporting logic #12627
  • allow passing null http headers object to translateTableName #12764
  • allow to set segment when use SegmentProcessorFramework #13341
  • auto renew jvm default sslconext when it's loaded from files #12462
  • avoid useless intermediate byte array allocation for VarChunkV4Reader's getStringMV #12978
  • aws sdk 2.25.3 #12562
  • build-helper-maven-plugin 3.5.0 #12548
  • cache ssl contexts and reuse them #12404
  • clean up jetbrain nullable annotation #13427
  • cleanup: maven no transfer progress #12444
  • close JDBC connections #12494
  • do not fail on duplicate relaxed vars (#13214)z
  • dropwizard metrics 4.2.25 #12600
  • dynamic chunk sizing for v4 raw forward index #12945
  • enable Netty leak detection #12483
  • enable parallel Maven in pinot linter script #12751
  • ensure inverse And/OrFilterOperator implementations match the query #13199
  • exclude .mvn directory from source assembly #12558
  • extend CompactedPinotSegmentRecordReader so that it can skip deleteRecord #13352
  • get startTime outside the executor task to avoid flaky time checks #13250
  • handle absent segments so that catchup checker doesn't get stuck on them #12883
  • handle overflow for MutableOffHeapByteArrayStore buffer starting size #13215
  • handle segments not tracked by partition mgr and add skipUpsertView query option #13415
  • handle table name translation on missed api resources #12792
  • hash4j version upgrade to 0.17.0 #12968
  • including the underlying exception in the logging output #13248
  • int96 parity with native parquet reader #12496
  • jsonExtractIndex support array of default values #12748
  • log the log rate limiter rate for dropped broker logs #13041
  • make http listener ssl config swappable #12455
  • make reflection calls compatible with 0.9.11 [#12958](https://github.com/apache/
  • maven: no transfer progress #12528
  • missed to delete the temp dir #12637
  • move shouldReplaceOnComparisonTie to base class to be more reusable #13353
  • reduce Java enum .values() usage in TimerContext #12579
  • reduce logging for SpecialValueTransformer #12970
  • reduce regex pattern compilation in Pinot jdbc #13138
  • refactor TlsUtils class #12515
  • refine when to registerSegment while doing addSegment and replaceSegment for upsert tables for better data consistency #12709
  • reformat AdminConsoleIntegrationTest.java #12552
  • reformat ClusterTest.java #12531
  • release segment mgrs more reliably #13216
  • replaced getServer with getServers #12545
  • report rebalance job status for the early returns like noops #13281
  • require noDictionaryColumns with aggregationConfigs #12464
  • share the same table config object #12463
  • track segments for snapshotting even if they lost all comparisons #13388
  • untrack the segment out of TTL #12449
  • update ControllerJobType from enum to string #12518
  • update RewriterConstants so that expr min max would not collide with columns start with "parent" #13357
  • update access control check error handling to catch throwable and log errors #13209

Bug Fixes

  • Use gte(lte) to replace between() which has a bug #12595
  • Fix the ConcurrentModificationException for And/Or DocIdSet #12611
  • Upgrade RoaringBitmap to 1.0.5 to pick up the fix for RangeBitmap.between() #12604
  • bugfix: do not move src ByteBuffer position for LZ4 length prefixed decompress #12539
  • Bug Fix createDictionaryForColumn does not take into account inverted index #13048
  • fix Cluster Manager error #12632
  • fix for quick start Cluster Manager issue #12610
  • Adding config for having suffix for client ID for realtime consumer #13168
  • Addressed comments and fixed tests from pull request 12389. /uptime and /start-time endpoints working all components #12512
  • Bigfix. Added missing paramName #13060
  • Bug fix: Do not ignore scheme property #12332
  • Bug fix: Handle missing shade config overwrites for Kafka #13437
  • BugFix: Fix merge result from more than one server #12778
  • Bugfix. Allow tenant rebalance with downtime as true #13246
  • Bugfix. Avoid passing null table name input to translation util #12726
  • Bugfix. Correct wrong method call from scheduleTask() to scheduleTaskForDatabase() #12791
  • Bugfix. Maintain literal data type during function evaluation #12607
  • Cleanup: Fix grammar in error message, also improve readability. #13451
  • Fix Bug in Handling Equal Comparison Column Values in Upsert #12395
  • Fix ColumnMinMaxValueGenerator #12502
  • Fix JavaEE related dependencies #13058
  • Fix Logging Location for CPU-Based Query Killing #13318
  • Fix PulsarUtils to not share buffer #12671
  • Fix URI construction so that AddSchema command line tool works when override flag is set to true #13320
  • Fix [Type]ArrayList elements() method usage #13354
  • Fix a typo when calculating query freshness #12947
  • Fix an overflow in PinotDataBuffer.readFrom #13152
  • Fix bug in logging in UpsertCompaction task #12419
  • Fix bug to return validDocIDsMetadata from all servers #12431
  • Fix connection issues if using JDBC and Hikari (#12267) #12411
  • Fix controller host / port / protocol CLI option description for admin commands #13237
  • Fix environment variables not applied when creating table #12560
  • Fix error message for insufficient number of untagged brokers during tenant creation #13234
  • Fix few metric rules which were affected by the database prefix handling #13290
  • Fix file handle leaks in Pinot Driver (apache#12263) #12356
  • Fix flakiness of ControllerPeriodicTasksIntegrationTest #13337
  • Fix issue with startree index metadata loading for columns with '__' in name #12554
  • Fix metric rule pattern regex #12856
  • Fix pinot-parquet NoClassFound issue #12615
  • Fix segment size check in OfflineClusterIntegrationTest #13389
  • Fix some resource leak in tests #12794
  • Fix the NPE from IS update metrics #13313
  • Fix the NPE when metadataTTL is enabled without delete column #13262
  • Fix the ServletConfig loading issue with swagger. #13122
  • Fix the issue that map flatten shouldn't remove the map field from the record #13243
  • Fix the race condition for H3InclusionIndexFilterOperator #12487
  • Fix the time segment pruner on TIMESTAMP data type #12789
  • Fix time stats in SegmentIndexCreationDriverImpl #13429
  • Fixed infer logical type name from avro union schema #13224
  • Fixing instance type to resolve #12677 and #12678
  • Helm: bug fix for chart rendering issue. #13264
  • Try to amend kafka common package with pinot shaded package prefix #13056
  • Update getValidDocIdsMetadataFromServer to make call in batches to servers and other bug fixes #13314
  • Upgrade com.microsoft.azure:msal4j from 1.3.5 to 1.3.10 for CVE fixing #12580
  • [bugfix] Handling null value for kafka client id suffix #13279
  • bugfix: fixing jdbc client sql feature not supported exception #12480
  • bugfix: re-add support for not text_match #12372
  • bugfix: reduce enum array allocation in QueryLogger #12478
  • bugfix: use consumerDir during lucene realtime segment conversion #13094
  • cleanup: fix apache rat violation #12476
  • fix GuavaRateLimiter acquire method #12500
  • fix fieldsToRead class not in decoder #13186
  • fix flakey test, avoid early finalization #13095
  • fix merging null multi value in partial upsert #13031
  • fix race condition in ScalingThreadPoolExecutor #13360
  • fix shared buffer, tests #12587
  • fix(build): update node version to 16 #12924
  • fixing CVE critical issues by resolving kerby/jline and wildfly libraries #12566
  • fixing pinot-adls high severity CVEs #12571
  • fixing swagger setup using localhost as host name #13254
  • swagger-ui upgrade to 5.15.0 Fixes #12908
  • upgrade jettison version to fix CVE #12567