Apache Impala is a parallel execution engine for SQL queries. It supports low-latency access and interactive exploration of data in HDFS and HBase. Impala allows data to be stored in a raw form, with aggregation performed at query time without requiring upfront aggregation of data.
Apache Impala is only available when using Cloudera CDH as the Hadoop distribution for PNDA.