Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Hive 3.x kerberos not Support #294

Merged
merged 1 commit into from
Oct 2, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
106 changes: 106 additions & 0 deletions HowToKerberize.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
![Bee waggle-dancing on a hive.](logo.png "Federating Hive Meta Stores.")

# Additional instructions to use Waggle Dance in a Kerberized environment


### Process

In a Kerberos environment a client make a request to Waggle Dance which in turn requests the proxy user's token from the metastore and then uses this token to communicate with the metastore.

This is necessary in certain scenarios that need authentication - for example the `create_table` API that requires the proxy user to create HDFS directories.

![Kerberos Process.](kerberos-process.png "Kerberos Process")

In addition, because Kerberos authentication requires a delegation-token to proxy as other users, the proxy user of the session is shared globally. This means we need to make all Hive Metastores share a set of delegation-token storage so that a single delegation-token can be authenticated by multiple Metastores.

**One solution is to use Zookeeper to store tokens for all Hive Metastores**

### Prerequisites

* Kerberized cluster:
active KDC,
some required properties in configuration files of Hadoop services
* User account with privileges in kerberos environment
* Zookeeper to store delegation-token (Recommended)

### Configuration

Waggle Dance does not read Hadoop's `core-site.xml` so a general property providing Kerberos auth should be added to
the Hive configuration file `hive-site.xml`:

```
<property>
<name>hadoop.security.authentication</name>
<value>KERBEROS</value>
</property>
```


Waggle Dance also needs a keytab file to communicate with the Metastore so the following properties should be present:
```
<property>
<name>hive.metastore.sasl.enabled</name>
<value>true</value>
</property>
<property>
<name>hive.metastore.kerberos.principal</name>
<value>hive/_HOST@YOUR_REALM.COM</value>
</property>
<property>
<name>hive.metastore.kerberos.keytab.file</name>
<value>/etc/hive.keytab</value>
</property>
```

In addition, all metastores need to use the Zookeeper shared token:
```
<property>
<name>hive.cluster.delegation.token.store.class</name>
<value>org.apache.hadoop.hive.thrift.ZooKeeperTokenStore</value>
</property>
<property>
<name>hive.cluster.delegation.token.store.zookeeper.connectString</name>
<value>zk1:2181,zk2:2181,zk3:2181</value>
</property>
<property>
<name>hive.cluster.delegation.token.store.zookeeper.znode</name>
<value>/hive/token</value>
</property>
```

If you are intending to use a Beeline client, the following properties may be valuable:
```
<property>
<name>hive.server2.transport.mode</name>
<value>http</value>
</property>
<property>
<name>hive.server2.authentication</name>
<value>KERBEROS</value>
</property>
<property>
<name>hive.server2.authentication.kerberos.principal</name>
<value>hive/_HOST@YOUR_REALM.COM</value>
</property>
<property>
<name>hive.server2.authentication.kerberos.keytab</name>
<value>/etc/hive.keytab</value>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<value>false</value>
</property>
```


### Running

Waggle Dance should be started by a privileged user with a fresh keytab.

If Waggle Dance throws a GSS exception, you have problem with the keytab file.
Try to perform `kdestroy` and `kinit` operations and check the keytab file ownership flags.

If the Metastore throws an exception with code -127, Waggle Dance is probably using the wrong authentication policy.
Check the values in `hive-conf.xml` and make sure that HIVE_HOME and HIVE_CONF_DIR are defined.

Don't forget to restart hive services!
Binary file added kerberos-process.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
22 changes: 22 additions & 0 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -51,10 +51,32 @@
<jakarta.version>6.0.0</jakarta.version>
<lombok.version>1.18.24</lombok.version>
<apache.commons.version>3.12.0</apache.commons.version>
<curator.version>2.13.0</curator.version>
</properties>

<dependencyManagement>
<dependencies>
<!-- zookeeper -->
<dependency>
<groupId>org.apache.curator</groupId>
<artifactId>curator-client</artifactId>
<version>${curator.version}</version>
</dependency>
<dependency>
<groupId>org.apache.curator</groupId>
<artifactId>curator-framework</artifactId>
<version>${curator.version}</version>
</dependency>
<dependency>
<groupId>org.apache.curator</groupId>
<artifactId>curator-recipes</artifactId>
<version>${curator.version}</version>
</dependency>
<dependency>
<groupId>org.apache.curator</groupId>
<artifactId>curator-x-discovery</artifactId>
<version>${curator.version}</version>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
Expand Down
11 changes: 11 additions & 0 deletions waggle-dance-core/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,17 @@
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-service</artifactId>
<version>${hive.version}</version>
<exclusions>
<exclusion>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-server</artifactId>
</exclusion>
<exclusion>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-runner</artifactId>
</exclusion>
</exclusions>
</dependency>


Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -90,10 +90,16 @@ public PrefixBasedDatabaseMappingService(

private void add(AbstractMetaStore metaStore) {
MetaStoreMapping metaStoreMapping = metaStoreMappingFactory.newInstance(metaStore);

DatabaseMapping databaseMapping = createDatabaseMapping(metaStoreMapping);

if (metaStore.getFederationType() == PRIMARY) {
primaryDatabaseMapping = databaseMapping;
if (!metaStoreMapping.isAvailable()) {
throw new WaggleDanceException(
String.format("Primary metastore is unavailable {}", metaStore.getRemoteMetaStoreUris())
);
}
}

mappingsByPrefix.put(metaStoreMapping.getDatabasePrefix(), databaseMapping);
Expand Down