Golang client for Impala

Because Apache Impala driver for Golang does not support Kerberos authenticathion yet, we needed to find an workaround. Such a difficult task (:

There are two most used ports that clients are using to connect to Impala : 21000 (Beeswax ) and 21050 (HiveServer2).



There are some options when using Kerberos authentication, but the simplest way are to manualy generate a Kerberos ticket before connecting to Impala. In production we are actually running kinit before openning any connection to Impala.

The first thing is to install Kerberos, then get the krb5.conf and the keytab files and then generate the ticket :

sudo apt install krb5-user
kinit -V my.user /tmp/my.user.keytab



Golang impalathing is implementing thrift interface and connects to impala 21000 port .

Wen used in production we had some issues: it was executing insert queries for 10-15 minutes and then it failed with ” write tcp x.x.x.x:13196->x.x.x.x:21000: write: broken pipe” .

We found that the big datateam cannot allocate more resources on that port, so we started to search another solution that connects to the hive port.

conn, err := impalathing.Connect(
    func() impalathing.Option {
      return func(o *impalathing.Options) {
        o.SaslTransportConfig = map[string]string{
          "mechanismName": "GSSAPI",
          "service":       "impala",
        o.ConnectionTimeout = 1



Gohive is implementing thrift intrerface and connects to impala 21050 port.

This package is created for “Spark Distributed SQL Engine”  and  we had to adapt it to work with Impala.

You can run queries synchronously or asynchronously.

It works smoothly in production. In synchronous mode, when the query poool is full, it hangs out until the query is processed, so you will have to take this in consideration.

configuration := NewConnectConfiguration()
configuration.Service = "impala"
configuration.TLSConfig = &tls.Config{} // just activated ssl, we dit not needed any certs
configuration.HiveConfiguration = map[string]string{
     "request_pool": "prod.pool",
connection, errConn := Connect("impala.example.com", 21050, "KERBEROS", configuration)


We ended using Gohive, and it proved to be reliable in production environment.