impala

package module
v1.7.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 25, 2026 License: MIT Imports: 23 Imported by: 3

README

Golang Apache Impala Driver

project logo - gopher with impala horns

The actively supported Apache Impala driver for Go's database/sql package

This driver started as a fork of github.com/bippio/go-impala, which hasn't been updated in over four years and appears to be abandoned. Several issues have been fixed since then — some quite severe. The original codebase also didn't support Go modules.

Go Reference Go Report Card Tests Coverage Status

Install

Add impala-go to your Go module:

go get github.com/sclgo/impala-go

Alternatively, see below how to use it as a CLI. impala-go does not use CGO.

Connection Parameters and DSN

The data source name (DSN; connection string) uses a URL format: impala://username:password@host:port?param1=value&param2=value

Driver name is impala.

Parameters:
  • auth - string. Authentication mode. Supported values: noauth, ldap.
  • tls - boolean. Enable TLS
  • ca-cert - The file that contains the public key certificate of the CA that signed the Impala certificate
  • batch-size - integer value (default: 1024). Maximum number of rows fetched per request.
  • buffer-size- in bytes (default: 4096). Buffer size for the Thrift transport.
  • mem-limit - string value (example: 3m). Memory limit for query, as a share of available RAM or a fixed value. See https://impala.apache.org/docs/build/html/topics/impala_mem_limit.html for details.
  • query-timeout - integer value in seconds. Query timeout - see https://impala.apache.org/docs/build/html/topics/impala_query_timeout_s.html for details.
  • socket-timeout - integer or string value (default: 5s). The maximum socket idle time, expressed as a time duration in this syntax. If the value is an integer without a time unit, milliseconds are assumed.
  • connect-timeout - integer or string value (default: 10s). The max wait for initial connection to server, expressed as a time duration in this syntax. If the value is an integer without a time unit, milliseconds are assumed.
  • tls-insecure-skip-verify - boolean. Disables TLS certificate verification by enabling the tls.Config.InsecureSkipVerify option. Behaves the same way as AllowSelfSignedCerts in the official JDBC driver.
  • reuse-session - boolean. Disables resetting the session when database/sql requests it. When this setting is enabled, this driver behaves consistently with the other DB drivers in the ecosystem but diverges somewhat from documented database/sql behavior. This setting is disabled by default for backward compatibility and alignment with published Go documentation. It must be enabled when this driver is used in github.com/xo/usql. usql returns the connection to the pool after each statement, relying on the typical driver behavior.

A string of this format can be constructed using the URL type in the net/url package.

  query := url.Values{}
  query.Add("auth", "ldap")

  u := &url.URL{
      Scheme:   "impala",
      User:     url.UserPassword(username, password),
      Host:     net.JoinHostPort(hostname, port),
      RawQuery: query.Encode(),
  }
  db, err := sql.Open("impala", u.String())

Also, you can bypass the string-based data source name by using sql.OpenDB:

  opts := impala.DefaultOptions
  opts.Host = hostname
  opts.UseLDAP = true
  opts.Username = username
  opts.Password = password

  connector := impala.NewConnector(&opts)
  db, err := sql.OpenDB(connector)

Impala supports numerous other session options which can be configured with the SET statement. mem-limit and query-timeout are the only two such options the driver supports as part of the DSN. Those DSN fields for those are an exception for backwards compatibility. The preferred way to set any session option is issuing SET statements to a SQL connection. Users may find it useful to wrap the driver.Connector returned by impala.NewConnector so that a set of session options are automatically applied to all created connections.

CLI

impala-go is included in xo/usql - the universal SQL CLI, inspired by psql.

Install usql, start it, then on its prompt, run:

\connect impala DSN

where DSN is a data source name in the format above. Review the usql documentation for other options.

The latest version of usql typically comes with the latest version of impala-go but if you need to use a different one, you can prepare a custom build using usqlgen. For example, the following command builds a usql binary in the working directory using impala-go from master branch:

go run github.com/sclgo/usqlgen@latest build --get github.com/sclgo/impala-go@master -- -tags impala

usql with impala-go is arguably a better CLI for Impala than the official impala-shell. For one, usql is much easier to install.

Example Go code

package main

// Simple program to list databases and the tables

import (
	"context"
	"database/sql"
	"log"

	"github.com/sclgo/impala-go"
)

func main() {
	opts := impala.DefaultOptions

	opts.Host = "localhost" // impala host
	opts.Port = "21050"

	// enable LDAP authentication:
	//opts.UseLDAP = true
	//opts.Username = "<ldap username>"
	//opts.Password = "<ldap password>"
	//
	// enable TLS
	//opts.UseTLS = true
	//opts.CACertPath = "/path/to/cacert"

	connector := impala.NewConnector(&opts)
	db := sql.OpenDB(connector)
	defer func() {
		_ = db.Close()
	}()

	ctx := context.Background()

	rows, err := db.QueryContext(ctx, "SHOW DATABASES")
	if err != nil {
		log.Fatal(err)
	}

	var name, comment string
	databases := make([]string, 0) // databases will contain all the DBs to enumerate later
	for rows.Next() {
		if err := rows.Scan(&name, &comment); err != nil {
			log.Fatal(err)
		}
		databases = append(databases, name)
	}
	if err := rows.Err(); err != nil {
		log.Fatal(err)
	}
	log.Println("List of Databases", databases)

	tables, err := impala.NewMetadata(db).GetTables(ctx, "%", "%")
	if err != nil {
		log.Fatal(err)
	}
	log.Println("List of Tables", tables)
}

Check out also an open data end-to-end demo.

Data types

Impala data types are mapped to Go types as expected, with the following exceptions:

  • "Complex" types - MAP, STRUCT, ARRAY - are not supported. Impala itself has limited support for those. As a workaround, select individual fields or flatten such values within select statements.
  • Decimals are converted to strings by the Impala server API. Either parse the decimal value after Rows.Scan, or use a custom sql.Scanner implementation in Row(s).Scan e.g. Decimal from github.com/cockroachdb/apd. Note that the processing of sql.Scanner within Row(s).Scan is a feature of the database/sql package, and not the driver. The ScanType of such values is string, while the DatabaseTypeName is DECIMAL. Retrieving precision and scale using the DecimalSize API is supported.

Context support

The driver methods recognize Context and support early cancellation in most cases. Additionally, the Query methods return early before all rows are retrieved. Exec methods return after the operation completes (this may be configurable in the future). Exec methods can still be stopped early by cancelling the context from another goroutine.

It is also supported to use a QueryContext method on a sql.Conn for a DDL/DML statement if you need the method to return before the statement completes. In that case, calling Rows.Next will wait for the statement to complete and then return false.

Compatibility and Support

The library is actively tested with Impala 4.4 and 3.4. All 3.x and 4.x minor versions should work well. 2.x is also supported on a best-effort basis.

While Impala shares the majority of its API with Apache Hive, this driver doesn't support Hive. Instead, it is recommended to use a dedicated Hive driver or client. Please file an issue if you find it more valuable to use this driver with Hive compared to the existing drivers.

The library is not compatible with TinyGo because Thrift for Go doesn't support it. The Thrift code incompatible with TinyGo is not referenced by impala-go but compilation fails nonetheless. Last checked with tinygo 0.41.1, thrift 0.23, and Go 1.26 on 2026-05-10. (Dev note: Any new release of Thrift or TinyGo may resolve the issue. Run make test-tinygo after updates to check again.) Thrift team tracks progress on TinyGo support at https://issues.apache.org/jira/browse/THRIFT-5209.

File any issues that you encounter as GitHub issues.

Versioning

The library follows semantic versioning, as specified in SemVer 2.0.0, and will use semantic import versioning (SIV), if a 2.0 version is ever needed. However, the rare exceptions to the semantic versioning rules that are allowed for the Go language and standard library, apply to this library as well. For example, a minor release can include a breaking change if the change was required to fix a security issue. Review the rest of the exceptions at https://go.dev/doc/go1compat#expectations.

gorelease tool is included in CI to automate the detection of most semantic versioning violations.

The minimum Go version may increase in minor, not patch, releases following general practice. The last two Go minor releases will always be supported.

This library started as a fork of github.com/bippio/go-impala, under the MIT license. This library retains the same license.

The project logo combines the Golang Gopher from github.com/golang-samples/gopher-vector with the Apache Impala logo, licensed under the Apache 2 license.

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	// ErrNotSupported means the driver does not support this operation
	ErrNotSupported = isql.ErrNotSupported

	// ErrOpenFailed means the driver failed to open a connection.
	// Following database/sql docs, this is a separate error from driver.ErrBadConn.
	// If the root cause is context.DeadlineExceeded, AuthError, or *tls.CertificateVerificationError,
	// that cause will be in the same error tree as this sentinel, likely as a sibling.
	ErrOpenFailed = errors.New("impala: failed to open connection")

	// ErrBadDSN means the driver failed to parse the DSN or contained incorrect values.
	// Another error in the tree will describe the specific issue.
	ErrBadDSN = errors.New("impala: bad DSN")
)
View Source
var (
	// DefaultOptions for impala driver
	DefaultOptions = Options{
		BatchSize:      1024,
		BufferSize:     4096,
		Port:           "21050",
		LogOut:         io.Discard,
		SocketTimeout:  5 * time.Second,
		ConnectTimeout: 10 * time.Second,
	}
)

Functions

func NewConnector

func NewConnector(opts *Options) driver.Connector

NewConnector creates connector with specified options

Types

type AuthError added in v1.1.0

type AuthError = sasl.AuthError

AuthError indicates that there was an authentication or authorization failure. The error message documents the username used, if any. errors.Unwrap() returns the underlying error interpreted as auth. failure, if any. This error will not be top-level in the chain/tree - earlier errors reflect the process during which the error happened.

type ColumnName added in v1.1.0

type ColumnName = hive.ColumnName

ColumnName contains all attributes that identify a columns

type ConnRawAccess added in v0.2.0

type ConnRawAccess interface {
	Raw(func(driverConn any) error) error
}

ConnRawAccess exposes the Raw method of sql.Conn

type Driver

type Driver struct{}

Driver to impala

func (*Driver) Open

func (d *Driver) Open(dsn string) (driver.Conn, error)

Open creates a new connection to impala using the given data source name. Implements driver.Driver. The returned error contains ErrOpenFailed in the chain/tree, along with the specific cause. See ErrOpenFailed about which causes are guaranteed to be reported. The API does not guarantee the order of errors in the tree.

func (*Driver) OpenConnector

func (d *Driver) OpenConnector(name string) (driver.Connector, error)

OpenConnector parses name as a DSN (data source name) and returns connector with fixed options Implements driver.DriverContext

type Metadata added in v0.1.1

type Metadata struct {
	// contains filtered or unexported fields
}

Metadata exposes the schema and other metadata in an Impala instance

func NewMetadata added in v0.1.1

func NewMetadata(db *sql.DB) *Metadata

NewMetadata creates Metadata instance with the given Impala DB as data source. A new connection will be retrieved for any call. If that's an issue, use NewMetadataFromConn

func NewMetadataFromConn added in v0.2.0

func NewMetadataFromConn(conn ConnRawAccess) *Metadata

NewMetadataFromConn creates Metadata instance with the given Impala connection as data source *sql.Conn implements ConnRawAccess

func (Metadata) GetColumns added in v1.1.0

func (m Metadata) GetColumns(ctx context.Context, schemaPattern string, tableNamePattern string, columnNamePattern string) ([]ColumnName, error)

GetColumns retrieves columns that match the provided LIKE patterns

func (Metadata) GetSchemas added in v0.2.0

func (m Metadata) GetSchemas(ctx context.Context, schemaPattern string) ([]string, error)

GetSchemas retrieves schemas that match the provided LIKE pattern

func (Metadata) GetTables added in v0.1.1

func (m Metadata) GetTables(ctx context.Context, schemaPattern string, tableNamePattern string) ([]TableName, error)

GetTables retrieves tables and views that match the provided LIKE patterns

type Options

type Options struct {
	Host     string
	Port     string
	Username string
	Password string

	// ReuseSession disables resetting the session when database/sql SPI requests it.
	// The connection and session will still be validated. database/sql asks to reset the session
	// when it reuses a connection from its pool.
	//
	// All popular drivers don't reset the connection session even though it is required
	// by the database/sql/driver SPI. When this setting is enabled, this driver behaves consistently
	// with the other DB drivers in the ecosystem but diverges somewhat from documented database/sql behavior.
	//
	// This setting must be enabled when this driver is used in github.com/xo/usql.
	// `usql` returns the connection to the pool after each statement, relying on the typical driver behavior.
	ReuseSession bool

	UseLDAP bool

	UseTLS     bool
	CACertPath string

	// TlsInsecureSkipVerify configures the tls.Config InsecureSkipVerify flag for
	// a TLS connection to Impala. Behaves the same way as AllowSelfSignedCerts in the official JDBC driver.
	TLSInsecureSkipVerify bool

	BufferSize int
	BatchSize  int

	// MemoryLimit configures the MEM_LIMIT Impala property for the connection
	// https://impala.apache.org/docs/build/html/topics/impala_mem_limit.html
	MemoryLimit string
	// QueryTimeout in seconds - for QUERY_TIMEOUT_S session configuration value
	// https://impala.apache.org/docs/build/html/topics/impala_query_timeout_s.html
	QueryTimeout int

	LogOut io.Writer

	// SocketTimeout configures the maximum socket idle time. 0 or negative value means no limit.
	// Configuring SocketTimeout together with setting a context deadline/timeout
	// also causes socket reads to be retried within the deadline (thrift behavior)
	SocketTimeout time.Duration

	// ConnectTimeout configures the max wait for initial connection to server. 0 or negative value means no limit.
	ConnectTimeout time.Duration
}

Options for impala driver connection It is recommended to copy DefaultOptions before customizing values. The zero value of Options is a valid, but not recommended, configuration. The default and recommended value of all fields is the zero value if not otherwise specified in DefaultOptions.

type TableName added in v0.2.0

type TableName = hive.TableName

TableName contains all attributes that identify a table

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL