Skip to main content

Tables

Tower Tables make it easy for users to onboard to Apache Iceberg. They provide methods for accessing and processing tabular and semi-structured data (e.g. tables with nested fields, via the VARIANT data type).

Overview

Tower offers two main components for working with tables:

  • The Table class: A wrapper around Iceberg tables that provides methods for reading and writing data
  • The tables helper function: A convenient way to create and access tables

Creating Tables

To create a table, you need to:

  1. Define its schema in Arrow Schema format
  2. Use either create_if_not_exists() or create() methods

Here's a basic example:

import pyarrow as pa
import tower

SCHEMA = pa.schema([
("col1", pa.string()),
("col2", pa.float64()),
...
])

mytable = tower.tables('mytable').create_if_not_exists(SCHEMA)

The returned mytable object is of the Tower Table class, which provides a unified interface for working with different types of tables. Currently, Tower supports Apache Iceberg tables.

Catalogs and Namespaces

Tower Tables are aware of the catalogs defined in Tower. Using the tables helper saves you from writing boilerplate code to set environment variables.

The example above assumes you're creating tables in the 'default' namespace of the 'default' catalog. For more examples of table creation with different catalogs and namespaces, see our Working with Tables guide.

Table Operations

TableReference Methods

The tables helper returns a TableReference object with these methods:

Table Creation

  • create_if_not_exists() - Creates a table with specified schema if it doesn't exist
  • create() - Creates a table with specified schema (fails if table exists)

Table Access

  • load() - Gets a reference to an existing table and loads its metadata

Table Methods

Once you have a table reference, you can perform these operations:

Schema Operations

Reading Data

  • to_polars() - Returns a Polars LazyFrame for efficient data processing
  • read() - Reads the entire table into memory as a Polars DataFrame

Writing Data

  • insert() - Inserts new data into the table
  • upsert() - Updates existing rows or inserts new ones

Data Management

  • delete() - Removes data from the table based on specified conditions