Develop apps
In Tower, you will be developing data applications, or “apps” for short. All source code-based data engineering artifacts - ETL/ELT pipelines, batch inference jobs, scripts, tasks of a larger DAG - are “apps” in Tower’s parlance. If you wondered what other data engineering artifacts there are, the answer is: "Tables! Data Sets!".
To manage and run one of these apps in Tower, you should install the Tower CLI and create an app in Tower.
Install the Tower CLI
if you haven’t already, run in the shell
pip install -U tower
Login to Tower
Run in the shell
tower login
Create an app in Tower
tower apps create --name="hello-world"
You will next prepare the code artifacts and a special manifest file called the Towerfile.
Navigate to folder with source code
If you maintain your app code in a git repo, clone it. For example, this is how you can clone Tower's example repo.
git clone https://github.com/tower/tower-examples
Otherwise, just change to the directory with the source code
cd tower-examples/01-hello-world
Prepare a manifest file (Towerfile)
In your favorite text editor (vim, nano, Cursor etc), create a toml-formatted file named Towerfile with at least the following keys: name
, script
, source
(see next sections for tips on what to put into the Towerfile).
Tower can deal with pretty much any Python code that you've already written. Anything that you can run by calling python script.py
will also run in Tower. Tower does expect certain files to be present in your project folder, though.
Main script: At the minimum your Tower app needs a main script containing your app logic. You will specify the main script in the script
section of the Towerfile. Tower currently supports Python scripts and Shell scripts. When using Shell scripts instead of Python scripts, you can use them to specify several steps executed in a sequence. For example, here is a shell script that builds a dlt+ project example.
dlt project clean
dlt pipeline -l
dlt pipeline events_to_lake run
dlt transformation . run
dlt dataset reports_dataset info
requirements.txt file: If you need any Python libraries installed, you will need to provide a requirements.txt file. Create one in your project folder, and specify it in the source
section of the Towerfile.
Other source and data files: If you have other source files, perhaps in folders, or if you want to reference data files from your code, add them to the source
section of your Towerfile.
source = [
"./.dlt/config.toml",
"./**/*.py",
"requirements.txt",
"./_data/*",
"./task.sh",
"./dlt_project.yml"
]
Change your app to pass secrets and define app parameters
While Tower lets you run pretty much any Python code, you will want to customize the behavior of your app when it runs in Tower. In Tower you do this by defining app parameters.
You will also want to remove hard-coded database connection settings and passwords from your source code, before they get checked into a code repository. At runtime, you will want to pass the database credentials and other config settings to the app. In Tower you can define secrets for that.
Both secrets and parameters will be injected into your application as environment variables. Secrets will be passed to your application implicitly without you specifying them on the command line, while for parameters you will need to specify them on the Tower command line (via --parameter=X=Y
) when running the app.
When deciding which variables to pass as secrets or parameters, consider the following.
Best to pass as secrets:
- Variables that your teammates should not be able to explicitly read (e.g. credentials to your databases and cloud access keys)
- Variables that your app needs across many of its runs and not just for one run
- Variables that all of your apps need to have access to
- Variables that you want to differentiate by the environment, e.g., have different values for "production" and "test"
To pass a secret to all apps in your account on each and every of their runs, create a new secret
tower secrets create --name=snowflake_url \
--value=https://abc123.snowflake.com
Best to pass as parameters:
- Variables whose values change from app run to app run
- Variables that are not secret in nature and are safe to share with your teammates
To pass a parameter to an application at run time, you will specify the value of this parameter on the command line, when running the app.
tower run --parameter=iceberg_table=”default.trade_stats”
Values of secrets and parameters can be accessed in your app code by calling os.getenv
.
import os
snowflake_url = os.getenv("snowflake_url")
iceberg_table_name= os.getenv("iceberg_table")
For well-known environment variables, such as AWS_REGION, AWS_ACCESS_KEY_ID, or PYICEBERG_CATALOG__DEFAULT__URI, you don’t have to explicitly get their values and then use them in your code, because the libraries that you will be using (e.g. s3fs or pyiceberg) check the values of these well-known environment variables automatically.