Spacetraders 2: PostgreSQL - Building a Local Game State with ELT

SpaceTraders is an API-first game where all interactions are performed through HTTP requests and JSON responses. After establishing the OpenAPI and API-consumption foundations, the next step is to persist game state locally. This entry documents a practical PostgreSQL setup for SpaceTraders gameplay, focusing on Python integration, common setup pitfalls, and an ELT-style workflow where raw API responses are stored first and transformed later. The note serves both as a learning guide and as a technical archive with reusable code patterns for database creation, schema organization, and raw data ingestion.

 

From API calls to local state

In the first entry of this series, SpaceTraders was introduced as a fully API-driven game. Every action, querying markets, navigating ships, and accepting contracts, returns structured JSON responses.

As gameplay progresses, two practical needs emerge:

  • keeping track of state across many API calls

  • analyzing changes over time (prices, cooldowns, resources, progression)

Relying solely on in-memory objects or temporary notebooks quickly becomes fragile. A local database provides:

  • persistence

  • inspectability

  • reproducibility

  • a clean separation between data collection and decision logic

PostgreSQL is a natural choice for this role.


Why PostgreSQL for SpaceTraders?

PostgreSQL fits well between lightweight tools (such as SQLite) and analytical platforms (such as Snowflake or Databricks):

  • strong typing and constraints

  • native JSON support (JSONB)

  • transactional safety

  • mature Python drivers

  • a schema system that encourages clean structure

Most importantly, PostgreSQL supports incremental learning: it can start simple and grow naturally as the project evolves.


One database per reset: a weekly workflow

SpaceTraders resets on a regular schedule. Instead of reusing the same database indefinitely, a clean strategy is:

  • one PostgreSQL database per reset

  • identical schema structure each time

  • disposable databases

This approach provides:

  • clean restarts

  • easy recovery from mistakes

  • consistent comparison between weeks

  • confidence that scripts are reusable and robust

In practice, databases might follow a naming pattern such as:

spacetraders_2025_w01
spacetraders_2025_w02

PostgreSQL + Python: minimal setup

Python packages

A modern, Windows-friendly stack:

pip install sqlalchemy psycopg[binary]
  • psycopg is the PostgreSQL driver

  • psycopg[binary] bundles required system libraries

  • SQLAlchemy provides a clean abstraction layer


Creating an engine

from sqlalchemy import create_engine

admin_engine = create_engine(
    "postgresql+psycopg://user:password@localhost:5432/postgres"
)

The postgres database is used as a maintenance database for administrative operations such as creating or dropping other databases.


Creating databases safely from Python

PostgreSQL does not allow CREATE DATABASE inside a transaction. Since SQLAlchemy implicitly starts transactions, this must be handled explicitly.

create database if missing

from sqlalchemy import text

def create_database_if_missing(engine, db_name: str) -> bool:
    with engine.connect() as conn:
        exists = conn.execute(
            text("SELECT 1 FROM pg_database WHERE datname = :name"),
            {"name": db_name}
        ).fetchone()

    if exists:
        return False

    with engine.connect().execution_options(
        isolation_level="AUTOCOMMIT"
    ) as conn:
        conn.execute(text(f'CREATE DATABASE "{db_name}"'))

    return True

This function can be reused verbatim across resets.


Schemas as structure, not bureaucracy

Inside each weekly database, schemas are used to express data responsibility rather than access control.

A minimal, practical layout:

         Schema           Purpose          
raw Raw API responses as ingested
core Cleaned, structured game entities
metrics Aggregates, statistics, derived insights

Schemas act like namespaces or folders inside a database.

CREATING SCHEMAS

with engine.begin() as conn:
    conn.execute(text("CREATE SCHEMA IF NOT EXISTS raw"))
    conn.execute(text("CREATE SCHEMA IF NOT EXISTS core"))
    conn.execute(text("CREATE SCHEMA IF NOT EXISTS metrics"))

Using fully qualified names (e.g. raw.api_events) avoids ambiguity and makes SQL self-documenting.


ELT over ETL: storing raw data first

Instead of transforming API responses in memory before insertion, an ELT approach is used:

  1. Extract data from the API

  2. Load raw responses into the database

  3. Transform into clean tables later

Benefits in the SpaceTraders context:

  • raw responses are preserved

  • transformations can be rewritten without refetching data

  • debugging is easier

  • historical analysis becomes possible

The database becomes the source of truth.


Hashes: detecting identical payloads efficiently

A hash is a fixed-length fingerprint derived from data content.

For API payloads:

  • the JSON object is serialized in a canonical form

  • a hash (e.g. SHA-256) is computed

  • identical payloads produce identical hashes

This allows fast equality checks without comparing large JSON blobs. Even when deduplication is not used initially, storing hashes enables future optimizations at very low cost.


Raw ingestion table design

A single, generic raw ingestion table is sufficient to start.

Example structure

raw.api_events (
    id BIGSERIAL PRIMARY KEY,
    fetched_at TIMESTAMPTZ,
    endpoint TEXT,
    request_key TEXT,
    status_code INTEGER,
    payload JSONB,
    payload_hash TEXT
)

Design principles:

  • append-only

  • minimal constraints

  • no transformations

  • suitable for all endpoints


Canonical JSON and hashing (Python helpers)

import json
import hashlib
from datetime import datetime, timezone

def utc_now():
    return datetime.now(timezone.utc)

def canonical_json(obj):
    return json.dumps(
        obj,
        sort_keys=True,
        separators=(",", ":"),
        ensure_ascii=False
    )

def sha256_hex(s: str) -> str:
    return hashlib.sha256(s.encode("utf-8")).hexdigest()

Canonical serialization ensures logically identical JSON objects produce the same hash.

API → database ingestion pattern

Reference function skeleton

import requests
from sqlalchemy import text

def fetch_and_store(
    engine,
    base_url,
    endpoint,
    params=None,
    headers=None,
):
    url = base_url.rstrip("/") + "/" + endpoint.lstrip("/")
    fetched_at = utc_now()

    r = requests.get(url, headers=headers, params=params, timeout=10)
    r.raise_for_status()

    payload_obj = r.json()
    payload_str = canonical_json(payload_obj)
    payload_hash = sha256_hex(payload_str)

    request_key = r.request.path_url

    with engine.begin() as conn:
        conn.execute(
            text("""
                INSERT INTO raw.api_events (
                    fetched_at, endpoint, request_key,
                    status_code, payload, payload_hash
                )
                VALUES (
                    :fetched_at, :endpoint, :request_key,
                    :status_code, CAST(:payload AS JSONB), :payload_hash
                )
            """),
            {
                "fetched_at": fetched_at,
                "endpoint": endpoint,
                "request_key": request_key,
                "status_code": r.status_code,
                "payload": payload_str,
                "payload_hash": payload_hash,
            }
        )

This function is intentionally generic and reusable across endpoints.


Reading raw data back for transformation

Latest payload for an endpoint

def load_latest_payload(engine, endpoint: str):
    with engine.connect() as conn:
        row = conn.execute(
            text("""
                SELECT payload
                FROM raw.api_events
                WHERE endpoint = :endpoint
                ORDER BY fetched_at DESC
                LIMIT 1
            """),
            {"endpoint": endpoint}
        ).fetchone()

    if row is None:
        raise LookupError("No data found")

    return row[0]

This pattern is typically used to populate core tables.


Dropping databases between resets

Dropping a database requires:

  • connecting to a different database (postgres)

  • terminating existing connections

  • running DROP DATABASE in autocommit mode

Reference pattern

def drop_database_force(admin_engine, db_name: str):
    with admin_engine.connect().execution_options(
        isolation_level="AUTOCOMMIT"
    ) as conn:
        conn.execute(text("""
            SELECT pg_terminate_backend(pid)
            FROM pg_stat_activity
            WHERE datname = :dbname
              AND pid <> pg_backend_pid()
        """), {"dbname": db_name})

        conn.execute(text(f'DROP DATABASE IF EXISTS "{db_name}"'))

This enables clean weekly resets without manual cleanup. It’s unlikely that this is going to become a rule, but in case something goes wrong, it’s a useful snippet to have in the toolbox.


Why this structure scales with gameplay

This setup intentionally favors:

  • clarity over cleverness

  • inspectable data over hidden logic

  • reproducibility over convenience

As gameplay becomes more complex, new tables, schemas, or transformations can be introduced incrementally, without rewriting the foundation.

Looking ahead

With PostgreSQL in place, future entries in this series can build on this foundation by:

  • defining core tables for ships, markets, contracts

  • tracking changes over time

  • automating decision loops

  • comparing progression across resets

Each step remains grounded in the same API-first, data-first workflow.

Next
Next

Spacetraders 1: OpenAPI and API-first gameplay foundations