How to Pre-seed a Database Container for Development Environments

Table of Contents

Pre-seeding a database container simplifies the setup process for development environments, ensuring that developers start with a consistent dataset. This practice is particularly useful when working with microservices or multi-developer teams where database state consistency is critical.

By pre-seeding a database container, you can automatically initialize tables, insert sample data, and configure necessary settings when the container starts. This eliminates manual steps, reduces errors, and streamlines local development workflows.

Pre-seeding offers several benefits, including:

Confident deployments
Consistency across environments
Early issue detection
Improved collaboration between Database Administrators and Software Engineers

This tutorial will guide you through the process of pre-seeding a database container using Docker Compose, with step-by-step instructions for a PostgreSQL example.

Prerequisites

Before starting, ensure you have:

Docker and Docker Compose Installed: Follow the official Docker and Docker Compose installation guides if needed.
Basic Knowledge of Docker Compose: Familiarity with YAML files and Docker basics will help.
A Sample SQL Script: Create or obtain an SQL script containing the database schema and seed data.

Step 1: Create a Directory Structure for the Project

Create a new directory for your project:

mkdir db-preseed-demo
cd db-preseed-demo

Inside this directory, create a subdirectory for your SQL scripts and a docker-compose.yml file:
```
mkdir scripts
touch docker-compose.yml
```
Create a SQL script file inside the scripts directory:
```
touch scripts/seed.sql
```
Create a .env file for environment variables:
```
touch .env
```

This structure aligns with the best practices for Docker projects Pre-seed the database by bind-mounting a SQL script. It separates the SQL scripts into their own directory, which is a common practice for organizing database initialization scripts. The .env file is added to store environment variables, which is useful for managing sensitive information like database passwords Pre-seed the database using JavaScript code.

Your project structure should now look like this:

db-preseed-demo/
├── scripts/
│   └── seed.sql
├── docker-compose.yml
└── .env

Step 2: Write the SQL Seed Script

Open the scripts/seed.sql file in your preferred text editor:
```
nano scripts/seed.sql
```

Add the SQL commands to define the schema and seed data:

CREATE TABLE IF NOT EXISTS users (
  id SERIAL PRIMARY KEY,
  name VARCHAR(50) NOT NULL,
  email VARCHAR(100) UNIQUE NOT NULL
);

INSERT INTO users (name, email) VALUES
  ('Alice', 'alice@example.com'),
  ('Bob', 'bob@example.com')
ON CONFLICT (email) DO NOTHING;

Save and close the file.

This script creates a users table and inserts sample data. The improvements include:

Step 3: Define the Docker Compose Configuration

Open the docker-compose.yml file in your preferred text editor:
```
nano docker-compose.yml
```

Add the following configuration:

version: '3.8'

services:
  db:
    image: postgres:latest
    container_name: preseeded-db
    environment:
      POSTGRES_USER: ${POSTGRES_USER:-devuser}
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-devpassword}
      POSTGRES_DB: ${POSTGRES_DB:-devdb}
    volumes:
      - ./scripts:/docker-entrypoint-initdb.d
      - postgres_data:/var/lib/postgresql/data
    ports:
      - "5432:5432"
    healthcheck:
      test: ["CMD", "pg_isready", "-U", "${POSTGRES_USER:-devuser}"]
      interval: 10s
      timeout: 5s
      retries: 5

volumes:
  postgres_data:

Save and close the file.
Create a .env file in the project root to store environment variables:
```
nano .env
```

Add the following content to the .env file:

POSTGRES_USER=devuser
POSTGRES_PASSWORD=devpassword
POSTGRES_DB=devdb

This configuration uses the official PostgreSQL image and mounts the scripts directory to the /docker-entrypoint-initdb.d/ directory. PostgreSQL automatically executes files in this directory when initializing the database.

Step 4: Start and Verify the Database Container

Start the container using Docker Compose:
```
docker compose up -d
```
The -d flag runs the container in detached mode, allowing it to run in the background Docker Compose file reference.
Watch the logs in real-time to ensure the database is initialized and the seed script runs:
```
docker compose logs -f db
```
The -f flag follows the log output, giving you live updates Use Docker Compose to run a database.
Wait until you see messages indicating that the database is ready for connections and the seed script has been executed.
Once the database is ready, you can verify that the pre-seeding was successful by connecting to the database and querying the users table:
```
docker compose exec db psql -U devuser -d devdb -c "SELECT * FROM users;"
```
This command executes a SQL query inside the running container to display the contents of the users table.
If you need to stop the container, you can use:
```
docker compose down
```
This command stops and removes the containers defined in your docker-compose.yml file.

Note: If you make changes to your seed.sql file and want to re-initialize the database, you'll need to remove the volume and recreate the container:

docker compose down -v
docker compose up -d

This ensures that the latest version of your seed script is executed.

Step 5: Verify and Interact with the Pre-seeded Data

Access the database container using psql through Docker Compose:
```
docker compose exec db psql -U ${POSTGRES_USER:-devuser} -d ${POSTGRES_DB:-devdb}
```
This command uses environment variables with default values, ensuring consistency with your Docker Compose configuration Pre-seed the database using JavaScript code.
Once connected to the PostgreSQL interactive terminal, check the seeded data:
```
SELECT * FROM users;
```
You should see the sample rows defined in seed.sql.
You can also perform additional queries to verify the structure and constraints of the table:
```
\d users
```
This command will show the table structure, including column names, data types, and constraints.

Try inserting a new user to verify the constraints:

INSERT INTO users (name, email) VALUES ('Charlie', 'charlie@example.com');
SELECT * FROM users;

Attempt to insert a duplicate email to verify the unique constraint:
```
INSERT INTO users (name, email) VALUES ('David', 'alice@example.com');
```
This should result in an error due to the unique constraint on the email column.
Exit the PostgreSQL interactive terminal:
```
\q
```

Step 6: Automate with a Reset Script (Optional)

For repeated testing and to ensure a clean, consistent database state for every development session, create a script to reset the database.

Create a reset.sh file:
```
nano reset.sh
```

Add the following content:

#!/bin/bash
set -e

echo "Stopping and removing containers..."
docker compose down -v

echo "Starting containers..."
docker compose up -d

echo "Waiting for database to be ready..."
docker compose exec db sh -c 'while ! pg_isready -U "$POSTGRES_USER" -d "$POSTGRES_DB"; do sleep 1; done'

echo "Verifying seeded data..."
docker compose exec db psql -U "$POSTGRES_USER" -d "$POSTGRES_DB" -c "SELECT * FROM users;"

echo "Reset complete. Database is ready for use."

Make the script executable:
```
chmod +x reset.sh
```
Run the script to reset the database:
```
./reset.sh
```

Conclusion

Pre-seeding a database container provides a solution for initializing development environments with consistent datasets. This tutorial covered creating a seed script, configuring Docker Compose, verifying the seeded data, and automating resets. By following these steps, you can streamline your workflow and improve collaboration across your team.