Pre-seeding a database container simplifies the setup process for development environments, ensuring that developers start with a consistent dataset. This practice is particularly useful when working with microservices or multi-developer teams where database state consistency is critical.
By pre-seeding a database container, you can automatically initialize tables, insert sample data, and configure necessary settings when the container starts. This eliminates manual steps, reduces errors, and streamlines local development workflows.
Pre-seeding offers several benefits, including:
- Confident deployments
- Consistency across environments
- Early issue detection
- Improved collaboration between Database Administrators and Software Engineers
This tutorial will guide you through the process of pre-seeding a database container using Docker Compose, with step-by-step instructions for a PostgreSQL example.
Prerequisites
Before starting, ensure you have:
- Docker and Docker Compose Installed: Follow the official Docker and Docker Compose installation guides if needed.
- Basic Knowledge of Docker Compose: Familiarity with YAML files and Docker basics will help.
- A Sample SQL Script: Create or obtain an SQL script containing the database schema and seed data.
Step 1: Create a Directory Structure for the Project
-
Create a new directory for your project:
mkdir db-preseed-demo cd db-preseed-demo
-
Inside this directory, create a subdirectory for your SQL scripts and a
docker-compose.yml
file:mkdir scripts touch docker-compose.yml
-
Create a SQL script file inside the
scripts
directory:touch scripts/seed.sql
-
Create a
.env
file for environment variables:touch .env
This structure aligns with the best practices for Docker projects Pre-seed the database by bind-mounting a SQL script. It separates the SQL scripts into their own directory, which is a common practice for organizing database initialization scripts. The .env
file is added to store environment variables, which is useful for managing sensitive information like database passwords Pre-seed the database using JavaScript code.
Your project structure should now look like this:
db-preseed-demo/
├── scripts/
│ └── seed.sql
├── docker-compose.yml
└── .env
Step 2: Write the SQL Seed Script
-
Open the
scripts/seed.sql
file in your preferred text editor:nano scripts/seed.sql
-
Add the SQL commands to define the schema and seed data:
CREATE TABLE IF NOT EXISTS users ( id SERIAL PRIMARY KEY, name VARCHAR(50) NOT NULL, email VARCHAR(100) UNIQUE NOT NULL ); INSERT INTO users (name, email) VALUES ('Alice', 'alice@example.com'), ('Bob', 'bob@example.com') ON CONFLICT (email) DO NOTHING;
-
Save and close the file.
This script creates a users
table and inserts sample data. The improvements include:
Step 3: Define the Docker Compose Configuration
-
Open the
docker-compose.yml
file in your preferred text editor:nano docker-compose.yml
-
Add the following configuration:
version: '3.8' services: db: image: postgres:latest container_name: preseeded-db environment: POSTGRES_USER: ${POSTGRES_USER:-devuser} POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-devpassword} POSTGRES_DB: ${POSTGRES_DB:-devdb} volumes: - ./scripts:/docker-entrypoint-initdb.d - postgres_data:/var/lib/postgresql/data ports: - "5432:5432" healthcheck: test: ["CMD", "pg_isready", "-U", "${POSTGRES_USER:-devuser}"] interval: 10s timeout: 5s retries: 5 volumes: postgres_data:
-
Save and close the file.
-
Create a
.env
file in the project root to store environment variables:nano .env
-
Add the following content to the
.env
file:POSTGRES_USER=devuser POSTGRES_PASSWORD=devpassword POSTGRES_DB=devdb
This configuration uses the official PostgreSQL image and mounts the scripts
directory to the /docker-entrypoint-initdb.d/
directory. PostgreSQL automatically executes files in this directory when initializing the database.
Step 4: Start and Verify the Database Container
-
Start the container using Docker Compose:
docker compose up -d
The
-d
flag runs the container in detached mode, allowing it to run in the background Docker Compose file reference. -
Watch the logs in real-time to ensure the database is initialized and the seed script runs:
docker compose logs -f db
The
-f
flag follows the log output, giving you live updates Use Docker Compose to run a database. -
Wait until you see messages indicating that the database is ready for connections and the seed script has been executed.
-
Once the database is ready, you can verify that the pre-seeding was successful by connecting to the database and querying the
users
table:docker compose exec db psql -U devuser -d devdb -c "SELECT * FROM users;"
This command executes a SQL query inside the running container to display the contents of the
users
table. -
If you need to stop the container, you can use:
docker compose down
This command stops and removes the containers defined in your
docker-compose.yml
file.
Note: If you make changes to your
seed.sql
file and want to re-initialize the database, you'll need to remove the volume and recreate the container:
docker compose down -v
docker compose up -d
This ensures that the latest version of your seed script is executed.
Step 5: Verify and Interact with the Pre-seeded Data
-
Access the database container using
psql
through Docker Compose:docker compose exec db psql -U ${POSTGRES_USER:-devuser} -d ${POSTGRES_DB:-devdb}
This command uses environment variables with default values, ensuring consistency with your Docker Compose configuration Pre-seed the database using JavaScript code.
-
Once connected to the PostgreSQL interactive terminal, check the seeded data:
SELECT * FROM users;
You should see the sample rows defined in
seed.sql
. -
You can also perform additional queries to verify the structure and constraints of the table:
\d users
This command will show the table structure, including column names, data types, and constraints.
-
Try inserting a new user to verify the constraints:
INSERT INTO users (name, email) VALUES ('Charlie', 'charlie@example.com'); SELECT * FROM users;
-
Attempt to insert a duplicate email to verify the unique constraint:
INSERT INTO users (name, email) VALUES ('David', 'alice@example.com');
This should result in an error due to the unique constraint on the email column.
-
Exit the PostgreSQL interactive terminal:
\q
Step 6: Automate with a Reset Script (Optional)
For repeated testing and to ensure a clean, consistent database state for every development session, create a script to reset the database.
-
Create a
reset.sh
file:nano reset.sh
-
Add the following content:
#!/bin/bash set -e echo "Stopping and removing containers..." docker compose down -v echo "Starting containers..." docker compose up -d echo "Waiting for database to be ready..." docker compose exec db sh -c 'while ! pg_isready -U "$POSTGRES_USER" -d "$POSTGRES_DB"; do sleep 1; done' echo "Verifying seeded data..." docker compose exec db psql -U "$POSTGRES_USER" -d "$POSTGRES_DB" -c "SELECT * FROM users;" echo "Reset complete. Database is ready for use."
-
Make the script executable:
chmod +x reset.sh
-
Run the script to reset the database:
./reset.sh
Conclusion
Pre-seeding a database container provides a solution for initializing development environments with consistent datasets. This tutorial covered creating a seed script, configuring Docker Compose, verifying the seeded data, and automating resets. By following these steps, you can streamline your workflow and improve collaboration across your team.