Seeding Postgres Test Databases

Dan Lynch

Nov 17, 2025

Previously: In Advanced RLS Testing Scenarios, we tested complex multi-user scenarios and mastered the savepoint pattern. Now let's explore efficient strategies for seeding test databases with realistic data.

Testing with realistic data reveals edge cases that empty tables hide. Manually inserting data in every test is tedious and error-prone. You need flexible, maintainable ways to populate your test database that work with your RLS policies and schema.

In this lesson, you'll learn how to seed test databases using pgsql-test's direct load methods: loadJson(), loadSql(), and loadCsv().

Prerequisites

See Prerequisites. Requires: Complete Advanced RLS Testing Scenarios.

Why Seeding Matters
Seeding with loadJson()
Seeding with loadSql()
Seeding with loadCsv()
Combining Seeding Strategies
Per-Test Seeding

Why Seeding Matters

Testing without proper data seeding leads to problems:

Inline SQL everywhere makes tests verbose and clutters test logic. Each test has dozens of INSERT statements obscuring what's actually being tested.

Shared fixtures become brittle. One test changes the fixture data, breaking other tests that depend on it.

Unrealistic data misses edge cases. Hand-crafted test data often doesn't reflect production complexity—missing nulls, edge values, or realistic relationships.

pgsql-test's direct load methods solve these problems with clean, maintainable seeding strategies.

Seeding with loadJson()

The loadJson() method is perfect for inline test data. It's simple, readable, and respects RLS policies.

Create __tests__/json-seeding.test.ts:

import { getConnections, PgTestClient } from 'pgsql-test';

const alice = 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa';
const bob = 'bbbbbbbb-bbbb-bbbb-bbbb-bbbbbbbbbbbb';

let pg: PgTestClient;
let db: PgTestClient;
let teardown: () => Promise<void>;

beforeAll(async () => {
  ({ pg, db, teardown } = await getConnections());

  // Seed test data using loadJson
  await pg.loadJson({
    'pets.pets': [
      {
        id: '11111111-1111-1111-1111-111111111111',
        name: 'Buddy',
        species: 'dog',
        age: 3,
        owner_id: alice
      },
      {
        id: '22222222-2222-2222-2222-222222222222',
        name: 'Whiskers',
        species: 'cat',
        age: 2,
        owner_id: alice
      },
      {
        id: '33333333-3333-3333-3333-333333333333',
        name: 'Tweety',
        species: 'bird',
        age: 1,
        owner_id: bob
      }
    ]
  });
});

afterAll(async () => await teardown());
beforeEach(async () => await db.beforeEach());
afterEach(async () => await db.afterEach());

it('has loaded all pets', async () => {
  db.setContext({ role: 'anonymous' });

  const result = await db.query('SELECT * FROM pets.pets ORDER BY name');
  expect(result.rows).toHaveLength(3);
  expect(result.rows[0].name).toBe('Buddy');
});

it('alice sees her pets', async () => {
  db.setContext({
    role: 'authenticated',
    'jwt.claims.user_id': alice
  });

  const result = await db.query(`
    SELECT name FROM pets.pets
    WHERE owner_id = $1
    ORDER BY name
  `, [alice]);

  expect(result.rows).toHaveLength(2);
  expect(result.rows.map(r => r.name)).toEqual(['Buddy', 'Whiskers']);
});

Key Features:

Schema-qualified tables - Use 'pets.pets' to specify the schema
UUID support - Can specify explicit IDs for referential integrity
RLS-aware - Respects RLS policies (use pg for setup, db for testing)
Clean syntax - JSON objects map directly to table rows

Run the test:

pnpm test:watch

Seeding with loadSql()

For larger datasets or when you want to version-control fixtures separately, SQL files work well.

Create __tests__/fixtures/pets-fixture.sql:

INSERT INTO pets.pets (id, name, species, age, owner_id) VALUES
  ('44444444-4444-4444-4444-444444444444', 'Max', 'dog', 5, 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa'),
  ('55555555-5555-5555-5555-555555555555', 'Luna', 'cat', 3, 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa'),
  ('66666666-6666-6666-6666-666666666666', 'Charlie', 'dog', 4, 'bbbbbbbb-bbbb-bbbb-bbbb-bbbbbbbbbbbb'),
  ('77777777-7777-7777-7777-777777777777', 'Nemo', 'fish', 1, 'bbbbbbbb-bbbb-bbbb-bbbb-bbbbbbbbbbbb'),
  ('88888888-8888-8888-8888-888888888888', 'Bella', 'cat', 2, 'cccccccc-cccc-cccc-cccc-cccccccccccc');

Create __tests__/sql-seeding.test.ts:

import path from 'path';
import { getConnections, PgTestClient } from 'pgsql-test';

const alice = 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa';
const bob = 'bbbbbbbb-bbbb-bbbb-bbbb-bbbbbbbbbbbb';

let pg: PgTestClient;
let db: PgTestClient;
let teardown: () => Promise<void>;

beforeAll(async () => {
  ({ pg, db, teardown } = await getConnections());

  // Load SQL fixture
  await pg.loadSql([
    path.join(__dirname, 'fixtures/pets-fixture.sql')
  ]);
});

afterAll(async () => await teardown());
beforeEach(async () => await db.beforeEach());
afterEach(async () => await db.afterEach());

it('loads fixtures from SQL file', async () => {
  db.setContext({ role: 'anonymous' });

  const result = await db.query('SELECT * FROM pets.pets ORDER BY name');
  expect(result.rows).toHaveLength(5);
});

it('supports complex queries on seeded data', async () => {
  db.setContext({
    role: 'authenticated',
    'jwt.claims.user_id': alice
  });

  const result = await db.query(`
    SELECT species, COUNT(*) as count
    FROM pets.pets
    WHERE owner_id = $1
    GROUP BY species
    ORDER BY species
  `, [alice]);

  expect(result.rows).toHaveLength(2);
  expect(result.rows[0]).toEqual({ species: 'cat', count: '1' });
  expect(result.rows[1]).toEqual({ species: 'dog', count: '1' });
});

Seeding with loadCsv()

For very large datasets or data exported from spreadsheets, CSV is ideal.

Create __tests__/fixtures/pets.csv:

id,name,species,age,owner_id
99999999-9999-9999-9999-999999999999,Rocky,dog,6,aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa
aaaaaaaa-aaaa-aaaa-aaaa-000000000001,Shadow,cat,4,aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa
bbbbbbbb-bbbb-bbbb-bbbb-000000000001,Sunny,bird,2,bbbbbbbb-bbbb-bbbb-bbbb-bbbbbbbbbbbb
cccccccc-cccc-cccc-cccc-000000000001,Bubbles,fish,1,cccccccc-cccc-cccc-cccc-cccccccccccc

Create __tests__/csv-seeding.test.ts:

import path from 'path';
import { getConnections, PgTestClient } from 'pgsql-test';

const alice = 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa';

let pg: PgTestClient;
let db: PgTestClient;
let teardown: () => Promise<void>;

beforeAll(async () => {
  ({ pg, db, teardown } = await getConnections());

  // Load CSV data
  await pg.loadCsv({
    'pets.pets': path.join(__dirname, 'fixtures/pets.csv')
  });
});

afterAll(async () => await teardown());
beforeEach(async () => await db.beforeEach());
afterEach(async () => await db.afterEach());

it('loads CSV data correctly', async () => {
  db.setContext({ role: 'anonymous' });

  const result = await db.query('SELECT * FROM pets.pets ORDER BY name');
  expect(result.rows).toHaveLength(4);
  expect(result.rows[0].name).toBe('Bubbles');
});

it('CSV data respects schema constraints', async () => {
  db.setContext({
    role: 'authenticated',
    'jwt.claims.user_id': alice
  });

  const result = await db.query(`
    SELECT name FROM pets.pets
    WHERE owner_id = $1
    ORDER BY name
  `, [alice]);

  expect(result.rows).toHaveLength(2);
  expect(result.rows.map(r => r.name)).toEqual(['Rocky', 'Shadow']);
});

Important CSV Note: loadCsv() uses Postgres's COPY command, which bypasses RLS policies. Always use pg (superuser) to load CSV data, not db.

Combining Seeding Strategies

You can mix seeding methods in the same test to leverage each approach's strengths:

import path from 'path';
import { getConnections, PgTestClient } from 'pgsql-test';

const alice = 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa';
const bob = 'bbbbbbbb-bbbb-bbbb-bbbb-bbbbbbbbbbbb';

let pg: PgTestClient;
let db: PgTestClient;
let teardown: () => Promise<void>;

beforeAll(async () => {
  ({ pg, db, teardown } = await getConnections());

  // 1. Load base data from SQL fixture
  await pg.loadSql([
    path.join(__dirname, 'fixtures/pets-fixture.sql')
  ]);

  // 2. Add test-specific data with JSON
  await pg.loadJson({
    'pets.pets': [
      {
        name: 'Test Pet',
        species: 'hamster',
        age: 1,
        owner_id: alice
      }
    ]
  });
});

afterAll(async () => await teardown());
beforeEach(async () => await db.beforeEach());
afterEach(async () => await db.afterEach());

it('combines SQL and JSON seeding', async () => {
  db.setContext({ role: 'anonymous' });

  const result = await db.query(`
    SELECT * FROM pets.pets
    WHERE name = 'Test Pet'
  `);

  expect(result.rows).toHaveLength(1);
  expect(result.rows[0].species).toBe('hamster');
});

Per-Test Seeding

Sometimes different tests need different data. Seed in beforeEach() instead of beforeAll():

import { getConnections, PgTestClient } from 'pgsql-test';

const alice = 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa';

let pg: PgTestClient;
let db: PgTestClient;
let teardown: () => Promise<void>;

beforeAll(async () => {
  ({ pg, db, teardown } = await getConnections());
});

afterAll(async () => await teardown());

beforeEach(async () => {
  await db.beforeEach();

  // Seed fresh data for each test
  db.setContext({
    role: 'authenticated',
    'jwt.claims.user_id': alice
  });
  await db.loadJson({
    'pets.pets': [
      { name: 'Test Dog', species: 'dog', age: 3, owner_id: alice },
      { name: 'Test Cat', species: 'cat', age: 2, owner_id: alice }
    ]
  });
});

afterEach(async () => await db.afterEach());

it('scenario 1 - can modify seeded data', async () => {
  db.setContext({
    role: 'authenticated',
    'jwt.claims.user_id': alice
  });

  await db.query(`
    UPDATE pets.pets SET age = age + 1 WHERE name = 'Test Dog'
  `);

  const result = await db.query(`
    SELECT age FROM pets.pets WHERE name = 'Test Dog'
  `);

  expect(result.rows[0].age).toBe(4);
});

it('scenario 2 - starts with clean data', async () => {
  // Previous test's UPDATE was rolled back
  db.setContext({
    role: 'authenticated',
    'jwt.claims.user_id': alice
  });

  const result = await db.query(`
    SELECT age FROM pets.pets WHERE name = 'Test Dog'
  `);

  expect(result.rows[0].age).toBe(3); // Back to original
});

Each test gets fresh data thanks to transaction rollback combined with per-test seeding.

Key Takeaways

loadJson() - Perfect for inline test data, respects RLS policies
loadSql() - Ideal for large fixtures and complex SQL operations
loadCsv() - Fast for huge datasets, but bypasses RLS (use pg client)
Per-test seeding - Use beforeEach() when tests need different data
Transaction isolation - Changes are rolled back between tests automatically

What's Next

You've mastered seeding strategies for Postgres test databases. With these tools, you can create comprehensive test suites with realistic data that verify your database logic, constraints, and RLS policies work correctly in production-like scenarios.

Previous Lesson Next Lesson

8:00

Spinning Up Temporary Testing Databases for Postgres

Testing

10:00

How to Test Row-Level Security (RLS) in Postgres

Testing

8:00

Advanced RLS Testing Scenarios - Simulating Many Users and Roles

Testing

10:00

Seeding Postgres Test Databases

Testing

Seeding Postgres Test Databases

Prerequisites

Table of Contents

Why Seeding Matters

Seeding with loadJson()

Seeding with loadSql()

Seeding with loadCsv()

Combining Seeding Strategies

Per-Test Seeding

Key Takeaways

What's Next