Skip to main content
Scalple
Scalple
PostgreSQL · pg_audit · GDPR Art. 32

PostgreSQL audit logging
per user for GDPR

Answer

PostgreSQL does not log queries per named user by default. To comply with GDPR Article 32, install the pgaudit extension, add it to shared_preload_libraries, and configure pgaudit.log = 'read,write,ddl' with pgaudit.log_relation = on. This gives you a query-level log per PostgreSQL role. The problem: pg_audit logs the database role — not the human. If your application uses a shared role (app_user), every log line is identical regardless of who ran the query. This guide covers the complete setup and both structural gaps pg_audit cannot close.

What PostgreSQL logs without pg_audit

Out of the box, PostgreSQL logs connection events and server errors. Individual queries are not logged unless log_statement is set. Even with log_statement = 'all', the default log line looks like this:

2024-01-15 14:23:47 UTC [18392] app_user@production LOG:  statement: SELECT id, email FROM users WHERE id = 4892

The role shown is app_user — a shared application credential used by all engineers connecting to production. There is no information about which human ran this query. Under GDPR, this log is non-compliant: it records what happened, but not who did it.

Who ran this query?

Unknown. app_user is shared across all engineers.

Which tables contain personal data?

Inferable from schema, not from log without pg_audit.

Was access authorized?

Unverifiable. No per-human policy engine.

Can this log be deleted?

Yes. Any superuser with server access can truncate it.

Installing and configuring pg_audit

pg_audit is a PostgreSQL extension maintained by the pgAudit project. It must be loaded at server start — you cannot enable it via SQL alone after the server is running.

Step 1 — Install the extension package

# Debian/Ubuntu
sudo apt install postgresql-16-pgaudit

# RHEL/Rocky Linux (PGDG repo)
sudo dnf install pgaudit_16

# macOS — no Homebrew formula; compile from source
git clone https://github.com/pgaudit/pgaudit.git && cd pgaudit
make install USE_PGXS=1 PG_CONFIG=$(pg_config --bindir)/pg_config

Step 2 — Add to shared_preload_libraries

Edit postgresql.conf or use ALTER SYSTEM. A server restart is required.

-- In psql (requires superuser)
ALTER SYSTEM SET shared_preload_libraries = 'pgaudit';

-- Then restart PostgreSQL, then create the extension
CREATE EXTENSION IF NOT EXISTS pgaudit;

Step 3 — Configure audit parameters

-- Log reads and writes on all relations
ALTER SYSTEM SET pgaudit.log = 'read,write,ddl';

-- Log the specific relation (table/view) for each statement
ALTER SYSTEM SET pgaudit.log_relation = 'on';

-- Include bind parameters (required for GDPR data subject queries)
ALTER SYSTEM SET pgaudit.log_parameter = 'on';

SELECT pg_reload_conf();

-- Verify settings are active
SHOW pgaudit.log;  -- read,write,ddl

Per-role configuration (recommended)

If you have per-operator database roles, configure auditing at the role level to avoid logging high-volume application read traffic:

-- Create per-operator roles
CREATE ROLE engineer_alice LOGIN PASSWORD '...';

-- Enable auditing only for operator roles
ALTER ROLE engineer_alice SET pgaudit.log = 'read,write';
ALTER ROLE engineer_alice SET pgaudit.log_relation = 'on';

The pg_audit log format

A pg_audit log entry consists of the standard PostgreSQL log line prefix followed by the AUDIT: marker and nine comma-separated fields. Here is a real example:

2024-01-15 14:23:47 UTC [18392] app_user@production LOG:  AUDIT: SESSION,1,1,READ,SELECT,TABLE,public.users,"SELECT id, email, last_login FROM users WHERE id = 4892",<not logged>
Timestamp + PID
2024-01-15 14:23:47 UTC [18392]Standard PostgreSQL log prefix. PID is the backend process.
Role@Database
app_user@productionPostgreSQL role — NOT the human operator. This is the core GDPR problem.
AUDIT_TYPE
SESSIONSESSION (entire session) or OBJECT (specific object grants). See FAQ.
STATEMENT_ID
1Integer ID for this statement within the session.
SUBSTATEMENT_ID
1Sub-statement ID — increments for CTEs, function calls within the statement.
CLASS
READREAD, WRITE, FUNCTION, ROLE, DDL, MISC, or MISC_SET.
COMMAND
SELECTThe SQL command tag: SELECT, INSERT, UPDATE, DELETE, COPY, TRUNCATE, etc.
OBJECT_TYPE
TABLETABLE, INDEX, SEQUENCE, VIEW, MATERIALIZED VIEW, COMPOSITE TYPE, FOREIGN TABLE, FUNCTION, SCHEMA, UNKNOWN.
OBJECT_NAME
public.usersSchema-qualified object name. Requires pgaudit.log_relation = on.
STATEMENT
"SELECT id, email..."The full SQL statement text.
PARAMETER
<not logged>Bind parameters if pgaudit.log_parameter = on. Essential for data subject queries.

The shared role problem

Most production applications connect via a single shared role — app_user, postgres, or a service account. When an engineer connects to production via a database GUI, psql, or an internal tool, they authenticate as that shared role. Every pg_audit log line shows the same role regardless of who ran the query.

A supervisory authority reviewing a GDPR breach will ask: "Can you show us every query Alice ran on the users table on January 15?" If Alice authenticated as app_user, the answer is: no, because you cannot distinguish Alice's queries from the other 12 engineers who also connected as app_user that day.

Workaround: SET LOCAL session variable

The most common workaround is to inject the human identity as a PostgreSQL session variable at the start of each transaction:

-- Your application wraps every DB operation:
BEGIN;
SET LOCAL app.current_user = 'alice@company.com';
SELECT id, email, last_login FROM users WHERE id = $1;
COMMIT;

-- pg_audit logs the query under app_user.
-- Your application log must separately record
-- that alice@company.com initiated this session.

This approach has two compliance problems. First, the app.current_user value is set by application code — it is not verified by the database. Any code with database access can write any value. Second, the correlation between the session variable and the pg_audit log line lives in two separate log streams. A complete audit trail requires joining them, which introduces both technical complexity and gaps.

Approach comparison

All approaches measured against the three GDPR Article 32 requirements that matter for a supervisory authority inquiry.

ApproachPer-user attributionTamper-proofGDPR Art. 32Superuser deletableEffort
PostgreSQL default logs❌ Role only❌ No❌ No✅ YesNone
pg_audit (shared app role)❌ Role only❌ No❌ No✅ YesLow
pg_audit + SET LOCAL⚠️ App-level❌ No⚠️ Partial✅ YesMedium
pg_audit + per-user roles✅ Yes❌ No⚠️ Partial✅ YesHigh
pg_audit + external SIEM✅ Yes⚠️ Depends⚠️ Partial❌ NoVery high
Scalple✅ Yes✅ INSERT-only✅ Yes❌ NoNone

Two structural gaps pg_audit cannot close

Gap 1

Log files are superuser-deletable

pg_audit writes to the PostgreSQL log file — typically /var/log/postgresql/postgresql.log. A user with server access can truncate, rotate, or delete this file. In cloud-managed PostgreSQL, the cloud provider controls log retention. Neither is structurally tamper-evident. Under GDPR, an audit log that a superuser can modify is not a reliable record for a supervisory authority inquiry. A cloud provider's retention guarantee is contractual, not structural.

Gap 2

Database role ≠ human identity

pg_audit records the PostgreSQL role, not the authenticated human. The database has no concept of a human identity separate from a role. If your team uses shared credentials, per-user roles, or any connection pooler that multiplexes connections, the log cannot reliably attribute a query to a specific person. GDPR requires individual attribution. A log line showing app_user satisfies no supervisory authority.

How Scalple closes both gaps

Scalple operates as a database access layer — engineers authenticate to Scalple with their individual identity (SSO, MFA), not to PostgreSQL directly. The database role is managed by Scalple, never exposed to the engineer. Every operation is logged in an INSERT-only audit table before the query executes. The audit schema revokes DELETE and UPDATE at the database level — no superuser can modify entries. The log shows the verified human identity, the exact query, and a tamper-evident timestamp. When your DPA asks for every query Alice ran on January 15, you export it in under 30 seconds.

-- Scalple audit log (INSERT-only, human-attributed)
SELECT
  audit_time,
  human_identity,  -- alice@company.com, not app_user
  command,
  object_schema,
  object_name,
  statement,
  row_count,
  duration_ms
FROM scalple_audit_log
WHERE human_identity = 'alice@company.com'
  AND audit_time::date = '2024-01-15'
ORDER BY audit_time;

PostgreSQL audit logging — frequently asked questions

Does PostgreSQL log queries per user by default?

No. PostgreSQL's default logging captures connection events, server errors, and slow queries (if log_min_duration_statement is configured), but does not log individual queries per user by default. Even when log_statement = 'all' is set, it logs all queries but attributes them to the PostgreSQL role — typically a shared application user like app_user or postgres — not to the human operator who initiated the request. For GDPR Article 32, attribution to a role is insufficient: you need attribution to a named, verified individual.

What does a pg_audit log line look like?

A pg_audit log entry follows the format: AUDIT: <AUDIT_TYPE>,<STATEMENT_ID>,<SUBSTATEMENT_ID>,<CLASS>,<COMMAND>,<OBJECT_TYPE>,<OBJECT_NAME>,<STATEMENT>,<PARAMETER>. A real example: '2024-01-15 14:23:47 UTC [18392] app_user@production LOG: AUDIT: SESSION,1,1,READ,SELECT,TABLE,public.users,"SELECT id, email FROM users WHERE id = 4892",<not logged>'. The prefix before AUDIT: is the standard PostgreSQL log line prefix showing timestamp, PID, and — critically — the database role (app_user), not the human operator. The PARAMETER field shows '<not logged>' unless pgaudit.log_parameter = on is set.

Is pg_audit alone sufficient for GDPR Article 32 compliance?

No, for two structural reasons. First, pg_audit logs the PostgreSQL role, not a human identity. If your application connects with a shared role, every audit line shows app_user — making the log legally non-attributable. Second, pg_audit writes to the PostgreSQL log file or syslog. Both are modifiable by a superuser. A supervisory authority will ask whether the log could have been altered after a breach. With file-based logging, it could. GDPR requires tamper-evident storage — structurally impossible to modify, not just access-controlled.

How do I attribute PostgreSQL queries to a human user, not a shared role?

There are three approaches, each with trade-offs. First: per-operator database roles — create one PostgreSQL role per human (alice, bob), issue individual credentials, and configure pgaudit.log per role. This is the most GDPR-correct but breaks most connection poolers and adds high operational overhead. Second: SET LOCAL with a session variable — at the start of each transaction, execute SET LOCAL app.current_user = 'alice@company.com', then read that variable in your log pipeline. The pg_audit log line still shows the application role, but your application log correlates the session. This is application-level only and can be spoofed. Third: a dedicated access layer like Scalple that authenticates the human before creating the database session and logs human identity independently of the database role.

What is the difference between SESSION and OBJECT audit mode in pg_audit?

SESSION audit mode (pgaudit.log = 'read,write,ddl') logs all statements of the specified classes for the entire session, regardless of which object is accessed. It is configured globally or per role. OBJECT audit mode logs statements that affect specific objects — tables, views, sequences — that have been granted to the pgaudit role. OBJECT mode is more granular (you can audit SELECT on the users table without auditing every other read) but requires explicit GRANT statements to the pgaudit role for each object you want to monitor. For GDPR compliance, you typically want SESSION mode with pgaudit.log = 'read,write' plus pgaudit.log_relation = on to capture the specific table names.

Can a database superuser delete pg_audit logs?

Yes. pg_audit writes to the standard PostgreSQL log file (typically /var/log/postgresql/postgresql.log or equivalent). A user with filesystem access to the server — including any superuser who can execute system commands via COPY TO or untrusted PLs — can truncate, rotate, or delete those files. In cloud-managed PostgreSQL (AWS RDS, Google Cloud SQL, Azure Database), the log file is managed by the cloud provider and not directly deletable by the database superuser, but it is accessible to cloud provider employees and can be disabled via configuration. No file-based logging is structurally tamper-evident. An INSERT-only database table — where DELETE and UPDATE are revoked at the database level — provides structural tamper evidence.

pg_audit is a good start. It is not an audit trail.

Scalple provides the human-attributed, INSERT-only audit trail that GDPR Article 32 requires — without per-operator database roles, SET LOCAL hacks, or SIEM integrations. Deploy in your EU infrastructure in under 10 minutes.

INSERT-only audit trailHuman identity attributionEU-only infrastructureNo pg_audit required