feat: Add database backup and restore utilities#26607
Conversation
Introduces a new utility class that provides database backup and restore functionality for OpenMetadata's CLI. It discovers all tables via information_schema, filters out GENERATED columns, dumps everything to a .tar.gz archive with metadata, and restores from such archives. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| try { | ||
| jdbi.useHandle(this::disableForeignKeyChecks); | ||
| restoreTablesFromArchive(backupPath, tablesMetadata); | ||
| LOG.info("Restore completed successfully"); | ||
| } finally { | ||
| jdbi.useHandle(this::enableForeignKeyChecks); | ||
| } |
There was a problem hiding this comment.
🚨 Bug: FK checks disabled on separate handle, not effective for inserts
At line 245, jdbi.useHandle(this::disableForeignKeyChecks) disables FK checks on a handle that is immediately closed. Then restoreTablesFromArchive (line 246) calls jdbi.useHandle(handle -> insertRows(...)) at line 303, which obtains a different connection from the pool. SET FOREIGN_KEY_CHECKS = 0 (MySQL) and SET session_replication_role = 'replica' (Postgres) are session-level settings — they only apply to the connection they were executed on. Since each jdbi.useHandle() may obtain a different pooled connection, the inserts will run with FK checks still enabled, causing failures when tables are restored in an order that violates foreign key constraints.
The same issue affects the enableForeignKeyChecks call in the finally block (line 249) — it may re-enable checks on yet another connection that never had them disabled.
Suggested fix:
// Use a single handle for the entire restore operation:
jdbi.useHandle(handle -> {
try {
disableForeignKeyChecks(handle);
restoreTablesFromArchive(handle, backupPath, tablesMetadata);
} finally {
enableForeignKeyChecks(handle);
}
});
// Update restoreTablesFromArchive to accept a Handle
// and pass it to insertRows directly instead of
// calling jdbi.useHandle() at line 303.
Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion
| String quoteIdentifier(String identifier) { | ||
| if (connectionType == ConnectionType.MYSQL) { | ||
| return "`" + identifier + "`"; | ||
| } | ||
| return "\"" + identifier + "\""; | ||
| } |
There was a problem hiding this comment.
⚠️ Security: SQL injection via crafted backup archive table/column names
During restore, table names are extracted from tar entry paths (line 284) and metadata.json field names (line 233) — both sourced from the untrusted backup archive. These names flow into SQL statements via quoteIdentifier() (line 397-402), which wraps identifiers in backticks or double-quotes but does NOT escape those delimiter characters within the identifier itself. A crafted archive with a table name like foo DROP TABLE users; --could inject arbitrary SQL.quoteIdentifieris used in SELECT (line 314), TRUNCATE (line 336), and INSERT (line 350) statements. Column names frommetadata.jsonare similarly unvalidated (line 292, used in INSERT at line 347). While the attack requires someone to provide a malicious.tar.gz` file to the restore command, this is a realistic threat for a backup/restore utility — especially if backups are downloaded from remote storage.
Suggested fix:
// 1. Validate identifiers with a strict allowlist pattern:
private static final Pattern SAFE_IDENTIFIER =
Pattern.compile("^[a-zA-Z_][a-zA-Z0-9_]*$");
String quoteIdentifier(String identifier) {
if (!SAFE_IDENTIFIER.matcher(identifier).matches()) {
throw new IllegalArgumentException(
"Invalid identifier: " + identifier);
}
// existing quoting logic...
}
// 2. Optionally cross-check restored table names against
// discoverTables() to ensure they exist in the DB.
Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion
| jdbi.useHandle( | ||
| handle -> { | ||
| List<String> tables = discoverTables(handle); | ||
| LOG.info("Discovered {} tables", tables.size()); | ||
|
|
||
| for (String tableName : tables) { | ||
| backupTable(handle, tableName, taos, tablesMetadata); | ||
| } | ||
| }); |
There was a problem hiding this comment.
💡 Edge Case: Backup reads tables without snapshot isolation
The backup() method reads tables sequentially within a single jdbi.useHandle callback, but without setting a transaction isolation level (e.g., REPEATABLE READ or SERIALIZABLE). If the database is being written to during backup, the resulting archive may contain an inconsistent snapshot — e.g., a foreign key reference in table A pointing to a row in table B that wasn't captured because it was deleted between the two table reads.
Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion
There was a problem hiding this comment.
Pull request overview
Adds an ops-level database backup/restore capability to OpenMetadata, wiring new backup / restore CLI commands into OpenMetadataOperations and implementing archive-based export/import via JDBI for MySQL/PostgreSQL.
Changes:
- Introduces
DatabaseBackupRestoreutility to write/read a.tar.gzbackup with per-table JSON plusmetadata.json. - Adds
backupandrestorePicocli commands toOpenMetadataOperations. - Adds unit tests for
extractDatabaseNameJDBC URL parsing.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 9 comments.
| File | Description |
|---|---|
| openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java | Implements table/column discovery, backup archive creation, and restore logic. |
| openmetadata-service/src/main/java/org/openmetadata/service/util/OpenMetadataOperations.java | Wires backup / restore commands into the ops CLI. |
| openmetadata-service/src/test/java/org/openmetadata/service/util/DatabaseBackupRestoreTest.java | Adds unit tests for JDBC database-name extraction helper. |
| openmetadata-service/pom.xml | Adds commons-compress dependency for tar/gzip support. |
You can also share your feedback on Copilot code review. Take the survey.
| }); | ||
|
|
||
| try { | ||
| jdbi.useHandle(this::disableForeignKeyChecks); | ||
| restoreTablesFromArchive(backupPath, tablesMetadata); | ||
| LOG.info("Restore completed successfully"); | ||
| } finally { | ||
| jdbi.useHandle(this::enableForeignKeyChecks); | ||
| } |
| byte[] content = tais.readAllBytes(); | ||
| ArrayNode rows = (ArrayNode) MAPPER.readTree(content); | ||
|
|
| int offset = 0; | ||
| ArrayNode allRows = MAPPER.createArrayNode(); | ||
|
|
||
| while (true) { | ||
| String sql = | ||
| String.format( | ||
| "SELECT %s FROM %s LIMIT %d OFFSET %d", | ||
| quotedColumns, quotedTable, BATCH_SIZE, offset); |
| JsonNode val = row.get(col); | ||
| if (val == null || val.isNull()) { | ||
| batch.bindNull(col, Types.VARCHAR); | ||
| } else if (val.isNumber()) { | ||
| if (val.isLong() || val.isInt() || val.isBigInteger()) { | ||
| batch.bind(col, val.longValue()); | ||
| } else { | ||
| batch.bind(col, val.doubleValue()); | ||
| } |
| "SELECT %s FROM %s LIMIT %d OFFSET %d", | ||
| quotedColumns, quotedTable, BATCH_SIZE, offset); |
| public void restore(String backupPath, boolean force) throws IOException { | ||
| LOG.info("Starting database restore from {}", backupPath); | ||
|
|
||
| ObjectNode metadata = readMetadata(backupPath); | ||
| String backupDbType = metadata.get("databaseType").asText(); | ||
| if (!backupDbType.equals(connectionType.name())) { | ||
| throw new IllegalStateException( | ||
| String.format( | ||
| "Backup database type '%s' does not match current connection type '%s'", | ||
| backupDbType, connectionType.name())); | ||
| } | ||
|
|
||
| LOG.info( | ||
| "Backup info - version: {}, timestamp: {}, databaseType: {}", | ||
| metadata.get("version").asText(), | ||
| metadata.get("timestamp").asText(), | ||
| backupDbType); | ||
|
|
||
| ObjectNode tablesMetadata = (ObjectNode) metadata.get("tables"); | ||
|
|
||
| jdbi.useHandle( | ||
| handle -> { | ||
| if (force) { | ||
| truncateAllTables(handle, tablesMetadata); | ||
| } else { | ||
| validateTablesEmpty(handle, tablesMetadata); | ||
| } | ||
| }); | ||
|
|
||
| try { | ||
| jdbi.useHandle(this::disableForeignKeyChecks); | ||
| restoreTablesFromArchive(backupPath, tablesMetadata); | ||
| LOG.info("Restore completed successfully"); | ||
| } finally { | ||
| jdbi.useHandle(this::enableForeignKeyChecks); | ||
| } | ||
| } |
| "SELECT column_name FROM information_schema.columns " | ||
| + "WHERE table_schema = 'public' AND table_name = :table " | ||
| + "AND (is_generated = 'NEVER' OR is_generated IS NULL) " | ||
| + "AND (column_default NOT LIKE 'nextval%' OR column_default IS NULL) " |
| batch.bind(col, val.doubleValue()); | ||
| } | ||
| } else if (val.isBoolean()) { | ||
| batch.bind(col, val.booleanValue()); |
| private void disableForeignKeyChecks(Handle handle) { | ||
| if (connectionType == ConnectionType.MYSQL) { | ||
| handle.execute("SET FOREIGN_KEY_CHECKS = 0"); | ||
| } else { | ||
| handle.execute("SET session_replication_role = 'replica'"); | ||
| } | ||
| } |
…mary output Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds a test-migration command to OpenMetadataOperations that: - Restores a backup, then runs pending migrations one by one - For each migration with a test class, runs validateBefore/validateAfter - Prints a summary table to stdout for PR validation - Returns exit code 0 (all pass) or 1 (any fail) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| } | ||
| } | ||
|
|
||
| static String versionToPackage(String version) { |
There was a problem hiding this comment.
⚠️ Bug: versionToPackage produces collisions for different versions
The versionToPackage method simply concatenates the numeric parts of a version string without any separator or fixed-width formatting. This means different versions can map to the same package name, causing the wrong test class to be loaded (or no test to be found).
Examples of collisions:
"1.12.0"→"v1120"and"1.1.20"→"v1120""1.1.0"→"v110"and"1.10"→"v110"
If a future migration version happens to collide with an existing one, the test framework will silently run the wrong test class or fail to find the correct one.
Suggested fix:
static String versionToPackage(String version) {
String base = version.contains("-") ? version.split("-")[0] : version;
String[] parts = base.split("\.");
StringBuilder sb = new StringBuilder("v");
for (int i = 0; i < parts.length; i++) {
if (i > 0) sb.append("_");
sb.append(Integer.parseInt(parts[i]));
}
return sb.toString();
}
// "1.12.0" → "v1_12_0", "1.1.20" → "v1_1_20"
Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion
| new MigrationTestEntry(version, "migration execution", "RUN", false, e.getMessage())); | ||
| } | ||
|
|
||
| if (testCase != null) { |
There was a problem hiding this comment.
💡 Edge Case: AFTER tests still run when migration fails
When a migration fails (lines 106-111), the migrationFailed flag is set but the code only checks testCase != null before running validateAfter (line 113). The else if (!migrationFailed) branch at line 115 only applies to the "no tests" placeholder — if a testCase exists, its validateAfter will still execute against a partially-migrated database before the break at line 119. This could produce confusing test results or exceptions from the after-validation running against an inconsistent schema.
Suggested fix:
if (testCase != null && !migrationFailed) {
entries.addAll(runValidation(testCase::validateAfter, handle, version, "AFTER"));
} else if (!migrationFailed) {
entries.add(new MigrationTestEntry(version, "(no tests)", "-", true, ""));
}
Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion
Code Review 🚫 Blocked 0 resolved / 5 findingsDatabase backup and restore utilities implementation blocked due to critical security and data integrity issues: FK checks disabled on separate handle not effective for inserts, SQL injection vulnerability via crafted archive table/column names during restore, version collision risk in versionToPackage, and test execution continuing after migration failures. 🚨 Bug: FK checks disabled on separate handle, not effective for inserts📄 openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java:244-250 📄 openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java:303 At line 245, The same issue affects the Suggested fix
|
| Auto-apply | Compact |
|
|
Was this helpful? React with 👍 / 👎 | Gitar
|
🟡 Playwright Results — all passed (14 flaky)✅ 3390 passed · ❌ 0 failed · 🟡 14 flaky · ⏭️ 183 skipped
🟡 14 flaky test(s) (passed on retry)
How to debug locally# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip # view trace |



Summary
DatabaseBackupRestoreutility class for full database backup and restore via JDBIbackupandrestoreCLI commands toOpenMetadataOperationsMigrationTestRunnerwithMigrationTestCaseinterface for migration validationtest-migrationCLI command that restores a backup, runs pending migrations one by one, and executes before/after test assertions.tar.gzarchive withmetadata.json(timestamp, version, db type, per-table info) and one JSON file per table--forceflag to truncate existing data, and handles FK constraintsUsage
Migration Test Framework
Define test classes per migration version following the naming convention:
The test-migration command outputs a summary table:
Test plan
extractDatabaseName(4 tests)versionToPackage(5 tests)--forceon non-empty database fails with table listing--forcetruncates and restores successfully