Skip to content

Reindex logger#26610

Closed
mohityadav766 wants to merge 13 commits intomainfrom
reindex-logger
Closed

Reindex logger#26610
mohityadav766 wants to merge 13 commits intomainfrom
reindex-logger

Conversation

@mohityadav766
Copy link
Member

@mohityadav766 mohityadav766 commented Mar 19, 2026

Describe your changes:

image

I worked on ... because ...

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Summary by Gitar

  • App run logging infrastructure:
    • Added AppRunLogAppender (Logback appender) to capture logs during app execution via MDC or thread name matching
    • Implemented RunLogBuffer with scheduled flushing and stream listener support for real-time log delivery
    • Added dual storage backends: LocalAppRunLogStorage (filesystem) and S3AppRunLogStorage with buffered uploads
  • REST API endpoints:
    • GET /v1/apps/name/{name}/runs/{runTimestamp}/logs — fetch text logs with server filtering
    • GET /v1/apps/name/{name}/runs/{runTimestamp}/logs/download — download as .log file
    • GET /v1/apps/name/{name}/runs/{runTimestamp}/logs/stream — Server-Sent Events stream for live/archived logs
    • GET /v1/apps/name/{name}/runs/{runTimestamp}/logs/servers — list servers with logs for a run
  • Frontend component:
    • Added AppRunTextLogs UI component with run/server selection, live streaming, download, and copy-to-clipboard
  • SearchIndexExecutor stats fixes:
    • Added periodic sink sync every 2 seconds and entity total adjustment when success+failed exceeds initial total

This will update automatically on new commits.

@mohityadav766 mohityadav766 requested a review from a team as a code owner March 19, 2026 19:09
Copilot AI review requested due to automatic review settings March 19, 2026 19:09
@github-actions github-actions bot added backend safe to test Add this label to run secure Github workflows on PRs labels Mar 19, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds per-run log capture and UI viewing/downloading for internal Applications (notably SearchIndex/reindex), and refines SearchIndexExecutor stats reporting.

Changes:

  • Backend: Introduces a Logback appender + buffer to capture app-run logs to disk and exposes new AppResource endpoints to fetch/download/stream logs.
  • UI: Adds a new “Logs” tab for internal apps and a log viewer that supports run selection, server selection, SSE streaming for active runs, and downloads.
  • SearchIndex: Improves progress/stat consistency by periodically syncing sink stats and ensuring totals don’t drift below observed success+failed counts.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
openmetadata-ui/src/main/resources/ui/src/rest/applicationAPI.ts Adds REST helpers to fetch and download app-run text logs.
openmetadata-ui/src/main/resources/ui/src/constants/constants.ts Adds a socket event constant for app-run logs (currently unused).
openmetadata-ui/src/main/resources/ui/src/components/Settings/Applications/MarketPlaceAppDetails/MarketPlaceAppDetails.interface.ts Adds LOGS tab enum value.
openmetadata-ui/src/main/resources/ui/src/components/Settings/Applications/AppRunTextLogs/AppRunTextLogs.interface.ts Defines props/response typings for the new log viewer.
openmetadata-ui/src/main/resources/ui/src/components/Settings/Applications/AppRunTextLogs/AppRunTextLogs.component.tsx Implements the Logs UI (run/server selection, SSE stream, download).
openmetadata-ui/src/main/resources/ui/src/components/Settings/Applications/AppDetails/AppDetails.component.tsx Adds “Logs” tab for internal, scheduled apps.
openmetadata-service/src/main/resources/logback.xml Registers new APP_RUN_LOG appender on root logger.
openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/RunLogBuffer.java Adds buffering + periodic flushing + stream listener support.
openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/AppRunLogAppender.java Implements Logback appender and log retention/listing utilities.
openmetadata-service/src/main/java/org/openmetadata/service/resources/apps/AppResource.java Adds endpoints to get/download/list/stream app-run logs.
openmetadata-service/src/main/java/org/openmetadata/service/apps/scheduler/OmAppJobListener.java Starts/stops log capture for app runs via MDC + thread prefix matching.
openmetadata-service/src/main/java/org/openmetadata/service/socket/WebSocketManager.java Adds an app-run logs channel constant (currently unused).
openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/searchIndex/SearchIndexExecutor.java Periodic sink stat syncing + total record consistency adjustments.
openmetadata-service/src/test/java/org/openmetadata/service/apps/logging/RunLogBufferTest.java Adds unit tests for buffering/flushing/line caps.
openmetadata-service/src/test/java/org/openmetadata/service/apps/logging/AppRunLogAppenderTest.java Adds unit tests for appender behaviors (server listing, retention, concurrency).
openmetadata-service/src/test/java/org/openmetadata/service/apps/bundles/searchIndex/SearchIndexStatsTest.java Adds tests for new stats consistency behaviors.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 8 comments.

Comment on lines +139 to +146
public boolean exists(String appName, long runTimestamp, String serverId) {
String key = s3Key(appName, runTimestamp, serverId);
try {
s3Client.headObject(HeadObjectRequest.builder().bucket(bucketName).key(key).build());
return true;
} catch (NoSuchKeyException e) {
return false;
}
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exists() only catches NoSuchKeyException, but AWS SDK v2 headObject commonly throws S3Exception with status code 404 for missing keys (depending on client configuration/endpoints). As written, missing objects may surface as uncaught exceptions instead of returning false. Consider catching S3Exception and treating 404/NoSuchKey as non-existent, while rethrowing other errors.

Copilot uses AI. Check for mistakes.
Comment on lines +192 to +217
@Override
public void flush() throws IOException {
if (buffer.size() == 0) {
return;
}
byte[] newContent = buffer.toByteArray();
buffer.reset();

byte[] existing = new byte[0];
try {
existing =
client
.getObject(GetObjectRequest.builder().bucket(bucket).key(key).build())
.readAllBytes();
} catch (NoSuchKeyException e) {
// first write
}

byte[] combined = new byte[existing.length + newContent.length];
System.arraycopy(existing, 0, combined, 0, existing.length);
System.arraycopy(newContent, 0, combined, existing.length, newContent.length);

client.putObject(
PutObjectRequest.builder().bucket(bucket).key(key).build(),
RequestBody.fromBytes(combined));
}
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

S3AppendOutputStream.flush() reads the entire existing object and rewrites it on every flush to simulate append. For longer runs this becomes increasingly expensive (O(n²) bytes transferred over time) and can significantly increase S3 costs/latency. Consider switching to chunked/object-per-flush storage (then compose on read), multipart uploads, or buffering larger batches and reducing flush frequency to limit read/rewrites.

Copilot uses AI. Check for mistakes.
Comment on lines +799 to +810
try {
while (!Thread.currentThread().isInterrupted()) {
Thread.sleep(5000);
output.write(": heartbeat\n\n".getBytes());
output.flush();
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
} finally {
activeBuffer.removeStreamListener(listener);
}
} else {
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SSE streaming loop for active runs only exits on thread interruption. When a run completes and AppRunLogAppender.stopCapture(...) removes/closes the buffer, this handler will keep sending heartbeats indefinitely and never emits the done event, which can leak request threads and connections. Add a termination condition inside the loop (e.g., break when AppRunLogAppender.getBuffer(name, String.valueOf(runTimestamp)) is null/closed) and send event: done before returning.

Copilot uses AI. Check for mistakes.
@Parameter(description = "Server ID filter", schema = @Schema(type = "string"))
@QueryParam("serverId")
String serverId) {
repository.getByName(uriInfo, name, repository.getFields("id"));
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These new log endpoints validate the app exists via repository.getByName(...), but they don’t perform any authorization check (and securityContext is otherwise unused). To avoid exposing app run logs to callers without view permission, route through getByNameInternal(...) / getInternal(...) or explicitly call authorizer.authorize(...) with the appropriate VIEW operation before reading/streaming logs.

Suggested change
repository.getByName(uriInfo, name, repository.getFields("id"));
App app = repository.getByName(uriInfo, name, repository.getFields("id"));
authorizer.authorize(
getSubjectContext(securityContext),
new OperationContext(APPLICATION, MetadataOperation.VIEW_BASIC),
app.getEntityReference());

Copilot uses AI. Check for mistakes.
Comment on lines +144 to +155
static final GenericContainer<?> minio =
new GenericContainer<>("minio/minio:latest")
.withCommand("server /data")
.withExposedPorts(9000)
.withEnv("MINIO_ROOT_USER", MINIO_ACCESS_KEY)
.withEnv("MINIO_ROOT_PASSWORD", MINIO_SECRET_KEY)
.waitingFor(
new HttpWaitStrategy()
.forPath("/minio/health/ready")
.forPort(9000)
.withStartupTimeout(Duration.ofMinutes(1)));

Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MinIO Testcontainers image is pinned to latest, which makes the test suite non-deterministic and can break unexpectedly when MinIO releases new versions. Pin to a specific MinIO image tag (and optionally document/centralize the version) for reproducible CI runs.

Copilot uses AI. Check for mistakes.
Comment on lines +269 to +270
responseCode == 400 || responseCode == 500,
"Path traversal should be rejected, got: " + responseCode);
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test currently treats a 500 response as acceptable for a path traversal attempt. Since the API should reject invalid serverId with a deterministic 4xx (and the resource code throws BadRequestException for invalid server IDs), tighten the assertion to expect 400 so the test will catch regressions that accidentally turn input validation failures into 500s.

Suggested change
responseCode == 400 || responseCode == 500,
"Path traversal should be rejected, got: " + responseCode);
responseCode == 400,
"Invalid serverId should return 400, got: " + responseCode);

Copilot uses AI. Check for mistakes.
Comment on lines +351 to +367
<pre
data-testid="lazy-log"
ref={logContainerRef}
style={{
height: '60vh',
overflow: 'auto',
margin: 0,
padding: '12px',
backgroundColor: '#222',
color: '#fff',
fontSize: '12px',
fontFamily: '"Monaco", "Menlo", "Consolas", monospace',
lineHeight: 1.6,
borderRadius: '4px',
whiteSpace: 'pre',
tabSize: 4,
}}>
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This component renders logs with a raw

 and hardcoded inline styles (including hardcoded colors). The UI already has a log viewer implementation (@melloware/react-logviewer / LazyLog) and shared styling (e.g., lazy-log-container) used in AppLogsViewer, which also provides search, selectable lines, and consistent theming. Consider reusing the existing log viewer component/styles and moving any styling to the existing LESS/theme tokens instead of inline hex colors.

Copilot uses AI. Check for mistakes.
Comment on lines +104 to +112
void localStorageRejectsPathTraversal() {
LocalAppRunLogStorage storage = new LocalAppRunLogStorage(tempDir.toString());
try {
storage.readLogs("../../etc", 1L, "passwd");
// Should not reach here
assertFalse(true, "Expected IllegalArgumentException");
} catch (IllegalArgumentException e) {
assertTrue(e.getMessage().contains("Invalid path"));
}
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This path traversal test uses a manual try/catch with assertFalse(true, ...) to indicate failure. Use assertThrows(IllegalArgumentException.class, ...) (and then assert on the message) to make the intent clearer and ensure the test fails correctly if the exception is not thrown.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings March 20, 2026 09:32
@gitar-bot
Copy link

gitar-bot bot commented Mar 20, 2026

Code Review ⚠️ Changes requested 12 resolved / 15 findings

Reindex logger adds S3 support and fixes multiple log streaming issues, but three important findings remain: formatLine drops exception stack traces, the SSE endpoint loads entire log files into memory, and SSE listeners silently swallow RuntimeExceptions on client disconnect.

⚠️ Bug: formatLine drops exception stack traces from captured logs

📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/AppRunLogAppender.java:90-99

The formatLine method only uses event.getFormattedMessage() and completely ignores event.getThrowableProxy(). In Logback, exception stack traces are stored separately in the throwable proxy, not in the formatted message. This means all LOG.error("...", exception) calls will have their stack traces silently dropped from the captured app run logs, making it very difficult to debug failed runs — which is the primary use case for this feature.

Suggested fix
static String formatLine(ILoggingEvent event) {
    String timestamp = FORMATTER.format(Instant.ofEpochMilli(event.getTimeStamp()));
    StringBuilder sb = new StringBuilder();
    sb.append(String.format("%s [%s] %-5s %s - %s",
        timestamp, event.getThreadName(), event.getLevel(),
        event.getLoggerName(), event.getFormattedMessage()));
    if (event.getThrowableProxy() != null) {
      sb.append("
");
      sb.append(ch.qos.logback.classic.pattern.ThrowableProxyConverter
          .throwableProxyToString(event.getThrowableProxy()));
    }
    return sb.toString();
}
⚠️ Performance: SSE stream endpoint reads entire log file into memory String

📄 openmetadata-service/src/main/java/org/openmetadata/service/resources/apps/AppResource.java:757-766

In streamAppRunTextLogs, storage.readLogs() at line 759 reads the entire log file into a single in-memory String, then splits it by newline to write line-by-line as SSE events. With maxLinesPerRun defaulting to 100,000 lines, this could easily be tens of MB per request. The readLogsStream() method exists specifically for streaming, and is already used correctly in the download endpoint.

Suggested fix
// Replace readLogs + split with streaming read:
if (resolvedServerId != null
    && storage.exists(name, runTimestamp, resolvedServerId)) {
  try (var reader = new java.io.BufferedReader(
      new java.io.InputStreamReader(
          storage.readLogsStream(name, runTimestamp, resolvedServerId),
          java.nio.charset.StandardCharsets.UTF_8))) {
    String line;
    while ((line = reader.readLine()) != null) {
      output.write(("data: " + line + "

")
          .getBytes(java.nio.charset.StandardCharsets.UTF_8));
    }
  }
  output.flush();
}
💡 Edge Case: SSE listener RuntimeException on disconnect swallowed silently

📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/RunLogBuffer.java:156-164 📄 openmetadata-service/src/main/java/org/openmetadata/service/resources/apps/AppResource.java:786-796

In the SSE streaming endpoint, when a client disconnects, the listener's output.write() throws IOException which is wrapped in RuntimeException (line 795). This propagates to RunLogBuffer.notifyListeners() which catches it and removes the listener (line 161-163). However, the exception message logged at DEBUG level says "Stream listener error, removing" — this is expected behavior on disconnect, not an error. More importantly, if a listener throws during notifyListeners, it breaks the loop and subsequent listeners for the same batch won't be notified.

Suggested fix
// In RunLogBuffer.notifyListeners, catch per-listener
// to avoid breaking the loop:
private void notifyListeners(String batchText) {
  List<Consumer<String>> toRemove = new ArrayList<>();
  for (Consumer<String> listener : streamListeners) {
    try {
      listener.accept(batchText);
    } catch (Exception e) {
      LOG.debug("Removing disconnected stream listener");
      toRemove.add(listener);
    }
  }
  streamListeners.removeAll(toRemove);
}
✅ 12 resolved
Security: Path traversal via unsanitized name/serverId in log endpoints

📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/AppRunLogAppender.java:160 📄 openmetadata-service/src/main/java/org/openmetadata/service/resources/apps/AppResource.java:601 📄 openmetadata-service/src/main/java/org/openmetadata/service/resources/apps/AppResource.java:662 📄 openmetadata-service/src/main/java/org/openmetadata/service/resources/apps/AppResource.java:745
The name (path param) and serverId (query param) are passed directly to Paths.get(logDirectory, appName, runTimestamp + "-" + serverId + ".log") without any sanitization. An attacker can supply values like ../../etc/passwd to read arbitrary files on the server. This affects all four new endpoints: getAppRunTextLogs, downloadAppRunTextLogs, streamAppRunTextLogs, and getAppRunLogServers.

While repository.getByName() is called first (which validates the app name exists in the DB), the serverId query parameter has no such validation and can contain path traversal sequences. Even for name, if any app name in the DB contains special characters, it could be exploited.

Fix: Resolve the constructed path to its canonical form and verify it still starts with the expected log directory before performing any I/O.

Bug: MDC cleanup skipped if jobWasExecuted throws before cleanup

📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/scheduler/OmAppJobListener.java:201
In OmAppJobListener.jobWasExecuted, the MDC cleanup (MDC.remove(...) at lines 207-210) and AppRunLogAppender.stopCapture() (line 205) are inside the try block but not in a finally block. If any code between lines 140-205 throws an exception (e.g., NPE from null runRecord, JSON parsing failure, WebSocket error), the MDC entries leak on the Quartz scheduler thread for its entire lifetime. This could cause subsequent unrelated jobs on the same thread to have their log events incorrectly routed to the wrong buffer, and the RunLogBuffer (including its flusher ScheduledExecutorService) would never be closed — leaking a thread.

Bug: TOCTOU race in RunLogBuffer.append allows exceeding maxLines

📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/RunLogBuffer.java:52
The append method does if (totalLineCount.get() >= maxLines) return then totalLineCount.incrementAndGet() — a classic check-then-act race. Since append is called from the Logback appender on arbitrary application threads concurrently, multiple threads can pass the guard simultaneously and push the line count beyond maxLines. The practical impact is low (a soft cap on log lines slightly exceeded), but if strict enforcement is needed, use compareAndSet or getAndIncrement atomically.

Edge Case: Logback config registers appender twice (XML + programmatic)

📄 openmetadata-service/src/main/resources/logback.xml:26 📄 openmetadata-service/src/main/resources/logback.xml:34 📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/AppRunLogAppender.java:102
The APP_RUN_LOG appender is declared in logback.xml (attached to root logger), but ensureRegistered() in AppRunLogAppender also programmatically creates and attaches a second instance to the root logger. This means every log event is processed by two AppRunLogAppender instances, resulting in duplicate lines in the log buffers. The XML config should either omit the appender-ref for APP_RUN_LOG from the root logger (relying on programmatic registration) or ensureRegistered() should check for existing appenders by name before adding a new one.

Bug: getBuffer called with wrong arity in streamAppRunTextLogs

📄 openmetadata-service/src/main/java/org/openmetadata/service/resources/apps/AppResource.java:770 📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/AppRunLogAppender.java:166
At line 770, AppRunLogAppender.getBuffer(String.valueOf(runTimestamp)) is called with a single argument, but the method signature requires two: getBuffer(String appName, String runTimestamp). This will cause a compilation error, making the entire SSE streaming endpoint non-functional.

The correct call (as used on line 614) should include the name parameter.

...and 7 more resolved from earlier reviews

🤖 Prompt for agents
Code Review: Reindex logger adds S3 support and fixes multiple log streaming issues, but three important findings remain: `formatLine` drops exception stack traces, the SSE endpoint loads entire log files into memory, and SSE listeners silently swallow RuntimeExceptions on client disconnect.

1. ⚠️ Bug: formatLine drops exception stack traces from captured logs
   Files: openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/AppRunLogAppender.java:90-99

   The `formatLine` method only uses `event.getFormattedMessage()` and completely ignores `event.getThrowableProxy()`. In Logback, exception stack traces are stored separately in the throwable proxy, not in the formatted message. This means all `LOG.error("...", exception)` calls will have their stack traces silently dropped from the captured app run logs, making it very difficult to debug failed runs — which is the primary use case for this feature.

   Suggested fix:
   static String formatLine(ILoggingEvent event) {
       String timestamp = FORMATTER.format(Instant.ofEpochMilli(event.getTimeStamp()));
       StringBuilder sb = new StringBuilder();
       sb.append(String.format("%s [%s] %-5s %s - %s",
           timestamp, event.getThreadName(), event.getLevel(),
           event.getLoggerName(), event.getFormattedMessage()));
       if (event.getThrowableProxy() != null) {
         sb.append("
   ");
         sb.append(ch.qos.logback.classic.pattern.ThrowableProxyConverter
             .throwableProxyToString(event.getThrowableProxy()));
       }
       return sb.toString();
   }

2. ⚠️ Performance: SSE stream endpoint reads entire log file into memory String
   Files: openmetadata-service/src/main/java/org/openmetadata/service/resources/apps/AppResource.java:757-766

   In `streamAppRunTextLogs`, `storage.readLogs()` at line 759 reads the entire log file into a single in-memory String, then splits it by newline to write line-by-line as SSE events. With `maxLinesPerRun` defaulting to 100,000 lines, this could easily be tens of MB per request. The `readLogsStream()` method exists specifically for streaming, and is already used correctly in the download endpoint.

   Suggested fix:
   // Replace readLogs + split with streaming read:
   if (resolvedServerId != null
       && storage.exists(name, runTimestamp, resolvedServerId)) {
     try (var reader = new java.io.BufferedReader(
         new java.io.InputStreamReader(
             storage.readLogsStream(name, runTimestamp, resolvedServerId),
             java.nio.charset.StandardCharsets.UTF_8))) {
       String line;
       while ((line = reader.readLine()) != null) {
         output.write(("data: " + line + "
   
   ")
             .getBytes(java.nio.charset.StandardCharsets.UTF_8));
       }
     }
     output.flush();
   }

3. 💡 Edge Case: SSE listener RuntimeException on disconnect swallowed silently
   Files: openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/RunLogBuffer.java:156-164, openmetadata-service/src/main/java/org/openmetadata/service/resources/apps/AppResource.java:786-796

   In the SSE streaming endpoint, when a client disconnects, the listener's `output.write()` throws IOException which is wrapped in RuntimeException (line 795). This propagates to `RunLogBuffer.notifyListeners()` which catches it and removes the listener (line 161-163). However, the exception message logged at DEBUG level says "Stream listener error, removing" — this is expected behavior on disconnect, not an error. More importantly, if a listener throws during `notifyListeners`, it breaks the loop and subsequent listeners for the same batch won't be notified.

   Suggested fix:
   // In RunLogBuffer.notifyListeners, catch per-listener
   // to avoid breaking the loop:
   private void notifyListeners(String batchText) {
     List<Consumer<String>> toRemove = new ArrayList<>();
     for (Consumer<String> listener : streamListeners) {
       try {
         listener.accept(batchText);
       } catch (Exception e) {
         LOG.debug("Removing disconnected stream listener");
         toRemove.add(listener);
       }
     }
     streamListeners.removeAll(toRemove);
   }

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 6 comments.

@Parameter(description = "Run timestamp", schema = @Schema(type = "number"))
@PathParam("runTimestamp")
Long runTimestamp) {
repository.getByName(uriInfo, name, repository.getFields("id"));
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getAppRunLogServers returns server IDs for a run without any authorization check. Even if the content is just metadata, it can leak operational details; please authorize VIEW access to the App before returning this information.

Suggested change
repository.getByName(uriInfo, name, repository.getFields("id"));
App app = repository.getByName(uriInfo, name, repository.getFields("id"));
SubjectContext subjectContext = getSubjectContext(securityContext);
OperationContext operationContext =
new OperationContext(APPLICATION, MetadataOperation.VIEW_ALL);
authorizer.authorize(subjectContext, operationContext, app.getEntityReference());

Copilot uses AI. Check for mistakes.
Comment on lines +1523 to +1527
long now = System.currentTimeMillis();
if (now - lastSinkSyncTime >= SINK_SYNC_INTERVAL_MS) {
lastSinkSyncTime = now;
syncSinkStatsFromBulkSink();
}
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

periodicSyncSinkStats is invoked from multiple consumer threads, but the now - lastSinkSyncTime check and the lastSinkSyncTime = now update are not atomic. Under concurrency, multiple threads can pass the check and trigger syncSinkStatsFromBulkSink() far more often than intended. Use an AtomicLong with CAS (or a synchronized/locked section) to enforce the interval reliably.

Copilot uses AI. Check for mistakes.

@Container
static final GenericContainer<?> minio =
new GenericContainer<>("minio/minio:latest")
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MinIO Testcontainers image is pinned to latest, which makes tests non-reproducible and can introduce sudden breakages when the upstream image changes. Pin this to a specific MinIO release tag (and update intentionally when needed).

Suggested change
new GenericContainer<>("minio/minio:latest")
new GenericContainer<>("minio/minio:RELEASE.2024-01-18T21-02-27Z")

Copilot uses AI. Check for mistakes.
@QueryParam("serverId")
String serverId) {
repository.getByName(uriInfo, name, repository.getFields("id"));

Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

streamAppRunTextLogs streams potentially sensitive log content but does not authorize the request (it only validates the app exists). Please add an explicit authorization check for VIEW access to the App before opening the SSE stream.

Suggested change
// Authorize VIEW access on Apps before streaming potentially sensitive logs
authorizer.authorize(
securityContext, new OperationContext(APPLICATION, MetadataOperation.VIEW));

Copilot uses AI. Check for mistakes.
Comment on lines +784 to +788
java.util.function.Consumer<String> listener =
batchText -> {
try {
for (String logLine : batchText.split("\n")) {
output.write(
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SSE listener writes to the same output stream that the request thread is also writing to (heartbeats / done events). Because RunLogBuffer notifies listeners from its scheduled flusher thread, this results in concurrent writes to output, which is not thread-safe and can interleave/corrupt the SSE stream or throw sporadic IO errors. Consider funneling all writes through a single thread (e.g., listener enqueues lines into a BlockingQueue that the SSE loop drains) or synchronizing all output writes on a shared lock.

Copilot uses AI. Check for mistakes.
@Parameter(description = "Server ID filter", schema = @Schema(type = "string"))
@QueryParam("serverId")
String serverId) {
repository.getByName(uriInfo, name, repository.getFields("id"));
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

downloadAppRunTextLogs also bypasses the usual authorizer.authorize(...) checks before streaming log content. Please add an explicit authorization check for VIEW access on the App before allowing download.

Suggested change
repository.getByName(uriInfo, name, repository.getFields("id"));
App app = repository.getByName(uriInfo, name, repository.getFields("id"));
SubjectContext subjectContext = getSubjectContext(securityContext);
OperationContext operationContext =
new OperationContext(APPLICATION, MetadataOperation.VIEW_BASIC);
authorizer.authorize(subjectContext, operationContext, getResourceContext(app));

Copilot uses AI. Check for mistakes.
@sonarqubecloud
Copy link

@sonarqubecloud
Copy link

@mohityadav766
Copy link
Member Author

closing this

Comment on lines +90 to +99
static String formatLine(ILoggingEvent event) {
String timestamp = FORMATTER.format(Instant.ofEpochMilli(event.getTimeStamp()));
return String.format(
"%s [%s] %-5s %s - %s",
timestamp,
event.getThreadName(),
event.getLevel(),
event.getLoggerName(),
event.getFormattedMessage());
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Bug: formatLine drops exception stack traces from captured logs

The formatLine method only uses event.getFormattedMessage() and completely ignores event.getThrowableProxy(). In Logback, exception stack traces are stored separately in the throwable proxy, not in the formatted message. This means all LOG.error("...", exception) calls will have their stack traces silently dropped from the captured app run logs, making it very difficult to debug failed runs — which is the primary use case for this feature.

Suggested fix:

static String formatLine(ILoggingEvent event) {
    String timestamp = FORMATTER.format(Instant.ofEpochMilli(event.getTimeStamp()));
    StringBuilder sb = new StringBuilder();
    sb.append(String.format("%s [%s] %-5s %s - %s",
        timestamp, event.getThreadName(), event.getLevel(),
        event.getLoggerName(), event.getFormattedMessage()));
    if (event.getThrowableProxy() != null) {
      sb.append("
");
      sb.append(ch.qos.logback.classic.pattern.ThrowableProxyConverter
          .throwableProxyToString(event.getThrowableProxy()));
    }
    return sb.toString();
}

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

Comment on lines +757 to +766
if (resolvedServerId != null
&& storage.exists(name, runTimestamp, resolvedServerId)) {
String content = storage.readLogs(name, runTimestamp, resolvedServerId);
for (String line : content.split("\n")) {
output.write(
("data: " + line + "\n\n")
.getBytes(java.nio.charset.StandardCharsets.UTF_8));
}
output.flush();
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance: SSE stream endpoint reads entire log file into memory String

In streamAppRunTextLogs, storage.readLogs() at line 759 reads the entire log file into a single in-memory String, then splits it by newline to write line-by-line as SSE events. With maxLinesPerRun defaulting to 100,000 lines, this could easily be tens of MB per request. The readLogsStream() method exists specifically for streaming, and is already used correctly in the download endpoint.

Suggested fix:

// Replace readLogs + split with streaming read:
if (resolvedServerId != null
    && storage.exists(name, runTimestamp, resolvedServerId)) {
  try (var reader = new java.io.BufferedReader(
      new java.io.InputStreamReader(
          storage.readLogsStream(name, runTimestamp, resolvedServerId),
          java.nio.charset.StandardCharsets.UTF_8))) {
    String line;
    while ((line = reader.readLine()) != null) {
      output.write(("data: " + line + "

")
          .getBytes(java.nio.charset.StandardCharsets.UTF_8));
    }
  }
  output.flush();
}

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

Comment on lines +156 to +164
private void notifyListeners(String batchText) {
for (Consumer<String> listener : streamListeners) {
try {
listener.accept(batchText);
} catch (Exception e) {
LOG.debug("Stream listener error, removing: {}", e.getMessage());
streamListeners.remove(listener);
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Edge Case: SSE listener RuntimeException on disconnect swallowed silently

In the SSE streaming endpoint, when a client disconnects, the listener's output.write() throws IOException which is wrapped in RuntimeException (line 795). This propagates to RunLogBuffer.notifyListeners() which catches it and removes the listener (line 161-163). However, the exception message logged at DEBUG level says "Stream listener error, removing" — this is expected behavior on disconnect, not an error. More importantly, if a listener throws during notifyListeners, it breaks the loop and subsequent listeners for the same batch won't be notified.

Suggested fix:

// In RunLogBuffer.notifyListeners, catch per-listener
// to avoid breaking the loop:
private void notifyListeners(String batchText) {
  List<Consumer<String>> toRemove = new ArrayList<>();
  for (Consumer<String> listener : streamListeners) {
    try {
      listener.accept(batchText);
    } catch (Exception e) {
      LOG.debug("Removing disconnected stream listener");
      toRemove.add(listener);
    }
  }
  streamListeners.removeAll(toRemove);
}

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants