RUM-16502: Dedup fatal ANR reported by Profiling pipeline#3479
Conversation
This comment has been minimized.
This comment has been minimized.
d28368b to
12b8bd5
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## feature/continuous-profiling #3479 +/- ##
================================================================
- Coverage 72.36% 72.36% -0.00%
================================================================
Files 979 979
Lines 36242 36246 +4
Branches 6034 6036 +2
================================================================
+ Hits 26225 26227 +2
- Misses 8359 8366 +7
+ Partials 1658 1653 -5
🚀 New features to boost your workflow:
|
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 12b8bd5faf
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 12b8bd5faf
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
12b8bd5 to
d05b2fe
Compare
| Log.i("DatadogSynthetics", "$key=$value") | ||
| } | ||
|
|
||
| private fun persistLastFatalAnrSent(timestampMs: Long) { |
There was a problem hiding this comment.
this method is already called on the I/O (writeScope) thread, no need to push to the persistence executor explicitly.
There was a problem hiding this comment.
removed the function, call it directly in onSuccess callback
| onSuccess { it.eventSent(rumContext.viewId.orEmpty(), StorageEvent.Error()) } | ||
| onSuccess { | ||
| it.eventSent(rumContext.viewId.orEmpty(), StorageEvent.Error()) | ||
| if (event.throwable is ANRException) { |
There was a problem hiding this comment.
there is one concern: if customer has trackNonFatalAnrs, such ANR will also pass here. I'm not sure if deduplication is useful for ANR coming from the ANRDetectorRunnable, because there we have many false positives (and probably we could also deduplicate between ANRDetectorRunnable and ANR trigger), but there is no information about ANR origin, so I guess we can live with it.
There was a problem hiding this comment.
yes, this is a trade-off, onSuccess is the most reliable place that we can mark the ANR is written, but here it already lost the information about the ANR origin. Checking if it is from Profiling trigger here can make the code quite messy
d05b2fe to
18aa23b
Compare
|
@codex review |
|
Codex Review: Didn't find any major issues. What shall we delve into next? ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
| * Writes timestamp of the last fatal ANR sent. | ||
| * Writes the timestamp of the last ANR RUM reported. Despite the "fatal" in the name, | ||
| * this same timestamp is also written for non-fatal ANRs (see | ||
| * `RumViewScope.persistLastNonFatalAnrSent`) so that the fatal-ANR pipeline, which |
There was a problem hiding this comment.
nit:
persistLastNonFatalAnrSent no longer exits
What does this PR do?
Dedups fatal ANRs so the late-crash reporter does not re-emit an ANR that was already reported in-session — covering both the new Profiling pipeline path and the existing
ANRDetectorRunnablepath.The marker write now lives in
RumViewScope.onAddError: when anAddErrorevent with anANRExceptionthrowable is successfully written,lastFatalAnrSentis persisted (usingevent.eventTime.timestamp, which is device time and matchesApplicationExitInfo.timestamp). On the next launch,DatadogLateCrashReporter.handleAnrCrashskips re-emitting when the marker is within a 10s tolerance ofApplicationExitInfo.timestamp(the two come from different moments — in-process detection vs. process death — so exact equality won't hold).Hooking at the write path (rather than at each call site) means:
RumViewManagerScope.handleBackgroundEventno longer write the marker → no false-negative on background-fatal cases.ANRDetectorRunnablepath are covered by the same hook → the pre-existing duplicate where an in-session foreground ANR turned fatal is also fixed.Motivation
RUM-16502 — without this, an ANR detected and uploaded by the profiling pipeline is duplicated as a fatal ANR error in RUM on the next launch. The same gap also existed in the legacy path (foreground ANR reported in-session, user closes the ANR dialog, fatal ANR re-reported next launch) and is closed by the same fix.
Additional Notes
Possible regression to the existing late-crash dedup: the previous check compared
ApplicationExitInfo.timestampto itself (marker written byhandleAnrCrash, read on next launch — exact equality always held). With the 10s tolerance, two distinct fatal ANRs whoseApplicationExitInfo.timestampvalues fall within 10s of each other would now be treated as duplicates.In practice this requires: ANR → process death → user relaunches the app → second ANR → process death, all within 10s of the first death. The relaunch alone almost always takes longer than 10s, so this collision is only realistic in a tight crash loop, where collapsing the two reports is arguably acceptable. The same-source case (one exit-info entry re-read across multiple launches) still dedups identically since `|0| <= 10_000`.
Review checklist (to be filled by reviewers)