fix profiler regression by gz · Pull Request #6331 · feldera/feldera

gz · 2026-05-27T16:40:02Z

Describe Manual Test Plan

Loaded new and old bundles in profiler view manually.

Karakatiza666

LGTM. The fix is clean, backward-compatible

mythical-fred

LGTM. The two-regex split (legacy <ISO>_ prefix vs new <ISO>/ directory) plus the (?:^|\/) anchor cleanly covers both flat and nested bundle layouts, and collectionDate correctly rewrites the new --separated time back to : for the JS Date parser. Inline doc comments are great.

Non-blocking observations:

This whole file is now pure-function regex + date parsing — exactly the shape that wants unit tests. profiler-layout has no test runner today, so this PR can't add them without scaffolding vitest first. Worth a tracking issue (feat: add vitest to profiler-layout) so the next regex tweak isn't another manual roundtrip. The new bundle format is locked to a literal trailing Z (no +HH:MM offset); if support-bundle ever emits a non-Z UTC marker the regex silently drops the collection on the floor — a 3-line vitest would catch that.
collectionDirRegex accepts (?:\.\d+)?Z(?:-\d+)?. The collision suffix -N is unbounded; not a real concern, just noting that -9999999999 would parse. Capping at \d{1,3} would match the support-bundle producer's expected output.
The legacy regex [^/_]+ will happily match a timestamp like 2026-01-19T12-55-54Z (the new format!) as a legacy key. Order matters: collectionDirRegex is tried first, so this is benign — worth a one-line comment in collectionKey noting the order is load-bearing.

Manual test plan covers the user-facing path; I'm not blocking on the missing unit tests because the infra isn't there.

1414c0f changed the support bundle format, putting things into folders. The console relied on the format to display a profile. Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>

Signed-off-by: Karakatiza666 <bulakh.96@gmail.com>

RuntimeHandle::join shut down the tokio merger runtime with: ``` if let Some(rt) = self.runtime.inner().tokio_merger_runtime.lock().unwrap().take() { drop(rt); } ``` The MutexGuard returned by lock() lives until the end of the body, so the mutex stayed locked across drop. Dropping the tokio runtime blocks until every blocking-pool task yields. Those tasks are merger workers, and MergeJob::run (in trace::spine_async) calls Runtime::tokio_merger_runtime() to spawn a child task for a newly-used spine level. That accessor locks the same mutex. A merger task entering this call during the drop window blocked on the mutex. The race became reachable with commit a886ecf "[dbsp] Lazily start merger tasks in spine" which switched dbsp from spawning every level up-front on foreground worker threads to spawning on demand from inside merger tasks. Reproduced on nexmark Q19 by holding the mutex for 10s after take() and forcing a tokio_merger_runtime() call from inside the merger loop. Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>

mythical-fred

Two real fixes, both look correct. Blocking on testing per our hard rule (behavior change → must land with a test). Detail inline.

Runtime::tokio_merger_runtime used to .expect() on the slot, so a caller that ran after RuntimeHandle::join had taken the runtime out would panic with "tokio merger runtime has been shut down". Since commit a886ecf "[dbsp] Lazily start merger tasks in spine", a merger task can call this accessor at any time during its lifetime, when this happens after the shutdown logic dropped the tokio merger runtime it creates a unnecessary panic during shutdown which needs to be handled gracefully. Before that commit every level was spawned up-front from foreground worker threads, so the only caller of this accessor was the foreground worker, never the merger runtime itself, and the race could not occur. Return Option<Handle> instead. MergeJob::run, returns early when the slot has already been taken; workers exiting set `request_exit`, so the merger has nothing to do anyway. Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>

Covers both fixes from PR #6331. Verified by reverting each fix in turn: - Lock-scope reverted: test times out at 30s. - Lock-scope reverted plus 10s amplifier sleep in `join`: same 30s timeout. Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>

mythical-fred

Re-APPROVE on d6ca088b. Blocker addressed: test_drop_does_not_deadlock_with_active_merger_lookup (commit d6ca088) covers both fixes deterministically — no sleep, no time-based assertions, only recv_timeout for deadlock detection (the 30s/5s budgets are upper bounds, not waits). The accessor-hammer + side-thread drop(handle) shape reliably loses the race on the buggy code: the worker holds the mutex across the runtime drop, the merger task spinning inside tokio_merger_runtime() blocks on the same mutex, and the blocking-pool teardown deadlocks against it. With both fixes in place, take() releases the slot, the accessor returns None, and the task exits via the clean-exit channel. Good test.

Spine_async tracing::trace! on the torn-down path is fine — trace is the right level for a benign race observation.

Prior APPROVE on profiler-layout dcd95e5da7 still stands; the two dbsp commits stack cleanly on top.

Karakatiza666 self-requested a review May 27, 2026 16:40

gz changed the title ~~Webconsole format~~ fix profiler regression May 27, 2026

Karakatiza666 approved these changes May 27, 2026

View reviewed changes

mihaibudiu approved these changes May 27, 2026

View reviewed changes

gz enabled auto-merge May 27, 2026 16:43

gz added this pull request to the merge queue May 27, 2026

mythical-fred approved these changes May 27, 2026

View reviewed changes

github-merge-queue Bot removed this pull request from the merge queue due to failed status checks May 27, 2026

gz added this pull request to the merge queue May 27, 2026

gz removed this pull request from the merge queue due to a manual request May 27, 2026

gz added this pull request to the merge queue May 27, 2026

github-merge-queue Bot removed this pull request from the merge queue due to failed status checks May 27, 2026

ryzhyk added this pull request to the merge queue May 27, 2026

gz and others added 2 commits May 27, 2026 17:45

Adjust profiler to load new format.

810cbc6

1414c0f changed the support bundle format, putting things into folders. The console relied on the format to display a profile. Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>

[web-console] Allow support bundle files in the nested dirs

9138d54

Signed-off-by: Karakatiza666 <bulakh.96@gmail.com>

github-merge-queue Bot removed this pull request from the merge queue due to failed status checks May 28, 2026

gz force-pushed the webconsole-format branch from dcd95e5 to 387166a Compare May 28, 2026 01:53

mythical-fred suggested changes May 28, 2026

View reviewed changes

Comment thread crates/dbsp/src/circuit/runtime.rs

Comment thread crates/dbsp/src/trace/spine_async.rs

blp approved these changes May 28, 2026

View reviewed changes

Comment thread crates/dbsp/src/circuit/runtime.rs

gz added 2 commits May 27, 2026 19:05

gz force-pushed the webconsole-format branch from 387166a to d6ca088 Compare May 28, 2026 02:37

gz enabled auto-merge May 28, 2026 02:38

gz added this pull request to the merge queue May 28, 2026

mythical-fred approved these changes May 28, 2026

View reviewed changes

Merged via the queue into main with commit da71240 May 28, 2026
1 check passed

gz deleted the webconsole-format branch May 28, 2026 04:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix profiler regression#6331

fix profiler regression#6331
gz merged 5 commits into
mainfrom
webconsole-format

gz commented May 27, 2026 •

edited

Loading

Uh oh!

Karakatiza666 left a comment

Uh oh!

mythical-fred left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mythical-fred left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mythical-fred left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

gz commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe Manual Test Plan

Uh oh!

Karakatiza666 left a comment

Choose a reason for hiding this comment

Uh oh!

mythical-fred left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mythical-fred left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mythical-fred left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

gz commented May 27, 2026 •

edited

Loading