[Experimental] Shared-memory transport for large node IPC payloads by JanProvaznik · Pull Request #14054 · dotnet/msbuild

JanProvaznik · 2026-06-12T15:32:30Z

Summary

Experimental / exploratory — not for merge as-is. This PR adds an opt-in shared-memory fast path for large MSBuild node-IPC payloads and the instrumentation used to measure it. It targets the -mt (multithreaded) build, where non-thread-safe tasks (notably Csc/Vbc, which are not [MSBuildMultiThreadableTask]) are routed to out-of-proc TaskHost sidecars and their TaskHostConfiguration/result packets cross the named-pipe transport.

The named pipe still carries every packet header and all small/control packets inline, so framing, version negotiation, disconnect detection, node reuse and shutdown are unchanged. Only packet bodies ≥ 8 KB are moved through a per-direction memory-mapped slot.

Problem (measured)

The out-of-proc transport sends serialized packets with a synchronous PipeStream.Write on a per-node drain thread (NodeProviderOutOfProcBase.DrainPacketQueue). That write is backpressure-bound: it blocks until the child drains the 128 KB kernel pipe buffer.

Instrumenting the actual codepath (MSBUILDIPCSTATS, see IpcTransferStats) over a deterministic OrchardCore.Cms.Web -t:Rebuild -mt -nr:false — 434 MB of large (≥ 8 KB) packet bodies — the main process spends:

codepath (main process, large packets)	named pipe (baseline)	shared memory	speedup
send (~5,000 pkts / 434 MB)	83.4 / 109.4 s (3.8–5.0 MB/s)	0.38 / 0.40 s (~1.1 GB/s)	~210–290×
receive (1,664 pkts / 120 MB)	0.32 / 0.34 s	0.054 / 0.078 s	~4–6×
total transport time	83.7 / 109.8 s	0.43 / 0.48 s	~190–250×

(2 runs each; same binary, only the env var toggles the feature.) The ~5 MB/s send throughput confirms the cost is blocking on backpressure, not raw copy bandwidth.

Approach

A spare high bit of the 32-bit packet-length header field flags "body delivered via shared memory" (independent of the existing type-byte extended-header bit).
NodeSharedMemoryChannel: one memory-mapped slot (1 MB) + two named semaphores per direction, strictly single-producer/single-consumer. Writer creates, reader opens (race-free: the writer creates its slot before the flagged header reaches the pipe; the reader only opens after observing the header). Payloads larger than the slot stream in chunks.
Ordering is direction-specific (measured): the parent sends configs back-to-back and must write the header first (body-first was ~25× slower — 12–14 s — from AcquireEmpty stalls). The child sends results spaced by task execution, so it publishes the body before the header, which collapses the parent's receive time from ~1.7 s to ~0.05 s.

Correctness & safety

Opt-in via MSBUILDSHAREDMEMORYIPC=1; default behavior is byte-for-byte unchanged.
The env var is inherited by launched nodes, so both endpoints agree. The measurement runs also set a distinct MSBUILDNODEHANDSHAKESALT so flagged/unflagged nodes can never pair (defense against node reuse mixing).
Windows-only at runtime (named MMF/semaphores are cross-process there) and #if NET — the .NET Framework / CLR2 task host always uses the pipe.
No serialization change: both paths still do the full Translate/BinaryTranslator round-trip; only the byte transport differs.
Validated: a standalone two-role protocol test (round-trips up to 5 MB, multi-chunk, byte-for-byte); every instrumented OrchardCore build exits 0; tracing confirms both-side engagement (~16 MB offloaded per project across all TaskHost children).

End-to-end effect (for context)

Full -mt Rebuild wall-clock improved only ~2–6% best case (often within machine noise) because that 80–110 s of pipe-write is largely overlapped across parallel TaskHost drain threads and off the critical path. When IPC is forced to be the bottleneck (MSBUILDFORCEALLTASKSOUTOFPROC=1), shm is a consistent ~5% faster with no overlap between arms. The headline is the transport codepath number, not e2e.

Limitations / open questions (why this is a draft)

Cross-platform: needs a Unix story (named semaphores/MMF differ); currently Windows-gated.
Engagement should be negotiated in the handshake, not via an env-var + salt trick, before this could ship.
Node reuse across builds is only lightly exercised here (measurements use -nr:false); slot lifetime/renaming across reconnects needs hardening.
Threshold (8 KB) and slot size (1 MB) are untuned.
IpcTransferStats is measurement scaffolding (opt-in, zero overhead when off) — would be dropped or moved behind ETW for a real change.
A zero-copy read (deserialize directly from the MMF view) was evaluated and rejected: the reader-side copy already runs at ~1.5–2.1 GB/s, so it would save ~30–60 ms/build for real correctness risk on the IPC path.

Try it

set MSBUILDSHAREDMEMORYIPC=1
set MSBUILDNODEHANDSHAKESALT=shmopt
dotnet msbuild <proj> -t:Rebuild -mt

Add MSBUILDIPCSTATS=1 (and optionally MSBUILDIPCSTATSFILE=<path>) to dump the per-mechanism transfer timings at process exit.

Offloads packet bodies >=8KB from the named pipe to a per-direction memory-mapped slot (1MB) + two semaphores in out-of-proc TaskHost / worker communication. The pipe still carries every header and small/control packets inline, so framing, versioning, disconnect and reuse are unchanged. Measured on a deterministic OrchardCore.Cms.Web -mt Rebuild (434MB of large packets): main-process transport time drops from ~84-110s (synchronous, backpressure-bound PipeStream.Write at ~5 MB/s) to ~0.43-0.48s (~1 GB/s), a ~190-250x reduction on the codepath. Opt-in via MSBUILDSHAREDMEMORYIPC=1, Windows-only, #if NET. Includes MSBUILDIPCSTATS measurement scaffolding. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Experimental] Shared-memory transport for large node IPC payloads#14054

[Experimental] Shared-memory transport for large node IPC payloads#14054
JanProvaznik wants to merge 1 commit into
mainfrom
dev/janprovaznik/shared-memory-ipc

JanProvaznik commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JanProvaznik commented Jun 12, 2026

Summary

Problem (measured)

Approach

Correctness & safety

End-to-end effect (for context)

Limitations / open questions (why this is a draft)

Try it

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant