Skip to content

[WIP] Add AS-aware relay selection#1514

Draft
Mshehu5 wants to merge 6 commits into
payjoin:masterfrom
Mshehu5:ASN_asmap
Draft

[WIP] Add AS-aware relay selection#1514
Mshehu5 wants to merge 6 commits into
payjoin:masterfrom
Mshehu5:ASN_asmap

Conversation

@Mshehu5

@Mshehu5 Mshehu5 commented May 1, 2026

Copy link
Copy Markdown
Contributor

This PR is a proof of concept for AS-aware, stateless relay selection in payjoin-cli and is meant to move the discussion in #919 forward. Reviewers will probably want to read #919 for the full context behind the design and tradeoffs.

The implementation adds optional ASMap-based filtering for trusted directories and OHTTP relays then demonstrates relay selection that separates POST and POLL traffic without storing a selected relay or relay index. Relay ordering is derived from the receiver key, request/message type and short time windows with polling avoiding POST-reserved AS buckets for nearby windows.

I also added comments around parts of the implementation that may be candidates for moving into the payjoin crate or possibly into a reusable external crate. These comments are suggestions not final API proposals and reviewer feedback would be very welcome. One reason for calling these pieces out is the concern raised in #919: asking every downstream wallet to reimplement this behavior could create extra work and consistency problems.

Disclosure: co-authored by Codex

Pull Request Checklist

Please confirm the following before requesting review:

@coveralls

coveralls commented May 1, 2026

Copy link
Copy Markdown
Collaborator

Coverage Report for CI Build 26363362204

Coverage increased (+0.2%) to 85.299%

Details

  • Coverage increased (+0.2%) from the base build.
  • Patch coverage: 148 uncovered changes across 4 files (650 of 798 lines covered, 81.45%).
  • 23 coverage regressions across 3 files.

Uncovered Changes

File Changed Covered %
payjoin-cli/src/app/v2/relay_selection.rs 499 442 88.58%
payjoin-cli/src/app/v2/mod.rs 126 85 67.46%
payjoin-cli/src/app/v2/ohttp.rs 76 45 59.21%
payjoin-cli/src/app/config.rs 88 69 78.41%

Coverage Regressions

23 previously-covered lines in 3 files lost coverage.

File Lines Losing Coverage Coverage
payjoin-cli/src/app/v2/mod.rs 21 58.33%
payjoin-cli/src/app/config.rs 1 78.97%
payjoin-cli/src/app/v2/ohttp.rs 1 63.22%

Coverage Stats

Coverage Status
Relevant Lines: 14373
Covered Lines: 12260
Line Coverage: 85.3%
Coverage Strength: 382.03 hits per line

💛 - Coveralls

@arminsabouri

Copy link
Copy Markdown
Contributor

cc @nothingmuch to approach Ack/Nack

Mshehu5 added 5 commits May 20, 2026 16:44
Replace the singular v2 directory setting with a trusted
directory list while preserving the CLI override behavior.
Load ASMap settings behind an optional feature and reject ASMap
configuration when the binary was built without support.
Choose relay order from the receiver key, request kind, time
window and relay bucket
Route key fetching and protocol requests through selected relay
candidates with failover for retryable network errors.
@Mshehu5 Mshehu5 force-pushed the ASN_asmap branch 2 times, most recently from 14b46f8 to 364ce27 Compare May 24, 2026 13:08
Add ASMap/OHTTP e2e observation test that runs the
payjoin-cli v2 flow against public directories and relays. The test
resolves each endpoint, maps IPs through the ASMap file, and writes a
Markdown summary showing selected directories, relay eligibility, and
observed ASNs.

Also log OHTTP relay attempts so CI output shows which relays were used during the live flow.

@nothingmuch nothingmuch left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approach ack

not 100% sure about relaymanager getting removed it may still have a job (keeping track of failures, arguably important enough to be in scope for reference client)

relay selection looks good, but i didn't review carefully since only seeking approach ack

Comment on lines +62 to +63
Post(MessageKind),
Poll(MessageKind),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is message kind for diagnostic purposes?

@bc1cindy bc1cindy left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is looking good!

an io-style feature seems viable, instead of a separate crate

since this is well underway, i'll close #1452


pub(crate) const WINDOW_SECS: u64 = 30;
const CLOCK_SKEW_WINDOWS: i64 = 2;
const POST_RESERVED_COUNT: usize = 3;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at time t_i, avoid making any real GET requests (i.e. dummy GET requests are OK) using the relays
whose hash is lexicographically smallest not just at time t_i, but at times t_{i-2}..t_{i+2} or at least
t_{i-1}..t_{i+1}
they should always use a different relay for the post and the get, avoiding the worst case leak

k=3 and skew=2 are fixed at the upper end of #919 (comment)

with 3-5 public OHTTP relays, the reservation budget (up to 15 buckets) can swallow the whole pool, so POLL falls back (286-289) to POST-reserved buckets, silently violating the spec when poll_candidates is empty (small pool)

picking k and skew based on pool size (k=2/skew=1 when small) would reduce reserved bucket to 6 and avoid the fallback for B ≥ 7, while keeping max protection when B ≥ 16. The remaining edge case (B ≤ 6) would still hit the warn!, possibly worth surfacing to the caller, maybe with a typed error

makes sense?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that makes sense. Fixed k=3 / skew=2 is good for larger pools but too aggressive for small relay sets since it can reserve every bucket.

///
/// The ordering is recomputed from the receiver key and current time, so no
/// relay-selection progress needs to be stored between requests.
pub(crate) fn select_relays_for_request(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be good to add tests for the deterministic selection

selection_hash stability (interop), deterministic ordering per (pubkey, RequestKind, TimeWindow), POLL/POST exclusion, and round-robin fairness

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah was trying to keep PR small and only keep essential so it can be easier to review and get approach ack with the smaller PR's test should be added

bail!("No valid relays available for the selected directory {}", directory.url.as_str());
}

relays.shuffle(&mut thread_rng());

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, relays.shuffle seems inert as ordered_candidates reorders everything afterwards

leftover of the old shuffle?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, good catch. This looks like leftover behavior from the old random relay ordering. Since request-time selection now goes through ordered_candidates, the initial relays.shuffle does not affect POST/POLL ordering. I’ll remove it.

Comment thread payjoin-cli/Cargo.toml

[dependencies]
anyhow = "1.0.99"
asmap = { version = "0.1.0", optional = true }

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be good to know what the team think about adding this dep to parse the binary, especially if we are moving to the payjoin crate

is it a good idea or would it be better to have our own parser? #1452 (comment)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did check out the library also saw a contibutor to asmap bitcoin core library give it a star but more eyes on the library would be nice

Comment on lines +77 to +79
pub struct AsmapConfig {
#[serde(rename = "asmap_file")]
pub asmap: LoadedAsmap,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a configured, per-integration asmap file is a client fingerprint two ways: (1) only some integrations load one, so the AS-aware ones stand out (uses-it-or-not), and (2) selection is deterministic from the asmap, so even those that do load one diverge unless they're on the exact same snapshot.

could the lib ship one bundled snapshot per version instead, so every integration on a given version has the same file with nothing to fetch?

it's ~1.5 MB, and asmap-data updates ~monthly (irregular, gaps up to ~2 months), but relay ASNs are stable so across versions keeping everyone consistent

https://github.com/bitcoin-core/asmap-data

wdyt?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No answer for this yet , this might require more discussion will get back to you

Comment on lines +80 to +83
#[serde(default)]
pub user_public_ips: Vec<IpAddr>,
#[serde(default)]
pub user_asns: Vec<u32>,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given a list of trusted directories and relays, these lists should be filtered to exclude servers that share an AS with the user. This is potentially tricky as it users to be able to determine their public IP(s).

requiring the user's own ASN/IP makes AS-aware mode operator-only, a NAT'd wallet can't supply it without an external lookup, which is a privacy leak of its own

#919 lists four AS-overlap risks:

  • sender∩receiver → inherent: relay selection can't fix it (it's the two endpoints themselves). VPN/Tor territory. out of scope.

of the three that selection can address, two don't need your own AS:

  • relay∩directory → exclude relays in the directory's AS. Needs only the directory + relay ASes.
  • relay∩relay → the windowed ordering derived from the receiver key. Needs only the relays' ASes.

both use IPs you resolve via DNS anyway to connect, like Bitcoin Core's asmap, which diversifies its peers by AS (anti-eclipse) and never learns its own ASN; you bucket the servers, not yourself.

  • only user-AS exclusion user∩directory needs your own AS, and thats the one that could be dropped to the network layer (VPN/Tor) , not the selector:
    • a directory's AS (datacenter) vs a user's AS (residential/mobile) almost never coincide.
    • the sender uses the directory that came in the URI, so it can't pick a different one to escape its own AS. it can't be addressed in selection.

so, if we drop this and keep just:

  • directory-AS exclusion relay∩directory
  • windowed ordering relay∩relay

that way both use only server ASes, zero input, so AS-aware selection can be on by default for everyone, which removes the uses-it-or-not fingerprint, default and zero-input, instead of operator-only.

makes sense?

Comment on lines +467 to +469
pub(crate) struct SystemNetwork {
#[cfg(feature = "asmap")]
asmap: Option<LoadedAsmap>,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with the asmap bundled and the user-AS input dropped, could this Option and the random fallback go, so AS-aware is the default instead of an opt-in?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants