Tags: SocketDev/socket-python-cli
Tags
Add --include-dirs flag to scan normally-excluded directories (#235) Adds the ability to opt directories that the CLI excludes from manifest discovery by default (build, dist, node_modules, .venv, etc.) back into the scan, for projects that keep manifest files under those names. - New --include-dirs flag (comma-separated directory names) re-includes any of the default-excluded names in manifest discovery. - --include-module-folders now functions as documented (re-includes the JS/TS module folders as a group); it was previously a no-op. Bumps version to 2.4.10 with changelog and CLI reference updates. Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
feat: stream CLI log transcripts and run status to Socket backend (#201) * feat: stream CLI logs to /python-cli-runs/* lifecycle endpoints Buffer the CLI's own log records and POST them in 5s batches to a new register/upload/finalize lifecycle so the admin dashboard renders what the user saw in their terminal alongside the run's terminal status. New modules: - core/cli_run.py — register_cli_run / finalize_cli_run helpers - core/log_uploader.py — BatchedLogUploader (daemon-thread flusher, chunked under the 256KB cap, swallows network errors, drains on shutdown) and UploadingLogHandler routing log records to it - core/streaming.py — setup_streaming() wires both into the socketcli and socketdev loggers, forces them to DEBUG so uploads capture the full history regardless of local terminal verbosity, and returns a teardown callable for the caller to register with atexit - set_run_status() propagates the terminal status through the teardown; socketcli.py exception handlers call it for KeyboardInterrupt (cancelled), uncaught Exception (failure), and any SystemExit with a non-zero code (failure) so sys.exit() paths inside main_code surface correctly instead of defaulting to success Best-effort end-to-end: registration failures fall back to no-streaming and never block the scan. Opt out with --disable-server-log-streaming. Tested against local depscan with the matching /v0/python-cli-runs/* endpoints; 173 unit tests pass. * chore: drop per-batch size chunking to match upstream uploader The 256 KB ceiling I added speculatively when the server cap was 256 KB no longer matches the reference implementation we're mirroring, which sends each flush as a single POST regardless of size. With the server cap now well above any plausible single-flush volume, chunking is unnecessary and divergent — drop it. Removes _chunk_by_size, _MAX_BATCH_BYTES, and the four chunking tests. _flush now POSTs the entire buffered batch as one request. * chore: drop integration field from cli-run register payload The server-side handler now rejects unknown fields and the integration column has been removed from the schema (it was plumbed end-to-end but never displayed, filtered, or grouped on). Stop sending it. Removes the integration parameter from register_cli_run and setup_streaming, drops the corresponding wiring in socketcli.py, and prunes the now-pointless test_register_cli_run_omits_integration_when_falsy case. * feat: link cli-run to its full_scan via report_run_id on finalize The depscan side now joins cli_run → full_scans → repositories via the report_run_id field to surface the scanned repo in the admin dashboard view of each CLI run. Wire the CLI to send the full_scan_id (== the report_run_id depscan expects) when it has one. - finalize_cli_run accepts an optional report_run_id and includes it (nullable) in the POST body. - streaming.py adds a module-level _report_run_id holder and a set_report_run_id() setter; teardown passes it through to finalize. - socketcli.py captures diff.id at a single chokepoint after the diff-producing branches converge, guarded against the NO_DIFF_RAN / NO_SCAN_RAN sentinel values. The field is nullable end-to-end so CLI invocations that fail before producing a diff (or are run in modes that don't create one) still finalize cleanly. * chore: bump version to 2.2.87 for streaming logs feature - socketsecurity/__init__.py: __version__ → 2.2.87 - pyproject.toml: version → 2.2.87 - CHANGELOG.md: new 2.2.87 entry describing the streaming-logs feature Required by .github/workflows/version-check.yml, which fails the PR if the version isn't incremented relative to main. * feat: flip streaming logs to opt-in via --upload-logs The Socket backend changed its register contract so that log streaming is now opt-in rather than default-on. The CLI always calls register (cheap, lets the server force-enable for specific orgs) and gates the downstream upload/finalize lifecycle on the response. Wire changes: - POST /v0/python-cli-runs body adds a required `share_logs` field. - Response: { log_streaming_enabled: bool, run_id: <uuid|null> }. When log_streaming_enabled is false, run_id is null and the CLI skips the upload + finalize calls entirely. CLI changes: - New `--upload-logs` flag (default off). When set, the CLI sends share_logs=true on register. - Removed `--disable-server-log-streaming` — default is off, so an opt-out flag no longer makes sense. - register_cli_run takes a required share_logs arg and returns None whenever log_streaming_enabled is false (whatever the reason: client opted out, server denied, server unreachable). Bumps version to 2.2.88 and updates the CHANGELOG entry to reflect the opt-in shape. * chore: regenerate uv.lock for version 2.4.8 The version-check workflow added in main now requires uv.lock to be updated whenever pyproject.toml changes, and the SFW smoke jobs run `uv sync --locked`, which fails on an out-of-sync lockfile. * feat: add --no-upload-logs to explicitly decline log upload Backend now distinguishes "user wants out" from "user said nothing": - `decline_logs: true` (the new flag) overrides every other signal including the server-side org-level override, so users with a legal/consent reason for no upload get a guaranteed off. - `share_logs: true` (the existing --upload-logs) opts in. - Otherwise the server applies its own policy. Argparse enforces that --upload-logs and --no-upload-logs are mutually exclusive (post-parse check via parser.error so dash/underscore aliases on either side still coexist with the same dests). register_cli_run now sends both `share_logs` and `decline_logs` in the payload; setup_streaming forwards both. CHANGELOG 2.4.8 entry updated to call out --no-upload-logs alongside --upload-logs. * chore: bump version to 2.4.9 2.4.8 already shipped with the full-scan retry fix; this release adds the opt-in --upload-logs streaming channel. * chore: bump __version__ to 2.4.9 version-check reads socketsecurity/__init__.py; the previous bump only touched pyproject.toml. * refactor: address PR review on streaming logs - collapse upload_logs/decline_logs config fields into a single Optional[bool] (tri-state); projection to share_logs/decline_logs happens at the setup_streaming call site. - streaming.py: replace module-level globals + atexit teardown with a StreamingLogs context manager. set_run_status disappears entirely — __exit__ infers the run status from the exception that closed the with block. set_report_run_id is now an instance method. Logger handler wiring iterates over (cli_logger, sdk_logger) instead of repeating itself. - log_uploader.py: drop _LEVEL_MAP, use logging.getLevelName directly. Wire format changes WARN/ERROR-for-CRITICAL to WARNING/CRITICAL. - log_uploader.py: tidy BatchedLogUploader.stop so the final _flush always runs and the thread shutdown only runs when there is a thread. - socketcli.py: wrap main_code body in 'with setup_streaming(...) as streaming:'; cli()'s exception handlers no longer need to set status before re-raising. - tests updated for the CM API and the new level strings. * test(log_uploader): cover cross-thread emit during active flush Adds a deterministic regression test that parks the uploader thread inside _flush() via a threading.Event, emits a real log record from the main thread while _FLUSH_GUARD.active is set on the uploader thread, and asserts the record lands in the next batch (not dropped). Documents that the thread-local guard only blocks recursive emits on the uploader thread itself. * refactor(config): use store_const + mutually exclusive group for log-upload flags --upload-logs and --no-upload-logs now share dest='upload_logs' via store_const (True/False) in an argparse mutually exclusive group. argparse handles the conflict natively; drop the manual mutex check and the tri-state mapping in from_args. args.upload_logs is now Optional[bool] directly. The undocumented --upload_logs / --no_upload_logs snake_case aliases are dropped (introduced earlier in this same unreleased branch; no users to break). * refactor: push upload_logs tri-state down to the API boundary - register_cli_run now takes upload_logs: Optional[bool] directly and projects to share_logs/decline_logs only when building the JSON payload. The invalid (share=True, decline=True) state is now unrepresentable above the wire boundary. - StreamingLogs.__init__ takes upload_logs instead of the two booleans; the call site in socketcli.py just passes config.upload_logs through. - Drop the setup_streaming alias; call sites use StreamingLogs directly. - Drop the dead 'except SystemExit: raise' in cli() — re-raising is the default for an unhandled exception path. - Tests updated for the new signatures. * fix(cli_run): broaden register exception handling to honor 'never break the scan' The earlier register_cli_run only caught APIFailure on the request and ValueError/JSONDecodeError on resp.json(). If the server returned JSON that parsed but wasn't an object (e.g. a list), body.get(...) would raise AttributeError and propagate out of StreamingLogs.__enter__, crashing the CLI before the scan even starts. Collapse the request + parse + field-extraction into a single try with a broad except so any unexpected failure — known or otherwise — falls back to no-streaming, matching the module docstring's promise. Add regression tests for non-dict JSON bodies and arbitrary unexpected exceptions from client.request.
Retry transient full-scan upload failures (502/503/504/408, dropped c… …onnections) (#232) * Retry transient full-scan upload failures (502/503/504/408, dropped connections) A full-scan upload can fail transiently at the gateway/connection level - an HTTP 502/503/504/408, a dropped or reset connection, or a client-side timeout - without the server having created the scan. The CLI previously made exactly one attempt, so an entire run (including a completed reachability analysis) died on a single transient failure even though a retried upload almost always succeeds. create_full_scan now retries the fullscans POST up to 3 total attempts with increasing waits (~10s, then ~30s, plus jitter) on transient failures only: APIBadGateway (502), APIConnectionError, APITimeout, and catch-all APIFailure whose embedded original_status_code is 408/503/504. Dedicated 4xx classes, catch-all 400s, and error payloads are never retried. In these failure modes the server never finished reading the request body, so no scan was created and a retry does not duplicate one; in the rare case where a gateway timeout races a request the server later completes, the extra scan is benign and superseded by the retry (as if the CLI had run twice). The retry loop lives inside the existing try/finally so the brotli-compressed .socket.facts.json.br temp files survive until every attempt has finished; fullscans.post rebuilds its lazy file loaders from the plain paths on every call, so re-invoking it per attempt is safe. Assisted-by: Claude Code:claude-opus-4-8 * docs: drop the 'retry almost always succeeds' claim from retry comments * Move transient-error classification into the SDK; simplify retry loop Address review feedback on the upload retry: - The retry decision now delegates to APIFailure.is_transient_error() (socketdev>=3.3.0, SocketDev/socket-sdk-python#93), which classifies by the HTTP status code the SDK records when raising. The CLI no longer encodes the SDK's exception hierarchy or parses status codes out of message text, so SDK restructuring can't silently break the classification. - The backoff schedule is now the single source of truth for the loop: FULL_SCAN_UPLOAD_BACKOFF_SCHEDULE_SECONDS = (10.0, 30.0, None), where each entry is the wait before the next attempt and the final None re-raises instead of retrying. FULL_SCAN_UPLOAD_MAX_ATTEMPTS is computed from its length. Note: uv.lock is intentionally not regenerated yet - socketdev 3.3.0 must be released to PyPI first (blocked on socket-sdk-python#93). * Lock socketdev 3.3.0 socketdev 3.3.0 is now released, unblocking the >=3.3.0 floor bump.
Pin @coana-tech/cli version; make reachability auto-update opt-in (#230) * Pin @coana-tech/cli version; make reachability auto-update opt-in The Python CLI auto-updated the reachability (Coana) engine to the latest published version on every --reach run via `npm install -g @coana-tech/cli`. Automatically pulling a brand-new engine version without opting in is undesirable for environments that need to review/approve dependency updates before adopting them. Run a fixed, pinned version (DEFAULT_COANA_CLI_VERSION = 15.3.22) via `npx @coana-tech/cli@<pinned>` instead, so the engine version only changes through a standard pip upgrade of this CLI. Opt into newest with `--reach-version latest`; pin an explicit version with `--reach-version <semver>`. The global `npm install -g` step is dropped entirely, so an existing global install is never auto-updated or downgraded. * Disable npx caching and add npm-install + node fallback for coana Mirror the Socket Node CLI's coana launcher: - Run the engine via `npx --yes --force` so the npx cache is bypassed; a corrupt or partial cache entry can no longer wedge a reachability run. - Fall back to `npm install --no-save --prefix <tmp> @coana-tech/cli@<ver>` + `node <bin>` when the npx launcher is missing or dies before coana starts (spawn error / signal / exit >= 128). Small positive exit codes are treated as real coana failures and are not retried. - Toggle with SOCKET_CLI_COANA_FORCE_NPM_INSTALL and SOCKET_CLI_COANA_DISABLE_NPM_FALLBACK. - Strip npm_package_* env vars before spawning coana to avoid E2BIG in large monorepos. Kept on version 2.4.7 (same unreleased version as the pin change). * Bump pinned @coana-tech/cli to 15.3.24 * Address PR review: per-version fallback cache, node prereq, accurate npx wording - M2: cache the npm-install fallback's resolved script path per version for the process lifetime (mirrors the Node CLI's installedCoanaScriptPathsByVersion), so a repeated fallback installs once instead of re-installing + leaking a temp dir each call. - M3: surface a clear error when `node` is missing in the fallback (instead of an opaque FileNotFoundError after a costly npm install), and add `node` to the up-front prereq check. - M1: correct the overstated 'npx --force disables the cache' wording in docstrings, docs, and CHANGELOG. The code already matches the Node CLI exactly (npx --yes --force); --force does not force a re-download of an already-cached pinned version, so the docs now describe what the flags actually do rather than claiming a cache bypass. Adds tests for per-version caching, node-missing, and real _resolve_coana_bin / _build_coana_node_cmd parsing. * Address review comments: Final annotation, atexit tmp cleanup, parametrized tests - Annotate DEFAULT_COANA_CLI_VERSION with typing.Final. - Register an atexit handler to remove the npm-install fallback's temp dirs. - Trim the over-long --force explanation in _spawn_coana's docstring and drop the inline comment that duplicated it. - Use try/finally in the cache-clearing test fixture. - Parametrize the spec-resolution, npx-version, and launcher-failure-heuristic tests. * Move launch-strategy rationale from the spec resolver to _spawn_coana The 'why npx, not npm install -g' explanation describes how coana is launched, not how a package spec string is built, so it belongs on _spawn_coana (per review). Leaves _resolve_coana_package_spec with a minimal docstring. * docs: show the real --reach-version default (15.3.24) in the Default column * docs: show real reach-flag defaults from the Coana CLI implementation Fill in the Default column for the flags whose defaults come from coana, verified against the @coana-tech/cli source (coana-package-manager/packages/cli): - --reach-analysis-timeout -> 600 (cli-core.ts: defaults to 600s when unset) - --reach-analysis-memory-limit -> 8192 (index.ts --memory-limit default) - --reach-concurrency -> 1 (index.ts --concurrency default) - --reach-min-severity -> info (no coana default = analyze all; info is the effective floor)
docs: correct remaining reachability reference gaps (#228) Reachability-reference fixes layered on current main (v2.4.5): - Document the uv + Enterprise-plan prerequisites the CLI enforces before running reachability (exit 3), and that per-ecosystem build toolchains are the analysis engine's runtime check, not a CLI pre-check. - Correct --reach-min-severity values to info/low/moderate/high/critical. - Document --reach-enable-analysis-splitting, --reach-detailed-analysis-log-file, --reach-lazy-mode, --reach-use-only-pregenerated-sboms. - Clarify --only-facts-file submits only the facts file when creating the full scan (no pre-existing scan required). - Note --reach creates a tier-1 full-application scan (scan_type=socket_tier1). Docs-only; the version bump + uv.lock are mandated by the sync-version hook.
Harden dependency review checks across PR types (#224) * ci: report e2e-* checks on fork and Dependabot PRs The e2e job is skipped on PRs that can't access repository secrets (forks and Dependabot). Because it's skipped via a job-level `if`, its matrix never expands, so the required e2e-* check contexts are never created and branch protection waits on them indefinitely, blocking merge. Add an e2e-bypass job whose `if` is the exact negation of the e2e job's run condition. It emits the same e2e-* check names with a passing status for fork/Dependabot PRs, satisfying branch protection without running the real tests. The two jobs are mutually exclusive and exhaustive: every PR runs exactly one. Signed-off-by: lelia <2418071+lelia@users.noreply.github.com> * ci: add dependency-review-gate aggregator check The Socket Firewall enterprise smoke job is the most meaningful supply-chain check for maintainer-added dependencies, but it can't be required directly: it's conditional (per-manifest, and free-vs-enterprise per author), so on most PRs it's legitimately skipped -- and a required check whose job is skipped sits at "Expected -- Waiting for status" forever, blocking merge (the same trap that stranded Dependabot PRs on the e2e-* checks). Add a dependency-review-gate job that always runs and collapses every smoke job into one pass/fail signal: it fails iff any job that ran ended in failure or was cancelled; success and skipped both pass. This is the single check intended to be marked required later -- it satisfies Dependabot/fork PRs (which run Firewall-free) and maintainer PRs (Firewall-enterprise) alike, and turns a Socket Firewall BLOCK into a merge-blocking failure instead of a non-required job nobody is forced to run. Scaffolding only: the gate is not yet added to branch protection's required checks (deferred until it's merged to main and observed reporting). Signed-off-by: lelia <2418071+lelia@users.noreply.github.com> * chore: bump CLI to 2.4.5 and require socketdev>=3.2.1 Follows the 2.4.4 release (SDK >=3.2.0) by picking up socketdev 3.2.1. Regenerates uv.lock to the published 3.2.1 release; no CLI logic changes. Signed-off-by: lelia <2418071+lelia@users.noreply.github.com> --------- Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
chore(deps): bump socketdev floor to >=3.2.0 (CE-225) (#222) Pick up socketdev 3.2.0, which adds OTHER = "other" to SocketCategory so the backend's "other" alert category no longer triggers the "Unknown SocketCategory" warning fallback (SDK PR #85). No CLI logic changes. Bump CLI to 2.4.1 (on top of the 2.4.0 license-details fix). uv.lock regenerated against socketdev 3.2.0. Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
feat(reach): add unified --exclude-paths, deprecate --reach-exclude-p… …aths (#227) Add a single --exclude-paths flag (Node CLI parity) that filters BOTH SCA manifest discovery and reachability analysis: - New Core matcher: anchored micromatch-style globs compiled to regex (no new deps). Scan-root-relative POSIX paths, '*' does not cross '/', '**' does, each pattern P expanded to [P, P/**]. Threaded into find_files via cli_config; no-op when unset. - Reach side unions --exclude-paths with the now-deprecated --reach-exclude-paths and forwards to coana --exclude-dirs. - Validation mirrors Node's assertValidExcludePaths (rejects negation, absolute paths, '..' traversal, degenerate match-everything; trailing slash stripped so '**/' is rejected). Accepts comma-strings and config-file lists. - --reach-exclude-paths soft-deprecated: still works, [DEPRECATED] in help, warns at runtime. Docs: document --exclude-paths under 'Path and File' (it affects every scan, not just reach), mark --reach-exclude-paths deprecated, and refresh the reachability flag table (--reach-analysis-timeout/-memory-limit primary names, --reach-debug, --reach-disable-external-tool-checks, defaults delegated to coana). Adds a CHANGELOG 2.4.3 entry and tests incl. the Node parity cases, validation, and config-file paths.
feat(reach): align reachability flags and coana env with Node CLI (#226) Bring the Python CLI's reachability surface to parity with the Node CLI: - --reach-disable-external-tool-checks -> coana --disable-external-tool-checks - forward SOCKET_CLI_VERSION + SOCKET_CALLER_USER_AGENT to coana (proxy is left to coana, which reads/inherits HTTPS_PROXY/HTTP_PROXY itself) - omit SOCKET_REPO_NAME/SOCKET_BRANCH_NAME for the default repo/branch sentinels - Node-style --reach-analysis-timeout/--reach-analysis-memory-limit as primary names, --reach-timeout/--reach-memory-limit kept as hidden aliases - --reach-debug -> coana --debug (global --enable-debug -> -d unchanged) - retry tier1 finalize with exponential backoff (3 attempts), never raising Memory-limit and concurrency are intentionally NOT hardcoded: coana already defaults to 8192 MB and concurrency 1, so the CLI omits the flags and lets coana apply them (and still forwards an explicit value when the user sets one). Splitting stays explicitly disabled (--disable-analysis-splitting) because coana defaults it ON. Removes stray always-on WARNING logging in the reachability runner. Adds a CHANGELOG 2.4.2 entry and tests for the flags/aliases, the coana command/env builder, and finalize retry.
Bundle pyenv in the Docker image for on-demand Python versions (#225) * Add pyenv to Docker image for on-demand Python versions Install pyenv (pinned to v2.7.1) just below uv, along with the Alpine build dependencies needed to compile CPython from source. This lets the bundled tooling build/install arbitrary Python versions on demand. Only the `pyenv` binary is symlinked onto the PATH; pyenv's shims directory is deliberately left off PATH so its shims don't shadow the system Python that the CLI runs on. bash is required since pyenv and the pyenv-installer are bash scripts. * Release 2.4.1: bundle pyenv in the Docker image Bump version to 2.4.1 and document the pyenv addition. The CLI is unchanged; this release exists to publish a Docker image that includes pyenv for on-demand Python version installation.
PreviousNext