Skip to content

Track command frequency when deduplicating repeated history entries #3522

@dhruv-anand-aintech

Description

@dhruv-anand-aintech

What would you like?

I would like Atuin to preserve command frequency information even when duplicate command history is deduplicated or compacted.

One possible design: when a command is reissued and it matches an existing history identity, Atuin could bump a counter on the existing history row instead of only storing another duplicate row. The identity could be the same tuple used by atuin history dedup today, such as (command, cwd, hostname), or whatever scope Atuin considers appropriate.

This could be an optional mode/config setting if changing the default append-only behavior would be too surprising.

Why?

I recently deduplicated a local Atuin database because shell Up-arrow/search cold starts were slower than expected. The database had about 73k rows and a 92 MB history.db. atuin history dedup --dupkeep 1 removed roughly 30k duplicate rows and, after SQLite VACUUM, reduced the DB to about 54 MB.

That helped size and search latency, but it also loses useful frequency information. For example, after dedup, commands like ls or git status --short only retain one row per command/cwd/hostname combination, so Atuin can no longer answer “how often do I run this exact command in this location?” without keeping all duplicate rows.

A persisted frequency count would let users keep a compact history while still supporting ranking/statistics based on real command usage.

Possible shape

  • Add a count/frequency column to history entries, or a separate frequency table keyed by command identity.
  • On repeated command execution, increment the counter for the matching entry.
  • Preserve existing timestamp behavior clearly, for example keep first seen / last seen timestamps or update last seen timestamp.
  • Have atuin history dedup optionally merge duplicate rows into the retained row by adding their counts, instead of deleting count information.
  • Expose the count in search/history output formatting and stats if useful.

Notes

I realize Atuin sync semantics may make this trickier than a simple local SQLite change. If a counter is synced, it may need conflict-safe merging rather than last-write-wins behavior.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions