What would you like?
I would like Atuin to preserve command frequency information even when duplicate command history is deduplicated or compacted.
One possible design: when a command is reissued and it matches an existing history identity, Atuin could bump a counter on the existing history row instead of only storing another duplicate row. The identity could be the same tuple used by atuin history dedup today, such as (command, cwd, hostname), or whatever scope Atuin considers appropriate.
This could be an optional mode/config setting if changing the default append-only behavior would be too surprising.
Why?
I recently deduplicated a local Atuin database because shell Up-arrow/search cold starts were slower than expected. The database had about 73k rows and a 92 MB history.db. atuin history dedup --dupkeep 1 removed roughly 30k duplicate rows and, after SQLite VACUUM, reduced the DB to about 54 MB.
That helped size and search latency, but it also loses useful frequency information. For example, after dedup, commands like ls or git status --short only retain one row per command/cwd/hostname combination, so Atuin can no longer answer “how often do I run this exact command in this location?” without keeping all duplicate rows.
A persisted frequency count would let users keep a compact history while still supporting ranking/statistics based on real command usage.
Possible shape
- Add a count/frequency column to history entries, or a separate frequency table keyed by command identity.
- On repeated command execution, increment the counter for the matching entry.
- Preserve existing timestamp behavior clearly, for example keep first seen / last seen timestamps or update last seen timestamp.
- Have
atuin history dedup optionally merge duplicate rows into the retained row by adding their counts, instead of deleting count information.
- Expose the count in search/history output formatting and stats if useful.
Notes
I realize Atuin sync semantics may make this trickier than a simple local SQLite change. If a counter is synced, it may need conflict-safe merging rather than last-write-wins behavior.
What would you like?
I would like Atuin to preserve command frequency information even when duplicate command history is deduplicated or compacted.
One possible design: when a command is reissued and it matches an existing history identity, Atuin could bump a counter on the existing history row instead of only storing another duplicate row. The identity could be the same tuple used by
atuin history deduptoday, such as(command, cwd, hostname), or whatever scope Atuin considers appropriate.This could be an optional mode/config setting if changing the default append-only behavior would be too surprising.
Why?
I recently deduplicated a local Atuin database because shell Up-arrow/search cold starts were slower than expected. The database had about 73k rows and a 92 MB
history.db.atuin history dedup --dupkeep 1removed roughly 30k duplicate rows and, after SQLiteVACUUM, reduced the DB to about 54 MB.That helped size and search latency, but it also loses useful frequency information. For example, after dedup, commands like
lsorgit status --shortonly retain one row per command/cwd/hostname combination, so Atuin can no longer answer “how often do I run this exact command in this location?” without keeping all duplicate rows.A persisted frequency count would let users keep a compact history while still supporting ranking/statistics based on real command usage.
Possible shape
atuin history dedupoptionally merge duplicate rows into the retained row by adding their counts, instead of deleting count information.Notes
I realize Atuin sync semantics may make this trickier than a simple local SQLite change. If a counter is synced, it may need conflict-safe merging rather than last-write-wins behavior.