Skip to content

diogogithub/sota-lens

SotA Lens

SotA Lens artwork

DOI arXiv Python tests Deploy GitHub Pages Build JOSS manuscript License: MIT

SotA Lens is an open-source toolkit for exploratory literature mapping through citation networks. It helps researchers move from a raw scholarly corpus to an inspectable graph, ranked signals, and a browser-based visual explorer that can be used before a formal scoping review, systematic mapping study, or state-of-the-art review is fully stabilised.

The project combines:

Why SotA Lens

Formal evidence-synthesis frameworks such as PRISMA are essential once a review protocol is defined. In practice, however, many research projects begin earlier: the topic may still be broad, the vocabulary unstable, and the neighbouring communities only partially visible.

SotA Lens supports that upstream exploratory stage. Rather than replacing screening or critical review, it helps researchers understand the shape of a field by surfacing:

  • influential and highly connected papers;
  • recurring authors and subject clusters;
  • community structure within a citation graph;
  • promising subtopics and adjacent areas;
  • candidate boundaries for a later formal review protocol.

Main features

SotA Lens can:

  • ingest article metadata from CSV files;
  • parse common reference-list encodings;
  • normalise DOI-based identifiers;
  • construct directed citation graphs with NetworkX;
  • export .gexf files for Gephi;
  • generate web-ready JSON datasets from CSV, GEXF, or GEXF enriched with CSV metadata;
  • compute graph summaries, author rankings, subject-term rankings, and community summaries;
  • provide a static browser explorer for local CSV/GEXF inspection.

All browser-side uploads are processed locally in the user's browser. The website does not upload user-selected files to a server.

Repository structure

sota-lens/
├── src/sota_lens/           Python package and CLI
├── tests/                   Pytest-based tests
├── website/                 Static GitHub Pages site
│   └── assets/              CSS, JavaScript, data, and SotA Lens artwork
├── paper/                   JOSS-ready manuscript and bibliography
├── docs/                    Supporting documentation
├── data/case-study/         Bundled DPM/SAR case-study corpus and graph artifacts
└── .github/workflows/       CI and GitHub Pages deployment

Installation

git clone https://github.com/diogogithub/sota-lens.git
cd sota-lens
python -m venv .venv
source .venv/bin/activate
python -m pip install -U pip
python -m pip install -e ".[dev]"

Run the tests:

pytest

Quick start

Build a citation graph and a web dataset from the bundled case study:

sota-lens build \
  --articles data/case-study/articles_full.csv \
  --out build/dpm-sar \
  --web-json build/dpm-sar/web-dataset.json

Inspect an existing graph:

sota-lens inspect data/case-study/graph.gexf

Convert a Gephi/GEXF graph into a browser dataset while preserving richer article metadata from CSV:

sota-lens web-data \
  --graph data/case-study/graph.gexf \
  --articles data/case-study/articles_full.csv \
  --out build/case-study-demo.json \
  --limit 500

The --articles argument is optional. Without it, web-data still reads the GEXF topology, but the browser dataset will only contain the metadata present inside the GEXF. With it, the output combines the GEXF graph/topology with the fuller CSV fields such as title, authors, subjects, year, URL, and depth.

Input CSV format

The most useful input is a CSV with at least the following columns:

Column Meaning
doi Stable publication identifier used as the node id
title Publication title
authors Author list
year Publication year
link or url Publication URL
subject Subject terms, if available
references Referenced DOIs or DOI-like identifiers
depth Optional exploration depth from the seed corpus

A minimal CSV can contain only doi, title, and references.

Website

The static site in website/ is split into separate pages:

  • index.html — public landing page;
  • methodology.html — method framing and relation to review workflows;
  • tool.html — interactive local-file explorer;
  • publication.html — paper, citation, repository information.

The tool page opens empty by default. The bundled case study is loaded only if the user clicks Try paper case study or opens tool.html?demo=paper.

Case study

The included case study maps literature around Dynamic Projection-Mapping and Spatial Augmented Reality. In the underlying exploration, two broad seed queries produced 200 seed results and a citation graph with 2,198 publication nodes and 8,249 citation edges. The browser demo uses a reduced subset for responsiveness, while the repository includes the richer supporting artifacts in data/case-study/.

Publication and archival status

SotA Lens is distributed as open-source research software, archived through Zenodo, and accompanied by an arXiv methodology preprint.

  • Software archive: 10.5281/zenodo.19860899.
  • arXiv preprint: arXiv:2605.16333.
  • Preprint PDF: https://arxiv.org/pdf/2605.16333.
  • JOSS: a JOSS-ready manuscript is available under paper/paper.md, but the project is not currently under JOSS review. The initial submission was procedurally rejected because JOSS currently requires a longer public open-development history for projects developed privately before public release. The repository will remain public while future work proceeds through public issues, pull requests, and releases.

Citation

If you use SotA Lens, please cite the Zenodo software archive:

@software{sotalens2026,
  author  = {Peralta Cordeiro, Diogo},
  title   = {SotA Lens: Citation-network mapping for exploratory state-of-the-art reviews},
  year    = {2026},
  version = {0.1.0},
  doi     = {10.5281/zenodo.19860899},
  url     = {https://github.com/diogogithub/sota-lens},
  note    = {Computer software}
}

The extended methodology preprint can be cited as:

@misc{peraltacordeiro2026sotalenspreprint,
  author        = {Peralta Cordeiro, Diogo},
  title         = {SotA Lens: Citation-network mapping for exploratory state-of-the-art reviews},
  year          = {2026},
  eprint        = {2605.16333},
  archivePrefix = {arXiv},
  primaryClass  = {cs.DL},
  url           = {https://arxiv.org/abs/2605.16333}
}

Contributing

Contributions, issues, and suggestions are welcome. Please see CONTRIBUTING.md.

License

SotA Lens is released under the MIT License. See LICENSE.