Skip to content

fix(types): make AutohighlightResult.rank Optional to tolerate missing field#207

Open
tsushanth wants to merge 1 commit into
AssemblyAI:masterfrom
tsushanth:fix/issue-148-autohighlight-rank-optional
Open

fix(types): make AutohighlightResult.rank Optional to tolerate missing field#207
tsushanth wants to merge 1 commit into
AssemblyAI:masterfrom
tsushanth:fix/issue-148-autohighlight-rank-optional

Conversation

@tsushanth

Copy link
Copy Markdown

Summary

Closes #148.

The Auto Highlights API has been observed in production to return individual auto_highlights_result.results entries with no rank field set. The SDK currently declares rank: float as required, so the Pydantic validator raises:

pydantic.v1.error_wrappers.ValidationError: 1 validation error for TranscriptResponse
auto_highlights_result -> results -> 2 -> rank field required (type=value_error.missing)

The error is raised while parsing the overall TranscriptResponse — so a missing-field-on-one-entry becomes a hard failure that loses the entire transcript (text, utterances, words, sentiment, every other audio-intelligence result), not just the affected highlight.

Fix

 class AutohighlightResult(BaseModel):
     count: int
-    rank: float
+    rank: Optional[float] = None
     text: str
     timestamps: List[Timestamp]

A missing rank parses through cleanly and the absence is observable to callers via rank is None. No silent default-value substitution; the field is genuinely missing, and the type reflects that.

The change is narrowly scoped to the single field surfaced by the reporter. The other fields on AutohighlightResult (count, text, timestamps) remain required — they haven't been seen missing in the wild and aren't speculative-loosened here.

Tests

New regression test in tests/unit/test_auto_highlights.py:

def test_auto_highlights_parses_result_without_rank(httpx_mock: HTTPXMock):
    """
    Regression for #148. ... A missing rank parses as None and the rest
    of the response is preserved.
    """
    mock_response = factories.generate_dict_factory(
        AutohighlightTranscriptResponseFactory
    )()
    del mock_response["auto_highlights_result"]["results"][0]["rank"]

    _, transcript = unit_test_utils.submit_mock_transcription_request(
        httpx_mock, mock_response=mock_response,
        config=aai.TranscriptionConfig(auto_highlights=True),
    )

    assert transcript.error is None
    assert transcript.auto_highlights.results[0].rank is None
    assert transcript.auto_highlights.results[0].text is not None

Verified the test fails on master (current required-rank contract) and passes on this commit.

Test plan

  • pytest tests/unit/test_auto_highlights.py -v — 3/3 pass (2 existing + 1 new)
  • Verified new test fails when the rank: Optional[float] = None change is reverted via git stash
  • Other AutohighlightResult fields (count, text, timestamps) still required — no inadvertent loosening
  • Type annotation accurate: Optional[float] lets rank is None work as expected at the call site

Notes for review

  • I considered a broader loosening (e.g. making count or text Optional too as forward-compat) but kept the diff scoped to the actually-reported field. If the API is known to omit other fields in newer responses, happy to widen — let me know.

…g field

Closes AssemblyAI#148.

The Auto Highlights API has been observed in production to return
individual `auto_highlights_result.results` entries with no `rank`
field set. The SDK currently declares `rank: float` as required, so
the Pydantic validator raises a `ValidationError` while parsing the
overall `TranscriptResponse` — turning a missing-field-on-one-entry
into a hard failure that loses the entire transcript (text,
utterances, words, sentiment, every other audio-intelligence result).

Make `rank` Optional with a default of `None` so a missing rank parses
through cleanly. The absence is observable to callers via `rank is
None` — no silent default-value substitution.

The change is narrowly scoped to the single field surfaced by the
reporter. The other fields on `AutohighlightResult` (count, text,
timestamps) remain required.

Regression test deletes `rank` from the first highlight in the mock
response, asserts the TranscriptResponse parses, the affected
entry's rank is None, and other entries still have their rank set.
Verified that the new test fails on master (the existing required-rank
contract) and passes on this commit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pydantic validation error: rank field missing from auto_highlights_result

1 participant