bug: pdf splitting modifies returned csv elements

**Describe the bug**
When specifying `output_format` as `csv`, the response from the api is different when `split_pdf_page` is `True` or `False`. When `False`, the elements contain an extra metadata field: `text_as_html`. This also means the element id does not match.

**To Reproduce**
`_test_unstructured_client/integration/test_decorators.py::test_integration_split_csv_response` illustrates this, but is passing because it asserts on a shortened string.

**Expected behavior**
The response to be identical whether or not `split_pdf_page` is `True` or `False`.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: pdf splitting modifies returned csv elements #201

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

bug: pdf splitting modifies returned csv elements #201

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions