feat(frontend): add HuggingFace media output rendering in result panel by juliethecao · Pull Request #5675 · apache/texera

juliethecao · 2026-06-13T05:22:27Z

⚠️ This PR is stacked on #5568. Until that lands, the diff below may also include earlier PRs' property editor changes depending on which base GitHub is showing. The new code in this PR is the media-type detection helpers, inline media rendering in the result table and row detail modal, and the backend string truncation fix. Once #5568 merges and this PR is retargeted to main, the diff should auto-clean to the PR 8 result-media changes only.

What changes were proposed in this PR?

Render image, audio, and video outputs from HuggingFace tasks inline in the workflow result panel instead of displaying raw data URLs as text.

New file — media-type.util.ts:

isImageUrl — detects data:image/ data URLs and common image extensions (.png, .jpg, .jpeg, .gif, .webp)
isAudioUrl — detects data:audio/ data URLs and common audio extensions (.mp3, .wav, .ogg, .m4a, .flac)
isVideoUrl — detects data:video/ data URLs, common video extensions (.mp4, .webm, .ogg), and the fal.media CDN host used by fal.ai text-to-video outputs

Changes to result-table-frame.component.{ts,html}:

Add isImageCell / isAudioCell / isVideoCell methods that delegate to the detection helpers
Render <img>, <audio controls>, <video controls> tags conditionally in the table cell template; non-media cells fall through to existing text rendering unchanged

Changes to result-panel-modal.component.{ts,html}:

Build rowEntries with per-field media metadata on modal open
Render media fields inline in the row detail view with a copy-to-clipboard fallback for the raw URL

Changes to result-panel-model.component.scss:

Add modal-toolbar and row-detail styles for media display

New file — media-type.util.spec.ts:

19 unit tests covering all three helpers across data URL prefixes, file extensions (including case-insensitivity and query strings), the fal.media CDN URL, cross-type rejection, and empty/plain-string inputs

Any related issues, documentation, discussions?

How was this PR tested?

Unit tests in media-type.util.spec.ts (19 cases) and result-table-frame.component.spec.ts. Run with ng test.

Was this PR authored or co-authored using generative AI tooling?

Co-authored with Claude Sonnet 4.6 in compliance with ASF guidelines

…d media proxy Introduces a new Jersey REST resource exposing endpoints used by the upcoming HuggingFace operator UI: - GET /api/huggingface/models — browse / search models per task - GET /api/huggingface/tasks — list HF pipeline tags with hosted inference - POST /api/huggingface/upload-audio — upload audio for HF audio tasks - GET /api/huggingface/audio-preview — stream uploaded audio (path-validated) - GET /api/huggingface/media-proxy — proxy remote media URLs to bypass CORS This is the first PR in a stacked series landing the HF operator end-to-end. No operator code yet; this resource is independently useful and lets the frontend integrate with HF before the operator class lands.

Addresses xuang7's review on PR apache#5124 — both endpoints previously buffered the full payload into a heap-resident byte[] with no upper bound, leaving the JVM open to OOM on a hostile or buggy upstream response (/media-proxy) or out-of-band write into the audio temp dir (/audio-preview). - /media-proxy: switch from Unirest.asBytes() to asObject(Function<RawResponse, T>), streaming the upstream body in 8 KiB chunks with a running byte counter. Aborts with 413 if the declared Content-Length exceeds the cap (pre-check) or if the body crosses the cap mid-read (defends against missing/lying Content-Length). New MAX_MEDIA_PROXY_BYTES = 50 MiB, sized for HF inference media (text-to-image ~5 MiB, text-to-video ~30 MiB) with headroom. - /audio-preview: add Files.size() defense-in-depth check before readAllBytes. /upload-audio already enforces MAX_AUDIO_BYTES on ingest; this catches the case where a bug or out-of-band write puts an oversized file in the temp dir. Adds a spec covering the audio-preview cap using a sparse-file fixture so the test stays fast (87/87 spec passes). The media-proxy cap path is exercised via the existing input-validation suite plus the new streamMediaWithCap helper - a follow-up can add a fake-RawResponse unit test if reviewers want explicit coverage of the chunked-read cap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@RolesAllowed

Per review on apache#5124 (xuang7, Ma77Ball): mark the resource with @RolesAllowed(Array("REGULAR", "ADMIN")) to document that all five endpoints require an authenticated user. The annotation isn't enforced yet — that's coming with the auth-enforcement PR @Yicong-Huang and @Ma77Ball are working on — but adding it now means no follow-up change is needed when enforcement lands, and it matches the convention used by UserConfigResource / AdminSettingsResource. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@JsonProperty

…eration Splits the monolithic 1,278-line HuggingFaceInferenceOpDesc from the team's feature branch into a dispatcher + per-task codegen architecture and ships the first task family (text-generation) end-to-end. - TaskCodegen trait + CodegenContext model the per-task variation - PythonCodegenBase emits the shared provider-fallback / process_table / _parse_response infrastructure with two holes for the per-task payload and parse snippets - TextGenCodegen supplies text-generation's chat-completions payload and the body["choices"][0]["message"]["content"] parse branch - HuggingFaceInferenceOpDesc becomes a thin dispatcher (~180 lines) holding @JsonProperty fields and the registeredCodegens map User-input string fields are typed as EncodableString and emitted via the pyb"..." macro so values reach Python as self.decode_python_template('<base64>') rather than raw literals; class constants are assigned in open(self) so self is in scope for the decode call. Generated process_table runs a defensive _HF_MODEL_ID_PATTERN check at runtime before any HF URL is composed. PR 2 of a stacked 9-PR series. PR 1 (apache#5124) ships the supporting REST resource; PRs 3-5 will add image, audio + media-gen, and QA/ranking task families by registering new *Codegen objects in the dispatcher. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@JsonProperty

…degen specs Addresses Codecov's 66.85% patch coverage warning by exercising the defensive null-handling branches in HuggingFaceInferenceOpDesc.scala and the TextGenCodegen contract that previously had no spec hits. - null-tolerance: feed null into every @JsonProperty (token, model, prompt col, system prompt, result col, task, maxNewTokens, temperature) and assert generatePythonCode still emits a parseable ProcessTableOperator with sane defaults (TASK falls back to text-generation, MAX_NEW_TOKENS clamps to 256, TEMPERATURE to 0.7). Covers the `if (x == null) ... else x` branches that previously had no test that took the null side. - TextGenCodegen.task: trivial canonical-value check. - TextGenCodegen ctx-independence: pass an "irrelevant"-filled ctx and assert payloadPython / parsePython still reference self.MODEL_ID and body["choices"]…. Catches a future refactor that accidentally splices ctx fields into the static snippets. 13/13 in HuggingFaceInferenceOpDescSpec, 2/2 in PythonCodeRawInvalidTextSpec (117/117 descriptors still py_compile cleanly). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…NAI_COMPATIBLE_PROVIDERS to class constants

@JsonProperty

Plugs the 9-task image family into the dispatcher pattern established in PR 2: image-only image-classification, object-detection, image-segmentation, image-to-text image + prompt visual-question-answering, document-question-answering, zero-shot-image-classification, image-text-to-text, image-to-image - ImageTaskCodegen supplies payload + parse Python for all 9 tasks - TaskCodegen trait gains a `tasks: Set[String]` default method so a single codegen can register under multiple task strings; the dispatcher map in HuggingFaceInferenceOpDesc is built from registeredCodegens.tasks.flatMap(...) - CodegenContext extended with imageInput + inputImageColumn (EncodableString) - HuggingFaceInferenceOpDesc gains 2 new @JsonProperty fields and registers ImageTaskCodegen PythonCodegenBase grows to host the shared image infrastructure: - image_only_tasks / image_prompt_tasks / image_tasks tuples and image_headers in process_table - per-row image bytes resolution from upload (self._read_image_input) or input column (self._read_binary_value + self._compress_image_bytes) - use_raw_binary_body / raw_binary_headers state threaded through _post_with_fallback (signature extended) - _post_with_fallback adds the image-text-to-text chat-completions branch and the model-author vision branch - _call_provider adds branches for zai-org's custom API, Replicate predictions + polling, Fal-ai, Wavespeed submit+poll, and image embedding in OpenAI-compatible / unknown-provider fallbacks - image-content-type response handling returns data:image URLs - image helpers added: _read_image_input, _compress_image_bytes, _image_input_as_base64, _read_binary_value, _looks_like_html, _html_to_image_bytes, _extract_json_arg, _url_to_data_url User-input strings continue to flow through pyb"..." + EncodableString so they reach Python as self.decode_python_template('<base64>') rather than raw literals. PythonCodeRawInvalidTextSpec still passes (117/117 descriptors py_compile cleanly). Frontend integration adds only the HF lines (no agent / dataset noise from the source branch): - HuggingFaceImageUploadComponent declared in app.module.ts - huggingface-image-upload formly type registered in formly-config.ts - Image upload component .ts/.html/.scss cherry-picked from huggingFace - HuggingFace.png + sample-image.png assets PR 3 of a stacked 9-PR series. Stacks on hf/02-operator-textgen. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…oad template

…ap comment

…nent Register the huggingface formly field type and declare HuggingFaceComponent in AppModule. Provides a task dropdown, paginated model list with client-side search, and per-task field state preservation when switching tasks.

The rxjs/no-implicit-any-catch ESLint rule requires explicit type annotations on error callbacks in .subscribe() calls. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Register the huggingface-audio-upload formly field type and declare HuggingFaceAudioUploadComponent in AppModule. Handles server-side audio storage via the /huggingface/upload-audio endpoint with local preview. Co-Authored-By: Anish Shivamurthy <anish@uci.edu>

…gFace property editor Show/hide operator fields based on the selected HuggingFace task (e.g., imageInput only for image tasks, contextColumn only for question-answering). Adds task preview cards with media samples per task kind (image/video/audio/text), custom validators for required inputs, and ~13 field visibility rules inside the formly jsonSchemaMapIntercept. Co-Authored-By: Anish Shivamurthy <anish@uci.edu>

Add mockHuggingFaceSchema and mockHuggingFacePredicate to test infrastructure. Add 7 spec tests covering huggingFaceTaskPreview for known tasks (text, image, audio, video), unknown tasks (fallback), empty tasks, and non-HF operators.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…mponent Satisfies the rxjs-angular/prefer-takeuntil lint rule. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

… panel Add isImageUrl/isAudioUrl/isVideoUrl detection helpers and wire them into the result table and row detail modal so image, audio, and video outputs from HuggingFace tasks render inline instead of as raw URLs. Gate backend string truncation on output mode so HF data URLs are never cut off.

codecov-commenter · 2026-06-13T05:24:48Z

Codecov Report

❌ Patch coverage is 37.43017% with 560 lines in your changes missing coverage. Please review.
✅ Project coverage is 52.66%. Comparing base (564ccdb) to head (1d05e38).
⚠️ Report is 37 commits behind head on main.

Files with missing lines	Patch %	Lines
...e/component/hugging-face/hugging-face.component.ts	10.23%	192 Missing and 1 partial ⚠️
...component/hugging-face/hugging-face.component.html	0.00%	90 Missing ⚠️
...udio-upload/hugging-face-audio-upload.component.ts	2.98%	64 Missing and 1 partial ⚠️
...it-frame/operator-property-edit-frame.component.ts	49.00%	49 Missing and 2 partials ⚠️
...-frame/operator-property-edit-frame.component.html	3.84%	50 Missing ⚠️
...mage-upload/hugging-face-image-upload.component.ts	50.00%	41 Missing and 1 partial ⚠️
...io-upload/hugging-face-audio-upload.component.html	0.00%	23 Missing ⚠️
...sult-table-frame/result-table-frame.component.html	0.00%	12 Missing ⚠️
...ge-upload/hugging-face-image-upload.component.html	38.88%	11 Missing ⚠️
...onent/result-panel/result-panel-modal.component.ts	62.06%	7 Missing and 4 partials ⚠️
... and 4 more

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #5675      +/-   ##
============================================
+ Coverage     52.17%   52.66%   +0.48%     
- Complexity     2482     2586     +104     
============================================
  Files          1068     1090      +22     
  Lines         41311    42628    +1317     
  Branches       4439     4658     +219     
============================================
+ Hits          21556    22448     +892     
- Misses        18490    18860     +370     
- Partials       1265     1320      +55

Flag	Coverage Δ		*Carryforward flag
access-control-service	`71.42% <ø> (+6.81%)`	⬆️
agent-service	`33.76% <ø> (ø)`		Carriedforward from a2f92c5
amber	`54.40% <96.96%> (+1.11%)`	⬆️
computing-unit-managing-service	`1.65% <ø> (ø)`
config-service	`56.71% <ø> (+0.65%)`	⬆️
file-service	`57.06% <ø> (+18.74%)`	⬆️
frontend	`46.41% <23.97%> (-0.02%)`	⬇️
pyamber	`90.72% <ø> (ø)`		Carriedforward from a2f92c5
python	`90.75% <ø> (ø)`		Carriedforward from a2f92c5
workflow-compiling-service	`58.69% <ø> (ø)`

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

github-actions · 2026-06-13T05:27:41Z

✅ No material benchmark regressions detected

🟢 15 better · 🔴 0 worse · ⚪ 0 noise (<±5%) · 0 without baseline

CI benchmark results are noisy; treat <±5% as noise unless repeated.

Dashboard · Run

	config	throughput	MB/s	latency	max Δ latest / 7d
🟢	bs=10 sw=10 sl=64	443	0.27	21,700/27,319/27,319 us	🟢 -28.8% / 🟢 -24.2%
🟢	bs=100 sw=10 sl=64	946	0.577	104,876/139,201/139,201 us	🟢 +29.7% / 🟢 +9.1%
🟢	bs=1000 sw=10 sl=64	1,116	0.681	894,516/940,171/940,171 us	🟢 +17.9% / 🟢 +10.5%

Baseline details

Latest main 7ae9b35 from 2026-06-13T00:44:28.840Z

config	metric	PR	latest main	7d avg	Δ latest	Δ 7d
bs=10 sw=10 sl=64	throughput	443 tuples/sec	394.18 tuples/sec	400.34 tuples/sec	+12.4%	+10.7%
bs=10 sw=10 sl=64	MB/s	0.27 MB/s	0.241 MB/s	0.244 MB/s	+12.2%	+10.5%
bs=10 sw=10 sl=64	p50	21,700 us	24,805 us	24,191 us	-12.5%	-10.3%
bs=10 sw=10 sl=64	p95	27,319 us	38,345 us	36,018 us	-28.8%	-24.2%
bs=10 sw=10 sl=64	p99	27,319 us	38,345 us	36,018 us	-28.8%	-24.2%
bs=100 sw=10 sl=64	throughput	946 tuples/sec	729.17 tuples/sec	867.19 tuples/sec	+29.7%	+9.1%
bs=100 sw=10 sl=64	MB/s	0.577 MB/s	0.445 MB/s	0.529 MB/s	+29.6%	+9.0%
bs=100 sw=10 sl=64	p50	104,876 us	133,219 us	114,874 us	-21.3%	-8.7%
bs=100 sw=10 sl=64	p95	139,201 us	173,178 us	142,264 us	-19.6%	-2.2%
bs=100 sw=10 sl=64	p99	139,201 us	173,178 us	142,264 us	-19.6%	-2.2%
bs=1000 sw=10 sl=64	throughput	1,116 tuples/sec	946.3 tuples/sec	1,010 tuples/sec	+17.9%	+10.5%
bs=1000 sw=10 sl=64	MB/s	0.681 MB/s	0.578 MB/s	0.616 MB/s	+17.9%	+10.5%
bs=1000 sw=10 sl=64	p50	894,516 us	1,059,718 us	995,071 us	-15.6%	-10.1%
bs=1000 sw=10 sl=64	p95	940,171 us	1,104,221 us	1,046,228 us	-14.9%	-10.1%
bs=1000 sw=10 sl=64	p99	940,171 us	1,104,221 us	1,046,228 us	-14.9%	-10.1%

Raw CSV

config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,451.66,200,128000,443,0.270,21700.03,27318.78,27318.78
1,100,10,64,20,2114.39,2000,1280000,946,0.577,104875.94,139201.32,139201.32
2,1000,10,64,20,17922.32,20000,12800000,1116,0.681,894515.64,940170.70,940170.70

…media-type spec Vitest does not support Jasmine-style toBeTrue() and toBeFalse() matchers.

PG1204 and others added 30 commits May 17, 2026 13:02

fix: address review feedback on HuggingFaceModelResource

935ccc1

Merge branch 'apache:main' into hf/01-backend-skeleton

089c3c4

Merge branch 'apache:main' into hf/01-backend-skeleton

2aa865c

Merge branch 'apache:main' into hf/01-backend-skeleton

0c30beb

chore: retrigger CI

6857e34

Merge branch 'apache:main' into hf/01-backend-skeleton

6f0f5fb

Merge branch 'main' into hf/01-backend-skeleton

fec6dfb

Merge branch 'apache:main' into hf/01-backend-skeleton

5e95bcd

fix: scala lint fixes

8350eb9

Merge branch 'apache:main' into hf/02-operator-textgen

2efa337

Merge branch 'apache:main' into hf/02-operator-textgen

c44d7d0

refactor(huggingFace): cap HTTP error detail + lift CHAT_ROUTES / OPE…

28fcab0

…NAI_COMPATIBLE_PROVIDERS to class constants

style: apply scalafmt and prettier to HF inference spec and image upl…

2b46a9c

…oad template

chore: add Apache license header to HF image upload template and styles

0815d14

test(frontend): cover HuggingFaceImageUploadComponent

76f606a

Merge branch 'apache:main' into hf/03-image-tasks

ea3ea63

fix(huggingFace): zero-shot labels, polling progress logs, data-URL c…

ef59a1e

…ap comment

Merge branch 'apache:main' into hf/03-image-tasks

3975e0a

feat(huggingface): add audio and media tasks

8a83dc2

feat(huggingface): add qa and ranking tasks

8507ca5

fix(frontend): add explicit type annotations to rxjs error callbacks

6b1495d

The rxjs/no-implicit-any-catch ESLint rule requires explicit type annotations on error callbacks in .subscribe() calls. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

ELin2025 and others added 6 commits June 8, 2026 14:23

Merge branch 'apache:main' into hf/07-property-editor

fe13f77

style(frontend): format HuggingFace components with prettier

837e554

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fix(frontend): add takeUntil to rxjs subscribe calls in HuggingFaceCo…

e0b10a2

…mponent Satisfies the rxjs-angular/prefer-takeuntil lint rule. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

style(frontend): format spec files with prettier

ad58847

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions Bot added frontend Changes related to the frontend GUI common labels Jun 13, 2026

github-actions Bot assigned juliethecao Jun 13, 2026

fix(frontend): use toBe(true/false) instead of toBeTrue/toBeFalse in …

1d05e38

…media-type spec Vitest does not support Jasmine-style toBeTrue() and toBeFalse() matchers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(frontend): add HuggingFace media output rendering in result panel#5675

feat(frontend): add HuggingFace media output rendering in result panel#5675
juliethecao wants to merge 37 commits into
apache:mainfrom
ELin2025:hf/08-result-media

juliethecao commented Jun 13, 2026

Uh oh!

codecov-commenter commented Jun 13, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

juliethecao commented Jun 13, 2026

What changes were proposed in this PR?

Any related issues, documentation, discussions?

How was this PR tested?

Was this PR authored or co-authored using generative AI tooling?

Uh oh!

codecov-commenter commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ No material benchmark regressions detected

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

codecov-commenter commented Jun 13, 2026 •

edited

Loading

github-actions Bot commented Jun 13, 2026 •

edited

Loading