0% found this document useful (0 votes)
11 views14 pages

Aether Analyst

The Aether Analyst document provides a detailed technical reference on the architecture and data flow of the Aether Analyst system, which integrates a Next.js frontend with a FastAPI backend. It outlines the system's dual-layer persistence using SQLite for hard state and ChromaDB for semantic memory, as well as the use of Server-Sent Events (SSE) for real-time updates during long-running AI tasks. The document also describes the critical API endpoints and the execution lifecycle of user interactions within the application.

Uploaded by

kaii.skywalkerr
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views14 pages

Aether Analyst

The Aether Analyst document provides a detailed technical reference on the architecture and data flow of the Aether Analyst system, which integrates a Next.js frontend with a FastAPI backend. It outlines the system's dual-layer persistence using SQLite for hard state and ChromaDB for semantic memory, as well as the use of Server-Sent Events (SSE) for real-time updates during long-running AI tasks. The document also describes the critical API endpoints and the execution lifecycle of user interactions within the application.

Uploaded by

kaii.skywalkerr
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

]

Aether Analyst
Exhaustive End-To-End Architecture Deep Dive

A granular, byte-level breakdown of every layer — from the React


component tree through the FastAPI async event loop, the ReAct
cognitive engine, and into the Gemini API sandbox.

Organisation: Aetherflows Studio


Document Type: Internal Technical Reference
Stack: [Link] 15 · React 19 · Python FastAPI · Google Gemini · SQLite · ChromaDB

All components, module names, and data flows described herein


reflect the current production implementation of Aether Analyst.
Contents
1 System Topology & Data Flow Paradigms 3
1.1 Synchronous CRUD via REST . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Asynchronous Streaming via Server-Sent Events (SSE) . . . . . . . . . . . 3
2 The [Link] 15 Frontend (app/dashboard/[Link]) 3
2.1 State Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 The Chat Execution Lifecycle (UI Perspective) . . . . . . . . . . . . . . . 4
3 The FastAPI Backend Orchestration ([Link]) 5
3.1 Critical API Endpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.1.1 POST /api/upload . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.1.2 POST /api/chat . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.1.3 GET /api/chat/stream . . . . . . . . . . . . . . . . . . . . . . 6
3.1.4 Supporting Endpoints . . . . . . . . . . . . . . . . . . . . . . . 6
4 Dual-Layer Persistence ([Link] & [Link]) 7
4.1 Hard State — SQLite via SQLAlchemy ([Link]) . . . . . . . . . . . 7
4.2 Semantic Memory — ChromaDB ([Link]) . . . . . . . . . . . . . . . . 7
5 The ReAct Core Engine (agent/[Link]) 8
5.1 Conceptual Foundation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
5.2 System Prompt Engineering . . . . . . . . . . . . . . . . . . . . . . . . . 9
5.3 The Execution Loop ([Link]()) . . . . . . . . . . . . . . . . . . 9
6 Persona Routing — The Three Agents 10
6.1 research_agent.py — The Research Analyst . . . . . . . . . . . . . . . . 10
6.2 analyst_agent.py — The Data Scientist . . . . . . . . . . . . . . . . . . 11
6.3 combined_agent.py — The Full-Stack Analyst . . . . . . . . . . . . . . . 11
7 Deep Tool Mechanics (backend/agent/tools/) 11
7.1 The Web Scraping Engine (web_search.py & [Link]) . . . . . . . . . . 11
7.2 The Code Sandbox (code_executor.py) . . . . . . . . . . . . . . . . . . . 12
7.2.1 Execution Mechanism . . . . . . . . . . . . . . . . . . . . . . . 12
7.2.2 Output Capture Mechanism . . . . . . . . . . . . . . . . . . . . 12
7.2.3 Autonomous Error Healing . . . . . . . . . . . . . . . . . . . . . 12
7.3 Auto Exploratory Data Analysis ([Link]) . . . . . . . . . . . . . . . . . . 13
7.4 The PDF Report Engine ([Link]) . . . . . . . . . . . . . . . . . . . . 13
8 System Component Summary 14

2
Aether Analyst — Architecture Reference Aetherflows Studio

1 System Topology & Data Flow Paradigms


Aether Analyst is built on a bifurcated interaction model, splitting all client-server
communication into two distinct paradigms based on the latency tolerance of each
operation.

1.1 Synchronous CRUD via REST


Standard HTTP GET / POST / DELETE requests handle all immutable state actions:
fetching prior chat sessions from the database, downloading generated PDF blobs, and
uploading raw CSV or Excel files. These operations are deterministic and bounded
in time — a typical SQLite read completes in under 5 ms, well within HTTP timeout
thresholds.

1.2 Asynchronous Streaming via Server-Sent Events (SSE)


AI-driven analysis tasks are categorically different. A Gemini reasoning loop that writes
Python code, executes it, reads the traceback, corrects the syntax, and re-executes may
take 60–180 seconds from start to finish. Standard HTTP requests would timeout —
and the user would see nothing.
The solution is Server-Sent Events (SSE): a persistent, one-way HTTP socket held
open by the browser. The FastAPI backend emits discrete JSON packets — called
Events — the exact millisecond each reasoning step completes:

Event Types emitted over SSE:

• thought — Gemini’s internal chain-of-thought reasoning block

• tool_call — The function name and JSON arguments Gemini selected

• observation — The raw output returned by the physical tool function

• message — The final user-facing synthesized answer

• error — A terminal crash payload with a traceback string

This paradigm ensures the UI remains live and informative throughout long-running
operations, rather than displaying a frozen spinner.

2 The [Link] 15 Frontend (app/dashboard/[Link])


The entire client application is rendered server-side by [Link] 15 at initial load and
then hydrated as a highly interactive React 19 Client Component. The server
render provides an instant, content-ful first paint; hydration attaches all event listeners

3
Aether Analyst — Architecture Reference Aetherflows Studio

and real-time state machinery.

2.1 State Architecture


The frontend uses localised React state as a client-side mirror of the backend
database. There is no global state manager (no Redux, no Zustand) — all state lives
inside the top-level [Link] component and flows down via props:

State Variable Purpose


sessions Array of all prior conversations fetched from GET
/api/sessions. Maps directly to the sidebar’s “Re-
cent Chats” list. Each entry carries a session_id,
title, and creation timestamp.
chatMessages The active conversation buffer. Each entry is an ob-
ject with a role field (user / agent / system) and
an array of parts (plain text, thought blocks, tool
execution traces, images).
reports An array fetched from SQLite mapping physical .pdf
file paths on the backend server to interactive cards
in the “Reports” tab of the UI.
activeRunId The UUID of the currently executing background
agent task. Used to construct the SSE stream URL.
isStreaming Boolean flag — when true, the send button is disabled
and the UI renders a live activity indicator.

2.2 The Chat Execution Lifecycle (UI Perspective)


The following is a precise, step-by-step trace of what happens when a user submits a
message:

1. The user types “Analyse sales_q3.csv and generate a correlation heatmap” and
presses Send.

2. The UI optimistically pushes a temporary role: ’user’ message object to


chatMessages so the user sees their own message instantly without waiting for a
server round-trip.

3. A POST /api/chat request is fired carrying the session_id, agent_type, and


user_prompt payload.

4. The server responds immediately (within ~20 ms) with a run_id — a UUID repre-
senting the background task that was just created.

5. The UI constructs the SSE URL: /api/chat/stream?session_id=X&run_id=Y and

4
Aether Analyst — Architecture Reference Aetherflows Studio

opens a native browser EventSource object.

6. As thought, tool_call, and observation events arrive over the socket, the
chatMessages state is mutated in real time using the functional form of setState,
immutably updating the correct message object.

7. Framer Motion accordion components expand dynamically as each thought block


and tool trace event arrives, giving the user a live view of the agent’s internal
reasoning process.

8. When the terminal message event fires, the agent’s final synthesised answer is
injected into the chat, isStreaming is set to false, and the EventSource connection
is closed.

Key Design Principle: The UI never waits for the agent to finish. Every
intermediate state — thoughts, tool calls, observations — is rendered in real time.
This eliminates perceived latency for long-running tasks.

3 The FastAPI Backend Orchestration ([Link])


[Link] is the routing nervous system of the platform. It initialises the SQLAlchemy
database connection as ASGI middleware, mounts all routers, and defines the critical
API surface.

3.1 Critical API Endpoints


3.1.1 POST /api/upload
Receives multipart form data containing .csv, .xlsx, or .pdf files from the browser’s
<input type="file">. Each uploaded file is:

• Streamed directly to the host OS /data/ directory using Python’s aiofiles library
for non-blocking I/O.

• Given a collision-safe filename by prepending a UUID4 prefix to the original


filename.

• Returned to the frontend as an absolute server path string (e.g., /data/a3f2-sales_q3.csv).


The AI sandbox tools require this exact path to ingest the file.

3.1.2 POST /api/chat


The primary entry point for all AI interactions. Its responsibilities are:

1. Intercept the JSON payload (session_id, user_prompt, agent_type).

5
Aether Analyst — Architecture Reference Aetherflows Studio

2. Write the user message to SQLite as a Message entity (role: user).

3. Create a Run database record with an initial status of RUNNING.

4. Use FastAPI’s BackgroundTasks.add_task(_execute_agent_run) to boot the


ReAct engine off the main thread. This is critical: if the agent ran syn-
chronously, the entire Uvicorn event loop would block, rejecting all other incoming
HTTP requests for the duration of the analysis.

5. Immediately return a 200 OK response with the newly created run_id so the frontend
can open the SSE stream without delay.

3.1.3 GET /api/chat/stream


The persistent SSE socket. Its implementation uses Python’s native asynchronous
generator pattern inside a FastAPI StreamingResponse:

1 from fastapi . responses import S treami ngResp onse


2 import asyncio
3

4 async def event_generator ( session_id : str , run_id : str ) :


5 while True :
6 # Poll an in - memory asyncio . Queue for new events
7 event = await event_queues [ run_id ]. get ()
8 # Format to SSE spec : " data : { json }\ n \ n "
9 yield f " data : { event . model_dump_json () }\ n \ n "
10 if event . type in ( " message " , " error " ) :
11 break # Terminal event close the stream
12

13 @app . get ( " / api / chat / stream " )


14 async def stream_chat ( session_id : str , run_id : str ) :
15 return St reamin gResp onse (
16 event_generator ( session_id , run_id ) ,
17 media_type = " text / event - stream " ,
18 headers ={ " Cache - Control " : " no - cache " , "X - Accel - Buffering " :
" no " } ,
19 )

Listing 1: Simplified SSE streaming endpoint

The X-Accel-Buffering: no header is critical for Nginx deployments — without


it, Nginx buffers the entire response body before forwarding, completely defeating the
purpose of streaming.

3.1.4 Supporting Endpoints


• GET /api/sessions — Fetches all Session rows from SQLite for the sidebar.

• GET /api/sessions/{id}/messages — Fetches the full message transcript for


session restoration.

6
Aether Analyst — Architecture Reference Aetherflows Studio

• GET /api/reports/{id}/download — Streams the physical .pdf file from disk as


an application/pdf response.

• DELETE /api/sessions/{id} — Cascades deletion of all child Message, Run, and


Report records.

4 Dual-Layer Persistence ([Link] & [Link])

A production-grade LLM agent requires two fundamentally different forms of memory,


each solving a distinct architectural problem.

4.1 Hard State — SQLite via SQLAlchemy ([Link])


Purpose: Immutable, transactional logs. Every message, every run status, every
generated report is written to [Link] on disk. This survives process restarts and
server reboots.

Model Schema & Purpose


Session id (UUID), title (str), agent_type (enum),
created_at. Top-level workspace container.
Run id (UUID), session_id (FK), status (RUNNING |
SUCCESS | FAILED), started_at, ended_at. Tracks
the lifecycle of every background agent execution.
Message id (UUID), session_id (FK), role (user | agent
| system), content (JSON), timestamp. The exact
chronological transcript. The content field stores the
full parts array as a JSON blob.
Report id (UUID), session_id (FK), filename (str),
file_path (str), created_at. Links generated PDF
documents to their parent sessions.

SQLAlchemy’s declarative ORM is used throughout, with async session factories


to ensure all database I/O is non-blocking and compatible with FastAPI’s ASGI
architecture.

4.2 Semantic Memory — ChromaDB ([Link])


Purpose: Overcoming the LLM’s finite context window via Retrieval-Augmented
Generation (RAG).
The Problem: A conversation spanning thousands of messages across multiple sessions
cannot be fed entirely to Gemini — the token limit would be exceeded, and API costs
would be prohibitive.

7
Aether Analyst — Architecture Reference Aetherflows Studio

The Solution:

1. Older messages are chunked into fixed-size text segments.

2. Each chunk is passed through Google’s text-embedding-004 model, converting


paragraphs into 768-dimensional float vectors that encode semantic meaning.

3. These vectors are stored in ChromaDB’s local vector index on disk.

4. When the user submits a new prompt, the prompt itself is embedded into the same
vector space.

5. ChromaDB executes a cosine similarity search, identifying the top-k (typically


k = 5) most semantically relevant historical chunks.

6. These chunks are silently injected into the Gemini system prompt under a [Relevant
Memory] header, effectively “reminding” the AI of relevant past context it would
otherwise have lost.

Why Cosine Similarity? Unlike Euclidean distance, cosine similarity measures


the angle between vectors, making it invariant to the absolute magnitude (length)
of the text chunk. Two sentences expressing the same idea with different verbosity
will have a cosine similarity close to 1.0 regardless of their character count.

5 The ReAct Core Engine (agent/[Link])


This is the cognitive heartbeat of the platform. The AgentCore class implements
the ReAct (Reasoning and Acting) framework, transforming the Gemini LLM
from a passive text predictor into an autonomous software agent capable of planning,
tool use, and error correction.

5.1 Conceptual Foundation


A raw LLM outputs probability distributions over tokens — it cannot natively do
anything. The ReAct pattern forces the LLM into a structured think-act-observe
loop:

Thoughtt → Actiont → Observationt → Thoughtt+1 → · · · → FinalAnswer

Each Observation (the real-world output of a tool) is appended to the conversation


history and fed back to the model, enabling it to reason about what its action achieved.

8
Aether Analyst — Architecture Reference Aetherflows Studio

5.2 System Prompt Engineering


Before the loop begins, the agent is initialised with a strict behavioural system
prompt that overrides Gemini’s default assistant persona:

“You are Aether Analyst, an expert autonomous data analyst. You must ALWAYS
output a structured thought block before taking any action. You have access to a suite
of physical tools. If the user references a file, call eda immediately to profile it. Never
fabricate data — if you are uncertain, use a tool to verify. Structure your final answer
in clean markdown.”
[Relevant Memory] ← ChromaDB RAG injection point
[Uploaded Files] ← Absolute server paths injected here

5.3 The Execution Loop ([Link]())


1. Generate: Call model.generate_content(history, tools=[...]) via the google-generative
SDK. The tools parameter passes Gemini the JSON schema definitions of all avail-
able functions.

2. Inspect the Response: Gemini’s response can be one of two types:

• [Link] — A plain text string. This is a thought block or the final answer.
Emit as thought or message SSE event respectively.

• Part.function_call — A structured payload containing a name (the function


to call) and args (a JSON dictionary of parameters). This triggers the tool
dispatch path.

3. Dispatch the Tool: [Link] looks up the function_name in its internal routing
dictionary (tool_registry), retrieves the corresponding Python callable, and
invokes it: result = tool_registry[name](**args).

4. Emit Events: The tool_call and subsequent observation SSE events are pushed
to the [Link] so the frontend can render them immediately.

5. Inject Observation: The tool result is appended to history as a FunctionResponse


part — the format Gemini expects for tool return values.

6. Loop: generate_content() is called again with the updated history. The loop
continues until Gemini outputs a terminal text response (no more function calls) or
the maximum iteration cap is hit.

1 async def run ( self , user_prompt : str ) -> None :


2 self . history . append ({ " role " : " user " , " parts " : [ user_prompt ]})
3

4 for _ in range ( MAX_ITERATIONS ) :

9
Aether Analyst — Architecture Reference Aetherflows Studio

5 response = await self . model . g e n e r a t e _ c o n t e n t _ a s y n c (


6 self . history , tools = self . tools
7 )
8 part = response . candidates [0]. content . parts [0]
9

10 if hasattr ( part , " function_call " ) :


11 fn_name = part . function_call . name
12 fn_args = dict ( part . function_call . args )
13 await self . _emit ( " tool_call " , { " name " : fn_name , " args " :
fn_args })
14

15 result = self . tool_registry [ fn_name ](** fn_args )


16 await self . _emit ( " observation " , { " result " : str ( result )
})
17

18 # Append function call + result to history


19 self . history . append ({ " role " : " model " , " parts " : [ part ]})
20 self . history . append ({
21 " role " : " user " ,
22 " parts " : [{ " f unctio n_resp onse " : { " name " : fn_name , "
response " : result }}]
23 })
24 else :
25 # Final text answer
26 await self . _emit ( " message " , { " text " : part . text })
27 return

Listing 2: Simplified ReAct loop core

6 Persona Routing — The Three Agents


Aether Analyst exposes three distinct agent personas, each implemented as a subclass
of AgentCore with a different tool loadout and system prompt variant. This
prevents tool confusion and reduces hallucination rates.

6.1 research_agent.py — The Research Analyst


• Tools Injected: web_search, arxiv_search, web_reader

• Tools Banned: code_executor, eda, generate_report

• Behaviour: Strictly internet-facing. Searches DuckDuckGo and arXiv, reads


full webpage content, and synthesises structured intelligence summaries. Cannot
execute any code. Designed for market research, academic literature reviews, and
competitive analysis.

10
Aether Analyst — Architecture Reference Aetherflows Studio

6.2 analyst_agent.py — The Data Scientist


• Tools Injected: code_executor, eda, generate_report

• Tools Banned: All internet-access tools. Hard-coded prohibition in system prompt.

• Behaviour: Operates exclusively on local files. Profiles CSVs, writes and executes
Python (Pandas, Matplotlib, Scikit-learn), iterates on errors autonomously, and
generates PDF reports. The internet ban prevents the agent from hallucinating by
fetching irrelevant external context.

6.3 combined_agent.py — The Full-Stack Analyst


• Tools Injected: Full union of both tool arrays.

• System Prompt Override: Explicit sequential instruction — “Execute Research


Tools first to build a domain intelligence profile. Only then switch to Data Analysis
Tools to apply findings to the local dataset.” This two-phase prompt prevents Gemini
from getting overwhelmed by the larger tool surface area.

• Use Case: Ideal for tasks like “Research the latest NLP benchmarks and then
compare our model’s test results against them.”

7 Deep Tool Mechanics (backend/agent/tools/)


The tool files contain the physical, deterministic actions that the probabilistic
LLM orchestrates but cannot safely perform on its own.

7.1 The Web Scraping Engine (web_search.py & [Link])


Gemini has no native internet access — its knowledge is static. web_search.py solves
this by simulating a real browser session:

1. A spoofed HTTP GET request is sent to DuckDuckGo’s HTML endpoint


([Link]/?q={query}) with realistic User-Agent and Accept-Language
headers to bypass bot detection.

2. BeautifulSoup4 parses the returned HTML DOM. The top 3 result URLs are
extracted from <a class="result__url"> anchor tags.

3. For each URL, a second GET request fetches the full page content.

4. BeautifulSoup strips all JavaScript, CSS, <script>, and <style> tags, retaining
only semantic text from <p>, <h1>–<h6>, and <li> elements.

5. The stripped text is truncated to a configurable token limit and returned as a single

11
Aether Analyst — Architecture Reference Aetherflows Studio

string blob to Gemini.

[Link] targets [Link]/api/query directly, parsing the standard Atom


XML response to extract Title, Authors, Abstract, Published Date, and PDF
URL with high precision.

7.2 The Code Sandbox (code_executor.py)


This is the most critical and architecturally complex tool on the platform. It enables
the AI to behave as a live Python interpreter.

7.2.1 Execution Mechanism


Gemini drafts a raw Python code string (e.g., import pandas as pd; df = pd.read_csv(’/data/sal
print([Link]())). code_executor.py receives this string and:

1 import contextlib
2 import io
3

4 def execute_code ( code : str ) -> str :


5 # Strip markdown fences if Gemini wraps code in ‘‘‘ python
6 code = re . sub ( r " ‘ ‘ ‘(?: python ) ?| ‘ ‘ ‘ " , " " , code ) . strip ()
7

8 stdout_capture = io . StringIO ()
9 try :
10 with contextlib . redirect_stdout ( stdout_capture ) :
11 exec ( code , { " __builtins__ " : __builtins__ })
12 output = stdout_capture . getvalue ()
13 return output if output else " Code executed successfully .
No output . "
14 except Exception as e :
15 return f " EXECUTION ERROR :\ n { traceback . format_exc () } "

Listing 3: Code execution with output capture and error handling

7.2.2 Output Capture Mechanism


Standard exec() writes all print() output to the OS terminal — which Gemini cannot
physically read. The tool wraps the execution in contextlib.redirect_stdout([Link]()).
This intercepts the stdout stream at the Python interpreter level, capturing all
print output into an in-memory buffer that is then read and returned as the Observation
string.

7.2.3 Autonomous Error Healing


If Gemini writes syntactically broken code (wrong column name, incorrect API usage,
a missing import), the except clause catches the full Python traceback. Critically,
this traceback string is returned verbatim to Gemini as the Observation. On
the next ReAct loop iteration, Gemini reads the error (e.g., “KeyError: ’Revenue’ —

12
Aether Analyst — Architecture Reference Aetherflows Studio

available columns: [’revenue’, ’date’, ’region’]”), self-diagnoses the mistake, and writes
corrected code — no human intervention required.

7.3 Auto Exploratory Data Analysis ([Link])


The [Link] tool serves as a high-level abstraction that prevents Gemini from wasting
API tokens drafting repetitive boilerplate Pandas profiling code for every new file:

1 import pandas as pd
2

3 def run_eda ( file_path : str ) -> str :


4 df = pd . read_csv ( file_path ) if file_path . endswith ( " . csv " ) \
5 else pd . read_excel ( file_path )
6

7 report_parts = [
8 f " Shape : { df . shape } " ,
9 f " Columns : { list ( df . columns ) } " ,
10 f " Dtypes :\ n { df . dtypes . to_string () } " ,
11 f " Describe :\ n { df . describe ( include = ’ all ’) . to_string () } " ,
12 f " Null Counts :\ n { df . isna () . sum () . to_string () } " ,
13 f " Duplicate Rows : { df . duplicated () . sum () } "
14 ]
15 return " \ n \ n " . join ( report_parts )

Listing 4: Automated EDA tool

This single tool call gives Gemini a complete statistical profile of any dataset instantly
— data types, null percentages, value distributions, and shape — enabling it to plan its
analysis strategy before writing a single line of code.

7.4 The PDF Report Engine ([Link])


When analysis is complete, Gemini does not dump raw markdown into the chat. Instead,
it invokes the generate_report(markdown_str) tool, which executes a four-stage
rendering pipeline:

1. Markdown → HTML: The markdown Python library converts the agent’s mark-
down string into a full HTML document with semantic tags (<p>, <table>, <h1>–<h4>,
<code>).

2. CSS Injection: A hardcoded CSS stylesheet encoding the Aether design system
(dark table headers, monospace fonts for code blocks, consistent 2.5 cm page margins,
lavender accent colours) is injected into the HTML <head>.

3. PDF Rasterisation: The xhtml2pdf library (pisa) consumes the complete HTM-
L/CSS document and rasterises it into A4-sized PDF pages, handling pagination,
table overflow, and image embedding automatically.

13
Aether Analyst — Architecture Reference Aetherflows Studio

4. Persistence: The output file is written to /data/reports/report_{uuid}.pdf


on disk, and a corresponding Report row is inserted into SQLite, making the file
immediately available for download via the GET /api/reports/{id}/download
endpoint.

8 System Component Summary

Component Role Key Technology

Frontend Real-time chat UI, SSE con- [Link] 15, React 19,
sumer, session management Framer Motion
API Server Request routing, back- Python FastAPI,
ground task dispatch, SSE Uvicorn, asyncio
emission
Hard Persistence Immutable mes- SQLite, SQLAlchemy
sage/run/report logs (async)
Semantic Memory Long-context RAG over ChromaDB,
conversation history text-embedding-004
ReAct Engine LLM orchestration, tool google-generativeai
dispatch, loop management SDK, Gemini 1.5 Pro
Web Scraper Live internet access via requests,
DuckDuckGo + arXiv BeautifulSoup4,
[Link]
Code Executor Dynamic Python execution exec(), contextlib,
with output capture [Link]
EDA Tool Automated dataset profil- pandas
ing describe/info/isna
PDF Engine Markdown-to-PDF report markdown, xhtml2pdf
generation (pisa)
File Storage Persistent upload and re- OS filesystem,
port storage aiofiles

End of Aether Analyst Architecture Reference


Aetherflows Studio · Internal Documentation

14

You might also like