This view is limited to 50 files because it contains too many changes.  See the raw diff here.
Files changed (50) hide show
  1. .cursorrules +0 -240
  2. .env.example +97 -80
  3. .github/README.md +11 -2
  4. .github/scripts/deploy_to_hf_space.py +0 -391
  5. .github/workflows/ci.yml +23 -70
  6. .github/workflows/deploy-hf-space.yml +0 -47
  7. .github/workflows/docs.yml +61 -0
  8. .gitignore +5 -6
  9. .pre-commit-config.yaml +11 -21
  10. =0.22.0 +0 -0
  11. =0.22.0, +0 -0
  12. AGENTS.txt +0 -236
  13. LICENSE.md +0 -25
  14. Makefile +51 -0
  15. README.md +86 -26
  16. deployments/README.md +0 -46
  17. deployments/modal_tts.py +0 -97
  18. dev/Makefile +51 -0
  19. docs/api/agents.md +103 -48
  20. docs/api/models.md +110 -57
  21. docs/api/orchestrators.md +86 -44
  22. docs/api/services.md +41 -123
  23. docs/api/tools.md +29 -57
  24. docs/architecture/agents.md +18 -123
  25. docs/architecture/graph-orchestration.md +152 -0
  26. docs/architecture/graph_orchestration.md +42 -185
  27. docs/architecture/middleware.md +37 -45
  28. docs/architecture/orchestrators.md +55 -58
  29. docs/architecture/services.md +28 -36
  30. docs/architecture/tools.md +33 -29
  31. docs/architecture/workflow-diagrams.md +20 -5
  32. docs/architecture/workflows.md +662 -0
  33. docs/configuration/CONFIGURATION.md +743 -0
  34. docs/configuration/index.md +260 -78
  35. CONTRIBUTING.md → docs/contributing.md +66 -132
  36. docs/contributing/code-quality.md +30 -73
  37. docs/contributing/code-style.md +16 -42
  38. docs/contributing/error-handling.md +15 -4
  39. docs/contributing/implementation-patterns.md +20 -7
  40. docs/contributing/index.md +26 -121
  41. docs/contributing/prompt-engineering.md +10 -0
  42. docs/contributing/testing.md +12 -66
  43. docs/getting-started/examples.md +31 -24
  44. docs/getting-started/installation.md +10 -18
  45. docs/getting-started/mcp-integration.md +14 -6
  46. docs/getting-started/quick-start.md +14 -41
  47. docs/index.md +9 -28
  48. docs/{LICENSE.md → license.md} +0 -0
  49. docs/overview/architecture.md +14 -16
  50. docs/overview/features.md +19 -44
.cursorrules DELETED
@@ -1,240 +0,0 @@
1
- # DeepCritical Project - Cursor Rules
2
-
3
- ## Project-Wide Rules
4
-
5
- **Architecture**: Multi-agent research system using Pydantic AI for agent orchestration, supporting iterative and deep research patterns. Uses middleware for state management, budget tracking, and workflow coordination.
6
-
7
- **Type Safety**: ALWAYS use complete type hints. All functions must have parameter and return type annotations. Use `mypy --strict` compliance. Use `TYPE_CHECKING` imports for circular dependencies: `from typing import TYPE_CHECKING; if TYPE_CHECKING: from src.services.embeddings import EmbeddingService`
8
-
9
- **Async Patterns**: ALL I/O operations must be async (`async def`, `await`). Use `asyncio.gather()` for parallel operations. CPU-bound work must use `run_in_executor()`: `loop = asyncio.get_running_loop(); result = await loop.run_in_executor(None, cpu_bound_function, args)`. Never block the event loop.
10
-
11
- **Error Handling**: Use custom exceptions from `src/utils/exceptions.py`: `DeepCriticalError`, `SearchError`, `RateLimitError`, `JudgeError`, `ConfigurationError`. Always chain exceptions: `raise SearchError(...) from e`. Log with structlog: `logger.error("Operation failed", error=str(e), context=value)`.
12
-
13
- **Logging**: Use `structlog` for ALL logging (NOT `print` or `logging`). Import: `import structlog; logger = structlog.get_logger()`. Log with structured data: `logger.info("event", key=value)`. Use appropriate levels: DEBUG, INFO, WARNING, ERROR.
14
-
15
- **Pydantic Models**: All data exchange uses Pydantic models from `src/utils/models.py`. Models are frozen (`model_config = {"frozen": True}`) for immutability. Use `Field()` with descriptions. Validate with `ge=`, `le=`, `min_length=`, `max_length=` constraints.
16
-
17
- **Code Style**: Ruff with 100-char line length. Ignore rules: `PLR0913` (too many arguments), `PLR0912` (too many branches), `PLR0911` (too many returns), `PLR2004` (magic values), `PLW0603` (global statement), `PLC0415` (lazy imports).
18
-
19
- **Docstrings**: Google-style docstrings for all public functions. Include Args, Returns, Raises sections. Use type hints in docstrings only if needed for clarity.
20
-
21
- **Testing**: Unit tests in `tests/unit/` (mocked, fast). Integration tests in `tests/integration/` (real APIs, marked `@pytest.mark.integration`). Use `respx` for httpx mocking, `pytest-mock` for general mocking.
22
-
23
- **State Management**: Use `ContextVar` in middleware for thread-safe isolation. Never use global mutable state (except singletons via `@lru_cache`). Use `WorkflowState` from `src/middleware/state_machine.py` for workflow state.
24
-
25
- **Citation Validation**: ALWAYS validate references before returning reports. Use `validate_references()` from `src/utils/citation_validator.py`. Remove hallucinated citations. Log warnings for removed citations.
26
-
27
- ---
28
-
29
- ## src/agents/ - Agent Implementation Rules
30
-
31
- **Pattern**: All agents use Pydantic AI `Agent` class. Agents have structured output types (Pydantic models) or return strings. Use factory functions in `src/agent_factory/agents.py` for creation.
32
-
33
- **Agent Structure**:
34
- - System prompt as module-level constant (with date injection: `datetime.now().strftime("%Y-%m-%d")`)
35
- - Agent class with `__init__(model: Any | None = None)`
36
- - Main method (e.g., `async def evaluate()`, `async def write_report()`)
37
- - Factory function: `def create_agent_name(model: Any | None = None) -> AgentName`
38
-
39
- **Model Initialization**: Use `get_model()` from `src/agent_factory/judges.py` if no model provided. Support OpenAI/Anthropic/HF Inference via settings.
40
-
41
- **Error Handling**: Return fallback values (e.g., `KnowledgeGapOutput(research_complete=False, outstanding_gaps=[...])`) on failure. Log errors with context. Use retry logic (3 retries) in Pydantic AI Agent initialization.
42
-
43
- **Input Validation**: Validate query/inputs are not empty. Truncate very long inputs with warnings. Handle None values gracefully.
44
-
45
- **Output Types**: Use structured output types from `src/utils/models.py` (e.g., `KnowledgeGapOutput`, `AgentSelectionPlan`, `ReportDraft`). For text output (writer agents), return `str` directly.
46
-
47
- **Agent-Specific Rules**:
48
- - `knowledge_gap.py`: Outputs `KnowledgeGapOutput`. Evaluates research completeness.
49
- - `tool_selector.py`: Outputs `AgentSelectionPlan`. Selects tools (RAG/web/database).
50
- - `writer.py`: Returns markdown string. Includes citations in numbered format.
51
- - `long_writer.py`: Uses `ReportDraft` input/output. Handles section-by-section writing.
52
- - `proofreader.py`: Takes `ReportDraft`, returns polished markdown.
53
- - `thinking.py`: Returns observation string from conversation history.
54
- - `input_parser.py`: Outputs `ParsedQuery` with research mode detection.
55
-
56
- ---
57
-
58
- ## src/tools/ - Search Tool Rules
59
-
60
- **Protocol**: All tools implement `SearchTool` protocol from `src/tools/base.py`: `name` property and `async def search(query, max_results) -> list[Evidence]`.
61
-
62
- **Rate Limiting**: Use `@retry` decorator from tenacity: `@retry(stop=stop_after_attempt(3), wait=wait_exponential(...))`. Implement `_rate_limit()` method for APIs with limits. Use shared rate limiters from `src/tools/rate_limiter.py`.
63
-
64
- **Error Handling**: Raise `SearchError` or `RateLimitError` on failures. Handle HTTP errors (429, 500, timeout). Return empty list on non-critical errors (log warning).
65
-
66
- **Query Preprocessing**: Use `preprocess_query()` from `src/tools/query_utils.py` to remove noise and expand synonyms.
67
-
68
- **Evidence Conversion**: Convert API responses to `Evidence` objects with `Citation`. Extract metadata (title, url, date, authors). Set relevance scores (0.0-1.0). Handle missing fields gracefully.
69
-
70
- **Tool-Specific Rules**:
71
- - `pubmed.py`: Use NCBI E-utilities (ESearch → EFetch). Rate limit: 0.34s between requests. Parse XML with `xmltodict`. Handle single vs. multiple articles.
72
- - `clinicaltrials.py`: Use `requests` library (NOT httpx - WAF blocks httpx). Run in thread pool: `await asyncio.to_thread(requests.get, ...)`. Filter: Only interventional studies, active/completed.
73
- - `europepmc.py`: Handle preprint markers: `[PREPRINT - Not peer-reviewed]`. Build URLs from DOI or PMID.
74
- - `rag_tool.py`: Wraps `LlamaIndexRAGService`. Returns Evidence from RAG results. Handles ingestion.
75
- - `search_handler.py`: Orchestrates parallel searches across multiple tools. Uses `asyncio.gather()` with `return_exceptions=True`. Aggregates results into `SearchResult`.
76
-
77
- ---
78
-
79
- ## src/middleware/ - Middleware Rules
80
-
81
- **State Management**: Use `ContextVar` for thread-safe isolation. `WorkflowState` uses `ContextVar[WorkflowState | None]`. Initialize with `init_workflow_state(embedding_service)`. Access with `get_workflow_state()` (auto-initializes if missing).
82
-
83
- **WorkflowState**: Tracks `evidence: list[Evidence]`, `conversation: Conversation`, `embedding_service: Any`. Methods: `add_evidence()` (deduplicates by URL), `async search_related()` (semantic search).
84
-
85
- **WorkflowManager**: Manages parallel research loops. Methods: `add_loop()`, `run_loops_parallel()`, `update_loop_status()`, `sync_loop_evidence_to_state()`. Uses `asyncio.gather()` for parallel execution. Handles errors per loop (don't fail all if one fails).
86
-
87
- **BudgetTracker**: Tracks tokens, time, iterations per loop and globally. Methods: `create_budget()`, `add_tokens()`, `start_timer()`, `update_timer()`, `increment_iteration()`, `check_budget()`, `can_continue()`. Token estimation: `estimate_tokens(text)` (~4 chars per token), `estimate_llm_call_tokens(prompt, response)`.
88
-
89
- **Models**: All middleware models in `src/utils/models.py`. `IterationData`, `Conversation`, `ResearchLoop`, `BudgetStatus` are used by middleware.
90
-
91
- ---
92
-
93
- ## src/orchestrator/ - Orchestration Rules
94
-
95
- **Research Flows**: Two patterns: `IterativeResearchFlow` (single loop) and `DeepResearchFlow` (plan → parallel loops → synthesis). Both support agent chains (`use_graph=False`) and graph execution (`use_graph=True`).
96
-
97
- **IterativeResearchFlow**: Pattern: Generate observations → Evaluate gaps → Select tools → Execute → Judge → Continue/Complete. Uses `KnowledgeGapAgent`, `ToolSelectorAgent`, `ThinkingAgent`, `WriterAgent`, `JudgeHandler`. Tracks iterations, time, budget.
98
-
99
- **DeepResearchFlow**: Pattern: Planner → Parallel iterative loops per section → Synthesizer. Uses `PlannerAgent`, `IterativeResearchFlow` (per section), `LongWriterAgent` or `ProofreaderAgent`. Uses `WorkflowManager` for parallel execution.
100
-
101
- **Graph Orchestrator**: Uses Pydantic AI Graphs (when available) or agent chains (fallback). Routes based on research mode (iterative/deep/auto). Streams `AgentEvent` objects for UI.
102
-
103
- **State Initialization**: Always call `init_workflow_state()` before running flows. Initialize `BudgetTracker` per loop. Use `WorkflowManager` for parallel coordination.
104
-
105
- **Event Streaming**: Yield `AgentEvent` objects during execution. Event types: "started", "search_complete", "judge_complete", "hypothesizing", "synthesizing", "complete", "error". Include iteration numbers and data payloads.
106
-
107
- ---
108
-
109
- ## src/services/ - Service Rules
110
-
111
- **EmbeddingService**: Local sentence-transformers (NO API key required). All operations async-safe via `run_in_executor()`. ChromaDB for vector storage. Deduplication threshold: 0.85 (85% similarity = duplicate).
112
-
113
- **LlamaIndexRAGService**: Uses OpenAI embeddings (requires `OPENAI_API_KEY`). Methods: `ingest_evidence()`, `retrieve()`, `query()`. Returns documents with metadata (source, title, url, date, authors). Lazy initialization with graceful fallback.
114
-
115
- **StatisticalAnalyzer**: Generates Python code via LLM. Executes in Modal sandbox (secure, isolated). Library versions pinned in `SANDBOX_LIBRARIES` dict. Returns `AnalysisResult` with verdict (SUPPORTED/REFUTED/INCONCLUSIVE).
116
-
117
- **Singleton Pattern**: Use `@lru_cache(maxsize=1)` for singletons: `@lru_cache(maxsize=1); def get_service() -> Service: return Service()`. Lazy initialization to avoid requiring dependencies at import time.
118
-
119
- ---
120
-
121
- ## src/utils/ - Utility Rules
122
-
123
- **Models**: All Pydantic models in `src/utils/models.py`. Use frozen models (`model_config = {"frozen": True}`) except where mutation needed. Use `Field()` with descriptions. Validate with constraints.
124
-
125
- **Config**: Settings via Pydantic Settings (`src/utils/config.py`). Load from `.env` automatically. Use `settings` singleton: `from src.utils.config import settings`. Validate API keys with properties: `has_openai_key`, `has_anthropic_key`.
126
-
127
- **Exceptions**: Custom exception hierarchy in `src/utils/exceptions.py`. Base: `DeepCriticalError`. Specific: `SearchError`, `RateLimitError`, `JudgeError`, `ConfigurationError`. Always chain exceptions.
128
-
129
- **LLM Factory**: Centralized LLM model creation in `src/utils/llm_factory.py`. Supports OpenAI, Anthropic, HF Inference. Use `get_model()` or factory functions. Check requirements before initialization.
130
-
131
- **Citation Validator**: Use `validate_references()` from `src/utils/citation_validator.py`. Removes hallucinated citations (URLs not in evidence). Logs warnings. Returns validated report string.
132
-
133
- ---
134
-
135
- ## src/orchestrator_factory.py Rules
136
-
137
- **Purpose**: Factory for creating orchestrators. Supports "simple" (legacy) and "advanced" (magentic) modes. Auto-detects mode based on API key availability.
138
-
139
- **Pattern**: Lazy import for optional dependencies (`_get_magentic_orchestrator_class()`). Handles `ImportError` gracefully with clear error messages.
140
-
141
- **Mode Detection**: `_determine_mode()` checks explicit mode or auto-detects: "advanced" if `settings.has_openai_key`, else "simple". Maps "magentic" → "advanced".
142
-
143
- **Function Signature**: `create_orchestrator(search_handler, judge_handler, config, mode) -> Any`. Simple mode requires handlers. Advanced mode uses MagenticOrchestrator.
144
-
145
- **Error Handling**: Raise `ValueError` with clear messages if requirements not met. Log mode selection with structlog.
146
-
147
- ---
148
-
149
- ## src/orchestrator_hierarchical.py Rules
150
-
151
- **Purpose**: Hierarchical orchestrator using middleware and sub-teams. Adapts Magentic ChatAgent to SubIterationTeam protocol.
152
-
153
- **Pattern**: Uses `SubIterationMiddleware` with `ResearchTeam` and `LLMSubIterationJudge`. Event-driven via callback queue.
154
-
155
- **State Initialization**: Initialize embedding service with graceful fallback. Use `init_magentic_state()` (deprecated, but kept for compatibility).
156
-
157
- **Event Streaming**: Uses `asyncio.Queue` for event coordination. Yields `AgentEvent` objects. Handles event callback pattern with `asyncio.wait()`.
158
-
159
- **Error Handling**: Log errors with context. Yield error events. Process remaining events after task completion.
160
-
161
- ---
162
-
163
- ## src/orchestrator_magentic.py Rules
164
-
165
- **Purpose**: Magentic-based orchestrator using ChatAgent pattern. Each agent has internal LLM. Manager orchestrates agents.
166
-
167
- **Pattern**: Uses `MagenticBuilder` with participants (searcher, hypothesizer, judge, reporter). Manager uses `OpenAIChatClient`. Workflow built in `_build_workflow()`.
168
-
169
- **Event Processing**: `_process_event()` converts Magentic events to `AgentEvent`. Handles: `MagenticOrchestratorMessageEvent`, `MagenticAgentMessageEvent`, `MagenticFinalResultEvent`, `MagenticAgentDeltaEvent`, `WorkflowOutputEvent`.
170
-
171
- **Text Extraction**: `_extract_text()` defensively extracts text from messages. Priority: `.content` → `.text` → `str(message)`. Handles buggy message objects.
172
-
173
- **State Initialization**: Initialize embedding service with graceful fallback. Use `init_magentic_state()` (deprecated).
174
-
175
- **Requirements**: Must call `check_magentic_requirements()` in `__init__`. Requires `agent-framework-core` and OpenAI API key.
176
-
177
- **Event Types**: Maps agent names to event types: "search" → "search_complete", "judge" → "judge_complete", "hypothes" → "hypothesizing", "report" → "synthesizing".
178
-
179
- ---
180
-
181
- ## src/agent_factory/ - Factory Rules
182
-
183
- **Pattern**: Factory functions for creating agents and handlers. Lazy initialization for optional dependencies. Support OpenAI/Anthropic/HF Inference.
184
-
185
- **Judges**: `create_judge_handler()` creates `JudgeHandler` with structured output (`JudgeAssessment`). Supports `MockJudgeHandler`, `HFInferenceJudgeHandler` as fallbacks.
186
-
187
- **Agents**: Factory functions in `agents.py` for all Pydantic AI agents. Pattern: `create_agent_name(model: Any | None = None) -> AgentName`. Use `get_model()` if model not provided.
188
-
189
- **Graph Builder**: `graph_builder.py` contains utilities for building research graphs. Supports iterative and deep research graph construction.
190
-
191
- **Error Handling**: Raise `ConfigurationError` if required API keys missing. Log agent creation. Handle import errors gracefully.
192
-
193
- ---
194
-
195
- ## src/prompts/ - Prompt Rules
196
-
197
- **Pattern**: System prompts stored as module-level constants. Include date injection: `datetime.now().strftime("%Y-%m-%d")`. Format evidence with truncation (1500 chars per item).
198
-
199
- **Judge Prompts**: In `judge.py`. Handle empty evidence case separately. Always request structured JSON output.
200
-
201
- **Hypothesis Prompts**: In `hypothesis.py`. Use diverse evidence selection (MMR algorithm). Sentence-aware truncation.
202
-
203
- **Report Prompts**: In `report.py`. Include full citation details. Use diverse evidence selection (n=20). Emphasize citation validation rules.
204
-
205
- ---
206
-
207
- ## Testing Rules
208
-
209
- **Structure**: Unit tests in `tests/unit/` (mocked, fast). Integration tests in `tests/integration/` (real APIs, marked `@pytest.mark.integration`).
210
-
211
- **Mocking**: Use `respx` for httpx mocking. Use `pytest-mock` for general mocking. Mock LLM calls in unit tests (use `MockJudgeHandler`).
212
-
213
- **Fixtures**: Common fixtures in `tests/conftest.py`: `mock_httpx_client`, `mock_llm_response`.
214
-
215
- **Coverage**: Aim for >80% coverage. Test error handling, edge cases, and integration paths.
216
-
217
- ---
218
-
219
- ## File-Specific Agent Rules
220
-
221
- **knowledge_gap.py**: Outputs `KnowledgeGapOutput`. System prompt evaluates research completeness. Handles conversation history. Returns fallback on error.
222
-
223
- **writer.py**: Returns markdown string. System prompt includes citation format examples. Validates inputs. Truncates long findings. Retry logic for transient failures.
224
-
225
- **long_writer.py**: Uses `ReportDraft` input/output. Writes sections iteratively. Reformats references (deduplicates, renumbers). Reformats section headings.
226
-
227
- **proofreader.py**: Takes `ReportDraft`, returns polished markdown. Removes duplicates. Adds summary. Preserves references.
228
-
229
- **tool_selector.py**: Outputs `AgentSelectionPlan`. System prompt lists available agents (WebSearchAgent, SiteCrawlerAgent, RAGAgent). Guidelines for when to use each.
230
-
231
- **thinking.py**: Returns observation string. Generates observations from conversation history. Uses query and background context.
232
-
233
- **input_parser.py**: Outputs `ParsedQuery`. Detects research mode (iterative/deep). Extracts entities and research questions. Improves/refines query.
234
-
235
-
236
-
237
-
238
-
239
-
240
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
.env.example CHANGED
@@ -1,83 +1,63 @@
1
- # HuggingFace
2
- HF_TOKEN=your_huggingface_token_here
3
 
4
- # OpenAI (optional)
5
- OPENAI_API_KEY=your_openai_key_here
6
 
7
- # Anthropic (optional)
8
- ANTHROPIC_API_KEY=your_anthropic_key_here
 
9
 
10
  # Model names (optional - sensible defaults set in config.py)
11
- # ANTHROPIC_MODEL=claude-sonnet-4-5-20250929
12
  # OPENAI_MODEL=gpt-5.1
 
13
 
 
14
 
15
- # ============================================
16
- # Audio Processing Configuration (TTS)
17
- # ============================================
18
- # Kokoro TTS Model Configuration
19
- TTS_MODEL=hexgrad/Kokoro-82M
20
- TTS_VOICE=af_heart
21
- TTS_SPEED=1.0
22
- TTS_GPU=T4
23
- TTS_TIMEOUT=60
24
-
25
- # Available TTS Voices:
26
- # American English Female: af_heart, af_bella, af_nicole, af_aoede, af_kore, af_sarah, af_nova, af_sky, af_alloy, af_jessica, af_river
27
- # American English Male: am_michael, am_fenrir, am_puck, am_echo, am_eric, am_liam, am_onyx, am_santa, am_adam
28
-
29
- # Available GPU Types (Modal):
30
- # T4 - Cheapest, good for testing (default)
31
- # A10 - Good balance of cost/performance
32
- # A100 - Fastest, most expensive
33
- # L4 - NVIDIA L4 GPU
34
- # L40S - NVIDIA L40S GPU
35
- # Note: GPU type is set at function definition time. Changes require app restart.
36
-
37
- # ============================================
38
- # Audio Processing Configuration (STT)
39
- # ============================================
40
- # Speech-to-Text API Configuration
41
- STT_API_URL=nvidia/canary-1b-v2
42
- STT_SOURCE_LANG=English
43
- STT_TARGET_LANG=English
44
-
45
- # Available STT Languages:
46
- # English, Bulgarian, Croatian, Czech, Danish, Dutch, Estonian, Finnish, French, German, Greek, Hungarian, Italian, Latvian, Lithuanian, Maltese, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish, Russian, Ukrainian
47
-
48
- # ============================================
49
- # Audio Feature Flags
50
- # ============================================
51
- ENABLE_AUDIO_INPUT=true
52
- ENABLE_AUDIO_OUTPUT=true
53
-
54
- # ============================================
55
- # Image OCR Configuration
56
- # ============================================
57
- OCR_API_URL=prithivMLmods/Multimodal-OCR3
58
- ENABLE_IMAGE_INPUT=true
59
-
60
- # ============== EMBEDDINGS ==============
61
-
62
- # OpenAI Embedding Model (used if LLM_PROVIDER is openai and performing RAG/Embeddings)
63
- OPENAI_EMBEDDING_MODEL=text-embedding-3-small
64
-
65
- # Local Embedding Model (used for local/offline embeddings)
66
- LOCAL_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
67
-
68
- # ============== HUGGINGFACE (FREE TIER) ==============
69
-
70
- # HuggingFace Token - enables Llama 3.1 (best quality free model)
71
  # Get yours at: https://huggingface.co/settings/tokens
72
- #
73
- # WITHOUT HF_TOKEN: Falls back to ungated models (zephyr-7b-beta)
74
- # WITH HF_TOKEN: Uses Llama 3.1 8B Instruct (requires accepting license)
75
  #
76
  # For HuggingFace Spaces deployment:
77
  # Set this as a "Secret" in Space Settings -> Variables and secrets
78
  # Users/judges don't need their own token - the Space secret is used
79
  #
80
  HF_TOKEN=hf_your-token-here
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
81
 
82
  # ============== AGENT CONFIGURATION ==============
83
 
@@ -85,23 +65,60 @@ MAX_ITERATIONS=10
85
  SEARCH_TIMEOUT=30
86
  LOG_LEVEL=INFO
87
 
88
- # ============================================
89
- # Modal Configuration (Required for TTS)
90
- # ============================================
91
- # Modal credentials are required for TTS (Text-to-Speech) functionality
92
- # Get your credentials from: https://modal.com/
93
- MODAL_TOKEN_ID=your_modal_token_id_here
94
- MODAL_TOKEN_SECRET=your_modal_token_secret_here
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
95
 
96
  # ============== EXTERNAL SERVICES ==============
97
 
98
- # PubMed (optional - higher rate limits)
99
  NCBI_API_KEY=your-ncbi-key-here
100
 
101
- # Vector Database (optional - for LlamaIndex RAG)
 
 
 
 
 
 
102
  CHROMA_DB_PATH=./chroma_db
103
- # Neo4j Knowledge Graph
104
- NEO4J_URI=bolt://localhost:7687
105
- NEO4J_USER=neo4j
106
- NEO4J_PASSWORD=your_neo4j_password_here
107
- NEO4J_DATABASE=your_database_name
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ============== LLM CONFIGURATION ==============
 
2
 
3
+ # Provider: "openai", "anthropic", or "huggingface"
4
+ LLM_PROVIDER=openai
5
 
6
+ # API Keys (at least one required for full LLM analysis)
7
+ OPENAI_API_KEY=sk-your-key-here
8
+ ANTHROPIC_API_KEY=sk-ant-your-key-here
9
 
10
  # Model names (optional - sensible defaults set in config.py)
 
11
  # OPENAI_MODEL=gpt-5.1
12
+ # ANTHROPIC_MODEL=claude-sonnet-4-5-20250929
13
 
14
+ # ============== HUGGINGFACE CONFIGURATION ==============
15
 
16
+ # HuggingFace Token - enables gated models and higher rate limits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
  # Get yours at: https://huggingface.co/settings/tokens
18
+ #
19
+ # WITHOUT HF_TOKEN: Falls back to ungated models (zephyr-7b-beta, Qwen2-7B)
20
+ # WITH HF_TOKEN: Uses gated models (Llama 3.1, Gemma-2) via inference providers
21
  #
22
  # For HuggingFace Spaces deployment:
23
  # Set this as a "Secret" in Space Settings -> Variables and secrets
24
  # Users/judges don't need their own token - the Space secret is used
25
  #
26
  HF_TOKEN=hf_your-token-here
27
+ # Alternative: HUGGINGFACE_API_KEY (same as HF_TOKEN)
28
+
29
+ # Default HuggingFace model for inference (gated, requires auth)
30
+ # Can be overridden in UI dropdown
31
+ # Latest reasoning models: Qwen3-Next-80B-A3B-Thinking, Qwen3-Next-80B-A3B-Instruct, Llama-3.3-70B-Instruct
32
+ HUGGINGFACE_MODEL=Qwen/Qwen3-Next-80B-A3B-Thinking
33
+
34
+ # Fallback models for HuggingFace Inference API (comma-separated)
35
+ # Models are tried in order until one succeeds
36
+ # Format: model1,model2,model3
37
+ # Latest reasoning models first, then reliable fallbacks
38
+ # Reasoning models: Qwen3-Next (thinking/instruct), Llama-3.3-70B, Qwen3-235B
39
+ # Fallbacks: Llama-3.1-8B, Zephyr-7B (ungated), Qwen2-7B (ungated)
40
+ HF_FALLBACK_MODELS=Qwen/Qwen3-Next-80B-A3B-Thinking,Qwen/Qwen3-Next-80B-A3B-Instruct,meta-llama/Llama-3.3-70B-Instruct,meta-llama/Llama-3.1-8B-Instruct,HuggingFaceH4/zephyr-7b-beta,Qwen/Qwen2-7B-Instruct
41
+
42
+ # Override model/provider selection (optional, usually set via UI)
43
+ # HF_MODEL=Qwen/Qwen3-Next-80B-A3B-Thinking
44
+ # HF_PROVIDER=hyperbolic
45
+
46
+ # ============== EMBEDDING CONFIGURATION ==============
47
+
48
+ # Embedding Provider: "openai", "local", or "huggingface"
49
+ # Default: "local" (no API key required)
50
+ EMBEDDING_PROVIDER=local
51
+
52
+ # OpenAI Embedding Model (used if EMBEDDING_PROVIDER=openai)
53
+ OPENAI_EMBEDDING_MODEL=text-embedding-3-small
54
+
55
+ # Local Embedding Model (sentence-transformers, used if EMBEDDING_PROVIDER=local)
56
+ # BAAI/bge-small-en-v1.5 is newer, faster, and better than all-MiniLM-L6-v2
57
+ LOCAL_EMBEDDING_MODEL=BAAI/bge-small-en-v1.5
58
+
59
+ # HuggingFace Embedding Model (used if EMBEDDING_PROVIDER=huggingface)
60
+ HUGGINGFACE_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
61
 
62
  # ============== AGENT CONFIGURATION ==============
63
 
 
65
  SEARCH_TIMEOUT=30
66
  LOG_LEVEL=INFO
67
 
68
+ # Graph-based execution (experimental)
69
+ # USE_GRAPH_EXECUTION=false
70
+
71
+ # Budget & Rate Limiting
72
+ # DEFAULT_TOKEN_LIMIT=100000
73
+ # DEFAULT_TIME_LIMIT_MINUTES=10
74
+ # DEFAULT_ITERATIONS_LIMIT=10
75
+
76
+ # ============== WEB SEARCH CONFIGURATION ==============
77
+
78
+ # Web Search Provider: "serper", "searchxng", "brave", "tavily", or "duckduckgo"
79
+ # Default: "duckduckgo" (no API key required)
80
+ WEB_SEARCH_PROVIDER=duckduckgo
81
+
82
+ # Serper API Key (for Google search via Serper)
83
+ # SERPER_API_KEY=your-serper-key-here
84
+
85
+ # SearchXNG Host URL (for self-hosted search)
86
+ # SEARCHXNG_HOST=http://localhost:8080
87
+
88
+ # Brave Search API Key
89
+ # BRAVE_API_KEY=your-brave-key-here
90
+
91
+ # Tavily API Key
92
+ # TAVILY_API_KEY=your-tavily-key-here
93
 
94
  # ============== EXTERNAL SERVICES ==============
95
 
96
+ # PubMed (optional - higher rate limits: 10 req/sec vs 3 req/sec)
97
  NCBI_API_KEY=your-ncbi-key-here
98
 
99
+ # Modal (optional - for secure code execution sandbox)
100
+ # MODAL_TOKEN_ID=your-modal-token-id
101
+ # MODAL_TOKEN_SECRET=your-modal-token-secret
102
+
103
+ # ============== VECTOR DATABASE (ChromaDB) ==============
104
+
105
+ # ChromaDB storage path
106
  CHROMA_DB_PATH=./chroma_db
107
+
108
+ # Persist ChromaDB to disk (default: true)
109
+ # CHROMA_DB_PERSIST=true
110
+
111
+ # Remote ChromaDB server (optional)
112
+ # CHROMA_DB_HOST=localhost
113
+ # CHROMA_DB_PORT=8000
114
+
115
+ # ============== RAG SERVICE CONFIGURATION ==============
116
+
117
+ # ChromaDB collection name for RAG
118
+ # RAG_COLLECTION_NAME=deepcritical_evidence
119
+
120
+ # Number of top results to retrieve from RAG
121
+ # RAG_SIMILARITY_TOP_K=5
122
+
123
+ # Automatically ingest evidence into RAG
124
+ # RAG_AUTO_INGEST=true
.github/README.md CHANGED
@@ -3,7 +3,8 @@
3
  > **You are reading the Github README!**
4
  >
5
  > - 📚 **Documentation**: See our [technical documentation](https://deepcritical.github.io/GradioDemo/) for detailed information
6
- > - 📖 **Demo README**: Check out the [Demo README](..README.md) for more information > - 🏆 **Demo**: Kindly consider using our [Free Demo](https://hf.co/DataQuests/GradioDemo)
 
7
 
8
 
9
  <div align="center">
@@ -37,7 +38,15 @@ gradio run "src/app.py"
37
 
38
  Open your browser to `http://localhost:7860`.
39
 
40
- ### 3. Connect via MCP
 
 
 
 
 
 
 
 
41
 
42
  This application exposes a Model Context Protocol (MCP) server, allowing you to use its search tools directly from Claude Desktop or other MCP clients.
43
 
 
3
  > **You are reading the Github README!**
4
  >
5
  > - 📚 **Documentation**: See our [technical documentation](https://deepcritical.github.io/GradioDemo/) for detailed information
6
+ > - 📖 **Demo README**: Check out the [Demo README](..README.md) for setup, configuration, and contribution guidelines
7
+ > - 🏆 **Hackathon Submission**: Keep reading below for more information about our MCP Hackathon submission
8
 
9
 
10
  <div align="center">
 
38
 
39
  Open your browser to `http://localhost:7860`.
40
 
41
+ ### 3. Authentication (Optional)
42
+
43
+ **HuggingFace OAuth Login**:
44
+ - Click the "Sign in with HuggingFace" button at the top of the app
45
+ - Your HuggingFace API token will be automatically used for AI inference
46
+ - No need to manually enter API keys when logged in
47
+ - OAuth token is used only for the current session and never stored
48
+
49
+ ### 4. Connect via MCP
50
 
51
  This application exposes a Model Context Protocol (MCP) server, allowing you to use its search tools directly from Claude Desktop or other MCP clients.
52
 
.github/scripts/deploy_to_hf_space.py DELETED
@@ -1,391 +0,0 @@
1
- """Deploy repository to Hugging Face Space, excluding unnecessary files."""
2
-
3
- import os
4
- import shutil
5
- import subprocess
6
- import tempfile
7
- from pathlib import Path
8
-
9
- from huggingface_hub import HfApi
10
-
11
-
12
- def get_excluded_dirs() -> set[str]:
13
- """Get set of directory names to exclude from deployment."""
14
- return {
15
- "docs",
16
- "dev",
17
- "folder",
18
- "site",
19
- "tests", # Optional - can be included if desired
20
- "examples", # Optional - can be included if desired
21
- ".git",
22
- ".github",
23
- "__pycache__",
24
- ".pytest_cache",
25
- ".mypy_cache",
26
- ".ruff_cache",
27
- ".venv",
28
- "venv",
29
- "env",
30
- "ENV",
31
- "node_modules",
32
- ".cursor",
33
- "reference_repos",
34
- "burner_docs",
35
- "chroma_db",
36
- "logs",
37
- "build",
38
- "dist",
39
- ".eggs",
40
- "htmlcov",
41
- "hf_space", # Exclude the cloned HF Space directory itself
42
- }
43
-
44
-
45
- def get_excluded_files() -> set[str]:
46
- """Get set of file names to exclude from deployment."""
47
- return {
48
- ".pre-commit-config.yaml",
49
- "mkdocs.yml",
50
- "uv.lock",
51
- "AGENTS.txt",
52
- ".env",
53
- ".env.local",
54
- "*.local",
55
- ".DS_Store",
56
- "Thumbs.db",
57
- "*.log",
58
- ".coverage",
59
- "coverage.xml",
60
- }
61
-
62
-
63
- def should_exclude(path: Path, excluded_dirs: set[str], excluded_files: set[str]) -> bool:
64
- """Check if a path should be excluded from deployment."""
65
- # Check if any parent directory is excluded
66
- for parent in path.parents:
67
- if parent.name in excluded_dirs:
68
- return True
69
-
70
- # Check if the path itself is a directory that should be excluded
71
- if path.is_dir() and path.name in excluded_dirs:
72
- return True
73
-
74
- # Check if the file name matches excluded patterns
75
- if path.is_file():
76
- # Check exact match
77
- if path.name in excluded_files:
78
- return True
79
- # Check pattern matches (simple wildcard support)
80
- for pattern in excluded_files:
81
- if "*" in pattern:
82
- # Simple pattern matching (e.g., "*.log")
83
- suffix = pattern.replace("*", "")
84
- if path.name.endswith(suffix):
85
- return True
86
-
87
- return False
88
-
89
-
90
- def deploy_to_hf_space() -> None:
91
- """Deploy repository to Hugging Face Space.
92
-
93
- Supports both user and organization Spaces:
94
- - User Space: username/space-name
95
- - Organization Space: organization-name/space-name
96
-
97
- Works with both classic tokens and fine-grained tokens.
98
- """
99
- # Get configuration from environment variables
100
- hf_token = os.getenv("HF_TOKEN")
101
- hf_username = os.getenv("HF_USERNAME") # Can be username or organization name
102
- space_name = os.getenv("HF_SPACE_NAME")
103
-
104
- # Check which variables are missing and provide helpful error message
105
- missing = []
106
- if not hf_token:
107
- missing.append("HF_TOKEN (should be in repository secrets)")
108
- if not hf_username:
109
- missing.append("HF_USERNAME (should be in repository variables)")
110
- if not space_name:
111
- missing.append("HF_SPACE_NAME (should be in repository variables)")
112
-
113
- if missing:
114
- raise ValueError(
115
- f"Missing required environment variables: {', '.join(missing)}\n"
116
- f"Please configure:\n"
117
- f" - HF_TOKEN in Settings > Secrets and variables > Actions > Secrets\n"
118
- f" - HF_USERNAME in Settings > Secrets and variables > Actions > Variables\n"
119
- f" - HF_SPACE_NAME in Settings > Secrets and variables > Actions > Variables"
120
- )
121
-
122
- # HF_USERNAME can be either a username or organization name
123
- # Format: {username|organization}/{space_name}
124
- repo_id = f"{hf_username}/{space_name}"
125
- local_dir = "hf_space"
126
-
127
- print(f"🚀 Deploying to Hugging Face Space: {repo_id}")
128
-
129
- # Initialize HF API
130
- api = HfApi(token=hf_token)
131
-
132
- # Create Space if it doesn't exist
133
- try:
134
- api.repo_info(repo_id=repo_id, repo_type="space", token=hf_token)
135
- print(f"✅ Space exists: {repo_id}")
136
- except Exception:
137
- print(f"⚠️ Space does not exist, creating: {repo_id}")
138
- # Create new repository
139
- # Note: For organizations, repo_id should be "org/space-name"
140
- # For users, repo_id should be "username/space-name"
141
- api.create_repo(
142
- repo_id=repo_id, # Full repo_id including owner
143
- repo_type="space",
144
- space_sdk="gradio",
145
- token=hf_token,
146
- exist_ok=True,
147
- )
148
- print(f"✅ Created new Space: {repo_id}")
149
-
150
- # Configure Git credential helper for authentication
151
- # This is needed for Git LFS to work properly with fine-grained tokens
152
- print("🔐 Configuring Git credentials...")
153
-
154
- # Use Git credential store to store the token
155
- # This allows Git LFS to authenticate properly
156
- temp_dir = Path(tempfile.gettempdir())
157
- credential_store = temp_dir / ".git-credentials-hf"
158
-
159
- # Write credentials in the format: https://username:[email protected]
160
- credential_store.write_text(
161
- f"https://{hf_username}:{hf_token}@huggingface.co\n", encoding="utf-8"
162
- )
163
- try:
164
- credential_store.chmod(0o600) # Secure permissions (Unix only)
165
- except OSError:
166
- # Windows doesn't support chmod, skip
167
- pass
168
-
169
- # Configure Git to use the credential store
170
- subprocess.run(
171
- ["git", "config", "--global", "credential.helper", f"store --file={credential_store}"],
172
- check=True,
173
- capture_output=True,
174
- )
175
-
176
- # Also set environment variable for Git LFS
177
- os.environ["GIT_CREDENTIAL_HELPER"] = f"store --file={credential_store}"
178
-
179
- # Clone repository using git
180
- # Use the token in the URL for initial clone, but LFS will use credential store
181
- space_url = f"https://{hf_username}:{hf_token}@huggingface.co/spaces/{repo_id}"
182
-
183
- if Path(local_dir).exists():
184
- print(f"🧹 Removing existing {local_dir} directory...")
185
- shutil.rmtree(local_dir)
186
-
187
- print("📥 Cloning Space repository...")
188
- try:
189
- result = subprocess.run(
190
- ["git", "clone", space_url, local_dir],
191
- check=True,
192
- capture_output=True,
193
- text=True,
194
- )
195
- print("✅ Cloned Space repository")
196
-
197
- # After clone, configure the remote to use credential helper
198
- # This ensures future operations (like push) use the credential store
199
- os.chdir(local_dir)
200
- subprocess.run(
201
- ["git", "remote", "set-url", "origin", f"https://huggingface.co/spaces/{repo_id}"],
202
- check=True,
203
- capture_output=True,
204
- )
205
- os.chdir("..")
206
-
207
- except subprocess.CalledProcessError as e:
208
- error_msg = e.stderr if e.stderr else e.stdout if e.stdout else "Unknown error"
209
- print(f"❌ Failed to clone Space repository: {error_msg}")
210
-
211
- # Try alternative: clone with LFS skip, then fetch LFS files separately
212
- print("🔄 Trying alternative clone method (skip LFS during clone)...")
213
- try:
214
- env = os.environ.copy()
215
- env["GIT_LFS_SKIP_SMUDGE"] = "1" # Skip LFS during clone
216
-
217
- subprocess.run(
218
- ["git", "clone", space_url, local_dir],
219
- check=True,
220
- capture_output=True,
221
- text=True,
222
- env=env,
223
- )
224
- print("✅ Cloned Space repository (LFS skipped)")
225
-
226
- # Configure remote
227
- os.chdir(local_dir)
228
- subprocess.run(
229
- ["git", "remote", "set-url", "origin", f"https://huggingface.co/spaces/{repo_id}"],
230
- check=True,
231
- capture_output=True,
232
- )
233
-
234
- # Try to fetch LFS files with proper authentication
235
- print("📥 Fetching LFS files...")
236
- subprocess.run(
237
- ["git", "lfs", "pull"],
238
- check=False, # Don't fail if LFS pull fails - we'll continue without LFS files
239
- capture_output=True,
240
- text=True,
241
- )
242
- os.chdir("..")
243
- print("✅ Repository cloned (LFS files may be incomplete, but deployment can continue)")
244
- except subprocess.CalledProcessError as e2:
245
- error_msg2 = e2.stderr if e2.stderr else e2.stdout if e2.stdout else "Unknown error"
246
- print(f"❌ Alternative clone method also failed: {error_msg2}")
247
- raise RuntimeError(f"Git clone failed: {error_msg}") from e
248
-
249
- # Get exclusion sets
250
- excluded_dirs = get_excluded_dirs()
251
- excluded_files = get_excluded_files()
252
-
253
- # Remove all existing files in HF Space (except .git)
254
- print("🧹 Cleaning existing files...")
255
- for item in Path(local_dir).iterdir():
256
- if item.name == ".git":
257
- continue
258
- if item.is_dir():
259
- shutil.rmtree(item)
260
- else:
261
- item.unlink()
262
-
263
- # Copy files from repository root
264
- print("📦 Copying files...")
265
- repo_root = Path(".")
266
- files_copied = 0
267
- dirs_copied = 0
268
-
269
- for item in repo_root.rglob("*"):
270
- # Skip if in .git directory
271
- if ".git" in item.parts:
272
- continue
273
-
274
- # Skip if in hf_space directory (the cloned Space directory)
275
- if "hf_space" in item.parts:
276
- continue
277
-
278
- # Skip if should be excluded
279
- if should_exclude(item, excluded_dirs, excluded_files):
280
- continue
281
-
282
- # Calculate relative path
283
- try:
284
- rel_path = item.relative_to(repo_root)
285
- except ValueError:
286
- # Item is outside repo root, skip
287
- continue
288
-
289
- # Skip if in excluded directory
290
- if any(part in excluded_dirs for part in rel_path.parts):
291
- continue
292
-
293
- # Destination path
294
- dest_path = Path(local_dir) / rel_path
295
-
296
- # Create parent directories
297
- dest_path.parent.mkdir(parents=True, exist_ok=True)
298
-
299
- # Copy file or directory
300
- if item.is_file():
301
- shutil.copy2(item, dest_path)
302
- files_copied += 1
303
- elif item.is_dir():
304
- # Directory will be created by parent mkdir, but we track it
305
- dirs_copied += 1
306
-
307
- print(f"✅ Copied {files_copied} files and {dirs_copied} directories")
308
-
309
- # Commit and push changes using git
310
- print("💾 Committing changes...")
311
-
312
- # Change to the Space directory
313
- original_cwd = os.getcwd()
314
- os.chdir(local_dir)
315
-
316
- try:
317
- # Configure git user (required for commit)
318
- subprocess.run(
319
- ["git", "config", "user.name", "github-actions[bot]"],
320
- check=True,
321
- capture_output=True,
322
- )
323
- subprocess.run(
324
- ["git", "config", "user.email", "github-actions[bot]@users.noreply.github.com"],
325
- check=True,
326
- capture_output=True,
327
- )
328
-
329
- # Add all files
330
- subprocess.run(
331
- ["git", "add", "."],
332
- check=True,
333
- capture_output=True,
334
- )
335
-
336
- # Check if there are changes to commit
337
- result = subprocess.run(
338
- ["git", "status", "--porcelain"],
339
- check=False,
340
- capture_output=True,
341
- text=True,
342
- )
343
-
344
- if result.stdout.strip():
345
- # There are changes, commit and push
346
- subprocess.run(
347
- ["git", "commit", "-m", "Deploy to Hugging Face Space [skip ci]"],
348
- check=True,
349
- capture_output=True,
350
- )
351
- print("📤 Pushing to Hugging Face Space...")
352
- # Ensure remote URL uses credential helper (not token in URL)
353
- subprocess.run(
354
- ["git", "remote", "set-url", "origin", f"https://huggingface.co/spaces/{repo_id}"],
355
- check=True,
356
- capture_output=True,
357
- )
358
- subprocess.run(
359
- ["git", "push"],
360
- check=True,
361
- capture_output=True,
362
- )
363
- print("✅ Deployment complete!")
364
- else:
365
- print("ℹ️ No changes to commit (repository is up to date)")
366
- except subprocess.CalledProcessError as e:
367
- error_msg = e.stderr if e.stderr else (e.stdout if e.stdout else str(e))
368
- if isinstance(error_msg, bytes):
369
- error_msg = error_msg.decode("utf-8", errors="replace")
370
- if "nothing to commit" in error_msg.lower():
371
- print("ℹ️ No changes to commit (repository is up to date)")
372
- else:
373
- print(f"⚠️ Error during git operations: {error_msg}")
374
- raise RuntimeError(f"Git operation failed: {error_msg}") from e
375
- finally:
376
- # Return to original directory
377
- os.chdir(original_cwd)
378
-
379
- # Clean up credential store for security
380
- try:
381
- if credential_store.exists():
382
- credential_store.unlink()
383
- except Exception:
384
- # Ignore cleanup errors
385
- pass
386
-
387
- print(f"🎉 Successfully deployed to: https://huggingface.co/spaces/{repo_id}")
388
-
389
-
390
- if __name__ == "__main__":
391
- deploy_to_hf_space()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
.github/workflows/ci.yml CHANGED
@@ -2,9 +2,9 @@ name: CI
2
 
3
  on:
4
  push:
5
- branches: [main, dev, develop]
6
  pull_request:
7
- branches: [main, dev, develop]
8
 
9
  jobs:
10
  test:
@@ -16,6 +16,11 @@ jobs:
16
  steps:
17
  - uses: actions/checkout@v4
18
 
 
 
 
 
 
19
  - name: Set up Python ${{ matrix.python-version }}
20
  uses: actions/setup-python@v5
21
  with:
@@ -23,105 +28,53 @@ jobs:
23
 
24
  - name: Install dependencies
25
  run: |
26
- python -m pip install --upgrade pip
27
- pip install -e ".[dev]"
28
 
29
  - name: Lint with ruff
30
- run: |
31
- ruff check . --exclude tests
32
- ruff format --check . --exclude tests
33
  continue-on-error: true
 
 
 
34
 
35
  - name: Type check with mypy
36
- run: |
37
- mypy src
38
  continue-on-error: true
39
-
40
- - name: Install embedding dependencies
41
  run: |
42
- pip install -e ".[embeddings]"
43
 
44
- - name: Run unit tests (excluding OpenAI and embedding providers)
45
  env:
46
  HF_TOKEN: ${{ secrets.HF_TOKEN }}
 
47
  run: |
48
- pytest tests/unit/ -v -m "not openai and not embedding_provider" --tb=short -p no:logfire --cov --cov-branch --cov-report=xml --cov-report=term
49
 
50
  - name: Run local embeddings tests
51
  env:
52
  HF_TOKEN: ${{ secrets.HF_TOKEN }}
 
53
  run: |
54
- pytest tests/ -v -m "local_embeddings" --tb=short -p no:logfire --cov --cov-branch --cov-report=xml --cov-report=term --cov-append || true
55
  continue-on-error: true # Allow failures if dependencies not available
56
 
57
  - name: Run HuggingFace integration tests
58
  env:
59
  HF_TOKEN: ${{ secrets.HF_TOKEN }}
 
60
  run: |
61
- pytest tests/integration/ -v -m "huggingface and not embedding_provider" --tb=short -p no:logfire --cov --cov-branch --cov-report=xml --cov-report=term --cov-append || true
62
  continue-on-error: true # Allow failures if HF_TOKEN not set
63
 
64
- - name: Run non-OpenAI integration tests (excluding embedding providers)
65
  env:
66
  HF_TOKEN: ${{ secrets.HF_TOKEN }}
 
67
  run: |
68
- pytest tests/integration/ -v -m "integration and not openai and not embedding_provider" --tb=short -p no:logfire --cov --cov-branch --cov-report=xml --cov-report=term --cov-append || true
69
  continue-on-error: true # Allow failures if dependencies not available
70
 
71
  - name: Upload coverage reports to Codecov
72
  uses: codecov/codecov-action@v5
 
73
  with:
74
  token: ${{ secrets.CODECOV_TOKEN }}
75
  slug: DeepCritical/GradioDemo
76
- files: ./coverage.xml
77
- fail_ci_if_error: false
78
- continue-on-error: true
79
-
80
- docs:
81
- runs-on: ubuntu-latest
82
- permissions:
83
- contents: write
84
- if: github.event_name == 'push' && (github.ref == 'refs/heads/main' || github.ref == 'refs/heads/dev' || github.ref == 'refs/heads/develop')
85
- steps:
86
- - uses: actions/checkout@v4
87
- with:
88
- fetch-depth: 0
89
-
90
- - name: Set up Python
91
- uses: actions/setup-python@v5
92
- with:
93
- python-version: '3.11'
94
-
95
- - name: Install uv
96
- uses: astral-sh/setup-uv@v5
97
- with:
98
- version: "latest"
99
-
100
- - name: Install dependencies
101
- run: |
102
- uv sync --extra dev
103
-
104
- - name: Configure Git
105
- run: |
106
- git config user.name "github-actions[bot]"
107
- git config user.email "github-actions[bot]@users.noreply.github.com"
108
- git remote set-url origin https://x-access-token:${{ secrets.GITHUB_TOKEN }}@github.com/${{ github.repository }}.git
109
-
110
- - name: Deploy to GitHub Pages
111
- run: |
112
- # mkdocs gh-deploy automatically creates .nojekyll, but let's verify
113
- uv run mkdocs gh-deploy --force --message "Deploy docs [skip ci]" --strict
114
- # Verify .nojekyll was created in gh-pages branch
115
- git fetch origin gh-pages:gh-pages || true
116
- git checkout gh-pages || true
117
- if [ -f .nojekyll ]; then
118
- echo "✓ .nojekyll file exists"
119
- else
120
- echo "⚠ .nojekyll file missing, creating it..."
121
- touch .nojekyll
122
- git add .nojekyll
123
- git commit -m "Add .nojekyll to disable Jekyll [skip ci]" || true
124
- git push origin gh-pages || true
125
- fi
126
- env:
127
- GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
 
2
 
3
  on:
4
  push:
5
+ branches: [main, dev]
6
  pull_request:
7
+ branches: [main, dev]
8
 
9
  jobs:
10
  test:
 
16
  steps:
17
  - uses: actions/checkout@v4
18
 
19
+ - name: Install uv
20
+ uses: astral-sh/setup-uv@v5
21
+ with:
22
+ version: "latest"
23
+
24
  - name: Set up Python ${{ matrix.python-version }}
25
  uses: actions/setup-python@v5
26
  with:
 
28
 
29
  - name: Install dependencies
30
  run: |
31
+ uv sync --extra dev
 
32
 
33
  - name: Lint with ruff
 
 
 
34
  continue-on-error: true
35
+ run: |
36
+ uv run ruff check . --exclude tests --exclude reference_repos
37
+ uv run ruff format --check . --exclude tests --exclude reference_repos
38
 
39
  - name: Type check with mypy
 
 
40
  continue-on-error: true
 
 
41
  run: |
42
+ uv run mypy src --ignore-missing-imports
43
 
44
+ - name: Run unit tests (No OpenAI/Anthropic, HuggingFace only)
45
  env:
46
  HF_TOKEN: ${{ secrets.HF_TOKEN }}
47
+ LLM_PROVIDER: huggingface
48
  run: |
49
+ uv run pytest tests/unit/ -v -m "not openai and not anthropic and not embedding_provider" --tb=short -p no:logfire --cov --cov-branch --cov-report=xml
50
 
51
  - name: Run local embeddings tests
52
  env:
53
  HF_TOKEN: ${{ secrets.HF_TOKEN }}
54
+ LLM_PROVIDER: huggingface
55
  run: |
56
+ uv run pytest tests/ -v -m "local_embeddings" --tb=short -p no:logfire --cov --cov-branch --cov-report=xml --cov-append || true
57
  continue-on-error: true # Allow failures if dependencies not available
58
 
59
  - name: Run HuggingFace integration tests
60
  env:
61
  HF_TOKEN: ${{ secrets.HF_TOKEN }}
62
+ LLM_PROVIDER: huggingface
63
  run: |
64
+ uv run pytest tests/integration/ -v -m "huggingface and not embedding_provider" --tb=short -p no:logfire --cov --cov-branch --cov-report=xml --cov-append || true
65
  continue-on-error: true # Allow failures if HF_TOKEN not set
66
 
67
+ - name: Run non-OpenAI/Anthropic integration tests (excluding embedding providers)
68
  env:
69
  HF_TOKEN: ${{ secrets.HF_TOKEN }}
70
+ LLM_PROVIDER: huggingface
71
  run: |
72
+ uv run pytest tests/integration/ -v -m "integration and not openai and not anthropic and not embedding_provider" --tb=short -p no:logfire --cov --cov-branch --cov-report=xml --cov-append || true
73
  continue-on-error: true # Allow failures if dependencies not available
74
 
75
  - name: Upload coverage reports to Codecov
76
  uses: codecov/codecov-action@v5
77
+ continue-on-error: true
78
  with:
79
  token: ${{ secrets.CODECOV_TOKEN }}
80
  slug: DeepCritical/GradioDemo
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
.github/workflows/deploy-hf-space.yml DELETED
@@ -1,47 +0,0 @@
1
- name: Deploy to Hugging Face Space
2
-
3
- on:
4
- push:
5
- branches: [main]
6
- workflow_dispatch: # Allow manual triggering
7
-
8
- jobs:
9
- deploy:
10
- runs-on: ubuntu-latest
11
- permissions:
12
- contents: read
13
- # No write permissions needed for GitHub repo (we're pushing to HF Space)
14
-
15
- steps:
16
- - name: Checkout Repository
17
- uses: actions/checkout@v4
18
- with:
19
- fetch-depth: 0
20
-
21
- - name: Set up Python
22
- uses: actions/setup-python@v5
23
- with:
24
- python-version: '3.11'
25
-
26
- - name: Install dependencies
27
- run: |
28
- pip install --upgrade pip
29
- pip install huggingface-hub
30
-
31
- - name: Deploy to Hugging Face Space
32
- env:
33
- # Token from secrets (sensitive data)
34
- HF_TOKEN: ${{ secrets.HF_TOKEN }}
35
- # Username/Organization from repository variables (non-sensitive)
36
- HF_USERNAME: ${{ vars.HF_USERNAME }}
37
- # Space name from repository variables (non-sensitive)
38
- HF_SPACE_NAME: ${{ vars.HF_SPACE_NAME }}
39
- run: |
40
- python .github/scripts/deploy_to_hf_space.py
41
-
42
- - name: Verify deployment
43
- if: success()
44
- run: |
45
- echo "✅ Deployment completed successfully!"
46
- echo "Space URL: https://huggingface.co/spaces/${{ vars.HF_USERNAME }}/${{ vars.HF_SPACE_NAME }}"
47
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
.github/workflows/docs.yml ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Documentation
2
+
3
+ on:
4
+ push:
5
+ branches:
6
+ - main
7
+ - dev
8
+ paths:
9
+ - 'docs/**'
10
+ - 'mkdocs.yml'
11
+ - '.github/workflows/docs.yml'
12
+ pull_request:
13
+ branches:
14
+ - main
15
+ - dev
16
+ paths:
17
+ - 'docs/**'
18
+ - 'mkdocs.yml'
19
+ - '.github/workflows/docs.yml'
20
+ workflow_dispatch:
21
+
22
+ permissions:
23
+ contents: write
24
+
25
+ jobs:
26
+ build:
27
+ runs-on: ubuntu-latest
28
+ steps:
29
+ - uses: actions/checkout@v4
30
+
31
+ - name: Set up Python
32
+ uses: actions/setup-python@v5
33
+ with:
34
+ python-version: '3.11'
35
+
36
+ - name: Install uv
37
+ uses: astral-sh/setup-uv@v5
38
+ with:
39
+ version: "latest"
40
+
41
+ - name: Install dependencies
42
+ run: |
43
+ uv sync --extra dev
44
+
45
+ - name: Build documentation
46
+ run: |
47
+ uv run mkdocs build --strict
48
+
49
+ - name: Deploy to GitHub Pages
50
+ if: (github.ref == 'refs/heads/main' || github.ref == 'refs/heads/dev') && github.event_name == 'push'
51
+ uses: peaceiris/actions-gh-pages@v3
52
+ with:
53
+ github_token: ${{ secrets.GITHUB_TOKEN }}
54
+ publish_dir: ./site
55
+ publish_branch: dev
56
+ cname: false
57
+ keep_files: true
58
+
59
+
60
+
61
+
.gitignore CHANGED
@@ -1,7 +1,10 @@
 
 
1
  folder/
2
  site/
3
  .cursor/
4
  .ruff_cache/
 
5
  # Python
6
  __pycache__/
7
  *.py[cod]
@@ -57,9 +60,6 @@ reference_repos/DeepCritical/
57
  # Keep the README in reference_repos
58
  !reference_repos/README.md
59
 
60
- # Development directory
61
- dev/
62
-
63
  # OS
64
  .DS_Store
65
  Thumbs.db
@@ -72,13 +72,12 @@ logs/
72
  .pytest_cache/
73
  .mypy_cache/
74
  .coverage
 
 
75
  htmlcov/
76
- test_output*.txt
77
 
78
  # Database files
79
  chroma_db/
80
  *.sqlite3
81
 
82
-
83
  # Trigger rebuild Wed Nov 26 17:51:41 EST 2025
84
- .env
 
1
+ =0.22.0
2
+ =0.22.0,
3
  folder/
4
  site/
5
  .cursor/
6
  .ruff_cache/
7
+ docs/contributing/
8
  # Python
9
  __pycache__/
10
  *.py[cod]
 
60
  # Keep the README in reference_repos
61
  !reference_repos/README.md
62
 
 
 
 
63
  # OS
64
  .DS_Store
65
  Thumbs.db
 
72
  .pytest_cache/
73
  .mypy_cache/
74
  .coverage
75
+ .coverage.*
76
+ coverage.xml
77
  htmlcov/
 
78
 
79
  # Database files
80
  chroma_db/
81
  *.sqlite3
82
 
 
83
  # Trigger rebuild Wed Nov 26 17:51:41 EST 2025
 
.pre-commit-config.yaml CHANGED
@@ -1,20 +1,20 @@
1
  repos:
2
  - repo: https://github.com/astral-sh/ruff-pre-commit
3
- rev: v0.4.4
4
  hooks:
5
  - id: ruff
6
- args: [--fix, --exclude, tests]
7
  exclude: ^reference_repos/
8
  - id: ruff-format
9
- args: [--exclude, tests]
10
  exclude: ^reference_repos/
11
 
12
  - repo: https://github.com/pre-commit/mirrors-mypy
13
- rev: v1.10.0
14
  hooks:
15
  - id: mypy
16
  files: ^src/
17
- exclude: ^folder|^src/app.py
18
  additional_dependencies:
19
  - pydantic>=2.7
20
  - pydantic-settings>=2.2
@@ -31,14 +31,9 @@ repos:
31
  types: [python]
32
  args: [
33
  "run",
34
- "pytest",
35
- "tests/unit/",
36
- "-v",
37
- "-m",
38
- "not openai and not embedding_provider",
39
- "--tb=short",
40
- "-p",
41
- "no:logfire",
42
  ]
43
  pass_filenames: false
44
  always_run: true
@@ -50,14 +45,9 @@ repos:
50
  types: [python]
51
  args: [
52
  "run",
53
- "pytest",
54
- "tests/",
55
- "-v",
56
- "-m",
57
- "local_embeddings",
58
- "--tb=short",
59
- "-p",
60
- "no:logfire",
61
  ]
62
  pass_filenames: false
63
  always_run: true
 
1
  repos:
2
  - repo: https://github.com/astral-sh/ruff-pre-commit
3
+ rev: v0.14.7 # Compatible with ruff>=0.14.6 (matches CI)
4
  hooks:
5
  - id: ruff
6
+ args: [--fix, --exclude, tests, --exclude, reference_repos]
7
  exclude: ^reference_repos/
8
  - id: ruff-format
9
+ args: [--exclude, tests, --exclude, reference_repos]
10
  exclude: ^reference_repos/
11
 
12
  - repo: https://github.com/pre-commit/mirrors-mypy
13
+ rev: v1.18.2 # Matches CI version mypy>=1.18.2
14
  hooks:
15
  - id: mypy
16
  files: ^src/
17
+ exclude: ^folder
18
  additional_dependencies:
19
  - pydantic>=2.7
20
  - pydantic-settings>=2.2
 
31
  types: [python]
32
  args: [
33
  "run",
34
+ "python",
35
+ ".pre-commit-hooks/run_pytest_with_sync.py",
36
+ "unit",
 
 
 
 
 
37
  ]
38
  pass_filenames: false
39
  always_run: true
 
45
  types: [python]
46
  args: [
47
  "run",
48
+ "python",
49
+ ".pre-commit-hooks/run_pytest_with_sync.py",
50
+ "embeddings",
 
 
 
 
 
51
  ]
52
  pass_filenames: false
53
  always_run: true
=0.22.0 ADDED
File without changes
=0.22.0, ADDED
File without changes
AGENTS.txt DELETED
@@ -1,236 +0,0 @@
1
- # DeepCritical Project - Rules
2
-
3
- ## Project-Wide Rules
4
-
5
- **Architecture**: Multi-agent research system using Pydantic AI for agent orchestration, supporting iterative and deep research patterns. Uses middleware for state management, budget tracking, and workflow coordination.
6
-
7
- **Type Safety**: ALWAYS use complete type hints. All functions must have parameter and return type annotations. Use `mypy --strict` compliance. Use `TYPE_CHECKING` imports for circular dependencies: `from typing import TYPE_CHECKING; if TYPE_CHECKING: from src.services.embeddings import EmbeddingService`
8
-
9
- **Async Patterns**: ALL I/O operations must be async (`async def`, `await`). Use `asyncio.gather()` for parallel operations. CPU-bound work must use `run_in_executor()`: `loop = asyncio.get_running_loop(); result = await loop.run_in_executor(None, cpu_bound_function, args)`. Never block the event loop.
10
-
11
- **Error Handling**: Use custom exceptions from `src/utils/exceptions.py`: `DeepCriticalError`, `SearchError`, `RateLimitError`, `JudgeError`, `ConfigurationError`. Always chain exceptions: `raise SearchError(...) from e`. Log with structlog: `logger.error("Operation failed", error=str(e), context=value)`.
12
-
13
- **Logging**: Use `structlog` for ALL logging (NOT `print` or `logging`). Import: `import structlog; logger = structlog.get_logger()`. Log with structured data: `logger.info("event", key=value)`. Use appropriate levels: DEBUG, INFO, WARNING, ERROR.
14
-
15
- **Pydantic Models**: All data exchange uses Pydantic models from `src/utils/models.py`. Models are frozen (`model_config = {"frozen": True}`) for immutability. Use `Field()` with descriptions. Validate with `ge=`, `le=`, `min_length=`, `max_length=` constraints.
16
-
17
- **Code Style**: Ruff with 100-char line length. Ignore rules: `PLR0913` (too many arguments), `PLR0912` (too many branches), `PLR0911` (too many returns), `PLR2004` (magic values), `PLW0603` (global statement), `PLC0415` (lazy imports).
18
-
19
- **Docstrings**: Google-style docstrings for all public functions. Include Args, Returns, Raises sections. Use type hints in docstrings only if needed for clarity.
20
-
21
- **Testing**: Unit tests in `tests/unit/` (mocked, fast). Integration tests in `tests/integration/` (real APIs, marked `@pytest.mark.integration`). Use `respx` for httpx mocking, `pytest-mock` for general mocking.
22
-
23
- **State Management**: Use `ContextVar` in middleware for thread-safe isolation. Never use global mutable state (except singletons via `@lru_cache`). Use `WorkflowState` from `src/middleware/state_machine.py` for workflow state.
24
-
25
- **Citation Validation**: ALWAYS validate references before returning reports. Use `validate_references()` from `src/utils/citation_validator.py`. Remove hallucinated citations. Log warnings for removed citations.
26
-
27
- ---
28
-
29
- ## src/agents/ - Agent Implementation Rules
30
-
31
- **Pattern**: All agents use Pydantic AI `Agent` class. Agents have structured output types (Pydantic models) or return strings. Use factory functions in `src/agent_factory/agents.py` for creation.
32
-
33
- **Agent Structure**:
34
- - System prompt as module-level constant (with date injection: `datetime.now().strftime("%Y-%m-%d")`)
35
- - Agent class with `__init__(model: Any | None = None)`
36
- - Main method (e.g., `async def evaluate()`, `async def write_report()`)
37
- - Factory function: `def create_agent_name(model: Any | None = None) -> AgentName`
38
-
39
- **Model Initialization**: Use `get_model()` from `src/agent_factory/judges.py` if no model provided. Support OpenAI/Anthropic/HF Inference via settings.
40
-
41
- **Error Handling**: Return fallback values (e.g., `KnowledgeGapOutput(research_complete=False, outstanding_gaps=[...])`) on failure. Log errors with context. Use retry logic (3 retries) in Pydantic AI Agent initialization.
42
-
43
- **Input Validation**: Validate query/inputs are not empty. Truncate very long inputs with warnings. Handle None values gracefully.
44
-
45
- **Output Types**: Use structured output types from `src/utils/models.py` (e.g., `KnowledgeGapOutput`, `AgentSelectionPlan`, `ReportDraft`). For text output (writer agents), return `str` directly.
46
-
47
- **Agent-Specific Rules**:
48
- - `knowledge_gap.py`: Outputs `KnowledgeGapOutput`. Evaluates research completeness.
49
- - `tool_selector.py`: Outputs `AgentSelectionPlan`. Selects tools (RAG/web/database).
50
- - `writer.py`: Returns markdown string. Includes citations in numbered format.
51
- - `long_writer.py`: Uses `ReportDraft` input/output. Handles section-by-section writing.
52
- - `proofreader.py`: Takes `ReportDraft`, returns polished markdown.
53
- - `thinking.py`: Returns observation string from conversation history.
54
- - `input_parser.py`: Outputs `ParsedQuery` with research mode detection.
55
-
56
- ---
57
-
58
- ## src/tools/ - Search Tool Rules
59
-
60
- **Protocol**: All tools implement `SearchTool` protocol from `src/tools/base.py`: `name` property and `async def search(query, max_results) -> list[Evidence]`.
61
-
62
- **Rate Limiting**: Use `@retry` decorator from tenacity: `@retry(stop=stop_after_attempt(3), wait=wait_exponential(...))`. Implement `_rate_limit()` method for APIs with limits. Use shared rate limiters from `src/tools/rate_limiter.py`.
63
-
64
- **Error Handling**: Raise `SearchError` or `RateLimitError` on failures. Handle HTTP errors (429, 500, timeout). Return empty list on non-critical errors (log warning).
65
-
66
- **Query Preprocessing**: Use `preprocess_query()` from `src/tools/query_utils.py` to remove noise and expand synonyms.
67
-
68
- **Evidence Conversion**: Convert API responses to `Evidence` objects with `Citation`. Extract metadata (title, url, date, authors). Set relevance scores (0.0-1.0). Handle missing fields gracefully.
69
-
70
- **Tool-Specific Rules**:
71
- - `pubmed.py`: Use NCBI E-utilities (ESearch → EFetch). Rate limit: 0.34s between requests. Parse XML with `xmltodict`. Handle single vs. multiple articles.
72
- - `clinicaltrials.py`: Use `requests` library (NOT httpx - WAF blocks httpx). Run in thread pool: `await asyncio.to_thread(requests.get, ...)`. Filter: Only interventional studies, active/completed.
73
- - `europepmc.py`: Handle preprint markers: `[PREPRINT - Not peer-reviewed]`. Build URLs from DOI or PMID.
74
- - `rag_tool.py`: Wraps `LlamaIndexRAGService`. Returns Evidence from RAG results. Handles ingestion.
75
- - `search_handler.py`: Orchestrates parallel searches across multiple tools. Uses `asyncio.gather()` with `return_exceptions=True`. Aggregates results into `SearchResult`.
76
-
77
- ---
78
-
79
- ## src/middleware/ - Middleware Rules
80
-
81
- **State Management**: Use `ContextVar` for thread-safe isolation. `WorkflowState` uses `ContextVar[WorkflowState | None]`. Initialize with `init_workflow_state(embedding_service)`. Access with `get_workflow_state()` (auto-initializes if missing).
82
-
83
- **WorkflowState**: Tracks `evidence: list[Evidence]`, `conversation: Conversation`, `embedding_service: Any`. Methods: `add_evidence()` (deduplicates by URL), `async search_related()` (semantic search).
84
-
85
- **WorkflowManager**: Manages parallel research loops. Methods: `add_loop()`, `run_loops_parallel()`, `update_loop_status()`, `sync_loop_evidence_to_state()`. Uses `asyncio.gather()` for parallel execution. Handles errors per loop (don't fail all if one fails).
86
-
87
- **BudgetTracker**: Tracks tokens, time, iterations per loop and globally. Methods: `create_budget()`, `add_tokens()`, `start_timer()`, `update_timer()`, `increment_iteration()`, `check_budget()`, `can_continue()`. Token estimation: `estimate_tokens(text)` (~4 chars per token), `estimate_llm_call_tokens(prompt, response)`.
88
-
89
- **Models**: All middleware models in `src/utils/models.py`. `IterationData`, `Conversation`, `ResearchLoop`, `BudgetStatus` are used by middleware.
90
-
91
- ---
92
-
93
- ## src/orchestrator/ - Orchestration Rules
94
-
95
- **Research Flows**: Two patterns: `IterativeResearchFlow` (single loop) and `DeepResearchFlow` (plan → parallel loops → synthesis). Both support agent chains (`use_graph=False`) and graph execution (`use_graph=True`).
96
-
97
- **IterativeResearchFlow**: Pattern: Generate observations → Evaluate gaps → Select tools → Execute → Judge → Continue/Complete. Uses `KnowledgeGapAgent`, `ToolSelectorAgent`, `ThinkingAgent`, `WriterAgent`, `JudgeHandler`. Tracks iterations, time, budget.
98
-
99
- **DeepResearchFlow**: Pattern: Planner → Parallel iterative loops per section → Synthesizer. Uses `PlannerAgent`, `IterativeResearchFlow` (per section), `LongWriterAgent` or `ProofreaderAgent`. Uses `WorkflowManager` for parallel execution.
100
-
101
- **Graph Orchestrator**: Uses Pydantic AI Graphs (when available) or agent chains (fallback). Routes based on research mode (iterative/deep/auto). Streams `AgentEvent` objects for UI.
102
-
103
- **State Initialization**: Always call `init_workflow_state()` before running flows. Initialize `BudgetTracker` per loop. Use `WorkflowManager` for parallel coordination.
104
-
105
- **Event Streaming**: Yield `AgentEvent` objects during execution. Event types: "started", "search_complete", "judge_complete", "hypothesizing", "synthesizing", "complete", "error". Include iteration numbers and data payloads.
106
-
107
- ---
108
-
109
- ## src/services/ - Service Rules
110
-
111
- **EmbeddingService**: Local sentence-transformers (NO API key required). All operations async-safe via `run_in_executor()`. ChromaDB for vector storage. Deduplication threshold: 0.85 (85% similarity = duplicate).
112
-
113
- **LlamaIndexRAGService**: Uses OpenAI embeddings (requires `OPENAI_API_KEY`). Methods: `ingest_evidence()`, `retrieve()`, `query()`. Returns documents with metadata (source, title, url, date, authors). Lazy initialization with graceful fallback.
114
-
115
- **StatisticalAnalyzer**: Generates Python code via LLM. Executes in Modal sandbox (secure, isolated). Library versions pinned in `SANDBOX_LIBRARIES` dict. Returns `AnalysisResult` with verdict (SUPPORTED/REFUTED/INCONCLUSIVE).
116
-
117
- **Singleton Pattern**: Use `@lru_cache(maxsize=1)` for singletons: `@lru_cache(maxsize=1); def get_service() -> Service: return Service()`. Lazy initialization to avoid requiring dependencies at import time.
118
-
119
- ---
120
-
121
- ## src/utils/ - Utility Rules
122
-
123
- **Models**: All Pydantic models in `src/utils/models.py`. Use frozen models (`model_config = {"frozen": True}`) except where mutation needed. Use `Field()` with descriptions. Validate with constraints.
124
-
125
- **Config**: Settings via Pydantic Settings (`src/utils/config.py`). Load from `.env` automatically. Use `settings` singleton: `from src.utils.config import settings`. Validate API keys with properties: `has_openai_key`, `has_anthropic_key`.
126
-
127
- **Exceptions**: Custom exception hierarchy in `src/utils/exceptions.py`. Base: `DeepCriticalError`. Specific: `SearchError`, `RateLimitError`, `JudgeError`, `ConfigurationError`. Always chain exceptions.
128
-
129
- **LLM Factory**: Centralized LLM model creation in `src/utils/llm_factory.py`. Supports OpenAI, Anthropic, HF Inference. Use `get_model()` or factory functions. Check requirements before initialization.
130
-
131
- **Citation Validator**: Use `validate_references()` from `src/utils/citation_validator.py`. Removes hallucinated citations (URLs not in evidence). Logs warnings. Returns validated report string.
132
-
133
- ---
134
-
135
- ## src/orchestrator_factory.py Rules
136
-
137
- **Purpose**: Factory for creating orchestrators. Supports "simple" (legacy) and "advanced" (magentic) modes. Auto-detects mode based on API key availability.
138
-
139
- **Pattern**: Lazy import for optional dependencies (`_get_magentic_orchestrator_class()`). Handles `ImportError` gracefully with clear error messages.
140
-
141
- **Mode Detection**: `_determine_mode()` checks explicit mode or auto-detects: "advanced" if `settings.has_openai_key`, else "simple". Maps "magentic" → "advanced".
142
-
143
- **Function Signature**: `create_orchestrator(search_handler, judge_handler, config, mode) -> Any`. Simple mode requires handlers. Advanced mode uses MagenticOrchestrator.
144
-
145
- **Error Handling**: Raise `ValueError` with clear messages if requirements not met. Log mode selection with structlog.
146
-
147
- ---
148
-
149
- ## src/orchestrator_hierarchical.py Rules
150
-
151
- **Purpose**: Hierarchical orchestrator using middleware and sub-teams. Adapts Magentic ChatAgent to SubIterationTeam protocol.
152
-
153
- **Pattern**: Uses `SubIterationMiddleware` with `ResearchTeam` and `LLMSubIterationJudge`. Event-driven via callback queue.
154
-
155
- **State Initialization**: Initialize embedding service with graceful fallback. Use `init_magentic_state()` (deprecated, but kept for compatibility).
156
-
157
- **Event Streaming**: Uses `asyncio.Queue` for event coordination. Yields `AgentEvent` objects. Handles event callback pattern with `asyncio.wait()`.
158
-
159
- **Error Handling**: Log errors with context. Yield error events. Process remaining events after task completion.
160
-
161
- ---
162
-
163
- ## src/orchestrator_magentic.py Rules
164
-
165
- **Purpose**: Magentic-based orchestrator using ChatAgent pattern. Each agent has internal LLM. Manager orchestrates agents.
166
-
167
- **Pattern**: Uses `MagenticBuilder` with participants (searcher, hypothesizer, judge, reporter). Manager uses `OpenAIChatClient`. Workflow built in `_build_workflow()`.
168
-
169
- **Event Processing**: `_process_event()` converts Magentic events to `AgentEvent`. Handles: `MagenticOrchestratorMessageEvent`, `MagenticAgentMessageEvent`, `MagenticFinalResultEvent`, `MagenticAgentDeltaEvent`, `WorkflowOutputEvent`.
170
-
171
- **Text Extraction**: `_extract_text()` defensively extracts text from messages. Priority: `.content` → `.text` → `str(message)`. Handles buggy message objects.
172
-
173
- **State Initialization**: Initialize embedding service with graceful fallback. Use `init_magentic_state()` (deprecated).
174
-
175
- **Requirements**: Must call `check_magentic_requirements()` in `__init__`. Requires `agent-framework-core` and OpenAI API key.
176
-
177
- **Event Types**: Maps agent names to event types: "search" → "search_complete", "judge" → "judge_complete", "hypothes" → "hypothesizing", "report" → "synthesizing".
178
-
179
- ---
180
-
181
- ## src/agent_factory/ - Factory Rules
182
-
183
- **Pattern**: Factory functions for creating agents and handlers. Lazy initialization for optional dependencies. Support OpenAI/Anthropic/HF Inference.
184
-
185
- **Judges**: `create_judge_handler()` creates `JudgeHandler` with structured output (`JudgeAssessment`). Supports `MockJudgeHandler`, `HFInferenceJudgeHandler` as fallbacks.
186
-
187
- **Agents**: Factory functions in `agents.py` for all Pydantic AI agents. Pattern: `create_agent_name(model: Any | None = None) -> AgentName`. Use `get_model()` if model not provided.
188
-
189
- **Graph Builder**: `graph_builder.py` contains utilities for building research graphs. Supports iterative and deep research graph construction.
190
-
191
- **Error Handling**: Raise `ConfigurationError` if required API keys missing. Log agent creation. Handle import errors gracefully.
192
-
193
- ---
194
-
195
- ## src/prompts/ - Prompt Rules
196
-
197
- **Pattern**: System prompts stored as module-level constants. Include date injection: `datetime.now().strftime("%Y-%m-%d")`. Format evidence with truncation (1500 chars per item).
198
-
199
- **Judge Prompts**: In `judge.py`. Handle empty evidence case separately. Always request structured JSON output.
200
-
201
- **Hypothesis Prompts**: In `hypothesis.py`. Use diverse evidence selection (MMR algorithm). Sentence-aware truncation.
202
-
203
- **Report Prompts**: In `report.py`. Include full citation details. Use diverse evidence selection (n=20). Emphasize citation validation rules.
204
-
205
- ---
206
-
207
- ## Testing Rules
208
-
209
- **Structure**: Unit tests in `tests/unit/` (mocked, fast). Integration tests in `tests/integration/` (real APIs, marked `@pytest.mark.integration`).
210
-
211
- **Mocking**: Use `respx` for httpx mocking. Use `pytest-mock` for general mocking. Mock LLM calls in unit tests (use `MockJudgeHandler`).
212
-
213
- **Fixtures**: Common fixtures in `tests/conftest.py`: `mock_httpx_client`, `mock_llm_response`.
214
-
215
- **Coverage**: Aim for >80% coverage. Test error handling, edge cases, and integration paths.
216
-
217
- ---
218
-
219
- ## File-Specific Agent Rules
220
-
221
- **knowledge_gap.py**: Outputs `KnowledgeGapOutput`. System prompt evaluates research completeness. Handles conversation history. Returns fallback on error.
222
-
223
- **writer.py**: Returns markdown string. System prompt includes citation format examples. Validates inputs. Truncates long findings. Retry logic for transient failures.
224
-
225
- **long_writer.py**: Uses `ReportDraft` input/output. Writes sections iteratively. Reformats references (deduplicates, renumbers). Reformats section headings.
226
-
227
- **proofreader.py**: Takes `ReportDraft`, returns polished markdown. Removes duplicates. Adds summary. Preserves references.
228
-
229
- **tool_selector.py**: Outputs `AgentSelectionPlan`. System prompt lists available agents (WebSearchAgent, SiteCrawlerAgent, RAGAgent). Guidelines for when to use each.
230
-
231
- **thinking.py**: Returns observation string. Generates observations from conversation history. Uses query and background context.
232
-
233
- **input_parser.py**: Outputs `ParsedQuery`. Detects research mode (iterative/deep). Extracts entities and research questions. Improves/refines query.
234
-
235
-
236
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
LICENSE.md DELETED
@@ -1,25 +0,0 @@
1
- # License
2
-
3
- DeepCritical is licensed under the MIT License.
4
-
5
- ## MIT License
6
-
7
- Copyright (c) 2024 DeepCritical Team
8
-
9
- Permission is hereby granted, free of charge, to any person obtaining a copy
10
- of this software and associated documentation files (the "Software"), to deal
11
- in the Software without restriction, including without limitation the rights
12
- to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
13
- copies of the Software, and to permit persons to whom the Software is
14
- furnished to do so, subject to the following conditions:
15
-
16
- The above copyright notice and this permission notice shall be included in all
17
- copies or substantial portions of the Software.
18
-
19
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
20
- IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
21
- FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
22
- AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
23
- LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
24
- OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
25
- SOFTWARE.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Makefile ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ .PHONY: install test lint format typecheck check clean all cov cov-html
2
+
3
+ # Default target
4
+ all: check
5
+
6
+ install:
7
+ uv sync --all-extras
8
+ uv run pre-commit install
9
+
10
+ test:
11
+ uv run pytest tests/unit/ -v -m "not openai" -p no:logfire
12
+
13
+ test-hf:
14
+ uv run pytest tests/ -v -m "huggingface" -p no:logfire
15
+
16
+ test-all:
17
+ uv run pytest tests/ -v -p no:logfire
18
+
19
+ # Coverage aliases
20
+ cov: test-cov
21
+ test-cov:
22
+ uv run pytest --cov=src --cov-report=term-missing -m "not openai" -p no:logfire
23
+
24
+ cov-html:
25
+ uv run pytest --cov=src --cov-report=html -p no:logfire
26
+ @echo "Coverage report: open htmlcov/index.html"
27
+
28
+ lint:
29
+ uv run ruff check src tests
30
+
31
+ format:
32
+ uv run ruff format src tests
33
+
34
+ typecheck:
35
+ uv run mypy src
36
+
37
+ check: lint typecheck test-cov
38
+ @echo "All checks passed!"
39
+
40
+ docs-build:
41
+ uv run mkdocs build
42
+
43
+ docs-serve:
44
+ uv run mkdocs serve
45
+
46
+ docs-clean:
47
+ rm -rf site/
48
+
49
+ clean:
50
+ rm -rf .pytest_cache .mypy_cache .ruff_cache __pycache__ .coverage htmlcov
51
+ find . -type d -name "__pycache__" -exec rm -rf {} + 2>/dev/null || true
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- title: The DETERMINATOR
3
  emoji: 🐉
4
  colorFrom: red
5
  colorTo: yellow
@@ -10,54 +10,114 @@ app_file: src/app.py
10
  hf_oauth: true
11
  hf_oauth_expiration_minutes: 480
12
  hf_oauth_scopes:
13
- # Required for HuggingFace Inference API (includes all third-party providers)
14
- # This scope grants access to:
15
- # - HuggingFace's own Inference API
16
- # - Third-party inference providers (nebius, together, scaleway, hyperbolic, novita, nscale, sambanova, ovh, fireworks, etc.)
17
- # - All models available through the Inference Providers API
18
- - inference-api
19
- # Optional: Uncomment if you need to access user's billing information
20
- # - read-billing
21
  pinned: true
22
  license: mit
23
  tags:
24
  - mcp-in-action-track-enterprise
25
  - mcp-hackathon
26
- - deep-research
27
  - biomedical-ai
28
  - pydantic-ai
29
  - llamaindex
30
  - modal
31
- - building-mcp-track-enterprise
32
- - building-mcp-track-consumer
33
- - mcp-in-action-track-enterprise
34
- - mcp-in-action-track-consumer
35
- - building-mcp-track-modal
36
- - building-mcp-track-blaxel
37
- - building-mcp-track-llama-index
38
- - building-mcp-track-HUGGINGFACE
39
  ---
40
 
41
  > [!IMPORTANT]
42
  > **You are reading the Gradio Demo README!**
43
  >
44
- > - 📚 **Documentation**: See our [technical documentation](https://deepcritical.github.io/GradioDemo/) for detailed information
45
- > - 📖 **Complete README**: Check out the [Github README](.github/README.md) for setup, configuration, and contribution guidelines
46
- > - ⚠️**This README is for our Gradio Demo Only !**
47
 
48
  <div align="center">
49
 
50
- [![GitHub](https://img.shields.io/github/stars/DeepCritical/GradioDemo?style=for-the-badge&logo=github&logoColor=white&label=GitHub&labelColor=181717&color=181717)](https://github.com/DeepCritical/GradioDemo)
51
- [![Documentation](https://img.shields.io/badge/Docs-0080FF?style=for-the-badge&logo=readthedocs&logoColor=white&labelColor=0080FF&color=0080FF)](deepcritical.github.io/GradioDemo/)
52
- [![Demo](https://img.shields.io/badge/Demo-FFD21E?style=for-the-badge&logo=huggingface&logoColor=white&labelColor=FFD21E&color=FFD21E)](https://huggingface.co/spaces/DataQuests/DeepCritical)
53
  [![codecov](https://codecov.io/gh/DeepCritical/GradioDemo/graph/badge.svg?token=B1f05RCGpz)](https://codecov.io/gh/DeepCritical/GradioDemo)
54
  [![Join us on Discord](https://img.shields.io/discord/1109943800132010065?label=Discord&logo=discord&style=flat-square)](https://discord.gg/qdfnvSPcqP)
55
 
56
 
57
  </div>
58
 
59
- # The DETERMINATOR
60
 
61
  ## About
62
 
63
- The DETERMINATOR is a powerful generalist deep research agent system that stops at nothing until finding precise answers to complex questions. It uses iterative search-and-judge loops to comprehensively investigate any research question from any domain.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Critical Deep Resarch
3
  emoji: 🐉
4
  colorFrom: red
5
  colorTo: yellow
 
10
  hf_oauth: true
11
  hf_oauth_expiration_minutes: 480
12
  hf_oauth_scopes:
13
+ - inference-api
 
 
 
 
 
 
 
14
  pinned: true
15
  license: mit
16
  tags:
17
  - mcp-in-action-track-enterprise
18
  - mcp-hackathon
19
+ - drug-repurposing
20
  - biomedical-ai
21
  - pydantic-ai
22
  - llamaindex
23
  - modal
 
 
 
 
 
 
 
 
24
  ---
25
 
26
  > [!IMPORTANT]
27
  > **You are reading the Gradio Demo README!**
28
  >
29
+ > - 📚 **Documentation**: See our [technical documentation](deepcritical.github.io/GradioDemo/) for detailed information
30
+ > - 📖 **Complete README**: Check out the [full README](.github/README.md) for setup, configuration, and contribution guidelines
31
+ > - 🏆 **Hackathon Submission**: Keep reading below for more information about our MCP Hackathon submission
32
 
33
  <div align="center">
34
 
35
+ [![GitHub](https://img.shields.io/github/stars/DeepCritical/GradioDemo?style=for-the-badge&logo=github&logoColor=white&label=🐙%20GitHub&labelColor=181717&color=181717)](https://github.com/DeepCritical/GradioDemo)
36
+ [![Documentation](https://img.shields.io/badge/📚%20Docs-0080FF?style=for-the-badge&logo=readthedocs&logoColor=white&labelColor=0080FF&color=0080FF)](deepcritical.github.io/GradioDemo/)
37
+ [![Demo](https://img.shields.io/badge/🚀%20Demo-FFD21E?style=for-the-badge&logo=huggingface&logoColor=white&labelColor=FFD21E&color=FFD21E)](https://huggingface.co/spaces/DataQuests/DeepCritical)
38
  [![codecov](https://codecov.io/gh/DeepCritical/GradioDemo/graph/badge.svg?token=B1f05RCGpz)](https://codecov.io/gh/DeepCritical/GradioDemo)
39
  [![Join us on Discord](https://img.shields.io/discord/1109943800132010065?label=Discord&logo=discord&style=flat-square)](https://discord.gg/qdfnvSPcqP)
40
 
41
 
42
  </div>
43
 
44
+ # DeepCritical
45
 
46
  ## About
47
 
48
+ The [Deep Critical Gradio Hackathon Team](### Team) met online in the Alzheimer's Critical Literature Review Group in the Hugging Science initiative. We're building the agent framework we want to use for ai assisted research to [turn the vast amounts of clinical data into cures](https://github.com/DeepCritical/GradioDemo).
49
+
50
+ For this hackathon we're proposing a simple yet powerful Deep Research Agent that iteratively looks for the answer until it finds it using general purpose websearch and special purpose retrievers for technical retrievers.
51
+
52
+ ## Deep Critical In the Medial
53
+
54
+ - Social Medial Posts about Deep Critical :
55
+ -
56
+ -
57
+ -
58
+ -
59
+ -
60
+ -
61
+ -
62
+
63
+ ## Important information
64
+
65
+ - **[readme](.github\README.md)**: configure, deploy , contribute and learn more here.
66
+ - **[docs](deepcritical.github.io/GradioDemo/)**: want to know how all this works ? read our detailed technical documentation here.
67
+ - **[demo](https://huggingface/spaces/DataQuests/DeepCritical)**: Try our demo on huggingface
68
+ - **[team](### Team)**: Join us , or follow us !
69
+ - **[video]**: See our demo video
70
+
71
+ ## Future Developments
72
+
73
+ - [] Apply Deep Research Systems To Generate Short Form Video (up to 5 minutes)
74
+ - [] Visualize Pydantic Graphs as Loading Screens in the UI
75
+ - [] Improve Data Science with more Complex Graph Agents
76
+ - [] Create Deep Critical Drug Reporposing / Discovery Demo
77
+ - [] Create Deep Critical Literal Review
78
+ - [] Create Deep Critical Hypothesis Generator
79
+ - [] Create PyPi Package
80
+
81
+ ## Completed
82
+
83
+ - [] **Multi-Source Search**: PubMed, ClinicalTrials.gov, bioRxiv/medRxiv
84
+ - [] **MCP Integration**: Use our tools from Claude Desktop or any MCP client
85
+ - [] **HuggingFace OAuth**: Sign in with HuggingFace
86
+ - [] **Modal Sandbox**: Secure execution of AI-generated statistical code
87
+ - [] **LlamaIndex RAG**: Semantic search and evidence synthesis
88
+ - [] **HuggingfaceInference**:
89
+ - [] **HuggingfaceMCP Custom Config To Use Community Tools**:
90
+ - [] **Strongly Typed Composable Graphs**:
91
+ - [] **Specialized Research Teams of Agents**:
92
+
93
+
94
+
95
+ ### Team
96
+
97
+ - ZJ
98
+ - MarioAderman
99
+ - Josephrp
100
+
101
+
102
+ ## Acknowledgements
103
+
104
+ - McSwaggins
105
+ - Magentic
106
+ - Huggingface
107
+ - Gradio
108
+ - DeepCritical
109
+ - Sponsors
110
+ - Microsoft
111
+ - Pydantic
112
+ - Llama-index
113
+ - Anthhropic/MCP
114
+ - List of Tools Makers
115
+
116
+
117
+ ## Links
118
+
119
+ [![GitHub](https://img.shields.io/github/stars/DeepCritical/GradioDemo?style=for-the-badge&logo=github&logoColor=white&label=🐙%20GitHub&labelColor=181717&color=181717)](https://github.com/DeepCritical/GradioDemo)
120
+ [![Documentation](https://img.shields.io/badge/📚%20Docs-0080FF?style=for-the-badge&logo=readthedocs&logoColor=white&labelColor=0080FF&color=0080FF)](deepcritical.github.io/GradioDemo/)
121
+ [![Demo](https://img.shields.io/badge/🚀%20Demo-FFD21E?style=for-the-badge&logo=huggingface&logoColor=white&labelColor=FFD21E&color=FFD21E)](https://huggingface.co/spaces/DataQuests/DeepCritical)
122
+ [![codecov](https://codecov.io/gh/DeepCritical/GradioDemo/graph/badge.svg?token=B1f05RCGpz)](https://codecov.io/gh/DeepCritical/GradioDemo)
123
+ [![Join us on Discord](https://img.shields.io/discord/1109943800132010065?label=Discord&logo=discord&style=flat-square)](https://discord.gg/qdfnvSPcqP)
deployments/README.md DELETED
@@ -1,46 +0,0 @@
1
- # Deployments
2
-
3
- This directory contains infrastructure deployment scripts for DeepCritical services.
4
-
5
- ## Modal Deployments
6
-
7
- ### TTS Service (`modal_tts.py`)
8
-
9
- Deploys the Kokoro TTS (Text-to-Speech) function to Modal's GPU infrastructure.
10
-
11
- **Deploy:**
12
- ```bash
13
- modal deploy deployments/modal_tts.py
14
- ```
15
-
16
- **Features:**
17
- - Kokoro 82M TTS model
18
- - GPU-accelerated (T4)
19
- - Voice options: af_heart, af_bella, am_michael, etc.
20
- - Configurable speech speed
21
-
22
- **Requirements:**
23
- - Modal account and credentials (`MODAL_TOKEN_ID`, `MODAL_TOKEN_SECRET` in `.env`)
24
- - GPU quota on Modal
25
-
26
- **After Deployment:**
27
- The function will be available at:
28
- - App: `deepcritical-tts`
29
- - Function: `kokoro_tts_function`
30
-
31
- The main application (`src/services/tts_modal.py`) will call this deployed function.
32
-
33
- ---
34
-
35
- ## Adding New Deployments
36
-
37
- When adding new deployment scripts:
38
-
39
- 1. Create a new file: `deployments/<service_name>.py`
40
- 2. Use Modal's app pattern:
41
- ```python
42
- import modal
43
- app = modal.App("deepcritical-<service-name>")
44
- ```
45
- 3. Document in this README
46
- 4. Test deployment: `modal deploy deployments/<service_name>.py`
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
deployments/modal_tts.py DELETED
@@ -1,97 +0,0 @@
1
- """Deploy Kokoro TTS function to Modal.
2
-
3
- This script deploys the TTS function to Modal so it can be called
4
- from the main DeepCritical application.
5
-
6
- Usage:
7
- modal deploy deploy_modal_tts.py
8
-
9
- After deployment, the function will be available at:
10
- App: deepcritical-tts
11
- Function: kokoro_tts_function
12
- """
13
-
14
- import modal
15
- import numpy as np
16
-
17
- # Create Modal app
18
- app = modal.App("deepcritical-tts")
19
-
20
- # Define Kokoro TTS dependencies
21
- KOKORO_DEPENDENCIES = [
22
- "torch>=2.0.0",
23
- "transformers>=4.30.0",
24
- "numpy<2.0",
25
- ]
26
-
27
- # Create Modal image with Kokoro
28
- tts_image = (
29
- modal.Image.debian_slim(python_version="3.11")
30
- .apt_install("git") # Install git first for pip install from github
31
- .pip_install(*KOKORO_DEPENDENCIES)
32
- .pip_install("git+https://github.com/hexgrad/kokoro.git")
33
- )
34
-
35
-
36
- @app.function(
37
- image=tts_image,
38
- gpu="T4",
39
- timeout=60,
40
- )
41
- def kokoro_tts_function(text: str, voice: str, speed: float) -> tuple[int, np.ndarray]:
42
- """Modal GPU function for Kokoro TTS.
43
-
44
- This function runs on Modal's GPU infrastructure.
45
- Based on: https://huggingface.co/spaces/hexgrad/Kokoro-TTS
46
-
47
- Args:
48
- text: Text to synthesize
49
- voice: Voice ID (e.g., af_heart, af_bella, am_michael)
50
- speed: Speech speed multiplier (0.5-2.0)
51
-
52
- Returns:
53
- Tuple of (sample_rate, audio_array)
54
- """
55
- import numpy as np
56
-
57
- try:
58
- import torch
59
- from kokoro import KModel, KPipeline
60
-
61
- # Initialize model (cached on GPU)
62
- model = KModel().to("cuda").eval()
63
- pipeline = KPipeline(lang_code=voice[0])
64
- pack = pipeline.load_voice(voice)
65
-
66
- # Generate audio - accumulate all chunks
67
- audio_chunks = []
68
- for _, ps, _ in pipeline(text, voice, speed):
69
- ref_s = pack[len(ps) - 1]
70
- audio = model(ps, ref_s, speed)
71
- audio_chunks.append(audio.numpy())
72
-
73
- # Concatenate all audio chunks
74
- if audio_chunks:
75
- full_audio = np.concatenate(audio_chunks)
76
- return (24000, full_audio)
77
-
78
- # If no audio generated, return empty
79
- return (24000, np.zeros(1, dtype=np.float32))
80
-
81
- except ImportError as e:
82
- raise RuntimeError(
83
- f"Kokoro not installed: {e}. "
84
- "Install with: pip install git+https://github.com/hexgrad/kokoro.git"
85
- ) from e
86
- except Exception as e:
87
- raise RuntimeError(f"TTS synthesis failed: {e}") from e
88
-
89
-
90
- # Optional: Add a test entrypoint
91
- @app.local_entrypoint()
92
- def test():
93
- """Test the TTS function."""
94
- print("Testing Modal TTS function...")
95
- sample_rate, audio = kokoro_tts_function.remote("Hello, this is a test.", "af_heart", 1.0)
96
- print(f"Generated audio: {sample_rate}Hz, shape={audio.shape}")
97
- print("✓ TTS function works!")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
dev/Makefile ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ .PHONY: install test lint format typecheck check clean all cov cov-html
2
+
3
+ # Default target
4
+ all: check
5
+
6
+ install:
7
+ uv sync --all-extras
8
+ uv run pre-commit install
9
+
10
+ test:
11
+ uv run pytest tests/unit/ -v -m "not openai" -p no:logfire
12
+
13
+ test-hf:
14
+ uv run pytest tests/ -v -m "huggingface" -p no:logfire
15
+
16
+ test-all:
17
+ uv run pytest tests/ -v -p no:logfire
18
+
19
+ # Coverage aliases
20
+ cov: test-cov
21
+ test-cov:
22
+ uv run pytest --cov=src --cov-report=term-missing -m "not openai" -p no:logfire
23
+
24
+ cov-html:
25
+ uv run pytest --cov=src --cov-report=html -p no:logfire
26
+ @echo "Coverage report: open htmlcov/index.html"
27
+
28
+ lint:
29
+ uv run ruff check src tests
30
+
31
+ format:
32
+ uv run ruff format src tests
33
+
34
+ typecheck:
35
+ uv run mypy src
36
+
37
+ check: lint typecheck test-cov
38
+ @echo "All checks passed!"
39
+
40
+ docs-build:
41
+ uv run mkdocs build
42
+
43
+ docs-serve:
44
+ uv run mkdocs serve
45
+
46
+ docs-clean:
47
+ rm -rf site/
48
+
49
+ clean:
50
+ rm -rf .pytest_cache .mypy_cache .ruff_cache __pycache__ .coverage htmlcov
51
+ find . -type d -name "__pycache__" -exec rm -rf {} + 2>/dev/null || true
docs/api/agents.md CHANGED
@@ -12,19 +12,27 @@ This page documents the API for DeepCritical agents.
12
 
13
  #### `evaluate`
14
 
15
- <!--codeinclude-->
16
- [KnowledgeGapAgent.evaluate](../src/agents/knowledge_gap.py) start_line:66 end_line:74
17
- <!--/codeinclude-->
 
 
 
 
 
 
 
 
18
 
19
  Evaluates research completeness and identifies outstanding knowledge gaps.
20
 
21
  **Parameters**:
22
  - `query`: Research query string
23
- - `background_context`: Background context for the query (default: "")
24
- - `conversation_history`: History of actions, findings, and thoughts as string (default: "")
25
- - `iteration`: Current iteration number (default: 0)
26
- - `time_elapsed_minutes`: Elapsed time in minutes (default: 0.0)
27
- - `max_time_minutes`: Maximum time limit in minutes (default: 10)
28
 
29
  **Returns**: `KnowledgeGapOutput` with:
30
  - `research_complete`: Boolean indicating if research is complete
@@ -40,17 +48,21 @@ Evaluates research completeness and identifies outstanding knowledge gaps.
40
 
41
  #### `select_tools`
42
 
43
- <!--codeinclude-->
44
- [ToolSelectorAgent.select_tools](../src/agents/tool_selector.py) start_line:78 end_line:84
45
- <!--/codeinclude-->
 
 
 
 
 
46
 
47
- Selects tools for addressing a knowledge gap.
48
 
49
  **Parameters**:
50
- - `gap`: The knowledge gap to address
51
  - `query`: Research query string
52
- - `background_context`: Optional background context (default: "")
53
- - `conversation_history`: History of actions, findings, and thoughts as string (default: "")
54
 
55
  **Returns**: `AgentSelectionPlan` with list of `AgentTask` objects.
56
 
@@ -64,17 +76,23 @@ Selects tools for addressing a knowledge gap.
64
 
65
  #### `write_report`
66
 
67
- <!--codeinclude-->
68
- [WriterAgent.write_report](../src/agents/writer.py) start_line:67 end_line:73
69
- <!--/codeinclude-->
 
 
 
 
 
 
70
 
71
  Generates a markdown report from research findings.
72
 
73
  **Parameters**:
74
  - `query`: Research query string
75
  - `findings`: Research findings to include in report
76
- - `output_length`: Optional description of desired output length (default: "")
77
- - `output_instructions`: Optional additional instructions for report generation (default: "")
78
 
79
  **Returns**: Markdown string with numbered citations.
80
 
@@ -88,25 +106,36 @@ Generates a markdown report from research findings.
88
 
89
  #### `write_next_section`
90
 
91
- <!--codeinclude-->
92
- [LongWriterAgent.write_next_section](../src/agents/long_writer.py) start_line:94 end_line:100
93
- <!--/codeinclude-->
 
 
 
 
 
 
94
 
95
  Writes the next section of a long-form report.
96
 
97
  **Parameters**:
98
- - `original_query`: The original research query
99
- - `report_draft`: Current report draft as string (all sections written so far)
100
- - `next_section_title`: Title of the section to write
101
- - `next_section_draft`: Draft content for the next section
102
 
103
- **Returns**: `LongWriterOutput` with formatted section and references.
104
 
105
  #### `write_report`
106
 
107
- <!--codeinclude-->
108
- [LongWriterAgent.write_report](../src/agents/long_writer.py) start_line:263 end_line:268
109
- <!--/codeinclude-->
 
 
 
 
 
110
 
111
  Generates final report from draft.
112
 
@@ -127,9 +156,14 @@ Generates final report from draft.
127
 
128
  #### `proofread`
129
 
130
- <!--codeinclude-->
131
- [ProofreaderAgent.proofread](../src/agents/proofreader.py) start_line:72 end_line:76
132
- <!--/codeinclude-->
 
 
 
 
 
133
 
134
  Proofreads and polishes a report draft.
135
 
@@ -150,17 +184,21 @@ Proofreads and polishes a report draft.
150
 
151
  #### `generate_observations`
152
 
153
- <!--codeinclude-->
154
- [ThinkingAgent.generate_observations](../src/agents/thinking.py) start_line:70 end_line:76
155
- <!--/codeinclude-->
 
 
 
 
 
156
 
157
  Generates observations from conversation history.
158
 
159
  **Parameters**:
160
  - `query`: Research query string
161
- - `background_context`: Optional background context (default: "")
162
- - `conversation_history`: History of actions, findings, and thoughts as string (default: "")
163
- - `iteration`: Current iteration number (default: 1)
164
 
165
  **Returns**: Observation string.
166
 
@@ -172,11 +210,14 @@ Generates observations from conversation history.
172
 
173
  ### Methods
174
 
175
- #### `parse`
176
 
177
- <!--codeinclude-->
178
- [InputParserAgent.parse](../src/agents/input_parser.py) start_line:82 end_line:82
179
- <!--/codeinclude-->
 
 
 
180
 
181
  Parses and improves a user query.
182
 
@@ -194,13 +235,18 @@ Parses and improves a user query.
194
 
195
  All agents have factory functions in `src.agent_factory.agents`:
196
 
197
- <!--codeinclude-->
198
- [Factory Functions](../src/agent_factory/agents.py) start_line:30 end_line:50
199
- <!--/codeinclude-->
 
 
 
 
 
 
200
 
201
  **Parameters**:
202
  - `model`: Optional Pydantic AI model. If None, uses `get_model()` from settings.
203
- - `oauth_token`: Optional OAuth token from HuggingFace login (takes priority over env vars)
204
 
205
  **Returns**: Agent instance.
206
 
@@ -209,3 +255,12 @@ All agents have factory functions in `src.agent_factory.agents`:
209
  - [Architecture - Agents](../architecture/agents.md) - Architecture overview
210
  - [Models API](models.md) - Data models used by agents
211
 
 
 
 
 
 
 
 
 
 
 
12
 
13
  #### `evaluate`
14
 
15
+ ```python
16
+ async def evaluate(
17
+ self,
18
+ query: str,
19
+ background_context: str,
20
+ conversation_history: Conversation,
21
+ iteration: int,
22
+ time_elapsed_minutes: float,
23
+ max_time_minutes: float
24
+ ) -> KnowledgeGapOutput
25
+ ```
26
 
27
  Evaluates research completeness and identifies outstanding knowledge gaps.
28
 
29
  **Parameters**:
30
  - `query`: Research query string
31
+ - `background_context`: Background context for the query
32
+ - `conversation_history`: Conversation history with previous iterations
33
+ - `iteration`: Current iteration number
34
+ - `time_elapsed_minutes`: Elapsed time in minutes
35
+ - `max_time_minutes`: Maximum time limit in minutes
36
 
37
  **Returns**: `KnowledgeGapOutput` with:
38
  - `research_complete`: Boolean indicating if research is complete
 
48
 
49
  #### `select_tools`
50
 
51
+ ```python
52
+ async def select_tools(
53
+ self,
54
+ query: str,
55
+ knowledge_gaps: list[str],
56
+ available_tools: list[str]
57
+ ) -> AgentSelectionPlan
58
+ ```
59
 
60
+ Selects tools for addressing knowledge gaps.
61
 
62
  **Parameters**:
 
63
  - `query`: Research query string
64
+ - `knowledge_gaps`: List of knowledge gaps to address
65
+ - `available_tools`: List of available tool names
66
 
67
  **Returns**: `AgentSelectionPlan` with list of `AgentTask` objects.
68
 
 
76
 
77
  #### `write_report`
78
 
79
+ ```python
80
+ async def write_report(
81
+ self,
82
+ query: str,
83
+ findings: str,
84
+ output_length: str = "medium",
85
+ output_instructions: str | None = None
86
+ ) -> str
87
+ ```
88
 
89
  Generates a markdown report from research findings.
90
 
91
  **Parameters**:
92
  - `query`: Research query string
93
  - `findings`: Research findings to include in report
94
+ - `output_length`: Desired output length ("short", "medium", "long")
95
+ - `output_instructions`: Additional instructions for report generation
96
 
97
  **Returns**: Markdown string with numbered citations.
98
 
 
106
 
107
  #### `write_next_section`
108
 
109
+ ```python
110
+ async def write_next_section(
111
+ self,
112
+ query: str,
113
+ draft: ReportDraft,
114
+ section_title: str,
115
+ section_content: str
116
+ ) -> LongWriterOutput
117
+ ```
118
 
119
  Writes the next section of a long-form report.
120
 
121
  **Parameters**:
122
+ - `query`: Research query string
123
+ - `draft`: Current report draft
124
+ - `section_title`: Title of the section to write
125
+ - `section_content`: Content/guidance for the section
126
 
127
+ **Returns**: `LongWriterOutput` with updated draft.
128
 
129
  #### `write_report`
130
 
131
+ ```python
132
+ async def write_report(
133
+ self,
134
+ query: str,
135
+ report_title: str,
136
+ report_draft: ReportDraft
137
+ ) -> str
138
+ ```
139
 
140
  Generates final report from draft.
141
 
 
156
 
157
  #### `proofread`
158
 
159
+ ```python
160
+ async def proofread(
161
+ self,
162
+ query: str,
163
+ report_title: str,
164
+ report_draft: ReportDraft
165
+ ) -> str
166
+ ```
167
 
168
  Proofreads and polishes a report draft.
169
 
 
184
 
185
  #### `generate_observations`
186
 
187
+ ```python
188
+ async def generate_observations(
189
+ self,
190
+ query: str,
191
+ background_context: str,
192
+ conversation_history: Conversation
193
+ ) -> str
194
+ ```
195
 
196
  Generates observations from conversation history.
197
 
198
  **Parameters**:
199
  - `query`: Research query string
200
+ - `background_context`: Background context
201
+ - `conversation_history`: Conversation history
 
202
 
203
  **Returns**: Observation string.
204
 
 
210
 
211
  ### Methods
212
 
213
+ #### `parse_query`
214
 
215
+ ```python
216
+ async def parse_query(
217
+ self,
218
+ query: str
219
+ ) -> ParsedQuery
220
+ ```
221
 
222
  Parses and improves a user query.
223
 
 
235
 
236
  All agents have factory functions in `src.agent_factory.agents`:
237
 
238
+ ```python
239
+ def create_knowledge_gap_agent(model: Any | None = None) -> KnowledgeGapAgent
240
+ def create_tool_selector_agent(model: Any | None = None) -> ToolSelectorAgent
241
+ def create_writer_agent(model: Any | None = None) -> WriterAgent
242
+ def create_long_writer_agent(model: Any | None = None) -> LongWriterAgent
243
+ def create_proofreader_agent(model: Any | None = None) -> ProofreaderAgent
244
+ def create_thinking_agent(model: Any | None = None) -> ThinkingAgent
245
+ def create_input_parser_agent(model: Any | None = None) -> InputParserAgent
246
+ ```
247
 
248
  **Parameters**:
249
  - `model`: Optional Pydantic AI model. If None, uses `get_model()` from settings.
 
250
 
251
  **Returns**: Agent instance.
252
 
 
255
  - [Architecture - Agents](../architecture/agents.md) - Architecture overview
256
  - [Models API](models.md) - Data models used by agents
257
 
258
+
259
+
260
+
261
+
262
+
263
+
264
+
265
+
266
+
docs/api/models.md CHANGED
@@ -8,14 +8,18 @@ This page documents the Pydantic models used throughout DeepCritical.
8
 
9
  **Purpose**: Represents evidence from search results.
10
 
11
- <!--codeinclude-->
12
- [Evidence Model](../src/utils/models.py) start_line:33 end_line:44
13
- <!--/codeinclude-->
 
 
 
 
14
 
15
  **Fields**:
16
  - `citation`: Citation information (title, URL, date, authors)
17
  - `content`: Evidence text content
18
- - `relevance`: Relevance score (0.0-1.0)
19
  - `metadata`: Additional metadata dictionary
20
 
21
  ## Citation
@@ -24,15 +28,18 @@ This page documents the Pydantic models used throughout DeepCritical.
24
 
25
  **Purpose**: Citation information for evidence.
26
 
27
- <!--codeinclude-->
28
- [Citation Model](../src/utils/models.py) start_line:12 end_line:30
29
- <!--/codeinclude-->
 
 
 
 
30
 
31
  **Fields**:
32
- - `source`: Source name (e.g., "pubmed", "clinicaltrials", "europepmc", "web", "rag")
33
  - `title`: Article/trial title
34
  - `url`: Source URL
35
- - `date`: Publication date (YYYY-MM-DD or "Unknown")
36
  - `authors`: List of authors (optional)
37
 
38
  ## KnowledgeGapOutput
@@ -41,9 +48,11 @@ This page documents the Pydantic models used throughout DeepCritical.
41
 
42
  **Purpose**: Output from knowledge gap evaluation.
43
 
44
- <!--codeinclude-->
45
- [KnowledgeGapOutput Model](../src/utils/models.py) start_line:494 end_line:504
46
- <!--/codeinclude-->
 
 
47
 
48
  **Fields**:
49
  - `research_complete`: Boolean indicating if research is complete
@@ -55,9 +64,10 @@ This page documents the Pydantic models used throughout DeepCritical.
55
 
56
  **Purpose**: Plan for tool/agent selection.
57
 
58
- <!--codeinclude-->
59
- [AgentSelectionPlan Model](../src/utils/models.py) start_line:521 end_line:526
60
- <!--/codeinclude-->
 
61
 
62
  **Fields**:
63
  - `tasks`: List of agent tasks to execute
@@ -68,15 +78,17 @@ This page documents the Pydantic models used throughout DeepCritical.
68
 
69
  **Purpose**: Individual agent task.
70
 
71
- <!--codeinclude-->
72
- [AgentTask Model](../src/utils/models.py) start_line:507 end_line:518
73
- <!--/codeinclude-->
 
 
 
74
 
75
  **Fields**:
76
- - `gap`: The knowledge gap being addressed (optional)
77
- - `agent`: Name of agent to use
78
- - `query`: The specific query for the agent
79
- - `entity_website`: The website of the entity being researched, if known (optional)
80
 
81
  ## ReportDraft
82
 
@@ -84,12 +96,17 @@ This page documents the Pydantic models used throughout DeepCritical.
84
 
85
  **Purpose**: Draft structure for long-form reports.
86
 
87
- <!--codeinclude-->
88
- [ReportDraft Model](../src/utils/models.py) start_line:538 end_line:545
89
- <!--/codeinclude-->
 
 
 
90
 
91
  **Fields**:
 
92
  - `sections`: List of report sections
 
93
 
94
  ## ReportSection
95
 
@@ -97,13 +114,17 @@ This page documents the Pydantic models used throughout DeepCritical.
97
 
98
  **Purpose**: Individual section in a report draft.
99
 
100
- <!--codeinclude-->
101
- [ReportDraftSection Model](../src/utils/models.py) start_line:529 end_line:535
102
- <!--/codeinclude-->
 
 
 
103
 
104
  **Fields**:
105
- - `section_title`: The title of the section
106
- - `section_content`: The content of the section
 
107
 
108
  ## ParsedQuery
109
 
@@ -111,9 +132,14 @@ This page documents the Pydantic models used throughout DeepCritical.
111
 
112
  **Purpose**: Parsed and improved query.
113
 
114
- <!--codeinclude-->
115
- [ParsedQuery Model](../src/utils/models.py) start_line:557 end_line:572
116
- <!--/codeinclude-->
 
 
 
 
 
117
 
118
  **Fields**:
119
  - `original_query`: Original query string
@@ -128,12 +154,13 @@ This page documents the Pydantic models used throughout DeepCritical.
128
 
129
  **Purpose**: Conversation history with iterations.
130
 
131
- <!--codeinclude-->
132
- [Conversation Model](../src/utils/models.py) start_line:331 end_line:337
133
- <!--/codeinclude-->
 
134
 
135
  **Fields**:
136
- - `history`: List of iteration data
137
 
138
  ## IterationData
139
 
@@ -141,15 +168,23 @@ This page documents the Pydantic models used throughout DeepCritical.
141
 
142
  **Purpose**: Data for a single iteration.
143
 
144
- <!--codeinclude-->
145
- [IterationData Model](../src/utils/models.py) start_line:315 end_line:328
146
- <!--/codeinclude-->
 
 
 
 
 
 
147
 
148
  **Fields**:
149
- - `gap`: The gap addressed in the iteration
150
- - `tool_calls`: The tool calls made
151
- - `findings`: The findings collected from tool calls
152
- - `thought`: The thinking done to reflect on the success of the iteration and next steps
 
 
153
 
154
  ## AgentEvent
155
 
@@ -157,9 +192,12 @@ This page documents the Pydantic models used throughout DeepCritical.
157
 
158
  **Purpose**: Event emitted during research execution.
159
 
160
- <!--codeinclude-->
161
- [AgentEvent Model](../src/utils/models.py) start_line:104 end_line:125
162
- <!--/codeinclude-->
 
 
 
163
 
164
  **Fields**:
165
  - `type`: Event type (e.g., "started", "search_complete", "complete")
@@ -172,20 +210,35 @@ This page documents the Pydantic models used throughout DeepCritical.
172
 
173
  **Purpose**: Current budget status.
174
 
175
- <!--codeinclude-->
176
- [BudgetStatus Model](../src/middleware/budget_tracker.py) start_line:15 end_line:25
177
- <!--/codeinclude-->
 
 
 
 
 
 
178
 
179
  **Fields**:
180
- - `tokens_used`: Total tokens used
181
- - `tokens_limit`: Token budget limit
182
- - `time_elapsed_seconds`: Time elapsed in seconds
183
- - `time_limit_seconds`: Time budget limit (default: 600.0 seconds / 10 minutes)
184
- - `iterations`: Number of iterations completed
185
- - `iterations_limit`: Maximum iterations (default: 10)
186
- - `iteration_tokens`: Tokens used per iteration (iteration number -> token count)
187
 
188
  ## See Also
189
 
190
  - [Architecture - Agents](../architecture/agents.md) - How models are used
191
  - [Configuration](../configuration/index.md) - Model configuration
 
 
 
 
 
 
 
 
 
 
 
8
 
9
  **Purpose**: Represents evidence from search results.
10
 
11
+ ```python
12
+ class Evidence(BaseModel):
13
+ citation: Citation
14
+ content: str
15
+ relevance_score: float = Field(ge=0.0, le=1.0)
16
+ metadata: dict[str, Any] = Field(default_factory=dict)
17
+ ```
18
 
19
  **Fields**:
20
  - `citation`: Citation information (title, URL, date, authors)
21
  - `content`: Evidence text content
22
+ - `relevance_score`: Relevance score (0.0-1.0)
23
  - `metadata`: Additional metadata dictionary
24
 
25
  ## Citation
 
28
 
29
  **Purpose**: Citation information for evidence.
30
 
31
+ ```python
32
+ class Citation(BaseModel):
33
+ title: str
34
+ url: str
35
+ date: str | None = None
36
+ authors: list[str] = Field(default_factory=list)
37
+ ```
38
 
39
  **Fields**:
 
40
  - `title`: Article/trial title
41
  - `url`: Source URL
42
+ - `date`: Publication date (optional)
43
  - `authors`: List of authors (optional)
44
 
45
  ## KnowledgeGapOutput
 
48
 
49
  **Purpose**: Output from knowledge gap evaluation.
50
 
51
+ ```python
52
+ class KnowledgeGapOutput(BaseModel):
53
+ research_complete: bool
54
+ outstanding_gaps: list[str] = Field(default_factory=list)
55
+ ```
56
 
57
  **Fields**:
58
  - `research_complete`: Boolean indicating if research is complete
 
64
 
65
  **Purpose**: Plan for tool/agent selection.
66
 
67
+ ```python
68
+ class AgentSelectionPlan(BaseModel):
69
+ tasks: list[AgentTask] = Field(default_factory=list)
70
+ ```
71
 
72
  **Fields**:
73
  - `tasks`: List of agent tasks to execute
 
78
 
79
  **Purpose**: Individual agent task.
80
 
81
+ ```python
82
+ class AgentTask(BaseModel):
83
+ agent_name: str
84
+ query: str
85
+ context: dict[str, Any] = Field(default_factory=dict)
86
+ ```
87
 
88
  **Fields**:
89
+ - `agent_name`: Name of agent to use
90
+ - `query`: Task query
91
+ - `context`: Additional context dictionary
 
92
 
93
  ## ReportDraft
94
 
 
96
 
97
  **Purpose**: Draft structure for long-form reports.
98
 
99
+ ```python
100
+ class ReportDraft(BaseModel):
101
+ title: str
102
+ sections: list[ReportSection] = Field(default_factory=list)
103
+ references: list[Citation] = Field(default_factory=list)
104
+ ```
105
 
106
  **Fields**:
107
+ - `title`: Report title
108
  - `sections`: List of report sections
109
+ - `references`: List of citations
110
 
111
  ## ReportSection
112
 
 
114
 
115
  **Purpose**: Individual section in a report draft.
116
 
117
+ ```python
118
+ class ReportSection(BaseModel):
119
+ title: str
120
+ content: str
121
+ order: int
122
+ ```
123
 
124
  **Fields**:
125
+ - `title`: Section title
126
+ - `content`: Section content
127
+ - `order`: Section order number
128
 
129
  ## ParsedQuery
130
 
 
132
 
133
  **Purpose**: Parsed and improved query.
134
 
135
+ ```python
136
+ class ParsedQuery(BaseModel):
137
+ original_query: str
138
+ improved_query: str
139
+ research_mode: Literal["iterative", "deep"]
140
+ key_entities: list[str] = Field(default_factory=list)
141
+ research_questions: list[str] = Field(default_factory=list)
142
+ ```
143
 
144
  **Fields**:
145
  - `original_query`: Original query string
 
154
 
155
  **Purpose**: Conversation history with iterations.
156
 
157
+ ```python
158
+ class Conversation(BaseModel):
159
+ iterations: list[IterationData] = Field(default_factory=list)
160
+ ```
161
 
162
  **Fields**:
163
+ - `iterations`: List of iteration data
164
 
165
  ## IterationData
166
 
 
168
 
169
  **Purpose**: Data for a single iteration.
170
 
171
+ ```python
172
+ class IterationData(BaseModel):
173
+ iteration: int
174
+ observations: str | None = None
175
+ knowledge_gaps: list[str] = Field(default_factory=list)
176
+ tool_calls: list[dict[str, Any]] = Field(default_factory=list)
177
+ findings: str | None = None
178
+ thoughts: str | None = None
179
+ ```
180
 
181
  **Fields**:
182
+ - `iteration`: Iteration number
183
+ - `observations`: Generated observations
184
+ - `knowledge_gaps`: Identified knowledge gaps
185
+ - `tool_calls`: Tool calls made
186
+ - `findings`: Findings from tools
187
+ - `thoughts`: Agent thoughts
188
 
189
  ## AgentEvent
190
 
 
192
 
193
  **Purpose**: Event emitted during research execution.
194
 
195
+ ```python
196
+ class AgentEvent(BaseModel):
197
+ type: str
198
+ iteration: int | None = None
199
+ data: dict[str, Any] = Field(default_factory=dict)
200
+ ```
201
 
202
  **Fields**:
203
  - `type`: Event type (e.g., "started", "search_complete", "complete")
 
210
 
211
  **Purpose**: Current budget status.
212
 
213
+ ```python
214
+ class BudgetStatus(BaseModel):
215
+ tokens_used: int
216
+ tokens_limit: int
217
+ time_elapsed_seconds: float
218
+ time_limit_seconds: float
219
+ iterations: int
220
+ iterations_limit: int
221
+ ```
222
 
223
  **Fields**:
224
+ - `tokens_used`: Tokens used so far
225
+ - `tokens_limit`: Token limit
226
+ - `time_elapsed_seconds`: Elapsed time in seconds
227
+ - `time_limit_seconds`: Time limit in seconds
228
+ - `iterations`: Current iteration count
229
+ - `iterations_limit`: Iteration limit
 
230
 
231
  ## See Also
232
 
233
  - [Architecture - Agents](../architecture/agents.md) - How models are used
234
  - [Configuration](../configuration/index.md) - Model configuration
235
+
236
+
237
+
238
+
239
+
240
+
241
+
242
+
243
+
244
+
docs/api/orchestrators.md CHANGED
@@ -12,24 +12,33 @@ This page documents the API for DeepCritical orchestrators.
12
 
13
  #### `run`
14
 
15
- <!--codeinclude-->
16
- [IterativeResearchFlow.run](../src/orchestrator/research_flow.py) start_line:134 end_line:140
17
- <!--/codeinclude-->
 
 
 
 
 
 
 
18
 
19
  Runs iterative research flow.
20
 
21
  **Parameters**:
22
  - `query`: Research query string
23
  - `background_context`: Background context (default: "")
24
- - `output_length`: Optional description of desired output length (default: "")
25
- - `output_instructions`: Optional additional instructions for report generation (default: "")
26
- - `message_history`: Optional user conversation history in Pydantic AI `ModelMessage` format (default: None)
27
-
28
- **Returns**: Final report string.
29
-
30
- **Note**: The `message_history` parameter enables multi-turn conversations by providing context from previous interactions.
31
-
32
- **Note**: `max_iterations`, `max_time_minutes`, and `token_budget` are constructor parameters, not `run()` parameters.
 
 
33
 
34
  ## DeepResearchFlow
35
 
@@ -41,21 +50,33 @@ Runs iterative research flow.
41
 
42
  #### `run`
43
 
44
- <!--codeinclude-->
45
- [DeepResearchFlow.run](../src/orchestrator/research_flow.py) start_line:778 end_line:778
46
- <!--/codeinclude-->
 
 
 
 
 
 
 
47
 
48
  Runs deep research flow.
49
 
50
  **Parameters**:
51
  - `query`: Research query string
52
- - `message_history`: Optional user conversation history in Pydantic AI `ModelMessage` format (default: None)
53
-
54
- **Returns**: Final report string.
55
-
56
- **Note**: The `message_history` parameter enables multi-turn conversations by providing context from previous interactions.
57
-
58
- **Note**: `max_iterations_per_section`, `max_time_minutes`, and `token_budget` are constructor parameters, not `run()` parameters.
 
 
 
 
 
59
 
60
  ## GraphOrchestrator
61
 
@@ -67,22 +88,24 @@ Runs deep research flow.
67
 
68
  #### `run`
69
 
70
- <!--codeinclude-->
71
- [GraphOrchestrator.run](../src/orchestrator/graph_orchestrator.py) start_line:177 end_line:177
72
- <!--/codeinclude-->
 
 
 
 
 
73
 
74
  Runs graph-based research orchestration.
75
 
76
  **Parameters**:
77
  - `query`: Research query string
78
- - `message_history`: Optional user conversation history in Pydantic AI `ModelMessage` format (default: None)
 
79
 
80
  **Yields**: `AgentEvent` objects during graph execution.
81
 
82
- **Note**:
83
- - `research_mode` and `use_graph` are constructor parameters, not `run()` parameters.
84
- - The `message_history` parameter enables multi-turn conversations by providing context from previous interactions. Message history is stored in `GraphExecutionContext` and passed to agents during execution.
85
-
86
  ## Orchestrator Factory
87
 
88
  **Module**: `src.orchestrator_factory`
@@ -93,18 +116,22 @@ Runs graph-based research orchestration.
93
 
94
  #### `create_orchestrator`
95
 
96
- <!--codeinclude-->
97
- [create_orchestrator](../src/orchestrator_factory.py) start_line:44 end_line:50
98
- <!--/codeinclude-->
 
 
 
 
 
99
 
100
  Creates an orchestrator instance.
101
 
102
  **Parameters**:
103
- - `search_handler`: Search handler protocol implementation (optional, required for simple mode)
104
- - `judge_handler`: Judge handler protocol implementation (optional, required for simple mode)
105
- - `config`: Configuration object (optional)
106
- - `mode`: Orchestrator mode ("simple", "advanced", "magentic", "iterative", "deep", "auto", or None for auto-detect)
107
- - `oauth_token`: Optional OAuth token from HuggingFace login (takes priority over env vars)
108
 
109
  **Returns**: Orchestrator instance.
110
 
@@ -126,19 +153,24 @@ Creates an orchestrator instance.
126
 
127
  #### `run`
128
 
129
- <!--codeinclude-->
130
- [MagenticOrchestrator.run](../src/orchestrator_magentic.py) start_line:101 end_line:101
131
- <!--/codeinclude-->
 
 
 
 
 
132
 
133
  Runs Magentic orchestration.
134
 
135
  **Parameters**:
136
  - `query`: Research query string
 
 
137
 
138
  **Yields**: `AgentEvent` objects converted from Magentic events.
139
 
140
- **Note**: `max_rounds` and `max_stalls` are constructor parameters, not `run()` parameters.
141
-
142
  **Requirements**:
143
  - `agent-framework-core` package
144
  - OpenAI API key
@@ -146,4 +178,14 @@ Runs Magentic orchestration.
146
  ## See Also
147
 
148
  - [Architecture - Orchestrators](../architecture/orchestrators.md) - Architecture overview
149
- - [Graph Orchestration](../architecture/graph_orchestration.md) - Graph execution details
 
 
 
 
 
 
 
 
 
 
 
12
 
13
  #### `run`
14
 
15
+ ```python
16
+ async def run(
17
+ self,
18
+ query: str,
19
+ background_context: str = "",
20
+ max_iterations: int | None = None,
21
+ max_time_minutes: float | None = None,
22
+ token_budget: int | None = None
23
+ ) -> AsyncGenerator[AgentEvent, None]
24
+ ```
25
 
26
  Runs iterative research flow.
27
 
28
  **Parameters**:
29
  - `query`: Research query string
30
  - `background_context`: Background context (default: "")
31
+ - `max_iterations`: Maximum iterations (default: from settings)
32
+ - `max_time_minutes`: Maximum time in minutes (default: from settings)
33
+ - `token_budget`: Token budget (default: from settings)
34
+
35
+ **Yields**: `AgentEvent` objects for:
36
+ - `started`: Research started
37
+ - `search_complete`: Search completed
38
+ - `judge_complete`: Evidence evaluation completed
39
+ - `synthesizing`: Generating report
40
+ - `complete`: Research completed
41
+ - `error`: Error occurred
42
 
43
  ## DeepResearchFlow
44
 
 
50
 
51
  #### `run`
52
 
53
+ ```python
54
+ async def run(
55
+ self,
56
+ query: str,
57
+ background_context: str = "",
58
+ max_iterations_per_section: int | None = None,
59
+ max_time_minutes: float | None = None,
60
+ token_budget: int | None = None
61
+ ) -> AsyncGenerator[AgentEvent, None]
62
+ ```
63
 
64
  Runs deep research flow.
65
 
66
  **Parameters**:
67
  - `query`: Research query string
68
+ - `background_context`: Background context (default: "")
69
+ - `max_iterations_per_section`: Maximum iterations per section (default: from settings)
70
+ - `max_time_minutes`: Maximum time in minutes (default: from settings)
71
+ - `token_budget`: Token budget (default: from settings)
72
+
73
+ **Yields**: `AgentEvent` objects for:
74
+ - `started`: Research started
75
+ - `planning`: Creating research plan
76
+ - `looping`: Running parallel research loops
77
+ - `synthesizing`: Synthesizing results
78
+ - `complete`: Research completed
79
+ - `error`: Error occurred
80
 
81
  ## GraphOrchestrator
82
 
 
88
 
89
  #### `run`
90
 
91
+ ```python
92
+ async def run(
93
+ self,
94
+ query: str,
95
+ research_mode: str = "auto",
96
+ use_graph: bool = True
97
+ ) -> AsyncGenerator[AgentEvent, None]
98
+ ```
99
 
100
  Runs graph-based research orchestration.
101
 
102
  **Parameters**:
103
  - `query`: Research query string
104
+ - `research_mode`: Research mode ("iterative", "deep", or "auto")
105
+ - `use_graph`: Whether to use graph execution (default: True)
106
 
107
  **Yields**: `AgentEvent` objects during graph execution.
108
 
 
 
 
 
109
  ## Orchestrator Factory
110
 
111
  **Module**: `src.orchestrator_factory`
 
116
 
117
  #### `create_orchestrator`
118
 
119
+ ```python
120
+ def create_orchestrator(
121
+ search_handler: SearchHandlerProtocol,
122
+ judge_handler: JudgeHandlerProtocol,
123
+ config: dict[str, Any],
124
+ mode: str | None = None
125
+ ) -> Any
126
+ ```
127
 
128
  Creates an orchestrator instance.
129
 
130
  **Parameters**:
131
+ - `search_handler`: Search handler protocol implementation
132
+ - `judge_handler`: Judge handler protocol implementation
133
+ - `config`: Configuration dictionary
134
+ - `mode`: Orchestrator mode ("simple", "advanced", "magentic", or None for auto-detect)
 
135
 
136
  **Returns**: Orchestrator instance.
137
 
 
153
 
154
  #### `run`
155
 
156
+ ```python
157
+ async def run(
158
+ self,
159
+ query: str,
160
+ max_rounds: int = 15,
161
+ max_stalls: int = 3
162
+ ) -> AsyncGenerator[AgentEvent, None]
163
+ ```
164
 
165
  Runs Magentic orchestration.
166
 
167
  **Parameters**:
168
  - `query`: Research query string
169
+ - `max_rounds`: Maximum rounds (default: 15)
170
+ - `max_stalls`: Maximum stalls before reset (default: 3)
171
 
172
  **Yields**: `AgentEvent` objects converted from Magentic events.
173
 
 
 
174
  **Requirements**:
175
  - `agent-framework-core` package
176
  - OpenAI API key
 
178
  ## See Also
179
 
180
  - [Architecture - Orchestrators](../architecture/orchestrators.md) - Architecture overview
181
+ - [Graph Orchestration](../architecture/graph-orchestration.md) - Graph execution details
182
+
183
+
184
+
185
+
186
+
187
+
188
+
189
+
190
+
191
+
docs/api/services.md CHANGED
@@ -12,9 +12,9 @@ This page documents the API for DeepCritical services.
12
 
13
  #### `embed`
14
 
15
- <!--codeinclude-->
16
- [EmbeddingService.embed](../src/services/embeddings.py) start_line:55 end_line:55
17
- <!--/codeinclude-->
18
 
19
  Generates embedding for a text string.
20
 
@@ -68,60 +68,6 @@ Finds duplicate texts based on similarity threshold.
68
 
69
  **Returns**: List of (index1, index2) tuples for duplicate pairs.
70
 
71
- #### `add_evidence`
72
-
73
- ```python
74
- async def add_evidence(
75
- self,
76
- evidence_id: str,
77
- content: str,
78
- metadata: dict[str, Any]
79
- ) -> None
80
- ```
81
-
82
- Adds evidence to vector store for semantic search.
83
-
84
- **Parameters**:
85
- - `evidence_id`: Unique identifier for the evidence
86
- - `content`: Evidence text content
87
- - `metadata`: Additional metadata dictionary
88
-
89
- #### `search_similar`
90
-
91
- ```python
92
- async def search_similar(
93
- self,
94
- query: str,
95
- n_results: int = 5
96
- ) -> list[dict[str, Any]]
97
- ```
98
-
99
- Finds semantically similar evidence.
100
-
101
- **Parameters**:
102
- - `query`: Search query string
103
- - `n_results`: Number of results to return (default: 5)
104
-
105
- **Returns**: List of dictionaries with `id`, `content`, `metadata`, and `distance` keys.
106
-
107
- #### `deduplicate`
108
-
109
- ```python
110
- async def deduplicate(
111
- self,
112
- new_evidence: list[Evidence],
113
- threshold: float = 0.9
114
- ) -> list[Evidence]
115
- ```
116
-
117
- Removes semantically duplicate evidence.
118
-
119
- **Parameters**:
120
- - `new_evidence`: List of evidence items to deduplicate
121
- - `threshold`: Similarity threshold (default: 0.9, where 0.9 = 90% similar is duplicate)
122
-
123
- **Returns**: List of unique evidence items (not already in vector store).
124
-
125
  ### Factory Function
126
 
127
  #### `get_embedding_service`
@@ -143,97 +89,63 @@ Returns singleton EmbeddingService instance.
143
 
144
  #### `ingest_evidence`
145
 
146
- <!--codeinclude-->
147
- [LlamaIndexRAGService.ingest_evidence](../src/services/llamaindex_rag.py) start_line:290 end_line:290
148
- <!--/codeinclude-->
149
 
150
  Ingests evidence into RAG service.
151
 
152
  **Parameters**:
153
- - `evidence_list`: List of Evidence objects to ingest
154
 
155
- **Note**: Supports multiple embedding providers (OpenAI, local sentence-transformers, Hugging Face).
156
 
157
  #### `retrieve`
158
 
159
  ```python
160
- def retrieve(
161
  self,
162
  query: str,
163
- top_k: int | None = None
164
- ) -> list[dict[str, Any]]
165
  ```
166
 
167
  Retrieves relevant documents for a query.
168
 
169
  **Parameters**:
170
  - `query`: Search query string
171
- - `top_k`: Number of top results to return (defaults to `similarity_top_k` from constructor)
172
 
173
- **Returns**: List of dictionaries with `text`, `score`, and `metadata` keys.
174
 
175
  #### `query`
176
 
177
  ```python
178
- def query(
179
  self,
180
- query_str: str,
181
- top_k: int | None = None
182
  ) -> str
183
  ```
184
 
185
- Queries RAG service and returns synthesized response.
186
-
187
- **Parameters**:
188
- - `query_str`: Query string
189
- - `top_k`: Number of results to use (defaults to `similarity_top_k` from constructor)
190
-
191
- **Returns**: Synthesized response string.
192
-
193
- **Raises**:
194
- - `ConfigurationError`: If no LLM API key is available for query synthesis
195
-
196
- #### `ingest_documents`
197
-
198
- ```python
199
- def ingest_documents(self, documents: list[Any]) -> None
200
- ```
201
-
202
- Ingests raw LlamaIndex Documents.
203
 
204
  **Parameters**:
205
- - `documents`: List of LlamaIndex Document objects
206
-
207
- #### `clear_collection`
208
-
209
- ```python
210
- def clear_collection(self) -> None
211
- ```
212
 
213
- Clears all documents from the collection.
214
 
215
  ### Factory Function
216
 
217
  #### `get_rag_service`
218
 
219
  ```python
220
- def get_rag_service(
221
- collection_name: str = "deepcritical_evidence",
222
- oauth_token: str | None = None,
223
- **kwargs: Any
224
- ) -> LlamaIndexRAGService
225
  ```
226
 
227
- Get or create a RAG service instance.
228
-
229
- **Parameters**:
230
- - `collection_name`: Name of the ChromaDB collection (default: "deepcritical_evidence")
231
- - `oauth_token`: Optional OAuth token from HuggingFace login (takes priority over env vars)
232
- - `**kwargs`: Additional arguments for LlamaIndexRAGService (e.g., `use_openai_embeddings=False`)
233
-
234
- **Returns**: Configured LlamaIndexRAGService instance.
235
-
236
- **Note**: By default, uses local embeddings (sentence-transformers) which require no API keys.
237
 
238
  ## StatisticalAnalyzer
239
 
@@ -248,27 +160,24 @@ Get or create a RAG service instance.
248
  ```python
249
  async def analyze(
250
  self,
251
- query: str,
252
  evidence: list[Evidence],
253
- hypothesis: dict[str, Any] | None = None
254
  ) -> AnalysisResult
255
  ```
256
 
257
- Analyzes a research question using statistical methods.
258
 
259
  **Parameters**:
260
- - `query`: The research question
261
- - `evidence`: List of Evidence objects to analyze
262
- - `hypothesis`: Optional hypothesis dict with `drug`, `target`, `pathway`, `effect`, `confidence` keys
263
 
264
  **Returns**: `AnalysisResult` with:
265
  - `verdict`: SUPPORTED, REFUTED, or INCONCLUSIVE
266
- - `confidence`: Confidence in verdict (0.0-1.0)
267
- - `statistical_evidence`: Summary of statistical findings
268
- - `code_generated`: Python code that was executed
269
- - `execution_output`: Output from code execution
270
- - `key_takeaways`: Key takeaways from analysis
271
- - `limitations`: List of limitations
272
 
273
  **Note**: Requires Modal credentials for sandbox execution.
274
 
@@ -277,3 +186,12 @@ Analyzes a research question using statistical methods.
277
  - [Architecture - Services](../architecture/services.md) - Architecture overview
278
  - [Configuration](../configuration/index.md) - Service configuration
279
 
 
 
 
 
 
 
 
 
 
 
12
 
13
  #### `embed`
14
 
15
+ ```python
16
+ async def embed(self, text: str) -> list[float]
17
+ ```
18
 
19
  Generates embedding for a text string.
20
 
 
68
 
69
  **Returns**: List of (index1, index2) tuples for duplicate pairs.
70
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
71
  ### Factory Function
72
 
73
  #### `get_embedding_service`
 
89
 
90
  #### `ingest_evidence`
91
 
92
+ ```python
93
+ async def ingest_evidence(self, evidence: list[Evidence]) -> None
94
+ ```
95
 
96
  Ingests evidence into RAG service.
97
 
98
  **Parameters**:
99
+ - `evidence`: List of Evidence objects to ingest
100
 
101
+ **Note**: Requires OpenAI API key for embeddings.
102
 
103
  #### `retrieve`
104
 
105
  ```python
106
+ async def retrieve(
107
  self,
108
  query: str,
109
+ top_k: int = 5
110
+ ) -> list[Document]
111
  ```
112
 
113
  Retrieves relevant documents for a query.
114
 
115
  **Parameters**:
116
  - `query`: Search query string
117
+ - `top_k`: Number of top results to return (default: 5)
118
 
119
+ **Returns**: List of Document objects with metadata.
120
 
121
  #### `query`
122
 
123
  ```python
124
+ async def query(
125
  self,
126
+ query: str,
127
+ top_k: int = 5
128
  ) -> str
129
  ```
130
 
131
+ Queries RAG service and returns formatted results.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
132
 
133
  **Parameters**:
134
+ - `query`: Search query string
135
+ - `top_k`: Number of top results to return (default: 5)
 
 
 
 
 
136
 
137
+ **Returns**: Formatted query results as string.
138
 
139
  ### Factory Function
140
 
141
  #### `get_rag_service`
142
 
143
  ```python
144
+ @lru_cache(maxsize=1)
145
+ def get_rag_service() -> LlamaIndexRAGService | None
 
 
 
146
  ```
147
 
148
+ Returns singleton LlamaIndexRAGService instance, or None if OpenAI key not available.
 
 
 
 
 
 
 
 
 
149
 
150
  ## StatisticalAnalyzer
151
 
 
160
  ```python
161
  async def analyze(
162
  self,
163
+ hypothesis: str,
164
  evidence: list[Evidence],
165
+ data_description: str | None = None
166
  ) -> AnalysisResult
167
  ```
168
 
169
+ Analyzes a hypothesis using statistical methods.
170
 
171
  **Parameters**:
172
+ - `hypothesis`: Hypothesis to analyze
173
+ - `evidence`: List of Evidence objects
174
+ - `data_description`: Optional data description
175
 
176
  **Returns**: `AnalysisResult` with:
177
  - `verdict`: SUPPORTED, REFUTED, or INCONCLUSIVE
178
+ - `code`: Generated analysis code
179
+ - `output`: Execution output
180
+ - `error`: Error message if execution failed
 
 
 
181
 
182
  **Note**: Requires Modal credentials for sandbox execution.
183
 
 
186
  - [Architecture - Services](../architecture/services.md) - Architecture overview
187
  - [Configuration](../configuration/index.md) - Service configuration
188
 
189
+
190
+
191
+
192
+
193
+
194
+
195
+
196
+
197
+
docs/api/tools.md CHANGED
@@ -56,10 +56,8 @@ Searches PubMed for articles.
56
  **Returns**: List of `Evidence` objects with PubMed articles.
57
 
58
  **Raises**:
59
- - `SearchError`: If search fails (timeout, HTTP error, XML parsing error)
60
- - `RateLimitError`: If rate limit is exceeded (429 status code)
61
-
62
- **Note**: Uses NCBI E-utilities (ESearch → EFetch). Rate limit: 0.34s between requests. Handles single vs. multiple articles.
63
 
64
  ## ClinicalTrialsTool
65
 
@@ -98,10 +96,10 @@ Searches ClinicalTrials.gov for trials.
98
 
99
  **Returns**: List of `Evidence` objects with clinical trials.
100
 
101
- **Note**: Only returns interventional studies with status: COMPLETED, ACTIVE_NOT_RECRUITING, RECRUITING, ENROLLING_BY_INVITATION. Uses `requests` library (NOT httpx - WAF blocks httpx). Runs in thread pool for async compatibility.
102
 
103
  **Raises**:
104
- - `SearchError`: If search fails (HTTP error, request exception)
105
 
106
  ## EuropePMCTool
107
 
@@ -140,10 +138,10 @@ Searches Europe PMC for articles and preprints.
140
 
141
  **Returns**: List of `Evidence` objects with articles/preprints.
142
 
143
- **Note**: Includes both preprints (marked with `[PREPRINT - Not peer-reviewed]`) and peer-reviewed articles. Handles preprint markers. Builds URLs from DOI or PMID.
144
 
145
  **Raises**:
146
- - `SearchError`: If search fails (HTTP error, connection error)
147
 
148
  ## RAGTool
149
 
@@ -151,20 +149,6 @@ Searches Europe PMC for articles and preprints.
151
 
152
  **Purpose**: Semantic search within collected evidence.
153
 
154
- ### Initialization
155
-
156
- ```python
157
- def __init__(
158
- self,
159
- rag_service: LlamaIndexRAGService | None = None,
160
- oauth_token: str | None = None
161
- ) -> None
162
- ```
163
-
164
- **Parameters**:
165
- - `rag_service`: Optional RAG service instance. If None, will be lazy-initialized.
166
- - `oauth_token`: Optional OAuth token from HuggingFace login (for RAG LLM)
167
-
168
  ### Properties
169
 
170
  #### `name`
@@ -196,10 +180,7 @@ Searches collected evidence using semantic similarity.
196
 
197
  **Returns**: List of `Evidence` objects from collected evidence.
198
 
199
- **Raises**:
200
- - `ConfigurationError`: If RAG service is unavailable
201
-
202
- **Note**: Requires evidence to be ingested into RAG service first. Wraps `LlamaIndexRAGService`. Returns Evidence from RAG results.
203
 
204
  ## SearchHandler
205
 
@@ -207,53 +188,44 @@ Searches collected evidence using semantic similarity.
207
 
208
  **Purpose**: Orchestrates parallel searches across multiple tools.
209
 
210
- ### Initialization
 
 
211
 
212
  ```python
213
- def __init__(
214
  self,
215
- tools: list[SearchTool],
216
- timeout: float = 30.0,
217
- include_rag: bool = False,
218
- auto_ingest_to_rag: bool = True,
219
- oauth_token: str | None = None
220
- ) -> None
221
  ```
222
 
223
- **Parameters**:
224
- - `tools`: List of search tools to use
225
- - `timeout`: Timeout for each search in seconds (default: 30.0)
226
- - `include_rag`: Whether to include RAG tool in searches (default: False)
227
- - `auto_ingest_to_rag`: Whether to automatically ingest results into RAG (default: True)
228
- - `oauth_token`: Optional OAuth token from HuggingFace login (for RAG LLM)
229
-
230
- ### Methods
231
-
232
- #### `execute`
233
-
234
- <!--codeinclude-->
235
- [SearchHandler.execute](../src/tools/search_handler.py) start_line:86 end_line:86
236
- <!--/codeinclude-->
237
-
238
  Searches multiple tools in parallel.
239
 
240
  **Parameters**:
241
  - `query`: Search query string
 
242
  - `max_results_per_tool`: Maximum results per tool (default: 10)
243
 
244
  **Returns**: `SearchResult` with:
245
- - `query`: The search query
246
  - `evidence`: Aggregated list of evidence
247
- - `sources_searched`: List of source names searched
248
- - `total_found`: Total number of results
249
- - `errors`: List of error messages from failed tools
250
 
251
- **Raises**:
252
- - `SearchError`: If search times out
253
-
254
- **Note**: Uses `asyncio.gather()` for parallel execution. Handles tool failures gracefully (returns errors in `SearchResult.errors`). Automatically ingests evidence into RAG if enabled.
255
 
256
  ## See Also
257
 
258
  - [Architecture - Tools](../architecture/tools.md) - Architecture overview
259
  - [Models API](models.md) - Data models used by tools
 
 
 
 
 
 
 
 
 
 
 
56
  **Returns**: List of `Evidence` objects with PubMed articles.
57
 
58
  **Raises**:
59
+ - `SearchError`: If search fails
60
+ - `RateLimitError`: If rate limit is exceeded
 
 
61
 
62
  ## ClinicalTrialsTool
63
 
 
96
 
97
  **Returns**: List of `Evidence` objects with clinical trials.
98
 
99
+ **Note**: Only returns interventional studies with status: COMPLETED, ACTIVE_NOT_RECRUITING, RECRUITING, ENROLLING_BY_INVITATION
100
 
101
  **Raises**:
102
+ - `SearchError`: If search fails
103
 
104
  ## EuropePMCTool
105
 
 
138
 
139
  **Returns**: List of `Evidence` objects with articles/preprints.
140
 
141
+ **Note**: Includes both preprints (marked with `[PREPRINT - Not peer-reviewed]`) and peer-reviewed articles.
142
 
143
  **Raises**:
144
+ - `SearchError`: If search fails
145
 
146
  ## RAGTool
147
 
 
149
 
150
  **Purpose**: Semantic search within collected evidence.
151
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
152
  ### Properties
153
 
154
  #### `name`
 
180
 
181
  **Returns**: List of `Evidence` objects from collected evidence.
182
 
183
+ **Note**: Requires evidence to be ingested into RAG service first.
 
 
 
184
 
185
  ## SearchHandler
186
 
 
188
 
189
  **Purpose**: Orchestrates parallel searches across multiple tools.
190
 
191
+ ### Methods
192
+
193
+ #### `search`
194
 
195
  ```python
196
+ async def search(
197
  self,
198
+ query: str,
199
+ tools: list[SearchTool] | None = None,
200
+ max_results_per_tool: int = 10
201
+ ) -> SearchResult
 
 
202
  ```
203
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
204
  Searches multiple tools in parallel.
205
 
206
  **Parameters**:
207
  - `query`: Search query string
208
+ - `tools`: List of tools to use (default: all available tools)
209
  - `max_results_per_tool`: Maximum results per tool (default: 10)
210
 
211
  **Returns**: `SearchResult` with:
 
212
  - `evidence`: Aggregated list of evidence
213
+ - `tool_results`: Results per tool
214
+ - `total_count`: Total number of results
 
215
 
216
+ **Note**: Uses `asyncio.gather()` for parallel execution. Handles tool failures gracefully.
 
 
 
217
 
218
  ## See Also
219
 
220
  - [Architecture - Tools](../architecture/tools.md) - Architecture overview
221
  - [Models API](models.md) - Data models used by tools
222
+
223
+
224
+
225
+
226
+
227
+
228
+
229
+
230
+
231
+
docs/architecture/agents.md CHANGED
@@ -4,16 +4,12 @@ DeepCritical uses Pydantic AI agents for all AI-powered operations. All agents f
4
 
5
  ## Agent Pattern
6
 
7
- ### Pydantic AI Agents
8
-
9
- Pydantic AI agents use the `Agent` class with the following structure:
10
 
11
  - **System Prompt**: Module-level constant with date injection
12
  - **Agent Class**: `__init__(model: Any | None = None)`
13
  - **Main Method**: Async method (e.g., `async def evaluate()`, `async def write_report()`)
14
- - **Factory Function**: `def create_agent_name(model: Any | None = None, oauth_token: str | None = None) -> AgentName`
15
-
16
- **Note**: Factory functions accept an optional `oauth_token` parameter for HuggingFace authentication, which takes priority over environment variables.
17
 
18
  ## Model Initialization
19
 
@@ -159,130 +155,19 @@ For text output (writer agents), agents return `str` directly.
159
  - `key_entities`: List of key entities
160
  - `research_questions`: List of research questions
161
 
162
- ## Magentic Agents
163
-
164
- The following agents use the `BaseAgent` pattern from `agent-framework` and are used exclusively with `MagenticOrchestrator`:
165
-
166
- ### Hypothesis Agent
167
-
168
- **File**: `src/agents/hypothesis_agent.py`
169
-
170
- **Purpose**: Generates mechanistic hypotheses based on evidence.
171
-
172
- **Pattern**: `BaseAgent` from `agent-framework`
173
-
174
- **Methods**:
175
- - `async def run(messages, thread, **kwargs) -> AgentRunResponse`
176
-
177
- **Features**:
178
- - Uses internal Pydantic AI `Agent` with `HypothesisAssessment` output type
179
- - Accesses shared `evidence_store` for evidence
180
- - Uses embedding service for diverse evidence selection (MMR algorithm)
181
- - Stores hypotheses in shared context
182
-
183
- ### Search Agent
184
-
185
- **File**: `src/agents/search_agent.py`
186
-
187
- **Purpose**: Wraps `SearchHandler` as an agent for Magentic orchestrator.
188
-
189
- **Pattern**: `BaseAgent` from `agent-framework`
190
-
191
- **Methods**:
192
- - `async def run(messages, thread, **kwargs) -> AgentRunResponse`
193
-
194
- **Features**:
195
- - Executes searches via `SearchHandlerProtocol`
196
- - Deduplicates evidence using embedding service
197
- - Searches for semantically related evidence
198
- - Updates shared evidence store
199
-
200
- ### Analysis Agent
201
-
202
- **File**: `src/agents/analysis_agent.py`
203
-
204
- **Purpose**: Performs statistical analysis using Modal sandbox.
205
-
206
- **Pattern**: `BaseAgent` from `agent-framework`
207
-
208
- **Methods**:
209
- - `async def run(messages, thread, **kwargs) -> AgentRunResponse`
210
-
211
- **Features**:
212
- - Wraps `StatisticalAnalyzer` service
213
- - Analyzes evidence and hypotheses
214
- - Returns verdict (SUPPORTED/REFUTED/INCONCLUSIVE)
215
- - Stores analysis results in shared context
216
-
217
- ### Report Agent (Magentic)
218
-
219
- **File**: `src/agents/report_agent.py`
220
-
221
- **Purpose**: Generates structured scientific reports from evidence and hypotheses.
222
-
223
- **Pattern**: `BaseAgent` from `agent-framework`
224
-
225
- **Methods**:
226
- - `async def run(messages, thread, **kwargs) -> AgentRunResponse`
227
-
228
- **Features**:
229
- - Uses internal Pydantic AI `Agent` with `ResearchReport` output type
230
- - Accesses shared evidence store and hypotheses
231
- - Validates citations before returning
232
- - Formats report as markdown
233
-
234
- ### Judge Agent
235
-
236
- **File**: `src/agents/judge_agent.py`
237
-
238
- **Purpose**: Evaluates evidence quality and determines if sufficient for synthesis.
239
-
240
- **Pattern**: `BaseAgent` from `agent-framework`
241
-
242
- **Methods**:
243
- - `async def run(messages, thread, **kwargs) -> AgentRunResponse`
244
- - `async def run_stream(messages, thread, **kwargs) -> AsyncIterable[AgentRunResponseUpdate]`
245
-
246
- **Features**:
247
- - Wraps `JudgeHandlerProtocol`
248
- - Accesses shared evidence store
249
- - Returns `JudgeAssessment` with sufficient flag, confidence, and recommendation
250
-
251
- ## Agent Patterns
252
-
253
- DeepCritical uses two distinct agent patterns:
254
-
255
- ### 1. Pydantic AI Agents (Traditional Pattern)
256
-
257
- These agents use the Pydantic AI `Agent` class directly and are used in iterative and deep research flows:
258
-
259
- - **Pattern**: `Agent(model, output_type, system_prompt)`
260
- - **Initialization**: `__init__(model: Any | None = None)`
261
- - **Methods**: Agent-specific async methods (e.g., `async def evaluate()`, `async def write_report()`)
262
- - **Examples**: `KnowledgeGapAgent`, `ToolSelectorAgent`, `WriterAgent`, `LongWriterAgent`, `ProofreaderAgent`, `ThinkingAgent`, `InputParserAgent`
263
-
264
- ### 2. Magentic Agents (Agent-Framework Pattern)
265
-
266
- These agents use the `BaseAgent` class from `agent-framework` and are used in Magentic orchestrator:
267
-
268
- - **Pattern**: `BaseAgent` from `agent-framework` with `async def run()` method
269
- - **Initialization**: `__init__(evidence_store, embedding_service, ...)`
270
- - **Methods**: `async def run(messages, thread, **kwargs) -> AgentRunResponse`
271
- - **Examples**: `HypothesisAgent`, `SearchAgent`, `AnalysisAgent`, `ReportAgent`, `JudgeAgent`
272
-
273
- **Note**: Magentic agents are used exclusively with the `MagenticOrchestrator` and follow the agent-framework protocol for multi-agent coordination.
274
-
275
  ## Factory Functions
276
 
277
  All agents have factory functions in `src/agent_factory/agents.py`:
278
 
279
- <!--codeinclude-->
280
- [Factory Functions](../src/agent_factory/agents.py) start_line:79 end_line:100
281
- <!--/codeinclude-->
 
 
 
282
 
283
  Factory functions:
284
  - Use `get_model()` if no model provided
285
- - Accept `oauth_token` parameter for HuggingFace authentication
286
  - Raise `ConfigurationError` if creation fails
287
  - Log agent creation
288
 
@@ -291,3 +176,13 @@ Factory functions:
291
  - [Orchestrators](orchestrators.md) - How agents are orchestrated
292
  - [API Reference - Agents](../api/agents.md) - API documentation
293
  - [Contributing - Code Style](../contributing/code-style.md) - Development guidelines
 
 
 
 
 
 
 
 
 
 
 
4
 
5
  ## Agent Pattern
6
 
7
+ All agents use the Pydantic AI `Agent` class with the following structure:
 
 
8
 
9
  - **System Prompt**: Module-level constant with date injection
10
  - **Agent Class**: `__init__(model: Any | None = None)`
11
  - **Main Method**: Async method (e.g., `async def evaluate()`, `async def write_report()`)
12
+ - **Factory Function**: `def create_agent_name(model: Any | None = None) -> AgentName`
 
 
13
 
14
  ## Model Initialization
15
 
 
155
  - `key_entities`: List of key entities
156
  - `research_questions`: List of research questions
157
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
158
  ## Factory Functions
159
 
160
  All agents have factory functions in `src/agent_factory/agents.py`:
161
 
162
+ ```python
163
+ def create_knowledge_gap_agent(model: Any | None = None) -> KnowledgeGapAgent
164
+ def create_tool_selector_agent(model: Any | None = None) -> ToolSelectorAgent
165
+ def create_writer_agent(model: Any | None = None) -> WriterAgent
166
+ # ... etc
167
+ ```
168
 
169
  Factory functions:
170
  - Use `get_model()` if no model provided
 
171
  - Raise `ConfigurationError` if creation fails
172
  - Log agent creation
173
 
 
176
  - [Orchestrators](orchestrators.md) - How agents are orchestrated
177
  - [API Reference - Agents](../api/agents.md) - API documentation
178
  - [Contributing - Code Style](../contributing/code-style.md) - Development guidelines
179
+
180
+
181
+
182
+
183
+
184
+
185
+
186
+
187
+
188
+
docs/architecture/graph-orchestration.md ADDED
@@ -0,0 +1,152 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Graph Orchestration Architecture
2
+
3
+ ## Overview
4
+
5
+ Phase 4 implements a graph-based orchestration system for research workflows using Pydantic AI agents as nodes. This enables better parallel execution, conditional routing, and state management compared to simple agent chains.
6
+
7
+ ## Graph Structure
8
+
9
+ ### Nodes
10
+
11
+ Graph nodes represent different stages in the research workflow:
12
+
13
+ 1. **Agent Nodes**: Execute Pydantic AI agents
14
+ - Input: Prompt/query
15
+ - Output: Structured or unstructured response
16
+ - Examples: `KnowledgeGapAgent`, `ToolSelectorAgent`, `ThinkingAgent`
17
+
18
+ 2. **State Nodes**: Update or read workflow state
19
+ - Input: Current state
20
+ - Output: Updated state
21
+ - Examples: Update evidence, update conversation history
22
+
23
+ 3. **Decision Nodes**: Make routing decisions based on conditions
24
+ - Input: Current state/results
25
+ - Output: Next node ID
26
+ - Examples: Continue research vs. complete research
27
+
28
+ 4. **Parallel Nodes**: Execute multiple nodes concurrently
29
+ - Input: List of node IDs
30
+ - Output: Aggregated results
31
+ - Examples: Parallel iterative research loops
32
+
33
+ ### Edges
34
+
35
+ Edges define transitions between nodes:
36
+
37
+ 1. **Sequential Edges**: Always traversed (no condition)
38
+ - From: Source node
39
+ - To: Target node
40
+ - Condition: None (always True)
41
+
42
+ 2. **Conditional Edges**: Traversed based on condition
43
+ - From: Source node
44
+ - To: Target node
45
+ - Condition: Callable that returns bool
46
+ - Example: If research complete → go to writer, else → continue loop
47
+
48
+ 3. **Parallel Edges**: Used for parallel execution branches
49
+ - From: Parallel node
50
+ - To: Multiple target nodes
51
+ - Execution: All targets run concurrently
52
+
53
+ ## Graph Patterns
54
+
55
+ ### Iterative Research Graph
56
+
57
+ ```
58
+ [Input] → [Thinking] → [Knowledge Gap] → [Decision: Complete?]
59
+ ↓ No ↓ Yes
60
+ [Tool Selector] [Writer]
61
+
62
+ [Execute Tools] → [Loop Back]
63
+ ```
64
+
65
+ ### Deep Research Graph
66
+
67
+ ```
68
+ [Input] → [Planner] → [Parallel Iterative Loops] → [Synthesizer]
69
+ ↓ ↓ ↓
70
+ [Loop1] [Loop2] [Loop3]
71
+ ```
72
+
73
+ ## State Management
74
+
75
+ State is managed via `WorkflowState` using `ContextVar` for thread-safe isolation:
76
+
77
+ - **Evidence**: Collected evidence from searches
78
+ - **Conversation**: Iteration history (gaps, tool calls, findings, thoughts)
79
+ - **Embedding Service**: For semantic search
80
+
81
+ State transitions occur at state nodes, which update the global workflow state.
82
+
83
+ ## Execution Flow
84
+
85
+ 1. **Graph Construction**: Build graph from nodes and edges
86
+ 2. **Graph Validation**: Ensure graph is valid (no cycles, all nodes reachable)
87
+ 3. **Graph Execution**: Traverse graph from entry node
88
+ 4. **Node Execution**: Execute each node based on type
89
+ 5. **Edge Evaluation**: Determine next node(s) based on edges
90
+ 6. **Parallel Execution**: Use `asyncio.gather()` for parallel nodes
91
+ 7. **State Updates**: Update state at state nodes
92
+ 8. **Event Streaming**: Yield events during execution for UI
93
+
94
+ ## Conditional Routing
95
+
96
+ Decision nodes evaluate conditions and return next node IDs:
97
+
98
+ - **Knowledge Gap Decision**: If `research_complete` → writer, else → tool selector
99
+ - **Budget Decision**: If budget exceeded → exit, else → continue
100
+ - **Iteration Decision**: If max iterations → exit, else → continue
101
+
102
+ ## Parallel Execution
103
+
104
+ Parallel nodes execute multiple nodes concurrently:
105
+
106
+ - Each parallel branch runs independently
107
+ - Results are aggregated after all branches complete
108
+ - State is synchronized after parallel execution
109
+ - Errors in one branch don't stop other branches
110
+
111
+ ## Budget Enforcement
112
+
113
+ Budget constraints are enforced at decision nodes:
114
+
115
+ - **Token Budget**: Track LLM token usage
116
+ - **Time Budget**: Track elapsed time
117
+ - **Iteration Budget**: Track iteration count
118
+
119
+ If any budget is exceeded, execution routes to exit node.
120
+
121
+ ## Error Handling
122
+
123
+ Errors are handled at multiple levels:
124
+
125
+ 1. **Node Level**: Catch errors in individual node execution
126
+ 2. **Graph Level**: Handle errors during graph traversal
127
+ 3. **State Level**: Rollback state changes on error
128
+
129
+ Errors are logged and yield error events for UI.
130
+
131
+ ## Backward Compatibility
132
+
133
+ Graph execution is optional via feature flag:
134
+
135
+ - `USE_GRAPH_EXECUTION=true`: Use graph-based execution
136
+ - `USE_GRAPH_EXECUTION=false`: Use agent chain execution (existing)
137
+
138
+ This allows gradual migration and fallback if needed.
139
+
140
+
141
+
142
+
143
+
144
+
145
+
146
+
147
+
148
+
149
+
150
+
151
+
152
+
docs/architecture/graph_orchestration.md CHANGED
@@ -2,163 +2,7 @@
2
 
3
  ## Overview
4
 
5
- DeepCritical implements a graph-based orchestration system for research workflows using Pydantic AI agents as nodes. This enables better parallel execution, conditional routing, and state management compared to simple agent chains.
6
-
7
- ## Conversation History
8
-
9
- DeepCritical supports multi-turn conversations through Pydantic AI's native message history format. The system maintains two types of history:
10
-
11
- 1. **User Conversation History**: Multi-turn user interactions (from Gradio chat interface) stored as `list[ModelMessage]`
12
- 2. **Research Iteration History**: Internal research process state (existing `Conversation` model)
13
-
14
- ### Message History Flow
15
-
16
- ```
17
- Gradio Chat History → convert_gradio_to_message_history() → GraphOrchestrator.run(message_history)
18
-
19
- GraphExecutionContext (stores message_history)
20
-
21
- Agent Nodes (receive message_history via agent.run())
22
-
23
- WorkflowState (persists user_message_history)
24
- ```
25
-
26
- ### Usage
27
-
28
- Message history is automatically converted from Gradio format and passed through the orchestrator:
29
-
30
- ```python
31
- # In app.py - automatic conversion
32
- message_history = convert_gradio_to_message_history(history) if history else None
33
- async for event in orchestrator.run(query, message_history=message_history):
34
- yield event
35
- ```
36
-
37
- Agents receive message history through their `run()` methods:
38
-
39
- ```python
40
- # In agent execution
41
- if message_history:
42
- result = await agent.run(input_data, message_history=message_history)
43
- ```
44
-
45
- ## Graph Patterns
46
-
47
- ### Iterative Research Graph
48
-
49
- The iterative research graph follows this pattern:
50
-
51
- ```
52
- [Input] → [Thinking] → [Knowledge Gap] → [Decision: Complete?]
53
- ↓ No ↓ Yes
54
- [Tool Selector] [Writer]
55
-
56
- [Execute Tools] → [Loop Back]
57
- ```
58
-
59
- **Node IDs**: `thinking` → `knowledge_gap` → `continue_decision` → `tool_selector`/`writer` → `execute_tools` → (loop back to `thinking`)
60
-
61
- **Special Node Handling**:
62
- - `execute_tools`: State node that uses `search_handler` to execute searches and add evidence to workflow state
63
- - `continue_decision`: Decision node that routes based on `research_complete` flag from `KnowledgeGapOutput`
64
-
65
- ### Deep Research Graph
66
-
67
- The deep research graph follows this pattern:
68
-
69
- ```
70
- [Input] → [Planner] → [Store Plan] → [Parallel Loops] → [Collect Drafts] → [Synthesizer]
71
- ↓ ↓ ↓
72
- [Loop1] [Loop2] [Loop3]
73
- ```
74
-
75
- **Node IDs**: `planner` → `store_plan` → `parallel_loops` → `collect_drafts` → `synthesizer`
76
-
77
- **Special Node Handling**:
78
- - `planner`: Agent node that creates `ReportPlan` with report outline
79
- - `store_plan`: State node that stores `ReportPlan` in context for parallel loops
80
- - `parallel_loops`: Parallel node that executes `IterativeResearchFlow` instances for each section
81
- - `collect_drafts`: State node that collects section drafts from parallel loops
82
- - `synthesizer`: Agent node that calls `LongWriterAgent.write_report()` directly with `ReportDraft`
83
-
84
- ### Deep Research
85
-
86
- ```mermaid
87
-
88
- sequenceDiagram
89
- actor User
90
- participant GraphOrchestrator
91
- participant InputParser
92
- participant GraphBuilder
93
- participant GraphExecutor
94
- participant Agent
95
- participant BudgetTracker
96
- participant WorkflowState
97
-
98
- User->>GraphOrchestrator: run(query)
99
- GraphOrchestrator->>InputParser: detect_research_mode(query)
100
- InputParser-->>GraphOrchestrator: mode (iterative/deep)
101
- GraphOrchestrator->>GraphBuilder: build_graph(mode)
102
- GraphBuilder-->>GraphOrchestrator: ResearchGraph
103
- GraphOrchestrator->>WorkflowState: init_workflow_state()
104
- GraphOrchestrator->>BudgetTracker: create_budget()
105
- GraphOrchestrator->>GraphExecutor: _execute_graph(graph)
106
-
107
- loop For each node in graph
108
- GraphExecutor->>Agent: execute_node(agent_node)
109
- Agent->>Agent: process_input
110
- Agent-->>GraphExecutor: result
111
- GraphExecutor->>WorkflowState: update_state(result)
112
- GraphExecutor->>BudgetTracker: add_tokens(used)
113
- GraphExecutor->>BudgetTracker: check_budget()
114
- alt Budget exceeded
115
- GraphExecutor->>GraphOrchestrator: emit(error_event)
116
- else Continue
117
- GraphExecutor->>GraphOrchestrator: emit(progress_event)
118
- end
119
- end
120
-
121
- GraphOrchestrator->>User: AsyncGenerator[AgentEvent]
122
-
123
- ```
124
-
125
- ### Iterative Research
126
-
127
- ```mermaid
128
- sequenceDiagram
129
- participant IterativeFlow
130
- participant ThinkingAgent
131
- participant KnowledgeGapAgent
132
- participant ToolSelector
133
- participant ToolExecutor
134
- participant JudgeHandler
135
- participant WriterAgent
136
-
137
- IterativeFlow->>IterativeFlow: run(query)
138
-
139
- loop Until complete or max_iterations
140
- IterativeFlow->>ThinkingAgent: generate_observations()
141
- ThinkingAgent-->>IterativeFlow: observations
142
-
143
- IterativeFlow->>KnowledgeGapAgent: evaluate_gaps()
144
- KnowledgeGapAgent-->>IterativeFlow: KnowledgeGapOutput
145
-
146
- alt Research complete
147
- IterativeFlow->>WriterAgent: create_final_report()
148
- WriterAgent-->>IterativeFlow: final_report
149
- else Gaps remain
150
- IterativeFlow->>ToolSelector: select_agents(gap)
151
- ToolSelector-->>IterativeFlow: AgentSelectionPlan
152
-
153
- IterativeFlow->>ToolExecutor: execute_tool_tasks()
154
- ToolExecutor-->>IterativeFlow: ToolAgentOutput[]
155
-
156
- IterativeFlow->>JudgeHandler: assess_evidence()
157
- JudgeHandler-->>IterativeFlow: should_continue
158
- end
159
- end
160
- ```
161
-
162
 
163
  ## Graph Structure
164
 
@@ -206,6 +50,25 @@ Edges define transitions between nodes:
206
  - To: Multiple target nodes
207
  - Execution: All targets run concurrently
208
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
209
 
210
  ## State Management
211
 
@@ -219,35 +82,14 @@ State transitions occur at state nodes, which update the global workflow state.
219
 
220
  ## Execution Flow
221
 
222
- 1. **Graph Construction**: Build graph from nodes and edges using `create_iterative_graph()` or `create_deep_graph()`
223
- 2. **Graph Validation**: Ensure graph is valid (no cycles, all nodes reachable) via `ResearchGraph.validate_structure()`
224
- 3. **Graph Execution**: Traverse graph from entry node using `GraphOrchestrator._execute_graph()`
225
- 4. **Node Execution**: Execute each node based on type:
226
- - **Agent Nodes**: Call `agent.run()` with transformed input
227
- - **State Nodes**: Update workflow state via `state_updater` function
228
- - **Decision Nodes**: Evaluate `decision_function` to get next node ID
229
- - **Parallel Nodes**: Execute all parallel nodes concurrently via `asyncio.gather()`
230
- 5. **Edge Evaluation**: Determine next node(s) based on edges and conditions
231
  6. **Parallel Execution**: Use `asyncio.gather()` for parallel nodes
232
- 7. **State Updates**: Update state at state nodes via `GraphExecutionContext.update_state()`
233
- 8. **Event Streaming**: Yield `AgentEvent` objects during execution for UI
234
-
235
- ### GraphExecutionContext
236
-
237
- The `GraphExecutionContext` class manages execution state during graph traversal:
238
-
239
- - **State**: Current `WorkflowState` instance
240
- - **Budget Tracker**: `BudgetTracker` instance for budget enforcement
241
- - **Node Results**: Dictionary storing results from each node execution
242
- - **Visited Nodes**: Set of node IDs that have been executed
243
- - **Current Node**: ID of the node currently being executed
244
-
245
- Methods:
246
- - `set_node_result(node_id, result)`: Store result from node execution
247
- - `get_node_result(node_id)`: Retrieve stored result
248
- - `has_visited(node_id)`: Check if node was visited
249
- - `mark_visited(node_id)`: Mark node as visited
250
- - `update_state(updater, data)`: Update workflow state
251
 
252
  ## Conditional Routing
253
 
@@ -298,5 +140,20 @@ This allows gradual migration and fallback if needed.
298
  ## See Also
299
 
300
  - [Orchestrators](orchestrators.md) - Overview of all orchestrator patterns
 
301
  - [Workflow Diagrams](workflow-diagrams.md) - Detailed workflow diagrams
302
  - [API Reference - Orchestrators](../api/orchestrators.md) - API documentation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
 
3
  ## Overview
4
 
5
+ Phase 4 implements a graph-based orchestration system for research workflows using Pydantic AI agents as nodes. This enables better parallel execution, conditional routing, and state management compared to simple agent chains.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
 
7
  ## Graph Structure
8
 
 
50
  - To: Multiple target nodes
51
  - Execution: All targets run concurrently
52
 
53
+ ## Graph Patterns
54
+
55
+ ### Iterative Research Graph
56
+
57
+ ```
58
+ [Input] → [Thinking] → [Knowledge Gap] → [Decision: Complete?]
59
+ ↓ No ↓ Yes
60
+ [Tool Selector] [Writer]
61
+
62
+ [Execute Tools] → [Loop Back]
63
+ ```
64
+
65
+ ### Deep Research Graph
66
+
67
+ ```
68
+ [Input] → [Planner] → [Parallel Iterative Loops] → [Synthesizer]
69
+ ↓ ↓ ↓
70
+ [Loop1] [Loop2] [Loop3]
71
+ ```
72
 
73
  ## State Management
74
 
 
82
 
83
  ## Execution Flow
84
 
85
+ 1. **Graph Construction**: Build graph from nodes and edges
86
+ 2. **Graph Validation**: Ensure graph is valid (no cycles, all nodes reachable)
87
+ 3. **Graph Execution**: Traverse graph from entry node
88
+ 4. **Node Execution**: Execute each node based on type
89
+ 5. **Edge Evaluation**: Determine next node(s) based on edges
 
 
 
 
90
  6. **Parallel Execution**: Use `asyncio.gather()` for parallel nodes
91
+ 7. **State Updates**: Update state at state nodes
92
+ 8. **Event Streaming**: Yield events during execution for UI
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
93
 
94
  ## Conditional Routing
95
 
 
140
  ## See Also
141
 
142
  - [Orchestrators](orchestrators.md) - Overview of all orchestrator patterns
143
+ - [Workflows](workflows.md) - Workflow diagrams and patterns
144
  - [Workflow Diagrams](workflow-diagrams.md) - Detailed workflow diagrams
145
  - [API Reference - Orchestrators](../api/orchestrators.md) - API documentation
146
+
147
+
148
+
149
+
150
+
151
+
152
+
153
+
154
+
155
+
156
+
157
+
158
+
159
+
docs/architecture/middleware.md CHANGED
@@ -18,20 +18,22 @@ DeepCritical uses middleware for state management, budget tracking, and workflow
18
  - `embedding_service: Any`: Embedding service for semantic search
19
 
20
  **Methods**:
21
- - `add_evidence(new_evidence: list[Evidence]) -> int`: Adds evidence with URL-based deduplication. Returns the number of new items added (excluding duplicates).
22
- - `async search_related(query: str, n_results: int = 5) -> list[Evidence]`: Semantic search for related evidence using embedding service
23
 
24
  **Initialization**:
 
 
25
 
26
- <!--codeinclude-->
27
- [Initialize Workflow State](../src/middleware/state_machine.py) start_line:98 end_line:110
28
- <!--/codeinclude-->
29
 
30
  **Access**:
 
 
31
 
32
- <!--codeinclude-->
33
- [Get Workflow State](../src/middleware/state_machine.py) start_line:115 end_line:129
34
- <!--/codeinclude-->
35
 
36
  ## Workflow Manager
37
 
@@ -40,10 +42,10 @@ DeepCritical uses middleware for state management, budget tracking, and workflow
40
  **Purpose**: Coordinates parallel research loops
41
 
42
  **Methods**:
43
- - `async add_loop(loop_id: str, query: str) -> ResearchLoop`: Add a new research loop to manage
44
- - `async run_loops_parallel(loop_configs: list[dict], loop_func: Callable, judge_handler: Any | None = None, budget_tracker: Any | None = None) -> list[Any]`: Run multiple research loops in parallel. Takes configuration dicts and a loop function.
45
- - `async update_loop_status(loop_id: str, status: LoopStatus, error: str | None = None)`: Update loop status
46
- - `async sync_loop_evidence_to_state(loop_id: str)`: Synchronize evidence from a specific loop to global state
47
 
48
  **Features**:
49
  - Uses `asyncio.gather()` for parallel execution
@@ -56,22 +58,9 @@ DeepCritical uses middleware for state management, budget tracking, and workflow
56
  from src.middleware.workflow_manager import WorkflowManager
57
 
58
  manager = WorkflowManager()
59
- await manager.add_loop("loop1", "Research query 1")
60
- await manager.add_loop("loop2", "Research query 2")
61
-
62
- async def run_research(config: dict) -> str:
63
- loop_id = config["loop_id"]
64
- query = config["query"]
65
- # ... research logic ...
66
- return "report"
67
-
68
- results = await manager.run_loops_parallel(
69
- loop_configs=[
70
- {"loop_id": "loop1", "query": "Research query 1"},
71
- {"loop_id": "loop2", "query": "Research query 2"},
72
- ],
73
- loop_func=run_research,
74
- )
75
  ```
76
 
77
  ## Budget Tracker
@@ -86,13 +75,13 @@ results = await manager.run_loops_parallel(
86
  - **Iterations**: Number of iterations
87
 
88
  **Methods**:
89
- - `create_budget(loop_id: str, tokens_limit: int = 100000, time_limit_seconds: float = 600.0, iterations_limit: int = 10) -> BudgetStatus`: Create a budget for a specific loop
90
- - `add_tokens(loop_id: str, tokens: int)`: Add token usage to a loop's budget
91
- - `start_timer(loop_id: str)`: Start time tracking for a loop
92
- - `update_timer(loop_id: str)`: Update elapsed time for a loop
93
- - `increment_iteration(loop_id: str)`: Increment iteration count for a loop
94
- - `check_budget(loop_id: str) -> tuple[bool, str]`: Check if a loop's budget has been exceeded. Returns (exceeded: bool, reason: str)
95
- - `can_continue(loop_id: str) -> bool`: Check if a loop can continue based on budget
96
 
97
  **Token Estimation**:
98
  - `estimate_tokens(text: str) -> int`: ~4 chars per token
@@ -104,20 +93,13 @@ from src.middleware.budget_tracker import BudgetTracker
104
 
105
  tracker = BudgetTracker()
106
  budget = tracker.create_budget(
107
- loop_id="research_loop",
108
- tokens_limit=100000,
109
  time_limit_seconds=600,
110
  iterations_limit=10
111
  )
112
- tracker.start_timer("research_loop")
113
  # ... research operations ...
114
- tracker.add_tokens("research_loop", 5000)
115
- tracker.update_timer("research_loop")
116
- exceeded, reason = tracker.check_budget("research_loop")
117
- if exceeded:
118
- # Budget exceeded, stop research
119
- pass
120
- if not tracker.can_continue("research_loop"):
121
  # Budget exceeded, stop research
122
  pass
123
  ```
@@ -144,3 +126,13 @@ All middleware components use `ContextVar` for thread-safe isolation:
144
  - [Orchestrators](orchestrators.md) - How middleware is used in orchestration
145
  - [API Reference - Orchestrators](../api/orchestrators.md) - API documentation
146
  - [Contributing - Code Style](../contributing/code-style.md) - Development guidelines
 
 
 
 
 
 
 
 
 
 
 
18
  - `embedding_service: Any`: Embedding service for semantic search
19
 
20
  **Methods**:
21
+ - `add_evidence(evidence: Evidence)`: Adds evidence with URL-based deduplication
22
+ - `async search_related(query: str, top_k: int = 5) -> list[Evidence]`: Semantic search
23
 
24
  **Initialization**:
25
+ ```python
26
+ from src.middleware.state_machine import init_workflow_state
27
 
28
+ init_workflow_state(embedding_service)
29
+ ```
 
30
 
31
  **Access**:
32
+ ```python
33
+ from src.middleware.state_machine import get_workflow_state
34
 
35
+ state = get_workflow_state() # Auto-initializes if missing
36
+ ```
 
37
 
38
  ## Workflow Manager
39
 
 
42
  **Purpose**: Coordinates parallel research loops
43
 
44
  **Methods**:
45
+ - `add_loop(loop: ResearchLoop)`: Add a research loop to manage
46
+ - `async run_loops_parallel() -> list[ResearchLoop]`: Run all loops in parallel
47
+ - `update_loop_status(loop_id: str, status: str)`: Update loop status
48
+ - `sync_loop_evidence_to_state()`: Synchronize evidence from loops to global state
49
 
50
  **Features**:
51
  - Uses `asyncio.gather()` for parallel execution
 
58
  from src.middleware.workflow_manager import WorkflowManager
59
 
60
  manager = WorkflowManager()
61
+ manager.add_loop(loop1)
62
+ manager.add_loop(loop2)
63
+ completed_loops = await manager.run_loops_parallel()
 
 
 
 
 
 
 
 
 
 
 
 
 
64
  ```
65
 
66
  ## Budget Tracker
 
75
  - **Iterations**: Number of iterations
76
 
77
  **Methods**:
78
+ - `create_budget(token_limit, time_limit_seconds, iterations_limit) -> BudgetStatus`
79
+ - `add_tokens(tokens: int)`: Add token usage
80
+ - `start_timer()`: Start time tracking
81
+ - `update_timer()`: Update elapsed time
82
+ - `increment_iteration()`: Increment iteration count
83
+ - `check_budget() -> BudgetStatus`: Check current budget status
84
+ - `can_continue() -> bool`: Check if research can continue
85
 
86
  **Token Estimation**:
87
  - `estimate_tokens(text: str) -> int`: ~4 chars per token
 
93
 
94
  tracker = BudgetTracker()
95
  budget = tracker.create_budget(
96
+ token_limit=100000,
 
97
  time_limit_seconds=600,
98
  iterations_limit=10
99
  )
100
+ tracker.start_timer()
101
  # ... research operations ...
102
+ if not tracker.can_continue():
 
 
 
 
 
 
103
  # Budget exceeded, stop research
104
  pass
105
  ```
 
126
  - [Orchestrators](orchestrators.md) - How middleware is used in orchestration
127
  - [API Reference - Orchestrators](../api/orchestrators.md) - API documentation
128
  - [Contributing - Code Style](../contributing/code-style.md) - Development guidelines
129
+
130
+
131
+
132
+
133
+
134
+
135
+
136
+
137
+
138
+
docs/architecture/orchestrators.md CHANGED
@@ -23,10 +23,19 @@ DeepCritical supports multiple orchestration patterns for research workflows.
23
  - Iterates until research complete or constraints met
24
 
25
  **Usage**:
 
 
26
 
27
- <!--codeinclude-->
28
- [IterativeResearchFlow Initialization](../src/orchestrator/research_flow.py) start_line:57 end_line:80
29
- <!--/codeinclude-->
 
 
 
 
 
 
 
30
 
31
  ### DeepResearchFlow
32
 
@@ -46,10 +55,19 @@ DeepCritical supports multiple orchestration patterns for research workflows.
46
  - Supports graph execution and agent chains
47
 
48
  **Usage**:
 
 
 
 
 
 
 
 
49
 
50
- <!--codeinclude-->
51
- [DeepResearchFlow Initialization](../src/orchestrator/research_flow.py) start_line:709 end_line:728
52
- <!--/codeinclude-->
 
53
 
54
  ## Graph Orchestrator
55
 
@@ -58,10 +76,9 @@ DeepCritical supports multiple orchestration patterns for research workflows.
58
  **Purpose**: Graph-based execution using Pydantic AI agents as nodes
59
 
60
  **Features**:
61
- - Uses graph execution (`use_graph=True`) or agent chains (`use_graph=False`) as fallback
62
  - Routes based on research mode (iterative/deep/auto)
63
  - Streams `AgentEvent` objects for UI
64
- - Uses `GraphExecutionContext` to manage execution state
65
 
66
  **Node Types**:
67
  - **Agent Nodes**: Execute Pydantic AI agents
@@ -74,22 +91,6 @@ DeepCritical supports multiple orchestration patterns for research workflows.
74
  - **Conditional Edges**: Traversed based on condition
75
  - **Parallel Edges**: Used for parallel execution branches
76
 
77
- **Special Node Handling**:
78
-
79
- The `GraphOrchestrator` has special handling for certain nodes:
80
-
81
- - **`execute_tools` node**: State node that uses `search_handler` to execute searches and add evidence to workflow state
82
- - **`parallel_loops` node**: Parallel node that executes `IterativeResearchFlow` instances for each section in deep research mode
83
- - **`synthesizer` node**: Agent node that calls `LongWriterAgent.write_report()` directly with `ReportDraft` instead of using `agent.run()`
84
- - **`writer` node**: Agent node that calls `WriterAgent.write_report()` directly with findings instead of using `agent.run()`
85
-
86
- **GraphExecutionContext**:
87
-
88
- The orchestrator uses `GraphExecutionContext` to manage execution state:
89
- - Tracks current node, visited nodes, and node results
90
- - Manages workflow state and budget tracker
91
- - Provides methods to store and retrieve node execution results
92
-
93
  ## Orchestrator Factory
94
 
95
  **File**: `src/orchestrator_factory.py`
@@ -102,10 +103,16 @@ The orchestrator uses `GraphExecutionContext` to manage execution state:
102
  - **Auto-detect**: Chooses based on API key availability
103
 
104
  **Usage**:
 
 
105
 
106
- <!--codeinclude-->
107
- [Create Orchestrator](../src/orchestrator_factory.py) start_line:44 end_line:66
108
- <!--/codeinclude-->
 
 
 
 
109
 
110
  ## Magentic Orchestrator
111
 
@@ -116,26 +123,14 @@ The orchestrator uses `GraphExecutionContext` to manage execution state:
116
  **Features**:
117
  - Uses `agent-framework-core`
118
  - ChatAgent pattern with internal LLMs per agent
119
- - `MagenticBuilder` with participants:
120
- - `searcher`: SearchAgent (wraps SearchHandler)
121
- - `hypothesizer`: HypothesisAgent (generates hypotheses)
122
- - `judge`: JudgeAgent (evaluates evidence)
123
- - `reporter`: ReportAgent (generates final report)
124
- - Manager orchestrates agents via chat client (OpenAI or HuggingFace)
125
- - Event-driven: converts Magentic events to `AgentEvent` for UI streaming via `_process_event()` method
126
- - Supports max rounds, stall detection, and reset handling
127
-
128
- **Event Processing**:
129
-
130
- The orchestrator processes Magentic events and converts them to `AgentEvent`:
131
- - `MagenticOrchestratorMessageEvent` → `AgentEvent` with type based on message content
132
- - `MagenticAgentMessageEvent` → `AgentEvent` with type based on agent name
133
- - `MagenticAgentDeltaEvent` → `AgentEvent` for streaming updates
134
- - `MagenticFinalResultEvent` → `AgentEvent` with type "complete"
135
 
136
  **Requirements**:
137
  - `agent-framework-core` package
138
- - OpenAI API key or HuggingFace authentication
139
 
140
  ## Hierarchical Orchestrator
141
 
@@ -164,9 +159,13 @@ The orchestrator processes Magentic events and converts them to `AgentEvent`:
164
 
165
  All orchestrators must initialize workflow state:
166
 
167
- <!--codeinclude-->
168
- [Initialize Workflow State](../src/middleware/state_machine.py) start_line:98 end_line:112
169
- <!--/codeinclude-->
 
 
 
 
170
 
171
  ## Event Streaming
172
 
@@ -174,28 +173,26 @@ All orchestrators yield `AgentEvent` objects:
174
 
175
  **Event Types**:
176
  - `started`: Research started
177
- - `searching`: Search in progress
178
  - `search_complete`: Search completed
179
- - `judging`: Evidence evaluation in progress
180
  - `judge_complete`: Evidence evaluation completed
181
- - `looping`: Iteration in progress
182
  - `hypothesizing`: Generating hypotheses
183
- - `analyzing`: Statistical analysis in progress
184
- - `analysis_complete`: Statistical analysis completed
185
  - `synthesizing`: Synthesizing results
186
  - `complete`: Research completed
187
  - `error`: Error occurred
188
- - `streaming`: Streaming update (delta events)
189
 
190
  **Event Structure**:
191
-
192
- <!--codeinclude-->
193
- [AgentEvent Model](../src/utils/models.py) start_line:104 end_line:126
194
- <!--/codeinclude-->
 
 
195
 
196
  ## See Also
197
 
198
- - [Graph Orchestration](graph_orchestration.md) - Graph-based execution details
 
 
199
  - [Workflow Diagrams](workflow-diagrams.md) - Detailed workflow diagrams
200
  - [API Reference - Orchestrators](../api/orchestrators.md) - API documentation
201
 
 
23
  - Iterates until research complete or constraints met
24
 
25
  **Usage**:
26
+ ```python
27
+ from src.orchestrator.research_flow import IterativeResearchFlow
28
 
29
+ flow = IterativeResearchFlow(
30
+ search_handler=search_handler,
31
+ judge_handler=judge_handler,
32
+ use_graph=False
33
+ )
34
+
35
+ async for event in flow.run(query):
36
+ # Handle events
37
+ pass
38
+ ```
39
 
40
  ### DeepResearchFlow
41
 
 
55
  - Supports graph execution and agent chains
56
 
57
  **Usage**:
58
+ ```python
59
+ from src.orchestrator.research_flow import DeepResearchFlow
60
+
61
+ flow = DeepResearchFlow(
62
+ search_handler=search_handler,
63
+ judge_handler=judge_handler,
64
+ use_graph=True
65
+ )
66
 
67
+ async for event in flow.run(query):
68
+ # Handle events
69
+ pass
70
+ ```
71
 
72
  ## Graph Orchestrator
73
 
 
76
  **Purpose**: Graph-based execution using Pydantic AI agents as nodes
77
 
78
  **Features**:
79
+ - Uses Pydantic AI Graphs (when available) or agent chains (fallback)
80
  - Routes based on research mode (iterative/deep/auto)
81
  - Streams `AgentEvent` objects for UI
 
82
 
83
  **Node Types**:
84
  - **Agent Nodes**: Execute Pydantic AI agents
 
91
  - **Conditional Edges**: Traversed based on condition
92
  - **Parallel Edges**: Used for parallel execution branches
93
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
94
  ## Orchestrator Factory
95
 
96
  **File**: `src/orchestrator_factory.py`
 
103
  - **Auto-detect**: Chooses based on API key availability
104
 
105
  **Usage**:
106
+ ```python
107
+ from src.orchestrator_factory import create_orchestrator
108
 
109
+ orchestrator = create_orchestrator(
110
+ search_handler=search_handler,
111
+ judge_handler=judge_handler,
112
+ config={},
113
+ mode="advanced" # or "simple" or None for auto-detect
114
+ )
115
+ ```
116
 
117
  ## Magentic Orchestrator
118
 
 
123
  **Features**:
124
  - Uses `agent-framework-core`
125
  - ChatAgent pattern with internal LLMs per agent
126
+ - `MagenticBuilder` with participants: searcher, hypothesizer, judge, reporter
127
+ - Manager orchestrates agents via `OpenAIChatClient`
128
+ - Requires OpenAI API key (function calling support)
129
+ - Event-driven: converts Magentic events to `AgentEvent` for UI streaming
 
 
 
 
 
 
 
 
 
 
 
 
130
 
131
  **Requirements**:
132
  - `agent-framework-core` package
133
+ - OpenAI API key
134
 
135
  ## Hierarchical Orchestrator
136
 
 
159
 
160
  All orchestrators must initialize workflow state:
161
 
162
+ ```python
163
+ from src.middleware.state_machine import init_workflow_state
164
+ from src.services.embeddings import get_embedding_service
165
+
166
+ embedding_service = get_embedding_service()
167
+ init_workflow_state(embedding_service)
168
+ ```
169
 
170
  ## Event Streaming
171
 
 
173
 
174
  **Event Types**:
175
  - `started`: Research started
 
176
  - `search_complete`: Search completed
 
177
  - `judge_complete`: Evidence evaluation completed
 
178
  - `hypothesizing`: Generating hypotheses
 
 
179
  - `synthesizing`: Synthesizing results
180
  - `complete`: Research completed
181
  - `error`: Error occurred
 
182
 
183
  **Event Structure**:
184
+ ```python
185
+ class AgentEvent:
186
+ type: str
187
+ iteration: int | None
188
+ data: dict[str, Any]
189
+ ```
190
 
191
  ## See Also
192
 
193
+ - [Graph Orchestration](graph-orchestration.md) - Graph-based execution details
194
+ - [Graph Orchestration (Detailed)](graph_orchestration.md) - Detailed graph architecture
195
+ - [Workflows](workflows.md) - Workflow diagrams and patterns
196
  - [Workflow Diagrams](workflow-diagrams.md) - Detailed workflow diagrams
197
  - [API Reference - Orchestrators](../api/orchestrators.md) - API documentation
198
 
docs/architecture/services.md CHANGED
@@ -10,18 +10,17 @@ DeepCritical provides several services for embeddings, RAG, and statistical anal
10
 
11
  **Features**:
12
  - **No API Key Required**: Uses local sentence-transformers models
13
- - **Async-Safe**: All operations use `run_in_executor()` to avoid blocking the event loop
14
- - **ChromaDB Storage**: In-memory vector storage for embeddings
15
- - **Deduplication**: 0.9 similarity threshold by default (90% similarity = duplicate, configurable)
16
 
17
  **Model**: Configurable via `settings.local_embedding_model` (default: `all-MiniLM-L6-v2`)
18
 
19
  **Methods**:
20
- - `async def embed(text: str) -> list[float]`: Generate embeddings (async-safe via `run_in_executor()`)
21
- - `async def embed_batch(texts: list[str]) -> list[list[float]]`: Batch embedding (more efficient)
22
- - `async def add_evidence(evidence_id: str, content: str, metadata: dict[str, Any]) -> None`: Add evidence to vector store
23
- - `async def search_similar(query: str, n_results: int = 5) -> list[dict[str, Any]]`: Find semantically similar evidence
24
- - `async def deduplicate(new_evidence: list[Evidence], threshold: float = 0.9) -> list[Evidence]`: Remove semantically duplicate evidence
25
 
26
  **Usage**:
27
  ```python
@@ -33,21 +32,15 @@ embedding = await service.embed("text to embed")
33
 
34
  ## LlamaIndex RAG Service
35
 
36
- **File**: `src/services/llamaindex_rag.py`
37
 
38
  **Purpose**: Retrieval-Augmented Generation using LlamaIndex
39
 
40
  **Features**:
41
- - **Multiple Embedding Providers**: OpenAI embeddings (requires `OPENAI_API_KEY`) or local sentence-transformers (no API key)
42
- - **Multiple LLM Providers**: HuggingFace LLM (preferred) or OpenAI LLM (fallback) for query synthesis
43
- - **ChromaDB Storage**: Vector database for document storage (supports in-memory mode)
44
  - **Metadata Preservation**: Preserves source, title, URL, date, authors
45
- - **Lazy Initialization**: Graceful fallback if dependencies not available
46
-
47
- **Initialization Parameters**:
48
- - `use_openai_embeddings: bool | None`: Force OpenAI embeddings (None = auto-detect)
49
- - `use_in_memory: bool`: Use in-memory ChromaDB client (useful for tests)
50
- - `oauth_token: str | None`: Optional OAuth token from HuggingFace login (takes priority over env vars)
51
 
52
  **Methods**:
53
  - `async def ingest_evidence(evidence: list[Evidence]) -> None`: Ingest evidence into RAG
@@ -56,13 +49,9 @@ embedding = await service.embed("text to embed")
56
 
57
  **Usage**:
58
  ```python
59
- from src.services.llamaindex_rag import get_rag_service
60
 
61
- service = get_rag_service(
62
- use_openai_embeddings=False, # Use local embeddings
63
- use_in_memory=True, # Use in-memory ChromaDB
64
- oauth_token=token # Optional HuggingFace token
65
- )
66
  if service:
67
  documents = await service.retrieve("query", top_k=5)
68
  ```
@@ -103,19 +92,13 @@ result = await analyzer.analyze(
103
 
104
  ## Singleton Pattern
105
 
106
- Services use singleton patterns for lazy initialization:
107
-
108
- **EmbeddingService**: Uses a global variable pattern:
109
-
110
- <!--codeinclude-->
111
- [EmbeddingService Singleton](../src/services/embeddings.py) start_line:164 end_line:172
112
- <!--/codeinclude-->
113
 
114
- **LlamaIndexRAGService**: Direct instantiation (no caching):
115
-
116
- <!--codeinclude-->
117
- [LlamaIndexRAGService Factory](../src/services/llamaindex_rag.py) start_line:440 end_line:466
118
- <!--/codeinclude-->
119
 
120
  This ensures:
121
  - Single instance per process
@@ -144,3 +127,12 @@ if settings.has_openai_key:
144
  - [API Reference - Services](../api/services.md) - API documentation
145
  - [Configuration](../configuration/index.md) - Service configuration
146
 
 
 
 
 
 
 
 
 
 
 
10
 
11
  **Features**:
12
  - **No API Key Required**: Uses local sentence-transformers models
13
+ - **Async-Safe**: All operations use `run_in_executor()` to avoid blocking
14
+ - **ChromaDB Storage**: Vector storage for embeddings
15
+ - **Deduplication**: 0.85 similarity threshold (85% similarity = duplicate)
16
 
17
  **Model**: Configurable via `settings.local_embedding_model` (default: `all-MiniLM-L6-v2`)
18
 
19
  **Methods**:
20
+ - `async def embed(text: str) -> list[float]`: Generate embeddings
21
+ - `async def embed_batch(texts: list[str]) -> list[list[float]]`: Batch embedding
22
+ - `async def similarity(text1: str, text2: str) -> float`: Calculate similarity
23
+ - `async def find_duplicates(texts: list[str], threshold: float = 0.85) -> list[tuple[int, int]]`: Find duplicates
 
24
 
25
  **Usage**:
26
  ```python
 
32
 
33
  ## LlamaIndex RAG Service
34
 
35
+ **File**: `src/services/rag.py`
36
 
37
  **Purpose**: Retrieval-Augmented Generation using LlamaIndex
38
 
39
  **Features**:
40
+ - **OpenAI Embeddings**: Requires `OPENAI_API_KEY`
41
+ - **ChromaDB Storage**: Vector database for document storage
 
42
  - **Metadata Preservation**: Preserves source, title, URL, date, authors
43
+ - **Lazy Initialization**: Graceful fallback if OpenAI key not available
 
 
 
 
 
44
 
45
  **Methods**:
46
  - `async def ingest_evidence(evidence: list[Evidence]) -> None`: Ingest evidence into RAG
 
49
 
50
  **Usage**:
51
  ```python
52
+ from src.services.rag import get_rag_service
53
 
54
+ service = get_rag_service()
 
 
 
 
55
  if service:
56
  documents = await service.retrieve("query", top_k=5)
57
  ```
 
92
 
93
  ## Singleton Pattern
94
 
95
+ All services use the singleton pattern with `@lru_cache(maxsize=1)`:
 
 
 
 
 
 
96
 
97
+ ```python
98
+ @lru_cache(maxsize=1)
99
+ def get_embedding_service() -> EmbeddingService:
100
+ return EmbeddingService()
101
+ ```
102
 
103
  This ensures:
104
  - Single instance per process
 
127
  - [API Reference - Services](../api/services.md) - API documentation
128
  - [Configuration](../configuration/index.md) - Service configuration
129
 
130
+
131
+
132
+
133
+
134
+
135
+
136
+
137
+
138
+
docs/architecture/tools.md CHANGED
@@ -6,17 +6,30 @@ DeepCritical implements a protocol-based search tool system for retrieving evide
6
 
7
  All tools implement the `SearchTool` protocol from `src/tools/base.py`:
8
 
9
- <!--codeinclude-->
10
- [SearchTool Protocol](../src/tools/base.py) start_line:8 end_line:31
11
- <!--/codeinclude-->
 
 
 
 
 
 
 
 
12
 
13
  ## Rate Limiting
14
 
15
  All tools use the `@retry` decorator from tenacity:
16
 
17
- <!--codeinclude-->
18
- [Retry Decorator Pattern](../src/tools/pubmed.py) start_line:46 end_line:50
19
- <!--/codeinclude-->
 
 
 
 
 
20
 
21
  Tools with API rate limits implement `_rate_limit()` method and use shared rate limiters from `src/tools/rate_limiter.py`.
22
 
@@ -117,23 +130,11 @@ Missing fields are handled gracefully with defaults.
117
 
118
  **Purpose**: Orchestrates parallel searches across multiple tools
119
 
120
- **Initialization Parameters**:
121
- - `tools: list[SearchTool]`: List of search tools to use
122
- - `timeout: float = 30.0`: Timeout for each search in seconds
123
- - `include_rag: bool = False`: Whether to include RAG tool in searches
124
- - `auto_ingest_to_rag: bool = True`: Whether to automatically ingest results into RAG
125
- - `oauth_token: str | None = None`: Optional OAuth token from HuggingFace login (for RAG LLM)
126
-
127
- **Methods**:
128
- - `async def execute(query: str, max_results_per_tool: int = 10) -> SearchResult`: Execute search across all tools in parallel
129
-
130
  **Features**:
131
- - Uses `asyncio.gather()` with `return_exceptions=True` for parallel execution
132
- - Aggregates results into `SearchResult` with evidence and metadata
133
- - Handles tool failures gracefully (continues with other tools)
134
  - Deduplicates results by URL
135
- - Automatically ingests results into RAG if `auto_ingest_to_rag=True`
136
- - Can add RAG tool dynamically via `add_rag_tool()` method
137
 
138
  ## Tool Registration
139
 
@@ -143,21 +144,14 @@ Tools are registered in the search handler:
143
  from src.tools.pubmed import PubMedTool
144
  from src.tools.clinicaltrials import ClinicalTrialsTool
145
  from src.tools.europepmc import EuropePMCTool
146
- from src.tools.search_handler import SearchHandler
147
 
148
  search_handler = SearchHandler(
149
  tools=[
150
  PubMedTool(),
151
  ClinicalTrialsTool(),
152
  EuropePMCTool(),
153
- ],
154
- include_rag=True, # Include RAG tool for semantic search
155
- auto_ingest_to_rag=True, # Automatically ingest results into RAG
156
- oauth_token=token # Optional HuggingFace token for RAG LLM
157
  )
158
-
159
- # Execute search
160
- result = await search_handler.execute("query", max_results_per_tool=10)
161
  ```
162
 
163
  ## See Also
@@ -165,3 +159,13 @@ result = await search_handler.execute("query", max_results_per_tool=10)
165
  - [Services](services.md) - RAG and embedding services
166
  - [API Reference - Tools](../api/tools.md) - API documentation
167
  - [Contributing - Implementation Patterns](../contributing/implementation-patterns.md) - Development guidelines
 
 
 
 
 
 
 
 
 
 
 
6
 
7
  All tools implement the `SearchTool` protocol from `src/tools/base.py`:
8
 
9
+ ```python
10
+ class SearchTool(Protocol):
11
+ @property
12
+ def name(self) -> str: ...
13
+
14
+ async def search(
15
+ self,
16
+ query: str,
17
+ max_results: int = 10
18
+ ) -> list[Evidence]: ...
19
+ ```
20
 
21
  ## Rate Limiting
22
 
23
  All tools use the `@retry` decorator from tenacity:
24
 
25
+ ```python
26
+ @retry(
27
+ stop=stop_after_attempt(3),
28
+ wait=wait_exponential(...)
29
+ )
30
+ async def search(self, query: str, max_results: int = 10) -> list[Evidence]:
31
+ # Implementation
32
+ ```
33
 
34
  Tools with API rate limits implement `_rate_limit()` method and use shared rate limiters from `src/tools/rate_limiter.py`.
35
 
 
130
 
131
  **Purpose**: Orchestrates parallel searches across multiple tools
132
 
 
 
 
 
 
 
 
 
 
 
133
  **Features**:
134
+ - Uses `asyncio.gather()` with `return_exceptions=True`
135
+ - Aggregates results into `SearchResult`
136
+ - Handles tool failures gracefully
137
  - Deduplicates results by URL
 
 
138
 
139
  ## Tool Registration
140
 
 
144
  from src.tools.pubmed import PubMedTool
145
  from src.tools.clinicaltrials import ClinicalTrialsTool
146
  from src.tools.europepmc import EuropePMCTool
 
147
 
148
  search_handler = SearchHandler(
149
  tools=[
150
  PubMedTool(),
151
  ClinicalTrialsTool(),
152
  EuropePMCTool(),
153
+ ]
 
 
 
154
  )
 
 
 
155
  ```
156
 
157
  ## See Also
 
159
  - [Services](services.md) - RAG and embedding services
160
  - [API Reference - Tools](../api/tools.md) - API documentation
161
  - [Contributing - Implementation Patterns](../contributing/implementation-patterns.md) - Development guidelines
162
+
163
+
164
+
165
+
166
+
167
+
168
+
169
+
170
+
171
+
docs/architecture/workflow-diagrams.md CHANGED
@@ -627,10 +627,23 @@ gantt
627
  ## Implementation Highlights
628
 
629
  **Simple 4-Agent Setup:**
630
-
631
- <!--codeinclude-->
632
- [Magentic Workflow Builder](../src/orchestrator_magentic.py) start_line:72 end_line:99
633
- <!--/codeinclude-->
 
 
 
 
 
 
 
 
 
 
 
 
 
634
 
635
  **Manager handles quality assessment in its instructions:**
636
  - Checks hypothesis quality (testable, novel, clear)
@@ -651,5 +664,7 @@ No separate Judge Agent needed - manager does it all!
651
  ## See Also
652
 
653
  - [Orchestrators](orchestrators.md) - Overview of all orchestrator patterns
654
- - [Graph Orchestration](graph_orchestration.md) - Graph-based execution overview
 
 
655
  - [API Reference - Orchestrators](../api/orchestrators.md) - API documentation
 
627
  ## Implementation Highlights
628
 
629
  **Simple 4-Agent Setup:**
630
+ ```python
631
+ workflow = (
632
+ MagenticBuilder()
633
+ .participants(
634
+ hypothesis=HypothesisAgent(tools=[background_tool]),
635
+ search=SearchAgent(tools=[web_search, rag_tool]),
636
+ analysis=AnalysisAgent(tools=[code_execution]),
637
+ report=ReportAgent(tools=[code_execution, visualization])
638
+ )
639
+ .with_standard_manager(
640
+ chat_client=AnthropicClient(model="claude-sonnet-4"),
641
+ max_round_count=15, # Prevent infinite loops
642
+ max_stall_count=3 # Detect stuck workflows
643
+ )
644
+ .build()
645
+ )
646
+ ```
647
 
648
  **Manager handles quality assessment in its instructions:**
649
  - Checks hypothesis quality (testable, novel, clear)
 
664
  ## See Also
665
 
666
  - [Orchestrators](orchestrators.md) - Overview of all orchestrator patterns
667
+ - [Graph Orchestration](graph-orchestration.md) - Graph-based execution overview
668
+ - [Graph Orchestration (Detailed)](graph_orchestration.md) - Detailed graph architecture
669
+ - [Workflows](workflows.md) - Workflow patterns summary
670
  - [API Reference - Orchestrators](../api/orchestrators.md) - API documentation
docs/architecture/workflows.md ADDED
@@ -0,0 +1,662 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # DeepCritical Workflow - Simplified Magentic Architecture
2
+
3
+ > **Architecture Pattern**: Microsoft Magentic Orchestration
4
+ > **Design Philosophy**: Simple, dynamic, manager-driven coordination
5
+ > **Key Innovation**: Intelligent manager replaces rigid sequential phases
6
+
7
+ ---
8
+
9
+ ## 1. High-Level Magentic Workflow
10
+
11
+ ```mermaid
12
+ flowchart TD
13
+ Start([User Query]) --> Manager[Magentic Manager<br/>Plan • Select • Assess • Adapt]
14
+
15
+ Manager -->|Plans| Task1[Task Decomposition]
16
+ Task1 --> Manager
17
+
18
+ Manager -->|Selects & Executes| HypAgent[Hypothesis Agent]
19
+ Manager -->|Selects & Executes| SearchAgent[Search Agent]
20
+ Manager -->|Selects & Executes| AnalysisAgent[Analysis Agent]
21
+ Manager -->|Selects & Executes| ReportAgent[Report Agent]
22
+
23
+ HypAgent -->|Results| Manager
24
+ SearchAgent -->|Results| Manager
25
+ AnalysisAgent -->|Results| Manager
26
+ ReportAgent -->|Results| Manager
27
+
28
+ Manager -->|Assesses Quality| Decision{Good Enough?}
29
+ Decision -->|No - Refine| Manager
30
+ Decision -->|No - Different Agent| Manager
31
+ Decision -->|No - Stalled| Replan[Reset Plan]
32
+ Replan --> Manager
33
+
34
+ Decision -->|Yes| Synthesis[Synthesize Final Result]
35
+ Synthesis --> Output([Research Report])
36
+
37
+ style Start fill:#e1f5e1
38
+ style Manager fill:#ffe6e6
39
+ style HypAgent fill:#fff4e6
40
+ style SearchAgent fill:#fff4e6
41
+ style AnalysisAgent fill:#fff4e6
42
+ style ReportAgent fill:#fff4e6
43
+ style Decision fill:#ffd6d6
44
+ style Synthesis fill:#d4edda
45
+ style Output fill:#e1f5e1
46
+ ```
47
+
48
+ ## 2. Magentic Manager: The 6-Phase Cycle
49
+
50
+ ```mermaid
51
+ flowchart LR
52
+ P1[1. Planning<br/>Analyze task<br/>Create strategy] --> P2[2. Agent Selection<br/>Pick best agent<br/>for subtask]
53
+ P2 --> P3[3. Execution<br/>Run selected<br/>agent with tools]
54
+ P3 --> P4[4. Assessment<br/>Evaluate quality<br/>Check progress]
55
+ P4 --> Decision{Quality OK?<br/>Progress made?}
56
+ Decision -->|Yes| P6[6. Synthesis<br/>Combine results<br/>Generate report]
57
+ Decision -->|No| P5[5. Iteration<br/>Adjust plan<br/>Try again]
58
+ P5 --> P2
59
+ P6 --> Done([Complete])
60
+
61
+ style P1 fill:#fff4e6
62
+ style P2 fill:#ffe6e6
63
+ style P3 fill:#e6f3ff
64
+ style P4 fill:#ffd6d6
65
+ style P5 fill:#fff3cd
66
+ style P6 fill:#d4edda
67
+ style Done fill:#e1f5e1
68
+ ```
69
+
70
+ ## 3. Simplified Agent Architecture
71
+
72
+ ```mermaid
73
+ graph TB
74
+ subgraph "Orchestration Layer"
75
+ Manager[Magentic Manager<br/>• Plans workflow<br/>• Selects agents<br/>• Assesses quality<br/>• Adapts strategy]
76
+ SharedContext[(Shared Context<br/>• Hypotheses<br/>• Search Results<br/>• Analysis<br/>• Progress)]
77
+ Manager <--> SharedContext
78
+ end
79
+
80
+ subgraph "Specialist Agents"
81
+ HypAgent[Hypothesis Agent<br/>• Domain understanding<br/>• Hypothesis generation<br/>• Testability refinement]
82
+ SearchAgent[Search Agent<br/>• Multi-source search<br/>• RAG retrieval<br/>• Result ranking]
83
+ AnalysisAgent[Analysis Agent<br/>• Evidence extraction<br/>• Statistical analysis<br/>• Code execution]
84
+ ReportAgent[Report Agent<br/>• Report assembly<br/>• Visualization<br/>• Citation formatting]
85
+ end
86
+
87
+ subgraph "MCP Tools"
88
+ WebSearch[Web Search<br/>PubMed • arXiv • bioRxiv]
89
+ CodeExec[Code Execution<br/>Sandboxed Python]
90
+ RAG[RAG Retrieval<br/>Vector DB • Embeddings]
91
+ Viz[Visualization<br/>Charts • Graphs]
92
+ end
93
+
94
+ Manager -->|Selects & Directs| HypAgent
95
+ Manager -->|Selects & Directs| SearchAgent
96
+ Manager -->|Selects & Directs| AnalysisAgent
97
+ Manager -->|Selects & Directs| ReportAgent
98
+
99
+ HypAgent --> SharedContext
100
+ SearchAgent --> SharedContext
101
+ AnalysisAgent --> SharedContext
102
+ ReportAgent --> SharedContext
103
+
104
+ SearchAgent --> WebSearch
105
+ SearchAgent --> RAG
106
+ AnalysisAgent --> CodeExec
107
+ ReportAgent --> CodeExec
108
+ ReportAgent --> Viz
109
+
110
+ style Manager fill:#ffe6e6
111
+ style SharedContext fill:#ffe6f0
112
+ style HypAgent fill:#fff4e6
113
+ style SearchAgent fill:#fff4e6
114
+ style AnalysisAgent fill:#fff4e6
115
+ style ReportAgent fill:#fff4e6
116
+ style WebSearch fill:#e6f3ff
117
+ style CodeExec fill:#e6f3ff
118
+ style RAG fill:#e6f3ff
119
+ style Viz fill:#e6f3ff
120
+ ```
121
+
122
+ ## 4. Dynamic Workflow Example
123
+
124
+ ```mermaid
125
+ sequenceDiagram
126
+ participant User
127
+ participant Manager
128
+ participant HypAgent
129
+ participant SearchAgent
130
+ participant AnalysisAgent
131
+ participant ReportAgent
132
+
133
+ User->>Manager: "Research protein folding in Alzheimer's"
134
+
135
+ Note over Manager: PLAN: Generate hypotheses → Search → Analyze → Report
136
+
137
+ Manager->>HypAgent: Generate 3 hypotheses
138
+ HypAgent-->>Manager: Returns 3 hypotheses
139
+ Note over Manager: ASSESS: Good quality, proceed
140
+
141
+ Manager->>SearchAgent: Search literature for hypothesis 1
142
+ SearchAgent-->>Manager: Returns 15 papers
143
+ Note over Manager: ASSESS: Good results, continue
144
+
145
+ Manager->>SearchAgent: Search for hypothesis 2
146
+ SearchAgent-->>Manager: Only 2 papers found
147
+ Note over Manager: ASSESS: Insufficient, refine search
148
+
149
+ Manager->>SearchAgent: Refined query for hypothesis 2
150
+ SearchAgent-->>Manager: Returns 12 papers
151
+ Note over Manager: ASSESS: Better, proceed
152
+
153
+ Manager->>AnalysisAgent: Analyze evidence for all hypotheses
154
+ AnalysisAgent-->>Manager: Returns analysis with code
155
+ Note over Manager: ASSESS: Complete, generate report
156
+
157
+ Manager->>ReportAgent: Create comprehensive report
158
+ ReportAgent-->>Manager: Returns formatted report
159
+ Note over Manager: SYNTHESIZE: Combine all results
160
+
161
+ Manager->>User: Final Research Report
162
+ ```
163
+
164
+ ## 5. Manager Decision Logic
165
+
166
+ ```mermaid
167
+ flowchart TD
168
+ Start([Manager Receives Task]) --> Plan[Create Initial Plan]
169
+
170
+ Plan --> Select[Select Agent for Next Subtask]
171
+ Select --> Execute[Execute Agent]
172
+ Execute --> Collect[Collect Results]
173
+
174
+ Collect --> Assess[Assess Quality & Progress]
175
+
176
+ Assess --> Q1{Quality Sufficient?}
177
+ Q1 -->|No| Q2{Same Agent Can Fix?}
178
+ Q2 -->|Yes| Feedback[Provide Specific Feedback]
179
+ Feedback --> Execute
180
+ Q2 -->|No| Different[Try Different Agent]
181
+ Different --> Select
182
+
183
+ Q1 -->|Yes| Q3{Task Complete?}
184
+ Q3 -->|No| Q4{Making Progress?}
185
+ Q4 -->|Yes| Select
186
+ Q4 -->|No - Stalled| Replan[Reset Plan & Approach]
187
+ Replan --> Plan
188
+
189
+ Q3 -->|Yes| Synth[Synthesize Final Result]
190
+ Synth --> Done([Return Report])
191
+
192
+ style Start fill:#e1f5e1
193
+ style Plan fill:#fff4e6
194
+ style Select fill:#ffe6e6
195
+ style Execute fill:#e6f3ff
196
+ style Assess fill:#ffd6d6
197
+ style Q1 fill:#ffe6e6
198
+ style Q2 fill:#ffe6e6
199
+ style Q3 fill:#ffe6e6
200
+ style Q4 fill:#ffe6e6
201
+ style Synth fill:#d4edda
202
+ style Done fill:#e1f5e1
203
+ ```
204
+
205
+ ## 6. Hypothesis Agent Workflow
206
+
207
+ ```mermaid
208
+ flowchart LR
209
+ Input[Research Query] --> Domain[Identify Domain<br/>& Key Concepts]
210
+ Domain --> Context[Retrieve Background<br/>Knowledge]
211
+ Context --> Generate[Generate 3-5<br/>Initial Hypotheses]
212
+ Generate --> Refine[Refine for<br/>Testability]
213
+ Refine --> Rank[Rank by<br/>Quality Score]
214
+ Rank --> Output[Return Top<br/>Hypotheses]
215
+
216
+ Output --> Struct[Hypothesis Structure:<br/>• Statement<br/>• Rationale<br/>• Testability Score<br/>• Data Requirements<br/>• Expected Outcomes]
217
+
218
+ style Input fill:#e1f5e1
219
+ style Output fill:#fff4e6
220
+ style Struct fill:#e6f3ff
221
+ ```
222
+
223
+ ## 7. Search Agent Workflow
224
+
225
+ ```mermaid
226
+ flowchart TD
227
+ Input[Hypotheses] --> Strategy[Formulate Search<br/>Strategy per Hypothesis]
228
+
229
+ Strategy --> Multi[Multi-Source Search]
230
+
231
+ Multi --> PubMed[PubMed Search<br/>via MCP]
232
+ Multi --> ArXiv[arXiv Search<br/>via MCP]
233
+ Multi --> BioRxiv[bioRxiv Search<br/>via MCP]
234
+
235
+ PubMed --> Aggregate[Aggregate Results]
236
+ ArXiv --> Aggregate
237
+ BioRxiv --> Aggregate
238
+
239
+ Aggregate --> Filter[Filter & Rank<br/>by Relevance]
240
+ Filter --> Dedup[Deduplicate<br/>Cross-Reference]
241
+ Dedup --> Embed[Embed Documents<br/>via MCP]
242
+ Embed --> Vector[(Vector DB)]
243
+ Vector --> RAGRetrieval[RAG Retrieval<br/>Top-K per Hypothesis]
244
+ RAGRetrieval --> Output[Return Contextualized<br/>Search Results]
245
+
246
+ style Input fill:#fff4e6
247
+ style Multi fill:#ffe6e6
248
+ style Vector fill:#ffe6f0
249
+ style Output fill:#e6f3ff
250
+ ```
251
+
252
+ ## 8. Analysis Agent Workflow
253
+
254
+ ```mermaid
255
+ flowchart TD
256
+ Input1[Hypotheses] --> Extract
257
+ Input2[Search Results] --> Extract[Extract Evidence<br/>per Hypothesis]
258
+
259
+ Extract --> Methods[Determine Analysis<br/>Methods Needed]
260
+
261
+ Methods --> Branch{Requires<br/>Computation?}
262
+ Branch -->|Yes| GenCode[Generate Python<br/>Analysis Code]
263
+ Branch -->|No| Qual[Qualitative<br/>Synthesis]
264
+
265
+ GenCode --> Execute[Execute Code<br/>via MCP Sandbox]
266
+ Execute --> Interpret1[Interpret<br/>Results]
267
+ Qual --> Interpret2[Interpret<br/>Findings]
268
+
269
+ Interpret1 --> Synthesize[Synthesize Evidence<br/>Across Sources]
270
+ Interpret2 --> Synthesize
271
+
272
+ Synthesize --> Verdict[Determine Verdict<br/>per Hypothesis]
273
+ Verdict --> Support[• Supported<br/>• Refuted<br/>• Inconclusive]
274
+ Support --> Gaps[Identify Knowledge<br/>Gaps & Limitations]
275
+ Gaps --> Output[Return Analysis<br/>Report]
276
+
277
+ style Input1 fill:#fff4e6
278
+ style Input2 fill:#e6f3ff
279
+ style Execute fill:#ffe6e6
280
+ style Output fill:#e6ffe6
281
+ ```
282
+
283
+ ## 9. Report Agent Workflow
284
+
285
+ ```mermaid
286
+ flowchart TD
287
+ Input1[Query] --> Assemble
288
+ Input2[Hypotheses] --> Assemble
289
+ Input3[Search Results] --> Assemble
290
+ Input4[Analysis] --> Assemble[Assemble Report<br/>Sections]
291
+
292
+ Assemble --> Exec[Executive Summary]
293
+ Assemble --> Intro[Introduction]
294
+ Assemble --> Methods[Methods]
295
+ Assemble --> Results[Results per<br/>Hypothesis]
296
+ Assemble --> Discussion[Discussion]
297
+ Assemble --> Future[Future Directions]
298
+ Assemble --> Refs[References]
299
+
300
+ Results --> VizCheck{Needs<br/>Visualization?}
301
+ VizCheck -->|Yes| GenViz[Generate Viz Code]
302
+ GenViz --> ExecViz[Execute via MCP<br/>Create Charts]
303
+ ExecViz --> Combine
304
+ VizCheck -->|No| Combine[Combine All<br/>Sections]
305
+
306
+ Exec --> Combine
307
+ Intro --> Combine
308
+ Methods --> Combine
309
+ Discussion --> Combine
310
+ Future --> Combine
311
+ Refs --> Combine
312
+
313
+ Combine --> Format[Format Output]
314
+ Format --> MD[Markdown]
315
+ Format --> PDF[PDF]
316
+ Format --> JSON[JSON]
317
+
318
+ MD --> Output[Return Final<br/>Report]
319
+ PDF --> Output
320
+ JSON --> Output
321
+
322
+ style Input1 fill:#e1f5e1
323
+ style Input2 fill:#fff4e6
324
+ style Input3 fill:#e6f3ff
325
+ style Input4 fill:#e6ffe6
326
+ style Output fill:#d4edda
327
+ ```
328
+
329
+ ## 10. Data Flow & Event Streaming
330
+
331
+ ```mermaid
332
+ flowchart TD
333
+ User[👤 User] -->|Research Query| UI[Gradio UI]
334
+ UI -->|Submit| Manager[Magentic Manager]
335
+
336
+ Manager -->|Event: Planning| UI
337
+ Manager -->|Select Agent| HypAgent[Hypothesis Agent]
338
+ HypAgent -->|Event: Delta/Message| UI
339
+ HypAgent -->|Hypotheses| Context[(Shared Context)]
340
+
341
+ Context -->|Retrieved by| Manager
342
+ Manager -->|Select Agent| SearchAgent[Search Agent]
343
+ SearchAgent -->|MCP Request| WebSearch[Web Search Tool]
344
+ WebSearch -->|Results| SearchAgent
345
+ SearchAgent -->|Event: Delta/Message| UI
346
+ SearchAgent -->|Documents| Context
347
+ SearchAgent -->|Embeddings| VectorDB[(Vector DB)]
348
+
349
+ Context -->|Retrieved by| Manager
350
+ Manager -->|Select Agent| AnalysisAgent[Analysis Agent]
351
+ AnalysisAgent -->|MCP Request| CodeExec[Code Execution Tool]
352
+ CodeExec -->|Results| AnalysisAgent
353
+ AnalysisAgent -->|Event: Delta/Message| UI
354
+ AnalysisAgent -->|Analysis| Context
355
+
356
+ Context -->|Retrieved by| Manager
357
+ Manager -->|Select Agent| ReportAgent[Report Agent]
358
+ ReportAgent -->|MCP Request| CodeExec
359
+ ReportAgent -->|Event: Delta/Message| UI
360
+ ReportAgent -->|Report| Context
361
+
362
+ Manager -->|Event: Final Result| UI
363
+ UI -->|Display| User
364
+
365
+ style User fill:#e1f5e1
366
+ style UI fill:#e6f3ff
367
+ style Manager fill:#ffe6e6
368
+ style Context fill:#ffe6f0
369
+ style VectorDB fill:#ffe6f0
370
+ style WebSearch fill:#f0f0f0
371
+ style CodeExec fill:#f0f0f0
372
+ ```
373
+
374
+ ## 11. MCP Tool Architecture
375
+
376
+ ```mermaid
377
+ graph TB
378
+ subgraph "Agent Layer"
379
+ Manager[Magentic Manager]
380
+ HypAgent[Hypothesis Agent]
381
+ SearchAgent[Search Agent]
382
+ AnalysisAgent[Analysis Agent]
383
+ ReportAgent[Report Agent]
384
+ end
385
+
386
+ subgraph "MCP Protocol Layer"
387
+ Registry[MCP Tool Registry<br/>• Discovers tools<br/>• Routes requests<br/>• Manages connections]
388
+ end
389
+
390
+ subgraph "MCP Servers"
391
+ Server1[Web Search Server<br/>localhost:8001<br/>• PubMed<br/>• arXiv<br/>• bioRxiv]
392
+ Server2[Code Execution Server<br/>localhost:8002<br/>• Sandboxed Python<br/>• Package management]
393
+ Server3[RAG Server<br/>localhost:8003<br/>• Vector embeddings<br/>• Similarity search]
394
+ Server4[Visualization Server<br/>localhost:8004<br/>• Chart generation<br/>• Plot rendering]
395
+ end
396
+
397
+ subgraph "External Services"
398
+ PubMed[PubMed API]
399
+ ArXiv[arXiv API]
400
+ BioRxiv[bioRxiv API]
401
+ Modal[Modal Sandbox]
402
+ ChromaDB[(ChromaDB)]
403
+ end
404
+
405
+ SearchAgent -->|Request| Registry
406
+ AnalysisAgent -->|Request| Registry
407
+ ReportAgent -->|Request| Registry
408
+
409
+ Registry --> Server1
410
+ Registry --> Server2
411
+ Registry --> Server3
412
+ Registry --> Server4
413
+
414
+ Server1 --> PubMed
415
+ Server1 --> ArXiv
416
+ Server1 --> BioRxiv
417
+ Server2 --> Modal
418
+ Server3 --> ChromaDB
419
+
420
+ style Manager fill:#ffe6e6
421
+ style Registry fill:#fff4e6
422
+ style Server1 fill:#e6f3ff
423
+ style Server2 fill:#e6f3ff
424
+ style Server3 fill:#e6f3ff
425
+ style Server4 fill:#e6f3ff
426
+ ```
427
+
428
+ ## 12. Progress Tracking & Stall Detection
429
+
430
+ ```mermaid
431
+ stateDiagram-v2
432
+ [*] --> Initialization: User Query
433
+
434
+ Initialization --> Planning: Manager starts
435
+
436
+ Planning --> AgentExecution: Select agent
437
+
438
+ AgentExecution --> Assessment: Collect results
439
+
440
+ Assessment --> QualityCheck: Evaluate output
441
+
442
+ QualityCheck --> AgentExecution: Poor quality<br/>(retry < max_rounds)
443
+ QualityCheck --> Planning: Poor quality<br/>(try different agent)
444
+ QualityCheck --> NextAgent: Good quality<br/>(task incomplete)
445
+ QualityCheck --> Synthesis: Good quality<br/>(task complete)
446
+
447
+ NextAgent --> AgentExecution: Select next agent
448
+
449
+ state StallDetection <<choice>>
450
+ Assessment --> StallDetection: Check progress
451
+ StallDetection --> Planning: No progress<br/>(stall count < max)
452
+ StallDetection --> ErrorRecovery: No progress<br/>(max stalls reached)
453
+
454
+ ErrorRecovery --> PartialReport: Generate partial results
455
+ PartialReport --> [*]
456
+
457
+ Synthesis --> FinalReport: Combine all outputs
458
+ FinalReport --> [*]
459
+
460
+ note right of QualityCheck
461
+ Manager assesses:
462
+ • Output completeness
463
+ • Quality metrics
464
+ • Progress made
465
+ end note
466
+
467
+ note right of StallDetection
468
+ Stall = no new progress
469
+ after agent execution
470
+ Triggers plan reset
471
+ end note
472
+ ```
473
+
474
+ ## 13. Gradio UI Integration
475
+
476
+ ```mermaid
477
+ graph TD
478
+ App[Gradio App<br/>DeepCritical Research Agent]
479
+
480
+ App --> Input[Input Section]
481
+ App --> Status[Status Section]
482
+ App --> Output[Output Section]
483
+
484
+ Input --> Query[Research Question<br/>Text Area]
485
+ Input --> Controls[Controls]
486
+ Controls --> MaxHyp[Max Hypotheses: 1-10]
487
+ Controls --> MaxRounds[Max Rounds: 5-20]
488
+ Controls --> Submit[Start Research Button]
489
+
490
+ Status --> Log[Real-time Event Log<br/>• Manager planning<br/>• Agent selection<br/>• Execution updates<br/>• Quality assessment]
491
+ Status --> Progress[Progress Tracker<br/>• Current agent<br/>• Round count<br/>• Stall count]
492
+
493
+ Output --> Tabs[Tabbed Results]
494
+ Tabs --> Tab1[Hypotheses Tab<br/>Generated hypotheses with scores]
495
+ Tabs --> Tab2[Search Results Tab<br/>Papers & sources found]
496
+ Tabs --> Tab3[Analysis Tab<br/>Evidence & verdicts]
497
+ Tabs --> Tab4[Report Tab<br/>Final research report]
498
+ Tab4 --> Download[Download Report<br/>MD / PDF / JSON]
499
+
500
+ Submit -.->|Triggers| Workflow[Magentic Workflow]
501
+ Workflow -.->|MagenticOrchestratorMessageEvent| Log
502
+ Workflow -.->|MagenticAgentDeltaEvent| Log
503
+ Workflow -.->|MagenticAgentMessageEvent| Log
504
+ Workflow -.->|MagenticFinalResultEvent| Tab4
505
+
506
+ style App fill:#e1f5e1
507
+ style Input fill:#fff4e6
508
+ style Status fill:#e6f3ff
509
+ style Output fill:#e6ffe6
510
+ style Workflow fill:#ffe6e6
511
+ ```
512
+
513
+ ## 14. Complete System Context
514
+
515
+ ```mermaid
516
+ graph LR
517
+ User[👤 Researcher<br/>Asks research questions] -->|Submits query| DC[DeepCritical<br/>Magentic Workflow]
518
+
519
+ DC -->|Literature search| PubMed[PubMed API<br/>Medical papers]
520
+ DC -->|Preprint search| ArXiv[arXiv API<br/>Scientific preprints]
521
+ DC -->|Biology search| BioRxiv[bioRxiv API<br/>Biology preprints]
522
+ DC -->|Agent reasoning| Claude[Claude API<br/>Sonnet 4 / Opus]
523
+ DC -->|Code execution| Modal[Modal Sandbox<br/>Safe Python env]
524
+ DC -->|Vector storage| Chroma[ChromaDB<br/>Embeddings & RAG]
525
+
526
+ DC -->|Deployed on| HF[HuggingFace Spaces<br/>Gradio 6.0]
527
+
528
+ PubMed -->|Results| DC
529
+ ArXiv -->|Results| DC
530
+ BioRxiv -->|Results| DC
531
+ Claude -->|Responses| DC
532
+ Modal -->|Output| DC
533
+ Chroma -->|Context| DC
534
+
535
+ DC -->|Research report| User
536
+
537
+ style User fill:#e1f5e1
538
+ style DC fill:#ffe6e6
539
+ style PubMed fill:#e6f3ff
540
+ style ArXiv fill:#e6f3ff
541
+ style BioRxiv fill:#e6f3ff
542
+ style Claude fill:#ffd6d6
543
+ style Modal fill:#f0f0f0
544
+ style Chroma fill:#ffe6f0
545
+ style HF fill:#d4edda
546
+ ```
547
+
548
+ ## 15. Workflow Timeline (Simplified)
549
+
550
+ ```mermaid
551
+ gantt
552
+ title DeepCritical Magentic Workflow - Typical Execution
553
+ dateFormat mm:ss
554
+ axisFormat %M:%S
555
+
556
+ section Manager Planning
557
+ Initial planning :p1, 00:00, 10s
558
+
559
+ section Hypothesis Agent
560
+ Generate hypotheses :h1, after p1, 30s
561
+ Manager assessment :h2, after h1, 5s
562
+
563
+ section Search Agent
564
+ Search hypothesis 1 :s1, after h2, 20s
565
+ Search hypothesis 2 :s2, after s1, 20s
566
+ Search hypothesis 3 :s3, after s2, 20s
567
+ RAG processing :s4, after s3, 15s
568
+ Manager assessment :s5, after s4, 5s
569
+
570
+ section Analysis Agent
571
+ Evidence extraction :a1, after s5, 15s
572
+ Code generation :a2, after a1, 20s
573
+ Code execution :a3, after a2, 25s
574
+ Synthesis :a4, after a3, 20s
575
+ Manager assessment :a5, after a4, 5s
576
+
577
+ section Report Agent
578
+ Report assembly :r1, after a5, 30s
579
+ Visualization :r2, after r1, 15s
580
+ Formatting :r3, after r2, 10s
581
+
582
+ section Manager Synthesis
583
+ Final synthesis :f1, after r3, 10s
584
+ ```
585
+
586
+ ---
587
+
588
+ ## Key Differences from Original Design
589
+
590
+ | Aspect | Original (Judge-in-Loop) | New (Magentic) |
591
+ |--------|-------------------------|----------------|
592
+ | **Control Flow** | Fixed sequential phases | Dynamic agent selection |
593
+ | **Quality Control** | Separate Judge Agent | Manager assessment built-in |
594
+ | **Retry Logic** | Phase-level with feedback | Agent-level with adaptation |
595
+ | **Flexibility** | Rigid 4-phase pipeline | Adaptive workflow |
596
+ | **Complexity** | 5 agents (including Judge) | 4 agents (no Judge) |
597
+ | **Progress Tracking** | Manual state management | Built-in round/stall detection |
598
+ | **Agent Coordination** | Sequential handoff | Manager-driven dynamic selection |
599
+ | **Error Recovery** | Retry same phase | Try different agent or replan |
600
+
601
+ ---
602
+
603
+ ## Simplified Design Principles
604
+
605
+ 1. **Manager is Intelligent**: LLM-powered manager handles planning, selection, and quality assessment
606
+ 2. **No Separate Judge**: Manager's assessment phase replaces dedicated Judge Agent
607
+ 3. **Dynamic Workflow**: Agents can be called multiple times in any order based on need
608
+ 4. **Built-in Safety**: max_round_count (15) and max_stall_count (3) prevent infinite loops
609
+ 5. **Event-Driven UI**: Real-time streaming updates to Gradio interface
610
+ 6. **MCP-Powered Tools**: All external capabilities via Model Context Protocol
611
+ 7. **Shared Context**: Centralized state accessible to all agents
612
+ 8. **Progress Awareness**: Manager tracks what's been done and what's needed
613
+
614
+ ---
615
+
616
+ ## Legend
617
+
618
+ - 🔴 **Red/Pink**: Manager, orchestration, decision-making
619
+ - 🟡 **Yellow/Orange**: Specialist agents, processing
620
+ - 🔵 **Blue**: Data, tools, MCP services
621
+ - 🟣 **Purple/Pink**: Storage, databases, state
622
+ - 🟢 **Green**: User interactions, final outputs
623
+ - ⚪ **Gray**: External services, APIs
624
+
625
+ ---
626
+
627
+ ## Implementation Highlights
628
+
629
+ **Simple 4-Agent Setup:**
630
+ ```python
631
+ workflow = (
632
+ MagenticBuilder()
633
+ .participants(
634
+ hypothesis=HypothesisAgent(tools=[background_tool]),
635
+ search=SearchAgent(tools=[web_search, rag_tool]),
636
+ analysis=AnalysisAgent(tools=[code_execution]),
637
+ report=ReportAgent(tools=[code_execution, visualization])
638
+ )
639
+ .with_standard_manager(
640
+ chat_client=AnthropicClient(model="claude-sonnet-4"),
641
+ max_round_count=15, # Prevent infinite loops
642
+ max_stall_count=3 # Detect stuck workflows
643
+ )
644
+ .build()
645
+ )
646
+ ```
647
+
648
+ **Manager handles quality assessment in its instructions:**
649
+ - Checks hypothesis quality (testable, novel, clear)
650
+ - Validates search results (relevant, authoritative, recent)
651
+ - Assesses analysis soundness (methodology, evidence, conclusions)
652
+ - Ensures report completeness (all sections, proper citations)
653
+
654
+ No separate Judge Agent needed - manager does it all!
655
+
656
+ ---
657
+
658
+ **Document Version**: 2.0 (Magentic Simplified)
659
+ **Last Updated**: 2025-11-24
660
+ **Architecture**: Microsoft Magentic Orchestration Pattern
661
+ **Agents**: 4 (Hypothesis, Search, Analysis, Report) + 1 Manager
662
+ **License**: MIT
docs/configuration/CONFIGURATION.md ADDED
@@ -0,0 +1,743 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Configuration Guide
2
+
3
+ ## Overview
4
+
5
+ DeepCritical uses **Pydantic Settings** for centralized configuration management. All settings are defined in the `Settings` class in `src/utils/config.py` and can be configured via environment variables or a `.env` file.
6
+
7
+ The configuration system provides:
8
+
9
+ - **Type Safety**: Strongly-typed fields with Pydantic validation
10
+ - **Environment File Support**: Automatically loads from `.env` file (if present)
11
+ - **Case-Insensitive**: Environment variables are case-insensitive
12
+ - **Singleton Pattern**: Global `settings` instance for easy access throughout the codebase
13
+ - **Validation**: Automatic validation on load with helpful error messages
14
+
15
+ ## Quick Start
16
+
17
+ 1. Create a `.env` file in the project root
18
+ 2. Set at least one LLM API key (`OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, or `HF_TOKEN`)
19
+ 3. Optionally configure other services as needed
20
+ 4. The application will automatically load and validate your configuration
21
+
22
+ ## Configuration System Architecture
23
+
24
+ ### Settings Class
25
+
26
+ The `Settings` class extends `BaseSettings` from `pydantic_settings` and defines all application configuration:
27
+
28
+ ```13:21:src/utils/config.py
29
+ class Settings(BaseSettings):
30
+ """Strongly-typed application settings."""
31
+
32
+ model_config = SettingsConfigDict(
33
+ env_file=".env",
34
+ env_file_encoding="utf-8",
35
+ case_sensitive=False,
36
+ extra="ignore",
37
+ )
38
+ ```
39
+
40
+ ### Singleton Instance
41
+
42
+ A global `settings` instance is available for import:
43
+
44
+ ```234:235:src/utils/config.py
45
+ # Singleton for easy import
46
+ settings = get_settings()
47
+ ```
48
+
49
+ ### Usage Pattern
50
+
51
+ Access configuration throughout the codebase:
52
+
53
+ ```python
54
+ from src.utils.config import settings
55
+
56
+ # Check if API keys are available
57
+ if settings.has_openai_key:
58
+ # Use OpenAI
59
+ pass
60
+
61
+ # Access configuration values
62
+ max_iterations = settings.max_iterations
63
+ web_search_provider = settings.web_search_provider
64
+ ```
65
+
66
+ ## Required Configuration
67
+
68
+ ### LLM Provider
69
+
70
+ You must configure at least one LLM provider. The system supports:
71
+
72
+ - **OpenAI**: Requires `OPENAI_API_KEY`
73
+ - **Anthropic**: Requires `ANTHROPIC_API_KEY`
74
+ - **HuggingFace**: Optional `HF_TOKEN` or `HUGGINGFACE_API_KEY` (can work without key for public models)
75
+
76
+ #### OpenAI Configuration
77
+
78
+ ```bash
79
+ LLM_PROVIDER=openai
80
+ OPENAI_API_KEY=your_openai_api_key_here
81
+ OPENAI_MODEL=gpt-5.1
82
+ ```
83
+
84
+ The default model is defined in the `Settings` class:
85
+
86
+ ```29:29:src/utils/config.py
87
+ openai_model: str = Field(default="gpt-5.1", description="OpenAI model name")
88
+ ```
89
+
90
+ #### Anthropic Configuration
91
+
92
+ ```bash
93
+ LLM_PROVIDER=anthropic
94
+ ANTHROPIC_API_KEY=your_anthropic_api_key_here
95
+ ANTHROPIC_MODEL=claude-sonnet-4-5-20250929
96
+ ```
97
+
98
+ The default model is defined in the `Settings` class:
99
+
100
+ ```30:32:src/utils/config.py
101
+ anthropic_model: str = Field(
102
+ default="claude-sonnet-4-5-20250929", description="Anthropic model"
103
+ )
104
+ ```
105
+
106
+ #### HuggingFace Configuration
107
+
108
+ HuggingFace can work without an API key for public models, but an API key provides higher rate limits:
109
+
110
+ ```bash
111
+ # Option 1: Using HF_TOKEN (preferred)
112
+ HF_TOKEN=your_huggingface_token_here
113
+
114
+ # Option 2: Using HUGGINGFACE_API_KEY (alternative)
115
+ HUGGINGFACE_API_KEY=your_huggingface_api_key_here
116
+
117
+ # Default model
118
+ HUGGINGFACE_MODEL=meta-llama/Llama-3.1-8B-Instruct
119
+ ```
120
+
121
+ The HuggingFace token can be set via either environment variable:
122
+
123
+ ```33:35:src/utils/config.py
124
+ hf_token: str | None = Field(
125
+ default=None, alias="HF_TOKEN", description="HuggingFace API token"
126
+ )
127
+ ```
128
+
129
+ ```57:59:src/utils/config.py
130
+ huggingface_api_key: str | None = Field(
131
+ default=None, description="HuggingFace API token (HF_TOKEN or HUGGINGFACE_API_KEY)"
132
+ )
133
+ ```
134
+
135
+ ## Optional Configuration
136
+
137
+ ### Embedding Configuration
138
+
139
+ DeepCritical supports multiple embedding providers for semantic search and RAG:
140
+
141
+ ```bash
142
+ # Embedding Provider: "openai", "local", or "huggingface"
143
+ EMBEDDING_PROVIDER=local
144
+
145
+ # OpenAI Embedding Model (used by LlamaIndex RAG)
146
+ OPENAI_EMBEDDING_MODEL=text-embedding-3-small
147
+
148
+ # Local Embedding Model (sentence-transformers, used by EmbeddingService)
149
+ LOCAL_EMBEDDING_MODEL=all-MiniLM-L6-v2
150
+
151
+ # HuggingFace Embedding Model
152
+ HUGGINGFACE_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
153
+ ```
154
+
155
+ The embedding provider configuration:
156
+
157
+ ```47:50:src/utils/config.py
158
+ embedding_provider: Literal["openai", "local", "huggingface"] = Field(
159
+ default="local",
160
+ description="Embedding provider to use",
161
+ )
162
+ ```
163
+
164
+ **Note**: OpenAI embeddings require `OPENAI_API_KEY`. The local provider (default) uses sentence-transformers and requires no API key.
165
+
166
+ ### Web Search Configuration
167
+
168
+ DeepCritical supports multiple web search providers:
169
+
170
+ ```bash
171
+ # Web Search Provider: "serper", "searchxng", "brave", "tavily", or "duckduckgo"
172
+ # Default: "duckduckgo" (no API key required)
173
+ WEB_SEARCH_PROVIDER=duckduckgo
174
+
175
+ # Serper API Key (for Google search via Serper)
176
+ SERPER_API_KEY=your_serper_api_key_here
177
+
178
+ # SearchXNG Host URL (for self-hosted search)
179
+ SEARCHXNG_HOST=http://localhost:8080
180
+
181
+ # Brave Search API Key
182
+ BRAVE_API_KEY=your_brave_api_key_here
183
+
184
+ # Tavily API Key
185
+ TAVILY_API_KEY=your_tavily_api_key_here
186
+ ```
187
+
188
+ The web search provider configuration:
189
+
190
+ ```71:74:src/utils/config.py
191
+ web_search_provider: Literal["serper", "searchxng", "brave", "tavily", "duckduckgo"] = Field(
192
+ default="duckduckgo",
193
+ description="Web search provider to use",
194
+ )
195
+ ```
196
+
197
+ **Note**: DuckDuckGo is the default and requires no API key, making it ideal for development and testing.
198
+
199
+ ### PubMed Configuration
200
+
201
+ PubMed search supports optional NCBI API key for higher rate limits:
202
+
203
+ ```bash
204
+ # NCBI API Key (optional, for higher rate limits: 10 req/sec vs 3 req/sec)
205
+ NCBI_API_KEY=your_ncbi_api_key_here
206
+ ```
207
+
208
+ The PubMed tool uses this configuration:
209
+
210
+ ```22:29:src/tools/pubmed.py
211
+ def __init__(self, api_key: str | None = None) -> None:
212
+ self.api_key = api_key or settings.ncbi_api_key
213
+ # Ignore placeholder values from .env.example
214
+ if self.api_key == "your-ncbi-key-here":
215
+ self.api_key = None
216
+
217
+ # Use shared rate limiter
218
+ self._limiter = get_pubmed_limiter(self.api_key)
219
+ ```
220
+
221
+ ### Agent Configuration
222
+
223
+ Control agent behavior and research loop execution:
224
+
225
+ ```bash
226
+ # Maximum iterations per research loop (1-50, default: 10)
227
+ MAX_ITERATIONS=10
228
+
229
+ # Search timeout in seconds
230
+ SEARCH_TIMEOUT=30
231
+
232
+ # Use graph-based execution for research flows
233
+ USE_GRAPH_EXECUTION=false
234
+ ```
235
+
236
+ The agent configuration fields:
237
+
238
+ ```80:85:src/utils/config.py
239
+ # Agent Configuration
240
+ max_iterations: int = Field(default=10, ge=1, le=50)
241
+ search_timeout: int = Field(default=30, description="Seconds to wait for search")
242
+ use_graph_execution: bool = Field(
243
+ default=False, description="Use graph-based execution for research flows"
244
+ )
245
+ ```
246
+
247
+ ### Budget & Rate Limiting Configuration
248
+
249
+ Control resource limits for research loops:
250
+
251
+ ```bash
252
+ # Default token budget per research loop (1000-1000000, default: 100000)
253
+ DEFAULT_TOKEN_LIMIT=100000
254
+
255
+ # Default time limit per research loop in minutes (1-120, default: 10)
256
+ DEFAULT_TIME_LIMIT_MINUTES=10
257
+
258
+ # Default iterations limit per research loop (1-50, default: 10)
259
+ DEFAULT_ITERATIONS_LIMIT=10
260
+ ```
261
+
262
+ The budget configuration with validation:
263
+
264
+ ```87:105:src/utils/config.py
265
+ # Budget & Rate Limiting Configuration
266
+ default_token_limit: int = Field(
267
+ default=100000,
268
+ ge=1000,
269
+ le=1000000,
270
+ description="Default token budget per research loop",
271
+ )
272
+ default_time_limit_minutes: int = Field(
273
+ default=10,
274
+ ge=1,
275
+ le=120,
276
+ description="Default time limit per research loop (minutes)",
277
+ )
278
+ default_iterations_limit: int = Field(
279
+ default=10,
280
+ ge=1,
281
+ le=50,
282
+ description="Default iterations limit per research loop",
283
+ )
284
+ ```
285
+
286
+ ### RAG Service Configuration
287
+
288
+ Configure the Retrieval-Augmented Generation service:
289
+
290
+ ```bash
291
+ # ChromaDB collection name for RAG
292
+ RAG_COLLECTION_NAME=deepcritical_evidence
293
+
294
+ # Number of top results to retrieve from RAG (1-50, default: 5)
295
+ RAG_SIMILARITY_TOP_K=5
296
+
297
+ # Automatically ingest evidence into RAG
298
+ RAG_AUTO_INGEST=true
299
+ ```
300
+
301
+ The RAG configuration:
302
+
303
+ ```127:141:src/utils/config.py
304
+ # RAG Service Configuration
305
+ rag_collection_name: str = Field(
306
+ default="deepcritical_evidence",
307
+ description="ChromaDB collection name for RAG",
308
+ )
309
+ rag_similarity_top_k: int = Field(
310
+ default=5,
311
+ ge=1,
312
+ le=50,
313
+ description="Number of top results to retrieve from RAG",
314
+ )
315
+ rag_auto_ingest: bool = Field(
316
+ default=True,
317
+ description="Automatically ingest evidence into RAG",
318
+ )
319
+ ```
320
+
321
+ ### ChromaDB Configuration
322
+
323
+ Configure the vector database for embeddings and RAG:
324
+
325
+ ```bash
326
+ # ChromaDB storage path
327
+ CHROMA_DB_PATH=./chroma_db
328
+
329
+ # Whether to persist ChromaDB to disk
330
+ CHROMA_DB_PERSIST=true
331
+
332
+ # ChromaDB server host (for remote ChromaDB, optional)
333
+ CHROMA_DB_HOST=localhost
334
+
335
+ # ChromaDB server port (for remote ChromaDB, optional)
336
+ CHROMA_DB_PORT=8000
337
+ ```
338
+
339
+ The ChromaDB configuration:
340
+
341
+ ```113:125:src/utils/config.py
342
+ chroma_db_path: str = Field(default="./chroma_db", description="ChromaDB storage path")
343
+ chroma_db_persist: bool = Field(
344
+ default=True,
345
+ description="Whether to persist ChromaDB to disk",
346
+ )
347
+ chroma_db_host: str | None = Field(
348
+ default=None,
349
+ description="ChromaDB server host (for remote ChromaDB)",
350
+ )
351
+ chroma_db_port: int | None = Field(
352
+ default=None,
353
+ description="ChromaDB server port (for remote ChromaDB)",
354
+ )
355
+ ```
356
+
357
+ ### External Services
358
+
359
+ #### Modal Configuration
360
+
361
+ Modal is used for secure sandbox execution of statistical analysis:
362
+
363
+ ```bash
364
+ # Modal Token ID (for Modal sandbox execution)
365
+ MODAL_TOKEN_ID=your_modal_token_id_here
366
+
367
+ # Modal Token Secret
368
+ MODAL_TOKEN_SECRET=your_modal_token_secret_here
369
+ ```
370
+
371
+ The Modal configuration:
372
+
373
+ ```110:112:src/utils/config.py
374
+ # External Services
375
+ modal_token_id: str | None = Field(default=None, description="Modal token ID")
376
+ modal_token_secret: str | None = Field(default=None, description="Modal token secret")
377
+ ```
378
+
379
+ ### Logging Configuration
380
+
381
+ Configure structured logging:
382
+
383
+ ```bash
384
+ # Log Level: "DEBUG", "INFO", "WARNING", or "ERROR"
385
+ LOG_LEVEL=INFO
386
+ ```
387
+
388
+ The logging configuration:
389
+
390
+ ```107:108:src/utils/config.py
391
+ # Logging
392
+ log_level: Literal["DEBUG", "INFO", "WARNING", "ERROR"] = "INFO"
393
+ ```
394
+
395
+ Logging is configured via the `configure_logging()` function:
396
+
397
+ ```212:231:src/utils/config.py
398
+ def configure_logging(settings: Settings) -> None:
399
+ """Configure structured logging with the configured log level."""
400
+ # Set stdlib logging level from settings
401
+ logging.basicConfig(
402
+ level=getattr(logging, settings.log_level),
403
+ format="%(message)s",
404
+ )
405
+
406
+ structlog.configure(
407
+ processors=[
408
+ structlog.stdlib.filter_by_level,
409
+ structlog.stdlib.add_logger_name,
410
+ structlog.stdlib.add_log_level,
411
+ structlog.processors.TimeStamper(fmt="iso"),
412
+ structlog.processors.JSONRenderer(),
413
+ ],
414
+ wrapper_class=structlog.stdlib.BoundLogger,
415
+ context_class=dict,
416
+ logger_factory=structlog.stdlib.LoggerFactory(),
417
+ )
418
+ ```
419
+
420
+ ## Configuration Properties
421
+
422
+ The `Settings` class provides helpful properties for checking configuration state:
423
+
424
+ ### API Key Availability
425
+
426
+ Check which API keys are available:
427
+
428
+ ```171:189:src/utils/config.py
429
+ @property
430
+ def has_openai_key(self) -> bool:
431
+ """Check if OpenAI API key is available."""
432
+ return bool(self.openai_api_key)
433
+
434
+ @property
435
+ def has_anthropic_key(self) -> bool:
436
+ """Check if Anthropic API key is available."""
437
+ return bool(self.anthropic_api_key)
438
+
439
+ @property
440
+ def has_huggingface_key(self) -> bool:
441
+ """Check if HuggingFace API key is available."""
442
+ return bool(self.huggingface_api_key or self.hf_token)
443
+
444
+ @property
445
+ def has_any_llm_key(self) -> bool:
446
+ """Check if any LLM API key is available."""
447
+ return self.has_openai_key or self.has_anthropic_key or self.has_huggingface_key
448
+ ```
449
+
450
+ **Usage:**
451
+
452
+ ```python
453
+ from src.utils.config import settings
454
+
455
+ # Check API key availability
456
+ if settings.has_openai_key:
457
+ # Use OpenAI
458
+ pass
459
+
460
+ if settings.has_anthropic_key:
461
+ # Use Anthropic
462
+ pass
463
+
464
+ if settings.has_huggingface_key:
465
+ # Use HuggingFace
466
+ pass
467
+
468
+ if settings.has_any_llm_key:
469
+ # At least one LLM is available
470
+ pass
471
+ ```
472
+
473
+ ### Service Availability
474
+
475
+ Check if external services are configured:
476
+
477
+ ```143:146:src/utils/config.py
478
+ @property
479
+ def modal_available(self) -> bool:
480
+ """Check if Modal credentials are configured."""
481
+ return bool(self.modal_token_id and self.modal_token_secret)
482
+ ```
483
+
484
+ ```191:204:src/utils/config.py
485
+ @property
486
+ def web_search_available(self) -> bool:
487
+ """Check if web search is available (either no-key provider or API key present)."""
488
+ if self.web_search_provider == "duckduckgo":
489
+ return True # No API key required
490
+ if self.web_search_provider == "serper":
491
+ return bool(self.serper_api_key)
492
+ if self.web_search_provider == "searchxng":
493
+ return bool(self.searchxng_host)
494
+ if self.web_search_provider == "brave":
495
+ return bool(self.brave_api_key)
496
+ if self.web_search_provider == "tavily":
497
+ return bool(self.tavily_api_key)
498
+ return False
499
+ ```
500
+
501
+ **Usage:**
502
+
503
+ ```python
504
+ from src.utils.config import settings
505
+
506
+ # Check service availability
507
+ if settings.modal_available:
508
+ # Use Modal sandbox
509
+ pass
510
+
511
+ if settings.web_search_available:
512
+ # Web search is configured
513
+ pass
514
+ ```
515
+
516
+ ### API Key Retrieval
517
+
518
+ Get the API key for the configured provider:
519
+
520
+ ```148:160:src/utils/config.py
521
+ def get_api_key(self) -> str:
522
+ """Get the API key for the configured provider."""
523
+ if self.llm_provider == "openai":
524
+ if not self.openai_api_key:
525
+ raise ConfigurationError("OPENAI_API_KEY not set")
526
+ return self.openai_api_key
527
+
528
+ if self.llm_provider == "anthropic":
529
+ if not self.anthropic_api_key:
530
+ raise ConfigurationError("ANTHROPIC_API_KEY not set")
531
+ return self.anthropic_api_key
532
+
533
+ raise ConfigurationError(f"Unknown LLM provider: {self.llm_provider}")
534
+ ```
535
+
536
+ For OpenAI-specific operations (e.g., Magentic mode):
537
+
538
+ ```162:169:src/utils/config.py
539
+ def get_openai_api_key(self) -> str:
540
+ """Get OpenAI API key (required for Magentic function calling)."""
541
+ if not self.openai_api_key:
542
+ raise ConfigurationError(
543
+ "OPENAI_API_KEY not set. Magentic mode requires OpenAI for function calling. "
544
+ "Use mode='simple' for other providers."
545
+ )
546
+ return self.openai_api_key
547
+ ```
548
+
549
+ ## Configuration Usage in Codebase
550
+
551
+ The configuration system is used throughout the codebase:
552
+
553
+ ### LLM Factory
554
+
555
+ The LLM factory uses settings to create appropriate models:
556
+
557
+ ```129:144:src/utils/llm_factory.py
558
+ if settings.llm_provider == "huggingface":
559
+ model_name = settings.huggingface_model or "meta-llama/Llama-3.1-8B-Instruct"
560
+ hf_provider = HuggingFaceProvider(api_key=settings.hf_token)
561
+ return HuggingFaceModel(model_name, provider=hf_provider)
562
+
563
+ if settings.llm_provider == "openai":
564
+ if not settings.openai_api_key:
565
+ raise ConfigurationError("OPENAI_API_KEY not set for pydantic-ai")
566
+ provider = OpenAIProvider(api_key=settings.openai_api_key)
567
+ return OpenAIModel(settings.openai_model, provider=provider)
568
+
569
+ if settings.llm_provider == "anthropic":
570
+ if not settings.anthropic_api_key:
571
+ raise ConfigurationError("ANTHROPIC_API_KEY not set for pydantic-ai")
572
+ anthropic_provider = AnthropicProvider(api_key=settings.anthropic_api_key)
573
+ return AnthropicModel(settings.anthropic_model, provider=anthropic_provider)
574
+ ```
575
+
576
+ ### Embedding Service
577
+
578
+ The embedding service uses local embedding model configuration:
579
+
580
+ ```29:31:src/services/embeddings.py
581
+ def __init__(self, model_name: str | None = None):
582
+ self._model_name = model_name or settings.local_embedding_model
583
+ self._model = SentenceTransformer(self._model_name)
584
+ ```
585
+
586
+ ### Orchestrator Factory
587
+
588
+ The orchestrator factory uses settings to determine mode:
589
+
590
+ ```69:80:src/orchestrator_factory.py
591
+ def _determine_mode(explicit_mode: str | None) -> str:
592
+ """Determine which mode to use."""
593
+ if explicit_mode:
594
+ if explicit_mode in ("magentic", "advanced"):
595
+ return "advanced"
596
+ return "simple"
597
+
598
+ # Auto-detect: advanced if paid API key available
599
+ if settings.has_openai_key:
600
+ return "advanced"
601
+
602
+ return "simple"
603
+ ```
604
+
605
+ ## Environment Variables Reference
606
+
607
+ ### Required (at least one LLM)
608
+
609
+ - `OPENAI_API_KEY` - OpenAI API key (required for OpenAI provider)
610
+ - `ANTHROPIC_API_KEY` - Anthropic API key (required for Anthropic provider)
611
+ - `HF_TOKEN` or `HUGGINGFACE_API_KEY` - HuggingFace API token (optional, can work without for public models)
612
+
613
+ #### LLM Configuration Variables
614
+
615
+ - `LLM_PROVIDER` - Provider to use: `"openai"`, `"anthropic"`, or `"huggingface"` (default: `"huggingface"`)
616
+ - `OPENAI_MODEL` - OpenAI model name (default: `"gpt-5.1"`)
617
+ - `ANTHROPIC_MODEL` - Anthropic model name (default: `"claude-sonnet-4-5-20250929"`)
618
+ - `HUGGINGFACE_MODEL` - HuggingFace model ID (default: `"meta-llama/Llama-3.1-8B-Instruct"`)
619
+
620
+ #### Embedding Configuration Variables
621
+
622
+ - `EMBEDDING_PROVIDER` - Provider: `"openai"`, `"local"`, or `"huggingface"` (default: `"local"`)
623
+ - `OPENAI_EMBEDDING_MODEL` - OpenAI embedding model (default: `"text-embedding-3-small"`)
624
+ - `LOCAL_EMBEDDING_MODEL` - Local sentence-transformers model (default: `"all-MiniLM-L6-v2"`)
625
+ - `HUGGINGFACE_EMBEDDING_MODEL` - HuggingFace embedding model (default: `"sentence-transformers/all-MiniLM-L6-v2"`)
626
+
627
+ #### Web Search Configuration Variables
628
+
629
+ - `WEB_SEARCH_PROVIDER` - Provider: `"serper"`, `"searchxng"`, `"brave"`, `"tavily"`, or `"duckduckgo"` (default: `"duckduckgo"`)
630
+ - `SERPER_API_KEY` - Serper API key (required for Serper provider)
631
+ - `SEARCHXNG_HOST` - SearchXNG host URL (required for SearchXNG provider)
632
+ - `BRAVE_API_KEY` - Brave Search API key (required for Brave provider)
633
+ - `TAVILY_API_KEY` - Tavily API key (required for Tavily provider)
634
+
635
+ #### PubMed Configuration Variables
636
+
637
+ - `NCBI_API_KEY` - NCBI API key (optional, increases rate limit from 3 to 10 req/sec)
638
+
639
+ #### Agent Configuration Variables
640
+
641
+ - `MAX_ITERATIONS` - Maximum iterations per research loop (1-50, default: `10`)
642
+ - `SEARCH_TIMEOUT` - Search timeout in seconds (default: `30`)
643
+ - `USE_GRAPH_EXECUTION` - Use graph-based execution (default: `false`)
644
+
645
+ #### Budget Configuration Variables
646
+
647
+ - `DEFAULT_TOKEN_LIMIT` - Default token budget per research loop (1000-1000000, default: `100000`)
648
+ - `DEFAULT_TIME_LIMIT_MINUTES` - Default time limit in minutes (1-120, default: `10`)
649
+ - `DEFAULT_ITERATIONS_LIMIT` - Default iterations limit (1-50, default: `10`)
650
+
651
+ #### RAG Configuration Variables
652
+
653
+ - `RAG_COLLECTION_NAME` - ChromaDB collection name (default: `"deepcritical_evidence"`)
654
+ - `RAG_SIMILARITY_TOP_K` - Number of top results to retrieve (1-50, default: `5`)
655
+ - `RAG_AUTO_INGEST` - Automatically ingest evidence into RAG (default: `true`)
656
+
657
+ #### ChromaDB Configuration Variables
658
+
659
+ - `CHROMA_DB_PATH` - ChromaDB storage path (default: `"./chroma_db"`)
660
+ - `CHROMA_DB_PERSIST` - Whether to persist ChromaDB to disk (default: `true`)
661
+ - `CHROMA_DB_HOST` - ChromaDB server host (optional, for remote ChromaDB)
662
+ - `CHROMA_DB_PORT` - ChromaDB server port (optional, for remote ChromaDB)
663
+
664
+ #### External Services Variables
665
+
666
+ - `MODAL_TOKEN_ID` - Modal token ID (optional, for Modal sandbox execution)
667
+ - `MODAL_TOKEN_SECRET` - Modal token secret (optional, for Modal sandbox execution)
668
+
669
+ #### Logging Configuration Variables
670
+
671
+ - `LOG_LEVEL` - Log level: `"DEBUG"`, `"INFO"`, `"WARNING"`, or `"ERROR"` (default: `"INFO"`)
672
+
673
+ ## Validation
674
+
675
+ Settings are validated on load using Pydantic validation:
676
+
677
+ - **Type Checking**: All fields are strongly typed
678
+ - **Range Validation**: Numeric fields have min/max constraints (e.g., `ge=1, le=50` for `max_iterations`)
679
+ - **Literal Validation**: Enum fields only accept specific values (e.g., `Literal["openai", "anthropic", "huggingface"]`)
680
+ - **Required Fields**: API keys are checked when accessed via `get_api_key()` or `get_openai_api_key()`
681
+
682
+ ### Validation Examples
683
+
684
+ The `max_iterations` field has range validation:
685
+
686
+ ```81:81:src/utils/config.py
687
+ max_iterations: int = Field(default=10, ge=1, le=50)
688
+ ```
689
+
690
+ The `llm_provider` field has literal validation:
691
+
692
+ ```26:28:src/utils/config.py
693
+ llm_provider: Literal["openai", "anthropic", "huggingface"] = Field(
694
+ default="openai", description="Which LLM provider to use"
695
+ )
696
+ ```
697
+
698
+ ## Error Handling
699
+
700
+ Configuration errors raise `ConfigurationError` from `src/utils/exceptions.py`:
701
+
702
+ ```22:25:src/utils/exceptions.py
703
+ class ConfigurationError(DeepCriticalError):
704
+ """Raised when configuration is invalid."""
705
+
706
+ pass
707
+ ```
708
+
709
+ ### Error Handling Example
710
+
711
+ ```python
712
+ from src.utils.config import settings
713
+ from src.utils.exceptions import ConfigurationError
714
+
715
+ try:
716
+ api_key = settings.get_api_key()
717
+ except ConfigurationError as e:
718
+ print(f"Configuration error: {e}")
719
+ ```
720
+
721
+ ### Common Configuration Errors
722
+
723
+ 1. **Missing API Key**: When `get_api_key()` is called but the required API key is not set
724
+ 2. **Invalid Provider**: When `llm_provider` is set to an unsupported value
725
+ 3. **Out of Range**: When numeric values exceed their min/max constraints
726
+ 4. **Invalid Literal**: When enum fields receive unsupported values
727
+
728
+ ## Configuration Best Practices
729
+
730
+ 1. **Use `.env` File**: Store sensitive keys in `.env` file (add to `.gitignore`)
731
+ 2. **Check Availability**: Use properties like `has_openai_key` before accessing API keys
732
+ 3. **Handle Errors**: Always catch `ConfigurationError` when calling `get_api_key()`
733
+ 4. **Validate Early**: Configuration is validated on import, so errors surface immediately
734
+ 5. **Use Defaults**: Leverage sensible defaults for optional configuration
735
+
736
+ ## Future Enhancements
737
+
738
+ The following configurations are planned for future phases:
739
+
740
+ 1. **Additional LLM Providers**: DeepSeek, OpenRouter, Gemini, Perplexity, Azure OpenAI, Local models
741
+ 2. **Model Selection**: Reasoning/main/fast model configuration
742
+ 3. **Service Integration**: Additional service integrations and configurations
743
+
docs/configuration/index.md CHANGED
@@ -25,9 +25,17 @@ The configuration system provides:
25
 
26
  The [`Settings`][settings-class] class extends `BaseSettings` from `pydantic_settings` and defines all application configuration:
27
 
28
- <!--codeinclude-->
29
- [Settings Class Definition](../src/utils/config.py) start_line:13 end_line:21
30
- <!--/codeinclude-->
 
 
 
 
 
 
 
 
31
 
32
  [View source](https://github.com/DeepCritical/GradioDemo/blob/main/src/utils/config.py#L13-L21)
33
 
@@ -35,9 +43,10 @@ The [`Settings`][settings-class] class extends `BaseSettings` from `pydantic_set
35
 
36
  A global `settings` instance is available for import:
37
 
38
- <!--codeinclude-->
39
- [Singleton Instance](../src/utils/config.py) start_line:234 end_line:235
40
- <!--/codeinclude-->
 
41
 
42
  [View source](https://github.com/DeepCritical/GradioDemo/blob/main/src/utils/config.py#L234-L235)
43
 
@@ -78,9 +87,9 @@ OPENAI_MODEL=gpt-5.1
78
 
79
  The default model is defined in the `Settings` class:
80
 
81
- <!--codeinclude-->
82
- [OpenAI Model Configuration](../src/utils/config.py) start_line:29 end_line:29
83
- <!--/codeinclude-->
84
 
85
  #### Anthropic Configuration
86
 
@@ -92,9 +101,11 @@ ANTHROPIC_MODEL=claude-sonnet-4-5-20250929
92
 
93
  The default model is defined in the `Settings` class:
94
 
95
- <!--codeinclude-->
96
- [Anthropic Model Configuration](../src/utils/config.py) start_line:30 end_line:32
97
- <!--/codeinclude-->
 
 
98
 
99
  #### HuggingFace Configuration
100
 
@@ -113,13 +124,17 @@ HUGGINGFACE_MODEL=meta-llama/Llama-3.1-8B-Instruct
113
 
114
  The HuggingFace token can be set via either environment variable:
115
 
116
- <!--codeinclude-->
117
- [HuggingFace Token Configuration](../src/utils/config.py) start_line:33 end_line:35
118
- <!--/codeinclude-->
 
 
119
 
120
- <!--codeinclude-->
121
- [HuggingFace API Key Configuration](../src/utils/config.py) start_line:57 end_line:59
122
- <!--/codeinclude-->
 
 
123
 
124
  ## Optional Configuration
125
 
@@ -143,9 +158,12 @@ HUGGINGFACE_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
143
 
144
  The embedding provider configuration:
145
 
146
- <!--codeinclude-->
147
- [Embedding Provider Configuration](../src/utils/config.py) start_line:47 end_line:50
148
- <!--/codeinclude-->
 
 
 
149
 
150
  **Note**: OpenAI embeddings require `OPENAI_API_KEY`. The local provider (default) uses sentence-transformers and requires no API key.
151
 
@@ -173,9 +191,12 @@ TAVILY_API_KEY=your_tavily_api_key_here
173
 
174
  The web search provider configuration:
175
 
176
- <!--codeinclude-->
177
- [Web Search Provider Configuration](../src/utils/config.py) start_line:71 end_line:74
178
- <!--/codeinclude-->
 
 
 
179
 
180
  **Note**: DuckDuckGo is the default and requires no API key, making it ideal for development and testing.
181
 
@@ -190,9 +211,16 @@ NCBI_API_KEY=your_ncbi_api_key_here
190
 
191
  The PubMed tool uses this configuration:
192
 
193
- <!--codeinclude-->
194
- [PubMed Tool Configuration](../src/tools/pubmed.py) start_line:22 end_line:29
195
- <!--/codeinclude-->
 
 
 
 
 
 
 
196
 
197
  ### Agent Configuration
198
 
@@ -211,9 +239,14 @@ USE_GRAPH_EXECUTION=false
211
 
212
  The agent configuration fields:
213
 
214
- <!--codeinclude-->
215
- [Agent Configuration](../src/utils/config.py) start_line:80 end_line:85
216
- <!--/codeinclude-->
 
 
 
 
 
217
 
218
  ### Budget & Rate Limiting Configuration
219
 
@@ -232,9 +265,27 @@ DEFAULT_ITERATIONS_LIMIT=10
232
 
233
  The budget configuration with validation:
234
 
235
- <!--codeinclude-->
236
- [Budget Configuration](../src/utils/config.py) start_line:87 end_line:105
237
- <!--/codeinclude-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
238
 
239
  ### RAG Service Configuration
240
 
@@ -253,9 +304,23 @@ RAG_AUTO_INGEST=true
253
 
254
  The RAG configuration:
255
 
256
- <!--codeinclude-->
257
- [RAG Service Configuration](../src/utils/config.py) start_line:127 end_line:141
258
- <!--/codeinclude-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
259
 
260
  ### ChromaDB Configuration
261
 
@@ -277,9 +342,21 @@ CHROMA_DB_PORT=8000
277
 
278
  The ChromaDB configuration:
279
 
280
- <!--codeinclude-->
281
- [ChromaDB Configuration](../src/utils/config.py) start_line:113 end_line:125
282
- <!--/codeinclude-->
 
 
 
 
 
 
 
 
 
 
 
 
283
 
284
  ### External Services
285
 
@@ -297,9 +374,11 @@ MODAL_TOKEN_SECRET=your_modal_token_secret_here
297
 
298
  The Modal configuration:
299
 
300
- <!--codeinclude-->
301
- [Modal Configuration](../src/utils/config.py) start_line:110 end_line:112
302
- <!--/codeinclude-->
 
 
303
 
304
  ### Logging Configuration
305
 
@@ -312,15 +391,35 @@ LOG_LEVEL=INFO
312
 
313
  The logging configuration:
314
 
315
- <!--codeinclude-->
316
- [Logging Configuration](../src/utils/config.py) start_line:107 end_line:108
317
- <!--/codeinclude-->
 
318
 
319
  Logging is configured via the `configure_logging()` function:
320
 
321
- <!--codeinclude-->
322
- [Configure Logging Function](../src/utils/config.py) start_line:212 end_line:231
323
- <!--/codeinclude-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
324
 
325
  ## Configuration Properties
326
 
@@ -330,9 +429,27 @@ The `Settings` class provides helpful properties for checking configuration stat
330
 
331
  Check which API keys are available:
332
 
333
- <!--codeinclude-->
334
- [API Key Availability Properties](../src/utils/config.py) start_line:171 end_line:189
335
- <!--/codeinclude-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
336
 
337
  **Usage:**
338
 
@@ -361,13 +478,29 @@ if settings.has_any_llm_key:
361
 
362
  Check if external services are configured:
363
 
364
- <!--codeinclude-->
365
- [Modal Availability Property](../src/utils/config.py) start_line:143 end_line:146
366
- <!--/codeinclude-->
 
 
 
367
 
368
- <!--codeinclude-->
369
- [Web Search Availability Property](../src/utils/config.py) start_line:191 end_line:204
370
- <!--/codeinclude-->
 
 
 
 
 
 
 
 
 
 
 
 
 
371
 
372
  **Usage:**
373
 
@@ -388,15 +521,34 @@ if settings.web_search_available:
388
 
389
  Get the API key for the configured provider:
390
 
391
- <!--codeinclude-->
392
- [Get API Key Method](../src/utils/config.py) start_line:148 end_line:160
393
- <!--/codeinclude-->
 
 
 
 
 
 
 
 
 
 
 
 
394
 
395
  For OpenAI-specific operations (e.g., Magentic mode):
396
 
397
- <!--codeinclude-->
398
- [Get OpenAI API Key Method](../src/utils/config.py) start_line:162 end_line:169
399
- <!--/codeinclude-->
 
 
 
 
 
 
 
400
 
401
  ## Configuration Usage in Codebase
402
 
@@ -406,25 +558,53 @@ The configuration system is used throughout the codebase:
406
 
407
  The LLM factory uses settings to create appropriate models:
408
 
409
- <!--codeinclude-->
410
- [LLM Factory Usage](../src/utils/llm_factory.py) start_line:129 end_line:144
411
- <!--/codeinclude-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
412
 
413
  ### Embedding Service
414
 
415
  The embedding service uses local embedding model configuration:
416
 
417
- <!--codeinclude-->
418
- [Embedding Service Usage](../src/services/embeddings.py) start_line:29 end_line:31
419
- <!--/codeinclude-->
 
 
420
 
421
  ### Orchestrator Factory
422
 
423
  The orchestrator factory uses settings to determine mode:
424
 
425
- <!--codeinclude-->
426
- [Orchestrator Factory Mode Detection](../src/orchestrator_factory.py) start_line:69 end_line:80
427
- <!--/codeinclude-->
 
 
 
 
 
 
 
 
 
 
 
428
 
429
  ## Environment Variables Reference
430
 
@@ -507,15 +687,17 @@ Settings are validated on load using Pydantic validation:
507
 
508
  The `max_iterations` field has range validation:
509
 
510
- <!--codeinclude-->
511
- [Max Iterations Validation](../src/utils/config.py) start_line:81 end_line:81
512
- <!--/codeinclude-->
513
 
514
  The `llm_provider` field has literal validation:
515
 
516
- <!--codeinclude-->
517
- [LLM Provider Literal Validation](../src/utils/config.py) start_line:26 end_line:28
518
- <!--/codeinclude-->
 
 
519
 
520
  ## Error Handling
521
 
 
25
 
26
  The [`Settings`][settings-class] class extends `BaseSettings` from `pydantic_settings` and defines all application configuration:
27
 
28
+ ```13:21:src/utils/config.py
29
+ class Settings(BaseSettings):
30
+ """Strongly-typed application settings."""
31
+
32
+ model_config = SettingsConfigDict(
33
+ env_file=".env",
34
+ env_file_encoding="utf-8",
35
+ case_sensitive=False,
36
+ extra="ignore",
37
+ )
38
+ ```
39
 
40
  [View source](https://github.com/DeepCritical/GradioDemo/blob/main/src/utils/config.py#L13-L21)
41
 
 
43
 
44
  A global `settings` instance is available for import:
45
 
46
+ ```234:235:src/utils/config.py
47
+ # Singleton for easy import
48
+ settings = get_settings()
49
+ ```
50
 
51
  [View source](https://github.com/DeepCritical/GradioDemo/blob/main/src/utils/config.py#L234-L235)
52
 
 
87
 
88
  The default model is defined in the `Settings` class:
89
 
90
+ ```29:29:src/utils/config.py
91
+ openai_model: str = Field(default="gpt-5.1", description="OpenAI model name")
92
+ ```
93
 
94
  #### Anthropic Configuration
95
 
 
101
 
102
  The default model is defined in the `Settings` class:
103
 
104
+ ```30:32:src/utils/config.py
105
+ anthropic_model: str = Field(
106
+ default="claude-sonnet-4-5-20250929", description="Anthropic model"
107
+ )
108
+ ```
109
 
110
  #### HuggingFace Configuration
111
 
 
124
 
125
  The HuggingFace token can be set via either environment variable:
126
 
127
+ ```33:35:src/utils/config.py
128
+ hf_token: str | None = Field(
129
+ default=None, alias="HF_TOKEN", description="HuggingFace API token"
130
+ )
131
+ ```
132
 
133
+ ```57:59:src/utils/config.py
134
+ huggingface_api_key: str | None = Field(
135
+ default=None, description="HuggingFace API token (HF_TOKEN or HUGGINGFACE_API_KEY)"
136
+ )
137
+ ```
138
 
139
  ## Optional Configuration
140
 
 
158
 
159
  The embedding provider configuration:
160
 
161
+ ```47:50:src/utils/config.py
162
+ embedding_provider: Literal["openai", "local", "huggingface"] = Field(
163
+ default="local",
164
+ description="Embedding provider to use",
165
+ )
166
+ ```
167
 
168
  **Note**: OpenAI embeddings require `OPENAI_API_KEY`. The local provider (default) uses sentence-transformers and requires no API key.
169
 
 
191
 
192
  The web search provider configuration:
193
 
194
+ ```71:74:src/utils/config.py
195
+ web_search_provider: Literal["serper", "searchxng", "brave", "tavily", "duckduckgo"] = Field(
196
+ default="duckduckgo",
197
+ description="Web search provider to use",
198
+ )
199
+ ```
200
 
201
  **Note**: DuckDuckGo is the default and requires no API key, making it ideal for development and testing.
202
 
 
211
 
212
  The PubMed tool uses this configuration:
213
 
214
+ ```22:29:src/tools/pubmed.py
215
+ def __init__(self, api_key: str | None = None) -> None:
216
+ self.api_key = api_key or settings.ncbi_api_key
217
+ # Ignore placeholder values from .env.example
218
+ if self.api_key == "your-ncbi-key-here":
219
+ self.api_key = None
220
+
221
+ # Use shared rate limiter
222
+ self._limiter = get_pubmed_limiter(self.api_key)
223
+ ```
224
 
225
  ### Agent Configuration
226
 
 
239
 
240
  The agent configuration fields:
241
 
242
+ ```80:85:src/utils/config.py
243
+ # Agent Configuration
244
+ max_iterations: int = Field(default=10, ge=1, le=50)
245
+ search_timeout: int = Field(default=30, description="Seconds to wait for search")
246
+ use_graph_execution: bool = Field(
247
+ default=False, description="Use graph-based execution for research flows"
248
+ )
249
+ ```
250
 
251
  ### Budget & Rate Limiting Configuration
252
 
 
265
 
266
  The budget configuration with validation:
267
 
268
+ ```87:105:src/utils/config.py
269
+ # Budget & Rate Limiting Configuration
270
+ default_token_limit: int = Field(
271
+ default=100000,
272
+ ge=1000,
273
+ le=1000000,
274
+ description="Default token budget per research loop",
275
+ )
276
+ default_time_limit_minutes: int = Field(
277
+ default=10,
278
+ ge=1,
279
+ le=120,
280
+ description="Default time limit per research loop (minutes)",
281
+ )
282
+ default_iterations_limit: int = Field(
283
+ default=10,
284
+ ge=1,
285
+ le=50,
286
+ description="Default iterations limit per research loop",
287
+ )
288
+ ```
289
 
290
  ### RAG Service Configuration
291
 
 
304
 
305
  The RAG configuration:
306
 
307
+ ```127:141:src/utils/config.py
308
+ # RAG Service Configuration
309
+ rag_collection_name: str = Field(
310
+ default="deepcritical_evidence",
311
+ description="ChromaDB collection name for RAG",
312
+ )
313
+ rag_similarity_top_k: int = Field(
314
+ default=5,
315
+ ge=1,
316
+ le=50,
317
+ description="Number of top results to retrieve from RAG",
318
+ )
319
+ rag_auto_ingest: bool = Field(
320
+ default=True,
321
+ description="Automatically ingest evidence into RAG",
322
+ )
323
+ ```
324
 
325
  ### ChromaDB Configuration
326
 
 
342
 
343
  The ChromaDB configuration:
344
 
345
+ ```113:125:src/utils/config.py
346
+ chroma_db_path: str = Field(default="./chroma_db", description="ChromaDB storage path")
347
+ chroma_db_persist: bool = Field(
348
+ default=True,
349
+ description="Whether to persist ChromaDB to disk",
350
+ )
351
+ chroma_db_host: str | None = Field(
352
+ default=None,
353
+ description="ChromaDB server host (for remote ChromaDB)",
354
+ )
355
+ chroma_db_port: int | None = Field(
356
+ default=None,
357
+ description="ChromaDB server port (for remote ChromaDB)",
358
+ )
359
+ ```
360
 
361
  ### External Services
362
 
 
374
 
375
  The Modal configuration:
376
 
377
+ ```110:112:src/utils/config.py
378
+ # External Services
379
+ modal_token_id: str | None = Field(default=None, description="Modal token ID")
380
+ modal_token_secret: str | None = Field(default=None, description="Modal token secret")
381
+ ```
382
 
383
  ### Logging Configuration
384
 
 
391
 
392
  The logging configuration:
393
 
394
+ ```107:108:src/utils/config.py
395
+ # Logging
396
+ log_level: Literal["DEBUG", "INFO", "WARNING", "ERROR"] = "INFO"
397
+ ```
398
 
399
  Logging is configured via the `configure_logging()` function:
400
 
401
+ ```212:231:src/utils/config.py
402
+ def configure_logging(settings: Settings) -> None:
403
+ """Configure structured logging with the configured log level."""
404
+ # Set stdlib logging level from settings
405
+ logging.basicConfig(
406
+ level=getattr(logging, settings.log_level),
407
+ format="%(message)s",
408
+ )
409
+
410
+ structlog.configure(
411
+ processors=[
412
+ structlog.stdlib.filter_by_level,
413
+ structlog.stdlib.add_logger_name,
414
+ structlog.stdlib.add_log_level,
415
+ structlog.processors.TimeStamper(fmt="iso"),
416
+ structlog.processors.JSONRenderer(),
417
+ ],
418
+ wrapper_class=structlog.stdlib.BoundLogger,
419
+ context_class=dict,
420
+ logger_factory=structlog.stdlib.LoggerFactory(),
421
+ )
422
+ ```
423
 
424
  ## Configuration Properties
425
 
 
429
 
430
  Check which API keys are available:
431
 
432
+ ```171:189:src/utils/config.py
433
+ @property
434
+ def has_openai_key(self) -> bool:
435
+ """Check if OpenAI API key is available."""
436
+ return bool(self.openai_api_key)
437
+
438
+ @property
439
+ def has_anthropic_key(self) -> bool:
440
+ """Check if Anthropic API key is available."""
441
+ return bool(self.anthropic_api_key)
442
+
443
+ @property
444
+ def has_huggingface_key(self) -> bool:
445
+ """Check if HuggingFace API key is available."""
446
+ return bool(self.huggingface_api_key or self.hf_token)
447
+
448
+ @property
449
+ def has_any_llm_key(self) -> bool:
450
+ """Check if any LLM API key is available."""
451
+ return self.has_openai_key or self.has_anthropic_key or self.has_huggingface_key
452
+ ```
453
 
454
  **Usage:**
455
 
 
478
 
479
  Check if external services are configured:
480
 
481
+ ```143:146:src/utils/config.py
482
+ @property
483
+ def modal_available(self) -> bool:
484
+ """Check if Modal credentials are configured."""
485
+ return bool(self.modal_token_id and self.modal_token_secret)
486
+ ```
487
 
488
+ ```191:204:src/utils/config.py
489
+ @property
490
+ def web_search_available(self) -> bool:
491
+ """Check if web search is available (either no-key provider or API key present)."""
492
+ if self.web_search_provider == "duckduckgo":
493
+ return True # No API key required
494
+ if self.web_search_provider == "serper":
495
+ return bool(self.serper_api_key)
496
+ if self.web_search_provider == "searchxng":
497
+ return bool(self.searchxng_host)
498
+ if self.web_search_provider == "brave":
499
+ return bool(self.brave_api_key)
500
+ if self.web_search_provider == "tavily":
501
+ return bool(self.tavily_api_key)
502
+ return False
503
+ ```
504
 
505
  **Usage:**
506
 
 
521
 
522
  Get the API key for the configured provider:
523
 
524
+ ```148:160:src/utils/config.py
525
+ def get_api_key(self) -> str:
526
+ """Get the API key for the configured provider."""
527
+ if self.llm_provider == "openai":
528
+ if not self.openai_api_key:
529
+ raise ConfigurationError("OPENAI_API_KEY not set")
530
+ return self.openai_api_key
531
+
532
+ if self.llm_provider == "anthropic":
533
+ if not self.anthropic_api_key:
534
+ raise ConfigurationError("ANTHROPIC_API_KEY not set")
535
+ return self.anthropic_api_key
536
+
537
+ raise ConfigurationError(f"Unknown LLM provider: {self.llm_provider}")
538
+ ```
539
 
540
  For OpenAI-specific operations (e.g., Magentic mode):
541
 
542
+ ```162:169:src/utils/config.py
543
+ def get_openai_api_key(self) -> str:
544
+ """Get OpenAI API key (required for Magentic function calling)."""
545
+ if not self.openai_api_key:
546
+ raise ConfigurationError(
547
+ "OPENAI_API_KEY not set. Magentic mode requires OpenAI for function calling. "
548
+ "Use mode='simple' for other providers."
549
+ )
550
+ return self.openai_api_key
551
+ ```
552
 
553
  ## Configuration Usage in Codebase
554
 
 
558
 
559
  The LLM factory uses settings to create appropriate models:
560
 
561
+ ```129:144:src/utils/llm_factory.py
562
+ if settings.llm_provider == "huggingface":
563
+ model_name = settings.huggingface_model or "meta-llama/Llama-3.1-8B-Instruct"
564
+ hf_provider = HuggingFaceProvider(api_key=settings.hf_token)
565
+ return HuggingFaceModel(model_name, provider=hf_provider)
566
+
567
+ if settings.llm_provider == "openai":
568
+ if not settings.openai_api_key:
569
+ raise ConfigurationError("OPENAI_API_KEY not set for pydantic-ai")
570
+ provider = OpenAIProvider(api_key=settings.openai_api_key)
571
+ return OpenAIModel(settings.openai_model, provider=provider)
572
+
573
+ if settings.llm_provider == "anthropic":
574
+ if not settings.anthropic_api_key:
575
+ raise ConfigurationError("ANTHROPIC_API_KEY not set for pydantic-ai")
576
+ anthropic_provider = AnthropicProvider(api_key=settings.anthropic_api_key)
577
+ return AnthropicModel(settings.anthropic_model, provider=anthropic_provider)
578
+ ```
579
 
580
  ### Embedding Service
581
 
582
  The embedding service uses local embedding model configuration:
583
 
584
+ ```29:31:src/services/embeddings.py
585
+ def __init__(self, model_name: str | None = None):
586
+ self._model_name = model_name or settings.local_embedding_model
587
+ self._model = SentenceTransformer(self._model_name)
588
+ ```
589
 
590
  ### Orchestrator Factory
591
 
592
  The orchestrator factory uses settings to determine mode:
593
 
594
+ ```69:80:src/orchestrator_factory.py
595
+ def _determine_mode(explicit_mode: str | None) -> str:
596
+ """Determine which mode to use."""
597
+ if explicit_mode:
598
+ if explicit_mode in ("magentic", "advanced"):
599
+ return "advanced"
600
+ return "simple"
601
+
602
+ # Auto-detect: advanced if paid API key available
603
+ if settings.has_openai_key:
604
+ return "advanced"
605
+
606
+ return "simple"
607
+ ```
608
 
609
  ## Environment Variables Reference
610
 
 
687
 
688
  The `max_iterations` field has range validation:
689
 
690
+ ```81:81:src/utils/config.py
691
+ max_iterations: int = Field(default=10, ge=1, le=50)
692
+ ```
693
 
694
  The `llm_provider` field has literal validation:
695
 
696
+ ```26:28:src/utils/config.py
697
+ llm_provider: Literal["openai", "anthropic", "huggingface"] = Field(
698
+ default="openai", description="Which LLM provider to use"
699
+ )
700
+ ```
701
 
702
  ## Error Handling
703
 
CONTRIBUTING.md → docs/contributing.md RENAMED
@@ -1,26 +1,24 @@
1
- # Contributing to The DETERMINATOR
2
 
3
- Thank you for your interest in contributing to The DETERMINATOR! This guide will help you get started.
4
 
5
  ## Table of Contents
6
 
7
  - [Git Workflow](#git-workflow)
8
  - [Getting Started](#getting-started)
9
  - [Development Commands](#development-commands)
 
 
 
 
 
 
 
10
  - [MCP Integration](#mcp-integration)
11
  - [Common Pitfalls](#common-pitfalls)
12
  - [Key Principles](#key-principles)
13
  - [Pull Request Process](#pull-request-process)
14
 
15
- > **Note**: Additional sections (Code Style, Error Handling, Testing, Implementation Patterns, Code Quality, and Prompt Engineering) are available as separate pages in the [documentation](https://deepcritical.github.io/GradioDemo/contributing/).
16
- > **Note on Project Names**: "The DETERMINATOR" is the product name, "DeepCritical" is the organization/project name, and "determinator" is the Python package name.
17
-
18
- ## Repository Information
19
-
20
- - **GitHub Repository**: [`DeepCritical/GradioDemo`](https://github.com/DeepCritical/GradioDemo) (source of truth, PRs, code review)
21
- - **HuggingFace Space**: [`DataQuests/DeepCritical`](https://huggingface.co/spaces/DataQuests/DeepCritical) (deployment/demo)
22
- - **Package Name**: `determinator` (Python package name in `pyproject.toml`)
23
-
24
  ## Git Workflow
25
 
26
  - `main`: Production-ready (GitHub)
@@ -29,31 +27,9 @@ Thank you for your interest in contributing to The DETERMINATOR! This guide will
29
  - **NEVER** push directly to `main` or `dev` on HuggingFace
30
  - GitHub is source of truth; HuggingFace is for deployment
31
 
32
- ### Dual Repository Setup
33
-
34
- This project uses a dual repository setup:
35
-
36
- - **GitHub (`DeepCritical/GradioDemo`)**: Source of truth for code, PRs, and code review
37
- - **HuggingFace (`DataQuests/DeepCritical`)**: Deployment target for the Gradio demo
38
-
39
- #### Remote Configuration
40
-
41
- When cloning, set up remotes as follows:
42
-
43
- ```bash
44
- # Clone from GitHub
45
- git clone https://github.com/DeepCritical/GradioDemo.git
46
- cd GradioDemo
47
-
48
- # Add HuggingFace remote (optional, for deployment)
49
- git remote add huggingface-upstream https://huggingface.co/spaces/DataQuests/DeepCritical
50
- ```
51
-
52
- **Important**: Never push directly to `main` or `dev` on HuggingFace. Always work through GitHub PRs. GitHub is the source of truth; HuggingFace is for deployment/demo only.
53
-
54
  ## Getting Started
55
 
56
- 1. **Fork the repository** on GitHub: [`DeepCritical/GradioDemo`](https://github.com/DeepCritical/GradioDemo)
57
  2. **Clone your fork**:
58
 
59
  ```bash
@@ -64,8 +40,7 @@ git remote add huggingface-upstream https://huggingface.co/spaces/DataQuests/Dee
64
  3. **Install dependencies**:
65
 
66
  ```bash
67
- uv sync --all-extras
68
- uv run pre-commit install
69
  ```
70
 
71
  4. **Create a feature branch**:
@@ -78,9 +53,7 @@ git remote add huggingface-upstream https://huggingface.co/spaces/DataQuests/Dee
78
  6. **Run checks**:
79
 
80
  ```bash
81
- uv run ruff check src tests
82
- uv run mypy src
83
- uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire
84
  ```
85
 
86
  7. **Commit and push**:
@@ -89,72 +62,22 @@ git remote add huggingface-upstream https://huggingface.co/spaces/DataQuests/Dee
89
  git commit -m "Description of changes"
90
  git push origin yourname-feature-name
91
  ```
92
-
93
  8. **Create a pull request** on GitHub
94
 
95
- ## Package Manager
96
-
97
- This project uses [`uv`](https://github.com/astral-sh/uv) as the package manager. All commands should be prefixed with `uv run` to ensure they run in the correct environment.
98
-
99
- ### Installation
100
-
101
- ```bash
102
- # Install uv if you haven't already (recommended: standalone installer)
103
- # Unix/macOS/Linux:
104
- curl -LsSf https://astral.sh/uv/install.sh | sh
105
-
106
- # Windows (PowerShell):
107
- powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
108
-
109
- # Alternative: pipx install uv
110
- # Or: pip install uv
111
-
112
- # Sync all dependencies including dev extras
113
- uv sync --all-extras
114
-
115
- # Install pre-commit hooks
116
- uv run pre-commit install
117
- ```
118
-
119
  ## Development Commands
120
 
121
  ```bash
122
- # Installation
123
- uv sync --all-extras # Install all dependencies including dev
124
- uv run pre-commit install # Install pre-commit hooks
125
-
126
- # Code Quality Checks (run all before committing)
127
- uv run ruff check src tests # Lint with ruff
128
- uv run ruff format src tests # Format with ruff
129
- uv run mypy src # Type checking
130
- uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire # Tests with coverage
131
-
132
- # Testing Commands
133
- uv run pytest tests/unit/ -v -m "not openai" -p no:logfire # Run unit tests (excludes OpenAI tests)
134
- uv run pytest tests/ -v -m "huggingface" -p no:logfire # Run HuggingFace tests
135
- uv run pytest tests/ -v -p no:logfire # Run all tests
136
- uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire # Tests with terminal coverage
137
- uv run pytest --cov=src --cov-report=html -p no:logfire # Generate HTML coverage report (opens htmlcov/index.html)
138
-
139
- # Documentation Commands
140
- uv run mkdocs build # Build documentation
141
- uv run mkdocs serve # Serve documentation locally (http://127.0.0.1:8000)
142
  ```
143
 
144
- ### Test Markers
145
-
146
- The project uses pytest markers to categorize tests. See [Testing Guidelines](docs/contributing/testing.md) for details:
147
-
148
- - `unit`: Unit tests (mocked, fast)
149
- - `integration`: Integration tests (real APIs)
150
- - `slow`: Slow tests
151
- - `openai`: Tests requiring OpenAI API key
152
- - `huggingface`: Tests requiring HuggingFace API key
153
- - `embedding_provider`: Tests requiring API-based embedding providers
154
- - `local_embeddings`: Tests using local embeddings
155
-
156
- **Note**: The `-p no:logfire` flag disables the logfire plugin to avoid conflicts during testing.
157
-
158
  ## Code Style & Conventions
159
 
160
  ### Type Safety
@@ -163,9 +86,11 @@ The project uses pytest markers to categorize tests. See [Testing Guidelines](do
163
  - Use `mypy --strict` compliance (no `Any` unless absolutely necessary)
164
  - Use `TYPE_CHECKING` imports for circular dependencies:
165
 
166
- <!--codeinclude-->
167
- [TYPE_CHECKING Import Pattern](../src/utils/citation_validator.py) start_line:8 end_line:11
168
- <!--/codeinclude-->
 
 
169
 
170
  ### Pydantic Models
171
 
@@ -200,10 +125,10 @@ result = await loop.run_in_executor(None, cpu_bound_function, args)
200
 
201
  ### Pre-commit
202
 
203
- - Pre-commit hooks run automatically on commit
204
  - Must pass: lint + typecheck + test-cov
205
- - Install hooks with: `uv run pre-commit install`
206
- - Note: `uv sync --all-extras` installs the pre-commit package, but you must run `uv run pre-commit install` separately to set up the git hooks
207
 
208
  ## Error Handling & Logging
209
 
@@ -211,9 +136,10 @@ result = await loop.run_in_executor(None, cpu_bound_function, args)
211
 
212
  Use custom exception hierarchy (`src/utils/exceptions.py`):
213
 
214
- <!--codeinclude-->
215
- [Exception Hierarchy](../src/utils/exceptions.py) start_line:4 end_line:31
216
- <!--/codeinclude-->
 
217
 
218
  ### Error Handling Rules
219
 
@@ -273,7 +199,7 @@ except httpx.HTTPError as e:
273
  1. Write failing test in `tests/unit/`
274
  2. Implement in `src/`
275
  3. Ensure test passes
276
- 4. Run checks: `uv run ruff check src tests && uv run mypy src && uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire`
277
 
278
  ### Test Examples
279
 
@@ -294,8 +220,7 @@ async def test_real_pubmed_search():
294
 
295
  ### Test Coverage
296
 
297
- - Run `uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire` for coverage report
298
- - Run `uv run pytest --cov=src --cov-report=html -p no:logfire` for HTML coverage report (opens `htmlcov/index.html`)
299
  - Aim for >80% coverage on critical paths
300
  - Exclude: `__init__.py`, `TYPE_CHECKING` blocks
301
 
@@ -339,9 +264,11 @@ class MySearchTool:
339
  - Lazy initialization for optional dependencies (e.g., embeddings, Modal)
340
  - Check requirements before initialization:
341
 
342
- <!--codeinclude-->
343
- [Check Magentic Requirements](../src/utils/llm_factory.py) start_line:152 end_line:170
344
- <!--/codeinclude-->
 
 
345
 
346
  ### State Management
347
 
@@ -353,9 +280,11 @@ class MySearchTool:
353
 
354
  Use `@lru_cache(maxsize=1)` for singletons:
355
 
356
- <!--codeinclude-->
357
- [Singleton Pattern Example](../src/services/statistical_analyzer.py) start_line:252 end_line:255
358
- <!--/codeinclude-->
 
 
359
 
360
  - Lazy initialization to avoid requiring dependencies at import time
361
 
@@ -369,9 +298,22 @@ Use `@lru_cache(maxsize=1)` for singletons:
369
 
370
  Example:
371
 
372
- <!--codeinclude-->
373
- [Search Method Docstring Example](../src/tools/pubmed.py) start_line:51 end_line:58
374
- <!--/codeinclude-->
 
 
 
 
 
 
 
 
 
 
 
 
 
375
 
376
  ### Code Comments
377
 
@@ -468,7 +410,7 @@ Example:
468
 
469
  ## Pull Request Process
470
 
471
- 1. Ensure all checks pass: `uv run ruff check src tests && uv run mypy src && uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire`
472
  2. Update documentation if needed
473
  3. Add tests for new features
474
  4. Update CHANGELOG if applicable
@@ -476,19 +418,11 @@ Example:
476
  6. Address review feedback
477
  7. Wait for approval before merging
478
 
479
- ## Project Structure
480
-
481
- - `src/`: Main source code
482
- - `tests/`: Test files (`unit/` and `integration/`)
483
- - `docs/`: Documentation source files (MkDocs)
484
- - `examples/`: Example usage scripts
485
- - `pyproject.toml`: Project configuration and dependencies
486
- - `.pre-commit-config.yaml`: Pre-commit hook configuration
487
-
488
  ## Questions?
489
 
490
- - Open an issue on [GitHub](https://github.com/DeepCritical/GradioDemo)
491
- - Check existing [documentation](https://deepcritical.github.io/GradioDemo/)
492
  - Review code examples in the codebase
493
 
494
- Thank you for contributing to The DETERMINATOR!
 
 
1
+ # Contributing to DeepCritical
2
 
3
+ Thank you for your interest in contributing to DeepCritical! This guide will help you get started.
4
 
5
  ## Table of Contents
6
 
7
  - [Git Workflow](#git-workflow)
8
  - [Getting Started](#getting-started)
9
  - [Development Commands](#development-commands)
10
+ - [Code Style & Conventions](#code-style--conventions)
11
+ - [Type Safety](#type-safety)
12
+ - [Error Handling & Logging](#error-handling--logging)
13
+ - [Testing Requirements](#testing-requirements)
14
+ - [Implementation Patterns](#implementation-patterns)
15
+ - [Code Quality & Documentation](#code-quality--documentation)
16
+ - [Prompt Engineering & Citation Validation](#prompt-engineering--citation-validation)
17
  - [MCP Integration](#mcp-integration)
18
  - [Common Pitfalls](#common-pitfalls)
19
  - [Key Principles](#key-principles)
20
  - [Pull Request Process](#pull-request-process)
21
 
 
 
 
 
 
 
 
 
 
22
  ## Git Workflow
23
 
24
  - `main`: Production-ready (GitHub)
 
27
  - **NEVER** push directly to `main` or `dev` on HuggingFace
28
  - GitHub is source of truth; HuggingFace is for deployment
29
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
  ## Getting Started
31
 
32
+ 1. **Fork the repository** on GitHub
33
  2. **Clone your fork**:
34
 
35
  ```bash
 
40
  3. **Install dependencies**:
41
 
42
  ```bash
43
+ make install
 
44
  ```
45
 
46
  4. **Create a feature branch**:
 
53
  6. **Run checks**:
54
 
55
  ```bash
56
+ make check
 
 
57
  ```
58
 
59
  7. **Commit and push**:
 
62
  git commit -m "Description of changes"
63
  git push origin yourname-feature-name
64
  ```
 
65
  8. **Create a pull request** on GitHub
66
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
67
  ## Development Commands
68
 
69
  ```bash
70
+ make install # Install dependencies + pre-commit
71
+ make check # Lint + typecheck + test (MUST PASS)
72
+ make test # Run unit tests
73
+ make lint # Run ruff
74
+ make format # Format with ruff
75
+ make typecheck # Run mypy
76
+ make test-cov # Test with coverage
77
+ make docs-build # Build documentation
78
+ make docs-serve # Serve documentation locally
 
 
 
 
 
 
 
 
 
 
 
79
  ```
80
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
81
  ## Code Style & Conventions
82
 
83
  ### Type Safety
 
86
  - Use `mypy --strict` compliance (no `Any` unless absolutely necessary)
87
  - Use `TYPE_CHECKING` imports for circular dependencies:
88
 
89
+ ```python
90
+ from typing import TYPE_CHECKING
91
+ if TYPE_CHECKING:
92
+ from src.services.embeddings import EmbeddingService
93
+ ```
94
 
95
  ### Pydantic Models
96
 
 
125
 
126
  ### Pre-commit
127
 
128
+ - Run `make check` before committing
129
  - Must pass: lint + typecheck + test-cov
130
+ - Pre-commit hooks installed via `make install`
131
+ - **CRITICAL**: Make sure you run the full pre-commit checks before opening a PR (not draft), otherwise Obstacle is the Way will lose his mind
132
 
133
  ## Error Handling & Logging
134
 
 
136
 
137
  Use custom exception hierarchy (`src/utils/exceptions.py`):
138
 
139
+ - `DeepCriticalError` (base)
140
+ - `SearchError` `RateLimitError`
141
+ - `JudgeError`
142
+ - `ConfigurationError`
143
 
144
  ### Error Handling Rules
145
 
 
199
  1. Write failing test in `tests/unit/`
200
  2. Implement in `src/`
201
  3. Ensure test passes
202
+ 4. Run `make check` (lint + typecheck + test)
203
 
204
  ### Test Examples
205
 
 
220
 
221
  ### Test Coverage
222
 
223
+ - Run `make test-cov` for coverage report
 
224
  - Aim for >80% coverage on critical paths
225
  - Exclude: `__init__.py`, `TYPE_CHECKING` blocks
226
 
 
264
  - Lazy initialization for optional dependencies (e.g., embeddings, Modal)
265
  - Check requirements before initialization:
266
 
267
+ ```python
268
+ def check_magentic_requirements() -> None:
269
+ if not settings.has_openai_key:
270
+ raise ConfigurationError("Magentic requires OpenAI")
271
+ ```
272
 
273
  ### State Management
274
 
 
280
 
281
  Use `@lru_cache(maxsize=1)` for singletons:
282
 
283
+ ```python
284
+ @lru_cache(maxsize=1)
285
+ def get_embedding_service() -> EmbeddingService:
286
+ return EmbeddingService()
287
+ ```
288
 
289
  - Lazy initialization to avoid requiring dependencies at import time
290
 
 
298
 
299
  Example:
300
 
301
+ ```python
302
+ async def search(self, query: str, max_results: int = 10) -> list[Evidence]:
303
+ """Search PubMed and return evidence.
304
+
305
+ Args:
306
+ query: The search query string
307
+ max_results: Maximum number of results to return
308
+
309
+ Returns:
310
+ List of Evidence objects
311
+
312
+ Raises:
313
+ SearchError: If the search fails
314
+ RateLimitError: If we hit rate limits
315
+ """
316
+ ```
317
 
318
  ### Code Comments
319
 
 
410
 
411
  ## Pull Request Process
412
 
413
+ 1. Ensure all checks pass: `make check`
414
  2. Update documentation if needed
415
  3. Add tests for new features
416
  4. Update CHANGELOG if applicable
 
418
  6. Address review feedback
419
  7. Wait for approval before merging
420
 
 
 
 
 
 
 
 
 
 
421
  ## Questions?
422
 
423
+ - Open an issue on GitHub
424
+ - Check existing documentation
425
  - Review code examples in the codebase
426
 
427
+ Thank you for contributing to DeepCritical!
428
+
docs/contributing/code-quality.md CHANGED
@@ -1,6 +1,6 @@
1
  # Code Quality & Documentation
2
 
3
- This document outlines code quality standards and documentation requirements for The DETERMINATOR.
4
 
5
  ## Linting
6
 
@@ -12,9 +12,6 @@ This document outlines code quality standards and documentation requirements for
12
  - `PLR2004`: Magic values (statistical constants)
13
  - `PLW0603`: Global statement (singleton pattern)
14
  - `PLC0415`: Lazy imports for optional dependencies
15
- - `E402`: Module level import not at top (needed for pytest.importorskip)
16
- - `E501`: Line too long (ignore line length violations)
17
- - `RUF100`: Unused noqa (version differences between local/CI)
18
 
19
  ## Type Checking
20
 
@@ -25,75 +22,12 @@ This document outlines code quality standards and documentation requirements for
25
 
26
  ## Pre-commit
27
 
28
- Pre-commit hooks run automatically on commit to ensure code quality. Configuration is in `.pre-commit-config.yaml`.
29
-
30
- ### Installation
31
-
32
- ```bash
33
- # Install dependencies (includes pre-commit package)
34
- uv sync --all-extras
35
-
36
- # Set up git hooks (must be run separately)
37
- uv run pre-commit install
38
- ```
39
-
40
- **Note**: `uv sync --all-extras` installs the pre-commit package, but you must run `uv run pre-commit install` separately to set up the git hooks.
41
-
42
- ### Pre-commit Hooks
43
-
44
- The following hooks run automatically on commit:
45
-
46
- 1. **ruff**: Lints code and fixes issues automatically
47
- - Runs on: `src/` (excludes `tests/`, `reference_repos/`)
48
- - Auto-fixes: Yes
49
-
50
- 2. **ruff-format**: Formats code with ruff
51
- - Runs on: `src/` (excludes `tests/`, `reference_repos/`)
52
- - Auto-fixes: Yes
53
-
54
- 3. **mypy**: Type checking
55
- - Runs on: `src/` (excludes `folder/`)
56
- - Additional dependencies: pydantic, pydantic-settings, tenacity, pydantic-ai
57
-
58
- 4. **pytest-unit**: Runs unit tests (excludes OpenAI and embedding_provider tests)
59
- - Runs: `tests/unit/` with `-m "not openai and not embedding_provider"`
60
- - Always runs: Yes (not just on changed files)
61
-
62
- 5. **pytest-local-embeddings**: Runs local embedding tests
63
- - Runs: `tests/` with `-m "local_embeddings"`
64
- - Always runs: Yes
65
-
66
- ### Manual Pre-commit Run
67
-
68
- To run pre-commit hooks manually (without committing):
69
-
70
- ```bash
71
- uv run pre-commit run --all-files
72
- ```
73
-
74
- ### Troubleshooting
75
-
76
- - **Hooks failing**: Fix the issues shown in the output, then commit again
77
- - **Skipping hooks**: Use `git commit --no-verify` (not recommended)
78
- - **Hook not running**: Ensure hooks are installed with `uv run pre-commit install`
79
- - **Type errors**: Check that all dependencies are installed with `uv sync --all-extras`
80
 
81
  ## Documentation
82
 
83
- ### Building Documentation
84
-
85
- Documentation is built using MkDocs. Source files are in `docs/`, and the configuration is in `mkdocs.yml`.
86
-
87
- ```bash
88
- # Build documentation
89
- uv run mkdocs build
90
-
91
- # Serve documentation locally (http://127.0.0.1:8000)
92
- uv run mkdocs serve
93
- ```
94
-
95
- The documentation site is published at: <https://deepcritical.github.io/GradioDemo/>
96
-
97
  ### Docstrings
98
 
99
  - Google-style docstrings for all public functions
@@ -102,9 +36,22 @@ The documentation site is published at: <https://deepcritical.github.io/GradioDe
102
 
103
  Example:
104
 
105
- <!--codeinclude-->
106
- [Search Method Docstring Example](../src/tools/pubmed.py) start_line:51 end_line:70
107
- <!--/codeinclude-->
 
 
 
 
 
 
 
 
 
 
 
 
 
108
 
109
  ### Code Comments
110
 
@@ -118,3 +65,13 @@ Example:
118
 
119
  - [Code Style](code-style.md) - Code style guidelines
120
  - [Testing](testing.md) - Testing guidelines
 
 
 
 
 
 
 
 
 
 
 
1
  # Code Quality & Documentation
2
 
3
+ This document outlines code quality standards and documentation requirements.
4
 
5
  ## Linting
6
 
 
12
  - `PLR2004`: Magic values (statistical constants)
13
  - `PLW0603`: Global statement (singleton pattern)
14
  - `PLC0415`: Lazy imports for optional dependencies
 
 
 
15
 
16
  ## Type Checking
17
 
 
22
 
23
  ## Pre-commit
24
 
25
+ - Run `make check` before committing
26
+ - Must pass: lint + typecheck + test-cov
27
+ - Pre-commit hooks installed via `make install`
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
 
29
  ## Documentation
30
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
31
  ### Docstrings
32
 
33
  - Google-style docstrings for all public functions
 
36
 
37
  Example:
38
 
39
+ ```python
40
+ async def search(self, query: str, max_results: int = 10) -> list[Evidence]:
41
+ """Search PubMed and return evidence.
42
+
43
+ Args:
44
+ query: The search query string
45
+ max_results: Maximum number of results to return
46
+
47
+ Returns:
48
+ List of Evidence objects
49
+
50
+ Raises:
51
+ SearchError: If the search fails
52
+ RateLimitError: If we hit rate limits
53
+ """
54
+ ```
55
 
56
  ### Code Comments
57
 
 
65
 
66
  - [Code Style](code-style.md) - Code style guidelines
67
  - [Testing](testing.md) - Testing guidelines
68
+
69
+
70
+
71
+
72
+
73
+
74
+
75
+
76
+
77
+
docs/contributing/code-style.md CHANGED
@@ -1,44 +1,6 @@
1
  # Code Style & Conventions
2
 
3
- This document outlines the code style and conventions for The DETERMINATOR.
4
-
5
- ## Package Manager
6
-
7
- This project uses [`uv`](https://github.com/astral-sh/uv) as the package manager. All commands should be prefixed with `uv run` to ensure they run in the correct environment.
8
-
9
- ### Installation
10
-
11
- ```bash
12
- # Install uv if you haven't already (recommended: standalone installer)
13
- # Unix/macOS/Linux:
14
- curl -LsSf https://astral.sh/uv/install.sh | sh
15
-
16
- # Windows (PowerShell):
17
- powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
18
-
19
- # Alternative: pipx install uv
20
- # Or: pip install uv
21
-
22
- # Sync all dependencies including dev extras
23
- uv sync --all-extras
24
- ```
25
-
26
- ### Running Commands
27
-
28
- All development commands should use `uv run` prefix:
29
-
30
- ```bash
31
- # Instead of: pytest tests/
32
- uv run pytest tests/
33
-
34
- # Instead of: ruff check src
35
- uv run ruff check src
36
-
37
- # Instead of: mypy src
38
- uv run mypy src
39
- ```
40
-
41
- This ensures commands run in the correct virtual environment managed by `uv`.
42
 
43
  ## Type Safety
44
 
@@ -46,9 +8,11 @@ This ensures commands run in the correct virtual environment managed by `uv`.
46
  - Use `mypy --strict` compliance (no `Any` unless absolutely necessary)
47
  - Use `TYPE_CHECKING` imports for circular dependencies:
48
 
49
- <!--codeinclude-->
50
- [TYPE_CHECKING Import Pattern](../src/utils/citation_validator.py) start_line:8 end_line:11
51
- <!--/codeinclude-->
 
 
52
 
53
  ## Pydantic Models
54
 
@@ -81,3 +45,13 @@ result = await loop.run_in_executor(None, cpu_bound_function, args)
81
 
82
  - [Error Handling](error-handling.md) - Error handling guidelines
83
  - [Implementation Patterns](implementation-patterns.md) - Common patterns
 
 
 
 
 
 
 
 
 
 
 
1
  # Code Style & Conventions
2
 
3
+ This document outlines the code style and conventions for DeepCritical.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
 
5
  ## Type Safety
6
 
 
8
  - Use `mypy --strict` compliance (no `Any` unless absolutely necessary)
9
  - Use `TYPE_CHECKING` imports for circular dependencies:
10
 
11
+ ```python
12
+ from typing import TYPE_CHECKING
13
+ if TYPE_CHECKING:
14
+ from src.services.embeddings import EmbeddingService
15
+ ```
16
 
17
  ## Pydantic Models
18
 
 
45
 
46
  - [Error Handling](error-handling.md) - Error handling guidelines
47
  - [Implementation Patterns](implementation-patterns.md) - Common patterns
48
+
49
+
50
+
51
+
52
+
53
+
54
+
55
+
56
+
57
+
docs/contributing/error-handling.md CHANGED
@@ -1,14 +1,15 @@
1
  # Error Handling & Logging
2
 
3
- This document outlines error handling and logging conventions for The DETERMINATOR.
4
 
5
  ## Exception Hierarchy
6
 
7
  Use custom exception hierarchy (`src/utils/exceptions.py`):
8
 
9
- <!--codeinclude-->
10
- [Exception Hierarchy](../src/utils/exceptions.py) start_line:4 end_line:31
11
- <!--/codeinclude-->
 
12
 
13
  ## Error Handling Rules
14
 
@@ -52,3 +53,13 @@ except httpx.HTTPError as e:
52
 
53
  - [Code Style](code-style.md) - Code style guidelines
54
  - [Testing](testing.md) - Testing guidelines
 
 
 
 
 
 
 
 
 
 
 
1
  # Error Handling & Logging
2
 
3
+ This document outlines error handling and logging conventions for DeepCritical.
4
 
5
  ## Exception Hierarchy
6
 
7
  Use custom exception hierarchy (`src/utils/exceptions.py`):
8
 
9
+ - `DeepCriticalError` (base)
10
+ - `SearchError` `RateLimitError`
11
+ - `JudgeError`
12
+ - `ConfigurationError`
13
 
14
  ## Error Handling Rules
15
 
 
53
 
54
  - [Code Style](code-style.md) - Code style guidelines
55
  - [Testing](testing.md) - Testing guidelines
56
+
57
+
58
+
59
+
60
+
61
+
62
+
63
+
64
+
65
+
docs/contributing/implementation-patterns.md CHANGED
@@ -1,6 +1,6 @@
1
  # Implementation Patterns
2
 
3
- This document outlines common implementation patterns used in The DETERMINATOR.
4
 
5
  ## Search Tools
6
 
@@ -40,9 +40,11 @@ class MySearchTool:
40
  - Lazy initialization for optional dependencies (e.g., embeddings, Modal)
41
  - Check requirements before initialization:
42
 
43
- <!--codeinclude-->
44
- [Check Magentic Requirements](../src/utils/llm_factory.py) start_line:152 end_line:170
45
- <!--/codeinclude-->
 
 
46
 
47
  ## State Management
48
 
@@ -54,9 +56,11 @@ class MySearchTool:
54
 
55
  Use `@lru_cache(maxsize=1)` for singletons:
56
 
57
- <!--codeinclude-->
58
- [Singleton Pattern Example](../src/services/statistical_analyzer.py) start_line:252 end_line:255
59
- <!--/codeinclude-->
 
 
60
 
61
  - Lazy initialization to avoid requiring dependencies at import time
62
 
@@ -65,3 +69,12 @@ Use `@lru_cache(maxsize=1)` for singletons:
65
  - [Code Style](code-style.md) - Code style guidelines
66
  - [Error Handling](error-handling.md) - Error handling guidelines
67
 
 
 
 
 
 
 
 
 
 
 
1
  # Implementation Patterns
2
 
3
+ This document outlines common implementation patterns used in DeepCritical.
4
 
5
  ## Search Tools
6
 
 
40
  - Lazy initialization for optional dependencies (e.g., embeddings, Modal)
41
  - Check requirements before initialization:
42
 
43
+ ```python
44
+ def check_magentic_requirements() -> None:
45
+ if not settings.has_openai_key:
46
+ raise ConfigurationError("Magentic requires OpenAI")
47
+ ```
48
 
49
  ## State Management
50
 
 
56
 
57
  Use `@lru_cache(maxsize=1)` for singletons:
58
 
59
+ ```python
60
+ @lru_cache(maxsize=1)
61
+ def get_embedding_service() -> EmbeddingService:
62
+ return EmbeddingService()
63
+ ```
64
 
65
  - Lazy initialization to avoid requiring dependencies at import time
66
 
 
69
  - [Code Style](code-style.md) - Code style guidelines
70
  - [Error Handling](error-handling.md) - Error handling guidelines
71
 
72
+
73
+
74
+
75
+
76
+
77
+
78
+
79
+
80
+
docs/contributing/index.md CHANGED
@@ -1,8 +1,6 @@
1
- # Contributing to The DETERMINATOR
2
 
3
- Thank you for your interest in contributing to The DETERMINATOR! This guide will help you get started.
4
-
5
- > **Note on Project Names**: "The DETERMINATOR" is the product name, "DeepCritical" is the organization/project name, and "determinator" is the Python package name.
6
 
7
  ## Git Workflow
8
 
@@ -12,138 +10,44 @@ Thank you for your interest in contributing to The DETERMINATOR! This guide will
12
  - **NEVER** push directly to `main` or `dev` on HuggingFace
13
  - GitHub is source of truth; HuggingFace is for deployment
14
 
15
- ## Repository Information
16
-
17
- - **GitHub Repository**: [`DeepCritical/GradioDemo`](https://github.com/DeepCritical/GradioDemo) (source of truth, PRs, code review)
18
- - **HuggingFace Space**: [`DataQuests/DeepCritical`](https://huggingface.co/spaces/DataQuests/DeepCritical) (deployment/demo)
19
- - **Package Name**: `determinator` (Python package name in `pyproject.toml`)
20
-
21
- ### Dual Repository Setup
22
-
23
- This project uses a dual repository setup:
24
-
25
- - **GitHub (`DeepCritical/GradioDemo`)**: Source of truth for code, PRs, and code review
26
- - **HuggingFace (`DataQuests/DeepCritical`)**: Deployment target for the Gradio demo
27
-
28
- #### Remote Configuration
29
-
30
- When cloning, set up remotes as follows:
31
-
32
- ```bash
33
- # Clone from GitHub
34
- git clone https://github.com/DeepCritical/GradioDemo.git
35
- cd GradioDemo
36
-
37
- # Add HuggingFace remote (optional, for deployment)
38
- git remote add huggingface-upstream https://huggingface.co/spaces/DataQuests/DeepCritical
39
- ```
40
-
41
- **Important**: Never push directly to `main` or `dev` on HuggingFace. Always work through GitHub PRs. GitHub is the source of truth; HuggingFace is for deployment/demo only.
42
-
43
- ## Package Manager
44
-
45
- This project uses [`uv`](https://github.com/astral-sh/uv) as the package manager. All commands should be prefixed with `uv run` to ensure they run in the correct environment.
46
-
47
- ### Installation
48
-
49
- ```bash
50
- # Install uv if you haven't already (recommended: standalone installer)
51
- # Unix/macOS/Linux:
52
- curl -LsSf https://astral.sh/uv/install.sh | sh
53
-
54
- # Windows (PowerShell):
55
- powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
56
-
57
- # Alternative: pipx install uv
58
- # Or: pip install uv
59
-
60
- # Sync all dependencies including dev extras
61
- uv sync --all-extras
62
-
63
- # Install pre-commit hooks
64
- uv run pre-commit install
65
- ```
66
-
67
  ## Development Commands
68
 
69
  ```bash
70
- # Installation
71
- uv sync --all-extras # Install all dependencies including dev
72
- uv run pre-commit install # Install pre-commit hooks
73
-
74
- # Code Quality Checks (run all before committing)
75
- uv run ruff check src tests # Lint with ruff
76
- uv run ruff format src tests # Format with ruff
77
- uv run mypy src # Type checking
78
- uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire # Tests with coverage
79
-
80
- # Testing Commands
81
- uv run pytest tests/unit/ -v -m "not openai" -p no:logfire # Run unit tests (excludes OpenAI tests)
82
- uv run pytest tests/ -v -m "huggingface" -p no:logfire # Run HuggingFace tests
83
- uv run pytest tests/ -v -p no:logfire # Run all tests
84
- uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire # Tests with terminal coverage
85
- uv run pytest --cov=src --cov-report=html -p no:logfire # Generate HTML coverage report (opens htmlcov/index.html)
86
-
87
- # Documentation Commands
88
- uv run mkdocs build # Build documentation
89
- uv run mkdocs serve # Serve documentation locally (http://127.0.0.1:8000)
90
  ```
91
 
92
- ### Test Markers
93
-
94
- The project uses pytest markers to categorize tests. See [Testing Guidelines](testing.md) for details:
95
-
96
- - `unit`: Unit tests (mocked, fast)
97
- - `integration`: Integration tests (real APIs)
98
- - `slow`: Slow tests
99
- - `openai`: Tests requiring OpenAI API key
100
- - `huggingface`: Tests requiring HuggingFace API key
101
- - `embedding_provider`: Tests requiring API-based embedding providers
102
- - `local_embeddings`: Tests using local embeddings
103
-
104
- **Note**: The `-p no:logfire` flag disables the logfire plugin to avoid conflicts during testing.
105
-
106
  ## Getting Started
107
 
108
- 1. **Fork the repository** on GitHub: [`DeepCritical/GradioDemo`](https://github.com/DeepCritical/GradioDemo)
109
-
110
  2. **Clone your fork**:
111
-
112
  ```bash
113
  git clone https://github.com/yourusername/GradioDemo.git
114
  cd GradioDemo
115
  ```
116
-
117
  3. **Install dependencies**:
118
-
119
  ```bash
120
- uv sync --all-extras
121
- uv run pre-commit install
122
  ```
123
-
124
  4. **Create a feature branch**:
125
-
126
  ```bash
127
  git checkout -b yourname-feature-name
128
  ```
129
-
130
  5. **Make your changes** following the guidelines below
131
-
132
  6. **Run checks**:
133
-
134
  ```bash
135
- uv run ruff check src tests
136
- uv run mypy src
137
- uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire
138
  ```
139
-
140
  7. **Commit and push**:
141
-
142
  ```bash
143
  git commit -m "Description of changes"
144
  git push origin yourname-feature-name
145
  ```
146
-
147
  8. **Create a pull request** on GitHub
148
 
149
  ## Development Guidelines
@@ -228,7 +132,7 @@ The project uses pytest markers to categorize tests. See [Testing Guidelines](te
228
 
229
  ## Pull Request Process
230
 
231
- 1. Ensure all checks pass: `uv run ruff check src tests && uv run mypy src && uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire`
232
  2. Update documentation if needed
233
  3. Add tests for new features
234
  4. Update CHANGELOG if applicable
@@ -236,19 +140,20 @@ The project uses pytest markers to categorize tests. See [Testing Guidelines](te
236
  6. Address review feedback
237
  7. Wait for approval before merging
238
 
239
- ## Project Structure
240
-
241
- - `src/`: Main source code
242
- - `tests/`: Test files (`unit/` and `integration/`)
243
- - `docs/`: Documentation source files (MkDocs)
244
- - `examples/`: Example usage scripts
245
- - `pyproject.toml`: Project configuration and dependencies
246
- - `.pre-commit-config.yaml`: Pre-commit hook configuration
247
-
248
  ## Questions?
249
 
250
- - Open an issue on [GitHub](https://github.com/DeepCritical/GradioDemo)
251
- - Check existing [documentation](https://deepcritical.github.io/GradioDemo/)
252
  - Review code examples in the codebase
253
 
254
- Thank you for contributing to The DETERMINATOR!
 
 
 
 
 
 
 
 
 
 
 
1
+ # Contributing to DeepCritical
2
 
3
+ Thank you for your interest in contributing to DeepCritical! This guide will help you get started.
 
 
4
 
5
  ## Git Workflow
6
 
 
10
  - **NEVER** push directly to `main` or `dev` on HuggingFace
11
  - GitHub is source of truth; HuggingFace is for deployment
12
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  ## Development Commands
14
 
15
  ```bash
16
+ make install # Install dependencies + pre-commit
17
+ make check # Lint + typecheck + test (MUST PASS)
18
+ make test # Run unit tests
19
+ make lint # Run ruff
20
+ make format # Format with ruff
21
+ make typecheck # Run mypy
22
+ make test-cov # Test with coverage
 
 
 
 
 
 
 
 
 
 
 
 
 
23
  ```
24
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
  ## Getting Started
26
 
27
+ 1. **Fork the repository** on GitHub
 
28
  2. **Clone your fork**:
 
29
  ```bash
30
  git clone https://github.com/yourusername/GradioDemo.git
31
  cd GradioDemo
32
  ```
 
33
  3. **Install dependencies**:
 
34
  ```bash
35
+ make install
 
36
  ```
 
37
  4. **Create a feature branch**:
 
38
  ```bash
39
  git checkout -b yourname-feature-name
40
  ```
 
41
  5. **Make your changes** following the guidelines below
 
42
  6. **Run checks**:
 
43
  ```bash
44
+ make check
 
 
45
  ```
 
46
  7. **Commit and push**:
 
47
  ```bash
48
  git commit -m "Description of changes"
49
  git push origin yourname-feature-name
50
  ```
 
51
  8. **Create a pull request** on GitHub
52
 
53
  ## Development Guidelines
 
132
 
133
  ## Pull Request Process
134
 
135
+ 1. Ensure all checks pass: `make check`
136
  2. Update documentation if needed
137
  3. Add tests for new features
138
  4. Update CHANGELOG if applicable
 
140
  6. Address review feedback
141
  7. Wait for approval before merging
142
 
 
 
 
 
 
 
 
 
 
143
  ## Questions?
144
 
145
+ - Open an issue on GitHub
146
+ - Check existing documentation
147
  - Review code examples in the codebase
148
 
149
+ Thank you for contributing to DeepCritical!
150
+
151
+
152
+
153
+
154
+
155
+
156
+
157
+
158
+
159
+
docs/contributing/prompt-engineering.md CHANGED
@@ -53,3 +53,13 @@ This document outlines prompt engineering guidelines and citation validation rul
53
 
54
  - [Code Quality](code-quality.md) - Code quality guidelines
55
  - [Error Handling](error-handling.md) - Error handling guidelines
 
 
 
 
 
 
 
 
 
 
 
53
 
54
  - [Code Quality](code-quality.md) - Code quality guidelines
55
  - [Error Handling](error-handling.md) - Error handling guidelines
56
+
57
+
58
+
59
+
60
+
61
+
62
+
63
+
64
+
65
+
docs/contributing/testing.md CHANGED
@@ -1,45 +1,12 @@
1
  # Testing Requirements
2
 
3
- This document outlines testing requirements and guidelines for The DETERMINATOR.
4
 
5
  ## Test Structure
6
 
7
  - Unit tests in `tests/unit/` (mocked, fast)
8
  - Integration tests in `tests/integration/` (real APIs, marked `@pytest.mark.integration`)
9
- - Use markers: `unit`, `integration`, `slow`, `openai`, `huggingface`, `embedding_provider`, `local_embeddings`
10
-
11
- ## Test Markers
12
-
13
- The project uses pytest markers to categorize tests. These markers are defined in `pyproject.toml`:
14
-
15
- - `@pytest.mark.unit`: Unit tests (mocked, fast) - Run with `-m "unit"`
16
- - `@pytest.mark.integration`: Integration tests (real APIs) - Run with `-m "integration"`
17
- - `@pytest.mark.slow`: Slow tests - Run with `-m "slow"`
18
- - `@pytest.mark.openai`: Tests requiring OpenAI API key - Run with `-m "openai"` or exclude with `-m "not openai"`
19
- - `@pytest.mark.huggingface`: Tests requiring HuggingFace API key or using HuggingFace models - Run with `-m "huggingface"`
20
- - `@pytest.mark.embedding_provider`: Tests requiring API-based embedding providers (OpenAI, etc.) - Run with `-m "embedding_provider"`
21
- - `@pytest.mark.local_embeddings`: Tests using local embeddings (sentence-transformers, ChromaDB) - Run with `-m "local_embeddings"`
22
-
23
- ### Running Tests by Marker
24
-
25
- ```bash
26
- # Run only unit tests (excludes OpenAI tests by default)
27
- uv run pytest tests/unit/ -v -m "not openai" -p no:logfire
28
-
29
- # Run HuggingFace tests
30
- uv run pytest tests/ -v -m "huggingface" -p no:logfire
31
-
32
- # Run all tests
33
- uv run pytest tests/ -v -p no:logfire
34
-
35
- # Run only local embedding tests
36
- uv run pytest tests/ -v -m "local_embeddings" -p no:logfire
37
-
38
- # Exclude slow tests
39
- uv run pytest tests/ -v -m "not slow" -p no:logfire
40
- ```
41
-
42
- **Note**: The `-p no:logfire` flag disables the logfire plugin to avoid conflicts during testing.
43
 
44
  ## Mocking
45
 
@@ -53,20 +20,7 @@ uv run pytest tests/ -v -m "not slow" -p no:logfire
53
  1. Write failing test in `tests/unit/`
54
  2. Implement in `src/`
55
  3. Ensure test passes
56
- 4. Run checks: `uv run ruff check src tests && uv run mypy src && uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire`
57
-
58
- ### Test Command Examples
59
-
60
- ```bash
61
- # Run unit tests (default, excludes OpenAI tests)
62
- uv run pytest tests/unit/ -v -m "not openai" -p no:logfire
63
-
64
- # Run HuggingFace tests
65
- uv run pytest tests/ -v -m "huggingface" -p no:logfire
66
-
67
- # Run all tests
68
- uv run pytest tests/ -v -p no:logfire
69
- ```
70
 
71
  ## Test Examples
72
 
@@ -87,29 +41,21 @@ async def test_real_pubmed_search():
87
 
88
  ## Test Coverage
89
 
90
- ### Terminal Coverage Report
 
 
 
 
 
 
 
 
91
 
92
- ```bash
93
- uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire
94
- ```
95
 
96
- This shows coverage with missing lines highlighted in the terminal output.
97
 
98
- ### HTML Coverage Report
99
 
100
- ```bash
101
- uv run pytest --cov=src --cov-report=html -p no:logfire
102
- ```
103
 
104
- This generates an HTML coverage report in `htmlcov/index.html`. Open this file in your browser to see detailed coverage information.
105
 
106
- ### Coverage Goals
107
 
108
- - Aim for >80% coverage on critical paths
109
- - Exclude: `__init__.py`, `TYPE_CHECKING` blocks
110
- - Coverage configuration is in `pyproject.toml` under `[tool.coverage.*]`
111
 
112
- ## See Also
113
 
114
- - [Code Style](code-style.md) - Code style guidelines
115
- - [Implementation Patterns](implementation-patterns.md) - Common patterns
 
1
  # Testing Requirements
2
 
3
+ This document outlines testing requirements and guidelines for DeepCritical.
4
 
5
  ## Test Structure
6
 
7
  - Unit tests in `tests/unit/` (mocked, fast)
8
  - Integration tests in `tests/integration/` (real APIs, marked `@pytest.mark.integration`)
9
+ - Use markers: `unit`, `integration`, `slow`
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
 
11
  ## Mocking
12
 
 
20
  1. Write failing test in `tests/unit/`
21
  2. Implement in `src/`
22
  3. Ensure test passes
23
+ 4. Run `make check` (lint + typecheck + test)
 
 
 
 
 
 
 
 
 
 
 
 
 
24
 
25
  ## Test Examples
26
 
 
41
 
42
  ## Test Coverage
43
 
44
+ - Run `make test-cov` for coverage report
45
+ - Aim for >80% coverage on critical paths
46
+ - Exclude: `__init__.py`, `TYPE_CHECKING` blocks
47
+
48
+ ## See Also
49
+
50
+ - [Code Style](code-style.md) - Code style guidelines
51
+ - [Implementation Patterns](implementation-patterns.md) - Common patterns
52
+
53
 
 
 
 
54
 
 
55
 
 
56
 
 
 
 
57
 
 
58
 
 
59
 
 
 
 
60
 
 
61
 
 
 
docs/getting-started/examples.md CHANGED
@@ -1,6 +1,6 @@
1
  # Examples
2
 
3
- This page provides examples of using The DETERMINATOR for various research tasks.
4
 
5
  ## Basic Research Query
6
 
@@ -11,7 +11,7 @@ This page provides examples of using The DETERMINATOR for various research tasks
11
  What are the latest treatments for Alzheimer's disease?
12
  ```
13
 
14
- **What The DETERMINATOR Does**:
15
  1. Searches PubMed for recent papers
16
  2. Searches ClinicalTrials.gov for active trials
17
  3. Evaluates evidence quality
@@ -24,8 +24,7 @@ What are the latest treatments for Alzheimer's disease?
24
  What clinical trials are investigating metformin for cancer prevention?
25
  ```
26
 
27
- **What The DETERMINATOR Does**:
28
-
29
  1. Searches ClinicalTrials.gov for relevant trials
30
  2. Searches PubMed for supporting literature
31
  3. Provides trial details and status
@@ -36,13 +35,12 @@ What clinical trials are investigating metformin for cancer prevention?
36
  ### Example 3: Comprehensive Review
37
 
38
  **Query**:
39
-
40
  ```
41
  Review the evidence for using metformin as an anti-aging intervention,
42
  including clinical trials, mechanisms of action, and safety profile.
43
  ```
44
 
45
- **What The DETERMINATOR Does**:
46
  1. Uses deep research mode (multi-section)
47
  2. Searches multiple sources in parallel
48
  3. Generates sections on:
@@ -58,7 +56,7 @@ including clinical trials, mechanisms of action, and safety profile.
58
  Test the hypothesis that regular exercise reduces Alzheimer's disease risk.
59
  ```
60
 
61
- **What The DETERMINATOR Does**:
62
  1. Generates testable hypotheses
63
  2. Searches for supporting/contradicting evidence
64
  3. Performs statistical analysis (if Modal configured)
@@ -102,13 +100,13 @@ from src.agent_factory.judges import create_judge_handler
102
  # Create orchestrator
103
  search_handler = SearchHandler()
104
  judge_handler = create_judge_handler()
105
- ```
 
 
 
 
 
106
 
107
- <!--codeinclude-->
108
- [Create Orchestrator](../src/orchestrator_factory.py) start_line:44 end_line:66
109
- <!--/codeinclude-->
110
-
111
- ```python
112
  # Run research query
113
  query = "What are the latest treatments for Alzheimer's disease?"
114
  async for event in orchestrator.run(query):
@@ -136,13 +134,13 @@ Single-loop research with search-judge-synthesize cycles:
136
 
137
  ```python
138
  from src.orchestrator.research_flow import IterativeResearchFlow
139
- ```
140
 
141
- <!--codeinclude-->
142
- [IterativeResearchFlow Initialization](../src/orchestrator/research_flow.py) start_line:56 end_line:77
143
- <!--/codeinclude-->
 
 
144
 
145
- ```python
146
  async for event in flow.run(query):
147
  # Handle events
148
  pass
@@ -154,13 +152,13 @@ Multi-section parallel research:
154
 
155
  ```python
156
  from src.orchestrator.research_flow import DeepResearchFlow
157
- ```
158
 
159
- <!--codeinclude-->
160
- [DeepResearchFlow Initialization](../src/orchestrator/research_flow.py) start_line:674 end_line:697
161
- <!--/codeinclude-->
 
 
162
 
163
- ```python
164
  async for event in flow.run(query):
165
  # Handle events
166
  pass
@@ -193,6 +191,15 @@ USE_GRAPH_EXECUTION=true
193
  ## Next Steps
194
 
195
  - Read the [Configuration Guide](../configuration/index.md) for all options
196
- - Explore the [Architecture Documentation](../architecture/graph_orchestration.md)
197
  - Check out the [API Reference](../api/agents.md) for programmatic usage
198
 
 
 
 
 
 
 
 
 
 
 
1
  # Examples
2
 
3
+ This page provides examples of using DeepCritical for various research tasks.
4
 
5
  ## Basic Research Query
6
 
 
11
  What are the latest treatments for Alzheimer's disease?
12
  ```
13
 
14
+ **What DeepCritical Does**:
15
  1. Searches PubMed for recent papers
16
  2. Searches ClinicalTrials.gov for active trials
17
  3. Evaluates evidence quality
 
24
  What clinical trials are investigating metformin for cancer prevention?
25
  ```
26
 
27
+ **What DeepCritical Does**:
 
28
  1. Searches ClinicalTrials.gov for relevant trials
29
  2. Searches PubMed for supporting literature
30
  3. Provides trial details and status
 
35
  ### Example 3: Comprehensive Review
36
 
37
  **Query**:
 
38
  ```
39
  Review the evidence for using metformin as an anti-aging intervention,
40
  including clinical trials, mechanisms of action, and safety profile.
41
  ```
42
 
43
+ **What DeepCritical Does**:
44
  1. Uses deep research mode (multi-section)
45
  2. Searches multiple sources in parallel
46
  3. Generates sections on:
 
56
  Test the hypothesis that regular exercise reduces Alzheimer's disease risk.
57
  ```
58
 
59
+ **What DeepCritical Does**:
60
  1. Generates testable hypotheses
61
  2. Searches for supporting/contradicting evidence
62
  3. Performs statistical analysis (if Modal configured)
 
100
  # Create orchestrator
101
  search_handler = SearchHandler()
102
  judge_handler = create_judge_handler()
103
+ orchestrator = create_orchestrator(
104
+ search_handler=search_handler,
105
+ judge_handler=judge_handler,
106
+ config={},
107
+ mode="advanced"
108
+ )
109
 
 
 
 
 
 
110
  # Run research query
111
  query = "What are the latest treatments for Alzheimer's disease?"
112
  async for event in orchestrator.run(query):
 
134
 
135
  ```python
136
  from src.orchestrator.research_flow import IterativeResearchFlow
 
137
 
138
+ flow = IterativeResearchFlow(
139
+ search_handler=search_handler,
140
+ judge_handler=judge_handler,
141
+ use_graph=False
142
+ )
143
 
 
144
  async for event in flow.run(query):
145
  # Handle events
146
  pass
 
152
 
153
  ```python
154
  from src.orchestrator.research_flow import DeepResearchFlow
 
155
 
156
+ flow = DeepResearchFlow(
157
+ search_handler=search_handler,
158
+ judge_handler=judge_handler,
159
+ use_graph=True
160
+ )
161
 
 
162
  async for event in flow.run(query):
163
  # Handle events
164
  pass
 
191
  ## Next Steps
192
 
193
  - Read the [Configuration Guide](../configuration/index.md) for all options
194
+ - Explore the [Architecture Documentation](../architecture/graph-orchestration.md)
195
  - Check out the [API Reference](../api/agents.md) for programmatic usage
196
 
197
+
198
+
199
+
200
+
201
+
202
+
203
+
204
+
205
+
docs/getting-started/installation.md CHANGED
@@ -12,29 +12,12 @@ This guide will help you install and set up DeepCritical on your system.
12
 
13
  ### 1. Install uv (Recommended)
14
 
15
- `uv` is a fast Python package installer and resolver. Install it using the standalone installer (recommended):
16
 
17
- **Unix/macOS/Linux:**
18
  ```bash
19
- curl -LsSf https://astral.sh/uv/install.sh | sh
20
- ```
21
-
22
- **Windows (PowerShell):**
23
- ```powershell
24
- powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
25
- ```
26
-
27
- **Alternative methods:**
28
- ```bash
29
- # Using pipx (recommended if you have pipx installed)
30
- pipx install uv
31
-
32
- # Or using pip
33
  pip install uv
34
  ```
35
 
36
- After installation, restart your terminal or add `~/.cargo/bin` to your PATH.
37
-
38
  ### 2. Clone the Repository
39
 
40
  ```bash
@@ -150,3 +133,12 @@ uv run pre-commit install
150
  - Learn about [MCP Integration](mcp-integration.md)
151
  - Explore [Examples](examples.md)
152
 
 
 
 
 
 
 
 
 
 
 
12
 
13
  ### 1. Install uv (Recommended)
14
 
15
+ `uv` is a fast Python package installer and resolver. Install it with:
16
 
 
17
  ```bash
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  pip install uv
19
  ```
20
 
 
 
21
  ### 2. Clone the Repository
22
 
23
  ```bash
 
133
  - Learn about [MCP Integration](mcp-integration.md)
134
  - Explore [Examples](examples.md)
135
 
136
+
137
+
138
+
139
+
140
+
141
+
142
+
143
+
144
+
docs/getting-started/mcp-integration.md CHANGED
@@ -1,10 +1,10 @@
1
  # MCP Integration
2
 
3
- The DETERMINATOR exposes a Model Context Protocol (MCP) server, allowing you to use its search tools directly from Claude Desktop or other MCP clients.
4
 
5
  ## What is MCP?
6
 
7
- The Model Context Protocol (MCP) is a standard for connecting AI assistants to external tools and data sources. The DETERMINATOR implements an MCP server that exposes its search capabilities as MCP tools.
8
 
9
  ## MCP Server URL
10
 
@@ -33,14 +33,14 @@ http://localhost:7860/gradio_api/mcp/
33
  ~/.config/Claude/claude_desktop_config.json
34
  ```
35
 
36
- ### 2. Add The DETERMINATOR Server
37
 
38
  Edit `claude_desktop_config.json` and add:
39
 
40
  ```json
41
  {
42
  "mcpServers": {
43
- "determinator": {
44
  "url": "http://localhost:7860/gradio_api/mcp/"
45
  }
46
  }
@@ -53,7 +53,7 @@ Close and restart Claude Desktop for changes to take effect.
53
 
54
  ### 4. Verify Connection
55
 
56
- In Claude Desktop, you should see The DETERMINATOR tools available:
57
  - `search_pubmed`
58
  - `search_clinical_trials`
59
  - `search_biorxiv`
@@ -198,6 +198,14 @@ You can configure multiple DeepCritical instances:
198
 
199
  - Learn about [Configuration](../configuration/index.md) for advanced settings
200
  - Explore [Examples](examples.md) for use cases
201
- - Read the [Architecture Documentation](../architecture/graph_orchestration.md)
 
 
 
 
 
 
 
 
202
 
203
 
 
1
  # MCP Integration
2
 
3
+ DeepCritical exposes a Model Context Protocol (MCP) server, allowing you to use its search tools directly from Claude Desktop or other MCP clients.
4
 
5
  ## What is MCP?
6
 
7
+ The Model Context Protocol (MCP) is a standard for connecting AI assistants to external tools and data sources. DeepCritical implements an MCP server that exposes its search capabilities as MCP tools.
8
 
9
  ## MCP Server URL
10
 
 
33
  ~/.config/Claude/claude_desktop_config.json
34
  ```
35
 
36
+ ### 2. Add DeepCritical Server
37
 
38
  Edit `claude_desktop_config.json` and add:
39
 
40
  ```json
41
  {
42
  "mcpServers": {
43
+ "deepcritical": {
44
  "url": "http://localhost:7860/gradio_api/mcp/"
45
  }
46
  }
 
53
 
54
  ### 4. Verify Connection
55
 
56
+ In Claude Desktop, you should see DeepCritical tools available:
57
  - `search_pubmed`
58
  - `search_clinical_trials`
59
  - `search_biorxiv`
 
198
 
199
  - Learn about [Configuration](../configuration/index.md) for advanced settings
200
  - Explore [Examples](examples.md) for use cases
201
+ - Read the [Architecture Documentation](../architecture/graph-orchestration.md)
202
+
203
+
204
+
205
+
206
+
207
+
208
+
209
+
210
 
211
 
docs/getting-started/quick-start.md CHANGED
@@ -1,47 +1,11 @@
1
- # Single Command Deploy
2
 
3
- Deploy with docker instandly with a single command :
4
-
5
- ```bash
6
- docker run -it -p 7860:7860 --platform=linux/amd64 \
7
- -e DB_KEY="YOUR_VALUE_HERE" \
8
- -e SERP_API="YOUR_VALUE_HERE" \
9
- -e INFERENCE_API="YOUR_VALUE_HERE" \
10
- -e MODAL_TOKEN_ID="YOUR_VALUE_HERE" \
11
- -e MODAL_TOKEN_SECRET="YOUR_VALUE_HERE" \
12
- -e NCBI_API_KEY="YOUR_VALUE_HERE" \
13
- -e SERPER_API_KEY="YOUR_VALUE_HERE" \
14
- -e CHROMA_DB_PATH="./chroma_db" \
15
- -e CHROMA_DB_HOST="localhost" \
16
- -e CHROMA_DB_PORT="8000" \
17
- -e RAG_COLLECTION_NAME="deepcritical_evidence" \
18
- -e RAG_SIMILARITY_TOP_K="5" \
19
- -e RAG_AUTO_INGEST="true" \
20
- -e USE_GRAPH_EXECUTION="false" \
21
- -e DEFAULT_TOKEN_LIMIT="100000" \
22
- -e DEFAULT_TIME_LIMIT_MINUTES="10" \
23
- -e DEFAULT_ITERATIONS_LIMIT="10" \
24
- -e WEB_SEARCH_PROVIDER="duckduckgo" \
25
- -e MAX_ITERATIONS="10" \
26
- -e SEARCH_TIMEOUT="30" \
27
- -e LOG_LEVEL="DEBUG" \
28
- -e EMBEDDING_PROVIDER="local" \
29
- -e OPENAI_EMBEDDING_MODEL="text-embedding-3-small" \
30
- -e LOCAL_EMBEDDING_MODEL="BAAI/bge-small-en-v1.5" \
31
- -e HUGGINGFACE_EMBEDDING_MODEL="sentence-transformers/all-MiniLM-L6-v2" \
32
- -e HF_FALLBACK_MODELS="Qwen/Qwen3-Next-80B-A3B-Thinking,Qwen/Qwen3-Next-80B-A3B-Instruct,meta-llama/Llama-3.3-70B-Instruct,meta-llama/Llama-3.1-8B-Instruct,HuggingFaceH4/zephyr-7b-beta,Qwen/Qwen2-7B-Instruct" \
33
- -e HUGGINGFACE_MODEL="Qwen/Qwen3-Next-80B-A3B-Thinking" \
34
- registry.hf.space/dataquests-deepcritical:latest python src/app.py
35
- ```
36
-
37
- ## Quick start guide
38
-
39
- Get up and running with The DETERMINATOR in minutes.
40
 
41
  ## Start the Application
42
 
43
  ```bash
44
- gradio src/app.py
45
  ```
46
 
47
  Open your browser to `http://localhost:7860`.
@@ -135,8 +99,17 @@ What are the active clinical trials investigating Alzheimer's disease treatments
135
 
136
  ## Next Steps
137
 
138
- - Learn about [MCP Integration](mcp-integration.md) to use The DETERMINATOR from Claude Desktop
139
  - Explore [Examples](examples.md) for more use cases
140
  - Read the [Configuration Guide](../configuration/index.md) for advanced settings
141
- - Check out the [Architecture Documentation](../architecture/graph_orchestration.md) to understand how it works
 
 
 
 
 
 
 
 
 
142
 
 
1
+ # Quick Start Guide
2
 
3
+ Get up and running with DeepCritical in minutes.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
 
5
  ## Start the Application
6
 
7
  ```bash
8
+ uv run gradio run src/app.py
9
  ```
10
 
11
  Open your browser to `http://localhost:7860`.
 
99
 
100
  ## Next Steps
101
 
102
+ - Learn about [MCP Integration](mcp-integration.md) to use DeepCritical from Claude Desktop
103
  - Explore [Examples](examples.md) for more use cases
104
  - Read the [Configuration Guide](../configuration/index.md) for advanced settings
105
+ - Check out the [Architecture Documentation](../architecture/graph-orchestration.md) to understand how it works
106
+
107
+
108
+
109
+
110
+
111
+
112
+
113
+
114
+
115
 
docs/index.md CHANGED
@@ -1,24 +1,12 @@
1
- # The DETERMINATOR
2
 
3
- **Generalist Deep Research Agent - Stops at Nothing Until Finding Precise Answers**
4
 
5
- The DETERMINATOR is a powerful generalist deep research agent system that uses iterative search-and-judge loops to comprehensively investigate any research question. It stops at nothing until finding precise answers, only stopping at configured limits (budget, time, iterations).
6
-
7
- **Key Features**:
8
- - **Generalist**: Handles queries from any domain (medical, technical, business, scientific, etc.)
9
- - **Automatic Source Selection**: Automatically determines if medical knowledge sources (PubMed, ClinicalTrials.gov) are needed
10
- - **Multi-Source Search**: Web search, PubMed, ClinicalTrials.gov, Europe PMC, RAG
11
- - **Iterative Refinement**: Continues searching and refining until precise answers are found
12
- - **Evidence Synthesis**: Comprehensive reports with proper citations
13
-
14
- **Important**: The DETERMINATOR is a research tool that synthesizes evidence. It cannot provide medical advice or answer medical questions directly.
15
 
16
  ## Features
17
 
18
- - **Generalist Research**: Handles any research question from any domain
19
- - **Automatic Medical Detection**: Automatically determines if medical knowledge sources are needed
20
- - **Multi-Source Search**: Web search, PubMed, ClinicalTrials.gov, Europe PMC (includes bioRxiv/medRxiv), RAG
21
- - **Iterative Until Precise**: Stops at nothing until finding precise answers (only stops at configured limits)
22
  - **MCP Integration**: Use our tools from Claude Desktop or any MCP client
23
  - **HuggingFace OAuth**: Sign in with your HuggingFace account to automatically use your API token
24
  - **Modal Sandbox**: Secure execution of AI-generated statistical code
@@ -30,15 +18,8 @@ The DETERMINATOR is a powerful generalist deep research agent system that uses i
30
  ## Quick Start
31
 
32
  ```bash
33
- # Install uv if you haven't already (recommended: standalone installer)
34
- # Unix/macOS/Linux:
35
- curl -LsSf https://astral.sh/uv/install.sh | sh
36
-
37
- # Windows (PowerShell):
38
- powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
39
-
40
- # Alternative: pipx install uv
41
- # Or: pip install uv
42
 
43
  # Sync dependencies
44
  uv sync
@@ -53,9 +34,9 @@ For detailed installation and setup instructions, see the [Getting Started Guide
53
 
54
  ## Architecture
55
 
56
- The DETERMINATOR uses a Vertical Slice Architecture:
57
 
58
- 1. **Search Slice**: Retrieving evidence from multiple sources (web, PubMed, ClinicalTrials.gov, Europe PMC, RAG) based on query analysis
59
  2. **Judge Slice**: Evaluating evidence quality using LLMs
60
  3. **Orchestrator Slice**: Managing the research loop and UI
61
 
@@ -73,7 +54,7 @@ Learn more about the [Architecture](overview/architecture.md).
73
  - [Getting Started](getting-started/installation.md) - Installation and setup
74
  - [Configuration](configuration/index.md) - Configuration guide
75
  - [API Reference](api/agents.md) - API documentation
76
- - [Contributing](contributing/index.md) - Development guidelines
77
 
78
  ## Links
79
 
 
1
+ # DeepCritical
2
 
3
+ **AI-Native Drug Repurposing Research Agent**
4
 
5
+ DeepCritical is a deep research agent system that uses iterative search-and-judge loops to comprehensively answer research questions. The system supports multiple orchestration patterns, graph-based execution, parallel research workflows, and long-running task management with real-time streaming.
 
 
 
 
 
 
 
 
 
6
 
7
  ## Features
8
 
9
+ - **Multi-Source Search**: PubMed, ClinicalTrials.gov, Europe PMC (includes bioRxiv/medRxiv)
 
 
 
10
  - **MCP Integration**: Use our tools from Claude Desktop or any MCP client
11
  - **HuggingFace OAuth**: Sign in with your HuggingFace account to automatically use your API token
12
  - **Modal Sandbox**: Secure execution of AI-generated statistical code
 
18
  ## Quick Start
19
 
20
  ```bash
21
+ # Install uv if you haven't already
22
+ pip install uv
 
 
 
 
 
 
 
23
 
24
  # Sync dependencies
25
  uv sync
 
34
 
35
  ## Architecture
36
 
37
+ DeepCritical uses a Vertical Slice Architecture:
38
 
39
+ 1. **Search Slice**: Retrieving evidence from PubMed, ClinicalTrials.gov, and Europe PMC
40
  2. **Judge Slice**: Evaluating evidence quality using LLMs
41
  3. **Orchestrator Slice**: Managing the research loop and UI
42
 
 
54
  - [Getting Started](getting-started/installation.md) - Installation and setup
55
  - [Configuration](configuration/index.md) - Configuration guide
56
  - [API Reference](api/agents.md) - API documentation
57
+ - [Contributing](contributing.md) - Development guidelines
58
 
59
  ## Links
60
 
docs/{LICENSE.md → license.md} RENAMED
File without changes
docs/overview/architecture.md CHANGED
@@ -1,6 +1,6 @@
1
  # Architecture Overview
2
 
3
- The DETERMINATOR is a powerful generalist deep research agent system that uses iterative search-and-judge loops to comprehensively investigate any research question. It stops at nothing until finding precise answers, only stopping at configured limits (budget, time, iterations). The system automatically determines if medical knowledge sources are needed and adapts its search strategy accordingly. It supports multiple orchestration patterns, graph-based execution, parallel research workflows, and long-running task management with real-time streaming.
4
 
5
  ## Core Architecture
6
 
@@ -134,11 +134,10 @@ The graph orchestrator (`src/orchestrator/graph_orchestrator.py`) implements a f
134
  - **Research Flows**: Iterative and deep research patterns (`src/orchestrator/research_flow.py`)
135
  - **Graph Builder**: Graph construction utilities (`src/agent_factory/graph_builder.py`)
136
  - **Agents**: Pydantic AI agents (`src/agents/`, `src/agent_factory/agents.py`)
137
- - **Search Tools**: Neo4j knowledge graph, PubMed, ClinicalTrials.gov, Europe PMC, Web search, RAG (`src/tools/`)
138
  - **Judge Handler**: LLM-based evidence assessment (`src/agent_factory/judges.py`)
139
  - **Embeddings**: Semantic search & deduplication (`src/services/embeddings.py`)
140
  - **Statistical Analyzer**: Modal sandbox execution (`src/services/statistical_analyzer.py`)
141
- - **Multimodal Processing**: Image OCR and audio STT/TTS services (`src/services/multimodal_processing.py`, `src/services/audio_processing.py`)
142
  - **Middleware**: State management, budget tracking, workflow coordination (`src/middleware/`)
143
  - **MCP Tools**: Claude Desktop integration (`src/mcp_tools.py`)
144
  - **Gradio UI**: Web interface with MCP server and streaming (`src/app.py`)
@@ -170,25 +169,24 @@ The system supports complex research workflows through:
170
 
171
  - **Orchestrator Factory** (`src/orchestrator_factory.py`):
172
  - Auto-detects mode: "advanced" if OpenAI key available, else "simple"
173
- - Supports explicit mode selection: "simple", "magentic" (alias for "advanced"), "advanced", "iterative", "deep", "auto"
174
  - Lazy imports for optional dependencies
175
 
176
- - **Orchestrator Modes** (selected in UI or via factory):
177
- - `simple`: Legacy linear search-judge loop (Free Tier)
178
- - `advanced` or `magentic`: Multi-agent coordination using Microsoft Agent Framework (requires OpenAI API key)
179
- - `iterative`: Knowledge-gap-driven research with single loop (Free Tier)
180
- - `deep`: Parallel section-based research with planning (Free Tier)
181
- - `auto`: Intelligent mode detection based on query complexity (Free Tier)
182
-
183
- - **Graph Research Modes** (used within graph orchestrator, separate from orchestrator mode):
184
- - `iterative`: Single research loop pattern
185
- - `deep`: Multi-section parallel research pattern
186
- - `auto`: Auto-detect pattern based on query complexity
187
 
188
  - **Execution Modes**:
189
  - `use_graph=True`: Graph-based execution (parallel, conditional routing)
190
  - `use_graph=False`: Agent chains (sequential, backward compatible)
191
 
192
- **Note**: The UI provides separate controls for orchestrator mode and graph research mode. When using graph-based orchestrators (iterative/deep/auto), the graph research mode determines the specific pattern used within the graph execution.
 
 
 
 
 
 
193
 
194
 
 
1
  # Architecture Overview
2
 
3
+ DeepCritical is a deep research agent system that uses iterative search-and-judge loops to comprehensively answer research questions. The system supports multiple orchestration patterns, graph-based execution, parallel research workflows, and long-running task management with real-time streaming.
4
 
5
  ## Core Architecture
6
 
 
134
  - **Research Flows**: Iterative and deep research patterns (`src/orchestrator/research_flow.py`)
135
  - **Graph Builder**: Graph construction utilities (`src/agent_factory/graph_builder.py`)
136
  - **Agents**: Pydantic AI agents (`src/agents/`, `src/agent_factory/agents.py`)
137
+ - **Search Tools**: PubMed, ClinicalTrials.gov, Europe PMC, RAG (`src/tools/`)
138
  - **Judge Handler**: LLM-based evidence assessment (`src/agent_factory/judges.py`)
139
  - **Embeddings**: Semantic search & deduplication (`src/services/embeddings.py`)
140
  - **Statistical Analyzer**: Modal sandbox execution (`src/services/statistical_analyzer.py`)
 
141
  - **Middleware**: State management, budget tracking, workflow coordination (`src/middleware/`)
142
  - **MCP Tools**: Claude Desktop integration (`src/mcp_tools.py`)
143
  - **Gradio UI**: Web interface with MCP server and streaming (`src/app.py`)
 
169
 
170
  - **Orchestrator Factory** (`src/orchestrator_factory.py`):
171
  - Auto-detects mode: "advanced" if OpenAI key available, else "simple"
172
+ - Supports explicit mode selection: "simple", "magentic", "advanced"
173
  - Lazy imports for optional dependencies
174
 
175
+ - **Research Modes**:
176
+ - `iterative`: Single research loop
177
+ - `deep`: Multi-section parallel research
178
+ - `auto`: Auto-detect based on query complexity
 
 
 
 
 
 
 
179
 
180
  - **Execution Modes**:
181
  - `use_graph=True`: Graph-based execution (parallel, conditional routing)
182
  - `use_graph=False`: Agent chains (sequential, backward compatible)
183
 
184
+
185
+
186
+
187
+
188
+
189
+
190
+
191
 
192
 
docs/overview/features.md CHANGED
@@ -1,32 +1,27 @@
1
  # Features
2
 
3
- The DETERMINATOR provides a comprehensive set of features for AI-assisted research:
4
 
5
  ## Core Features
6
 
7
  ### Multi-Source Search
8
 
9
- - **General Web Search**: Search general knowledge sources for any domain
10
- - **Neo4j Knowledge Graph**: Search structured knowledge graph for papers and disease relationships
11
- - **PubMed**: Search peer-reviewed biomedical literature via NCBI E-utilities (automatically used when medical knowledge needed)
12
- - **ClinicalTrials.gov**: Search interventional clinical trials (automatically used when medical knowledge needed)
13
  - **Europe PMC**: Search preprints and peer-reviewed articles (includes bioRxiv/medRxiv)
14
  - **RAG**: Semantic search within collected evidence using LlamaIndex
15
- - **Automatic Source Selection**: Automatically determines which sources are needed based on query analysis
16
 
17
  ### MCP Integration
18
 
19
  - **Model Context Protocol**: Expose search tools via MCP server
20
- - **Claude Desktop**: Use The DETERMINATOR tools directly from Claude Desktop
21
  - **MCP Clients**: Compatible with any MCP-compatible client
22
 
23
  ### Authentication
24
 
25
- - **REQUIRED**: Authentication is mandatory before using the application
26
- - **HuggingFace OAuth**: Sign in with HuggingFace account for automatic API token usage (recommended)
27
- - **Manual API Keys**: Support for HuggingFace API keys via environment variables (`HF_TOKEN` or `HUGGINGFACE_API_KEY`)
28
- - **Free Tier Support**: Automatic fallback to HuggingFace Inference API (public models) when no API key is available
29
- - **Authentication Check**: The application will display an error message if authentication is not provided
30
 
31
  ### Secure Code Execution
32
 
@@ -45,26 +40,9 @@ The DETERMINATOR provides a comprehensive set of features for AI-assisted resear
45
 
46
  - **Graph-Based Execution**: Flexible graph orchestration with conditional routing
47
  - **Parallel Research Loops**: Run multiple research tasks concurrently
48
- - **Iterative Research**: Single-loop research with search-judge-synthesize cycles that continues until precise answers are found
49
  - **Deep Research**: Multi-section parallel research with planning and synthesis
50
- - **Magentic Orchestration**: Multi-agent coordination using Microsoft Agent Framework (alias: "advanced" mode)
51
- - **Stops at Nothing**: Only stops at configured limits (budget, time, iterations), otherwise continues until finding precise answers
52
-
53
- **Orchestrator Modes**:
54
- - `simple`: Legacy linear search-judge loop
55
- - `advanced` (or `magentic`): Multi-agent coordination (requires OpenAI API key)
56
- - `iterative`: Knowledge-gap-driven research with single loop
57
- - `deep`: Parallel section-based research with planning
58
- - `auto`: Intelligent mode detection based on query complexity
59
-
60
- **Graph Research Modes** (used within graph orchestrator):
61
- - `iterative`: Single research loop pattern
62
- - `deep`: Multi-section parallel research pattern
63
- - `auto`: Auto-detect pattern based on query complexity
64
-
65
- **Execution Modes**:
66
- - `use_graph=True`: Graph-based execution with parallel and conditional routing
67
- - `use_graph=False`: Agent chains with sequential execution (backward compatible)
68
 
69
  ### Real-Time Streaming
70
 
@@ -86,16 +64,6 @@ The DETERMINATOR provides a comprehensive set of features for AI-assisted resear
86
  - **Conversation History**: Track iteration history and agent interactions
87
  - **State Synchronization**: Share evidence across parallel loops
88
 
89
- ### Multimodal Input & Output
90
-
91
- - **Image Input (OCR)**: Upload images and extract text using optical character recognition
92
- - **Audio Input (STT)**: Record or upload audio files and transcribe to text using speech-to-text
93
- - **Audio Output (TTS)**: Generate audio responses with text-to-speech synthesis
94
- - **Configurable Settings**: Enable/disable multimodal features via sidebar settings
95
- - **Voice Selection**: Choose from multiple TTS voices (American English: af_*, am_*)
96
- - **Speech Speed Control**: Adjust TTS speech speed (0.5x to 2.0x)
97
- - **Multimodal Processing Service**: Integrated service for processing images and audio files
98
-
99
  ## Advanced Features
100
 
101
  ### Agent System
@@ -137,12 +105,10 @@ The DETERMINATOR provides a comprehensive set of features for AI-assisted resear
137
 
138
  ### Gradio Interface
139
 
140
- - **Real-Time Chat**: Interactive chat interface with multimodal support
141
  - **Streaming Updates**: Live progress updates
142
  - **Accordion UI**: Organized display of pending/done operations
143
  - **OAuth Integration**: Seamless HuggingFace authentication
144
- - **Multimodal Input**: Support for text, images, and audio input in the same interface
145
- - **Sidebar Settings**: Configuration accordions for research, multimodal, and audio settings
146
 
147
  ### MCP Server
148
 
@@ -167,3 +133,12 @@ The DETERMINATOR provides a comprehensive set of features for AI-assisted resear
167
  - **Architecture Diagrams**: Visual architecture documentation
168
  - **API Reference**: Complete API documentation
169
 
 
 
 
 
 
 
 
 
 
 
1
  # Features
2
 
3
+ DeepCritical provides a comprehensive set of features for AI-assisted research:
4
 
5
  ## Core Features
6
 
7
  ### Multi-Source Search
8
 
9
+ - **PubMed**: Search peer-reviewed biomedical literature via NCBI E-utilities
10
+ - **ClinicalTrials.gov**: Search interventional clinical trials
 
 
11
  - **Europe PMC**: Search preprints and peer-reviewed articles (includes bioRxiv/medRxiv)
12
  - **RAG**: Semantic search within collected evidence using LlamaIndex
 
13
 
14
  ### MCP Integration
15
 
16
  - **Model Context Protocol**: Expose search tools via MCP server
17
+ - **Claude Desktop**: Use DeepCritical tools directly from Claude Desktop
18
  - **MCP Clients**: Compatible with any MCP-compatible client
19
 
20
  ### Authentication
21
 
22
+ - **HuggingFace OAuth**: Sign in with HuggingFace account for automatic API token usage
23
+ - **Manual API Keys**: Support for OpenAI, Anthropic, and HuggingFace API keys
24
+ - **Free Tier Support**: Automatic fallback to HuggingFace Inference API
 
 
25
 
26
  ### Secure Code Execution
27
 
 
40
 
41
  - **Graph-Based Execution**: Flexible graph orchestration with conditional routing
42
  - **Parallel Research Loops**: Run multiple research tasks concurrently
43
+ - **Iterative Research**: Single-loop research with search-judge-synthesize cycles
44
  - **Deep Research**: Multi-section parallel research with planning and synthesis
45
+ - **Magentic Orchestration**: Multi-agent coordination using Microsoft Agent Framework
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
46
 
47
  ### Real-Time Streaming
48
 
 
64
  - **Conversation History**: Track iteration history and agent interactions
65
  - **State Synchronization**: Share evidence across parallel loops
66
 
 
 
 
 
 
 
 
 
 
 
67
  ## Advanced Features
68
 
69
  ### Agent System
 
105
 
106
  ### Gradio Interface
107
 
108
+ - **Real-Time Chat**: Interactive chat interface
109
  - **Streaming Updates**: Live progress updates
110
  - **Accordion UI**: Organized display of pending/done operations
111
  - **OAuth Integration**: Seamless HuggingFace authentication
 
 
112
 
113
  ### MCP Server
114
 
 
133
  - **Architecture Diagrams**: Visual architecture documentation
134
  - **API Reference**: Complete API documentation
135
 
136
+
137
+
138
+
139
+
140
+
141
+
142
+
143
+
144
+