HAMMALE commited on
Commit
35bd451
ยท
0 Parent(s):

Initial ReAct Space: Compare Think-Only, Act-Only, and ReAct reasoning modes

Browse files
Files changed (4) hide show
  1. ARCHITECTURE.md +268 -0
  2. README.md +86 -0
  3. app.py +503 -0
  4. requirements.txt +5 -0
ARCHITECTURE.md ADDED
@@ -0,0 +1,268 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ๐Ÿ—๏ธ Architecture Overview
2
+
3
+ ## System Architecture
4
+
5
+ This Hugging Face Space implements a comparative agent system with three reasoning modes. Here's how everything works together:
6
+
7
+ ```
8
+ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
9
+ โ”‚ Gradio UI Layer โ”‚
10
+ โ”‚ - Question Input โ”‚
11
+ โ”‚ - Mode Selection (Think/Act/ReAct/All) โ”‚
12
+ โ”‚ - Three Output Panels (side-by-side comparison) โ”‚
13
+ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
14
+ โ”‚
15
+ โ–ผ
16
+ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
17
+ โ”‚ Agent Controller โ”‚
18
+ โ”‚ run_comparison() - Routes to appropriate mode handler โ”‚
19
+ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
20
+ โ”‚
21
+ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
22
+ โ–ผ โ–ผ โ–ผ
23
+ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
24
+ โ”‚ Think-Only โ”‚ โ”‚ Act-Only โ”‚ โ”‚ ReAct โ”‚
25
+ โ”‚ Mode โ”‚ โ”‚ Mode โ”‚ โ”‚ Mode โ”‚
26
+ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
27
+ โ”‚ โ”‚ โ”‚
28
+ โ–ผ โ–ผ โ–ผ
29
+ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
30
+ โ”‚ LLM Interface โ”‚
31
+ โ”‚ call_llm() - Communicates with openai/gpt-oss-20b โ”‚
32
+ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
33
+ โ”‚
34
+ โ–ผ (Act-Only & ReAct modes only)
35
+ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
36
+ โ”‚ Tool Executor โ”‚
37
+ โ”‚ - parse_action() โ”‚
38
+ โ”‚ - call_tool() โ”‚
39
+ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
40
+ โ”‚
41
+ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”
42
+ โ–ผ โ–ผ โ–ผ โ–ผ โ–ผ
43
+ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
44
+ โ”‚ DuckDuckGo โ”‚ โ”‚ Wikipedia โ”‚ โ”‚Weatherโ”‚ โ”‚Calcโ”‚ โ”‚ Python โ”‚
45
+ โ”‚ Search โ”‚ โ”‚ Search โ”‚ โ”‚ API โ”‚ โ”‚ โ”‚ โ”‚ REPL โ”‚
46
+ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
47
+ ```
48
+
49
+ ## Component Details
50
+
51
+ ### 1. **Tool Layer**
52
+
53
+ Each tool is wrapped in a `Tool` class with:
54
+ - **name**: Identifier for the LLM to reference
55
+ - **description**: Instructions for when/how to use the tool
56
+ - **func**: The actual implementation
57
+
58
+ **Tool Implementations:**
59
+
60
+ - `duckduckgo_search()`: Uses DuckDuckGo's JSON API
61
+ - `wikipedia_search()`: Uses the Wikipedia Python library
62
+ - `get_weather()`: Queries wttr.in API for weather data
63
+ - `calculate()`: Safe AST-based math expression evaluator
64
+ - `python_repl()`: Sandboxed Python execution with whitelisted builtins
65
+
66
+ ### 2. **Agent Modes**
67
+
68
+ #### Think-Only Mode (`think_only_mode`)
69
+ ```
70
+ User Question โ†’ System Prompt โ†’ LLM โ†’ Thoughts โ†’ Answer
71
+ ```
72
+ - Single LLM call with CoT prompt
73
+ - No tool access
74
+ - Shows reasoning steps
75
+ - Best for knowledge-based questions
76
+
77
+ #### Act-Only Mode (`act_only_mode`)
78
+ ```
79
+ User Question โ†’ System Prompt โ†’ LLM โ†’ Action
80
+ โ†“
81
+ Execute Tool โ†’ Observation
82
+ โ†“
83
+ LLM โ†’ Action/Answer
84
+ โ†“
85
+ ...
86
+ ```
87
+ - Iterative loop: Action โ†’ Observation
88
+ - No explicit "Thought" step
89
+ - Maximum 5 iterations
90
+ - Best for information gathering
91
+
92
+ #### ReAct Mode (`react_mode`)
93
+ ```
94
+ User Question โ†’ System Prompt โ†’ LLM โ†’ Thought โ†’ Action
95
+ โ†“
96
+ Execute Tool โ†’ Observation
97
+ โ†“
98
+ LLM โ†’ Thought โ†’ Action/Answer
99
+ โ†“
100
+ ...
101
+ ```
102
+ - Full Thought-Action-Observation cycle
103
+ - Most comprehensive reasoning
104
+ - Maximum 5 iterations
105
+ - Best for complex multi-step problems
106
+
107
+ ### 3. **LLM Interface**
108
+
109
+ **`call_llm()` Function:**
110
+ - Uses Hugging Face Inference API
111
+ - Model: openai/gpt-oss-20b
112
+ - Supports chat format (messages list)
113
+ - Configurable temperature and max_tokens
114
+
115
+ **Authentication:**
116
+ - Requires `HF_TOKEN` environment variable
117
+ - Set in Space secrets (secure)
118
+
119
+ ### 4. **Parsing & Control Flow**
120
+
121
+ **`parse_action()` Function:**
122
+ - Extracts `Action:` and `Action Input:` from LLM response
123
+ - Uses regex to handle various formats
124
+ - Returns (action_name, action_input) tuple
125
+
126
+ **Iteration Control:**
127
+ - Max 5 iterations per mode to prevent infinite loops
128
+ - Early termination when "Answer:" detected
129
+ - Error handling for malformed responses
130
+
131
+ ### 5. **UI Layer (Gradio)**
132
+
133
+ **Components:**
134
+ - **Input Section**: Question textbox + mode dropdown
135
+ - **Example Buttons**: Pre-filled question templates
136
+ - **Output Panels**: Three side-by-side Markdown displays
137
+ - **Streaming**: Generator functions for real-time updates
138
+
139
+ **User Flow:**
140
+ 1. User enters question or clicks example
141
+ 2. Selects mode (or "All" for comparison)
142
+ 3. Clicks "Run"
143
+ 4. Sees real-time updates in output panel(s)
144
+ 5. Views final answer and complete reasoning trace
145
+
146
+ ## Data Flow Example
147
+
148
+ ### Example: "What's the weather in Paris?"
149
+
150
+ **Mode: ReAct**
151
+
152
+ 1. User submits question
153
+ 2. `react_mode()` called with question
154
+ 3. Prompt formatted with question + tool descriptions
155
+ 4. First LLM call:
156
+ ```
157
+ Thought: I need to check the current weather in Paris
158
+ Action: get_weather
159
+ Action Input: Paris
160
+ ```
161
+ 5. `parse_action()` extracts tool call
162
+ 6. `call_tool("get_weather", "Paris")` executes
163
+ 7. Observation: "Weather in Paris: Cloudy, 15ยฐC..."
164
+ 8. Second LLM call with observation
165
+ 9. LLM responds:
166
+ ```
167
+ Thought: I have the weather information
168
+ Answer: The current weather in Paris is...
169
+ ```
170
+ 10. Generator yields formatted output to UI
171
+ 11. User sees complete trace in ReAct panel
172
+
173
+ ## Key Design Patterns
174
+
175
+ ### 1. **Generator Pattern for Streaming**
176
+ ```python
177
+ def mode(question: str) -> Generator[str, None, None]:
178
+ yield "Step 1..."
179
+ # process
180
+ yield "Step 2..."
181
+ # etc
182
+ ```
183
+ Enables real-time UI updates without blocking
184
+
185
+ ### 2. **Tool Registry Pattern**
186
+ ```python
187
+ TOOLS = [Tool(name, description, func), ...]
188
+ ```
189
+ Easy to add new tools - just append to list
190
+
191
+ ### 3. **Prompt Templates**
192
+ ```python
193
+ PROMPT = """...""".format(question=q, tools=t)
194
+ ```
195
+ Modular prompts for each mode
196
+
197
+ ### 4. **Safe Execution**
198
+ - AST parsing for calculator (no `eval()`)
199
+ - Whitelisted builtins for Python REPL
200
+ - Timeout limits on API calls
201
+ - Error handling with fallback messages
202
+
203
+ ## Extensibility
204
+
205
+ ### Adding a New Tool
206
+
207
+ ```python
208
+ def my_tool(input: str) -> str:
209
+ # Implementation
210
+ return result
211
+
212
+ TOOLS.append(Tool(
213
+ name="my_tool",
214
+ description="When to use this tool...",
215
+ func=my_tool
216
+ ))
217
+ ```
218
+
219
+ ### Adding a New Mode
220
+
221
+ ```python
222
+ def hybrid_mode(question: str) -> Generator[str, None, None]:
223
+ # Custom logic mixing elements
224
+ yield "Starting hybrid mode..."
225
+ # ...
226
+
227
+ # Add to run_comparison() and UI dropdown
228
+ ```
229
+
230
+ ### Customizing Prompts
231
+
232
+ Edit the `*_PROMPT` constants to change agent behavior:
233
+ - Add constraints
234
+ - Change format
235
+ - Provide examples
236
+ - Adjust tone
237
+
238
+ ## Performance Considerations
239
+
240
+ 1. **API Latency**: Model calls take 2-5 seconds
241
+ 2. **Tool Latency**: External APIs add 1-2 seconds per call
242
+ 3. **Iteration Count**: 5 iterations max = ~30 seconds worst case
243
+ 4. **Parallel Modes**: "All" mode runs sequentially (not parallel)
244
+
245
+ ## Security Notes
246
+
247
+ 1. **API Keys**: Never commit `HF_TOKEN` to repo
248
+ 2. **Python REPL**: Sandboxed with limited builtins
249
+ 3. **User Input**: Sanitized before tool execution
250
+ 4. **Rate Limits**: Consider adding rate limiting for production
251
+
252
+ ## Testing Strategy
253
+
254
+ 1. **Unit Tests**: Test individual tool functions
255
+ 2. **Integration Tests**: Test mode handlers end-to-end
256
+ 3. **Prompt Tests**: Verify LLM responses parse correctly
257
+ 4. **UI Tests**: Test Gradio interface components
258
+
259
+ ## Future Enhancements
260
+
261
+ - [ ] Add memory/conversation history
262
+ - [ ] Implement parallel tool calling
263
+ - [ ] Add caching layer for repeated queries
264
+ - [ ] Support custom user tools
265
+ - [ ] Add performance metrics/timing
266
+ - [ ] Implement token counting/cost tracking
267
+ - [ ] Add export functionality for reasoning traces
268
+
README.md ADDED
@@ -0,0 +1,86 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: ReAct - Reasoning Modes Comparison
3
+ emoji: ๐Ÿง 
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: gradio
7
+ sdk_version: 4.44.0
8
+ app_file: app.py
9
+ pinned: false
10
+ license: mit
11
+ ---
12
+
13
+ # ๐Ÿง  LLM Reasoning Modes Comparison
14
+
15
+ This Space demonstrates and compares three different reasoning paradigms for Large Language Models using **openai/gpt-oss-20b**:
16
+
17
+ ## ๐ŸŽฏ Reasoning Modes
18
+
19
+ ### 1. **Think-Only** (Chain-of-Thought)
20
+ - Uses internal reasoning and knowledge only
21
+ - Shows step-by-step thought process
22
+ - No external tool access
23
+ - Best for: Problems solvable with general knowledge
24
+
25
+ ### 2. **Act-Only** (Tool Use)
26
+ - Uses external tools to gather information
27
+ - Shows actions and observations only
28
+ - Minimal explicit reasoning
29
+ - Best for: Fact-checking and real-time data retrieval
30
+
31
+ ### 3. **ReAct** (Reasoning + Acting)
32
+ - Interleaves Thought โ†’ Action โ†’ Observation
33
+ - Combines reasoning with tool use
34
+ - Most comprehensive approach
35
+ - Best for: Complex problems requiring both reasoning and external data
36
+
37
+ ## ๐Ÿ› ๏ธ Available Tools
38
+
39
+ The agent has access to these real external tools:
40
+
41
+ - **๐Ÿ” DuckDuckGo Search**: Web search for current information
42
+ - **๐Ÿ“š Wikipedia Search**: Detailed encyclopedic knowledge
43
+ - **๐ŸŒค๏ธ Weather API**: Real-time weather data for any location
44
+ - **๐Ÿงฎ Calculator**: Safe mathematical expression evaluation
45
+ - **๐Ÿ Python REPL**: Execute Python code for data processing
46
+
47
+ ## ๐Ÿš€ How to Use
48
+
49
+ 1. Enter your question in the text box
50
+ 2. Select a reasoning mode (or "All" to compare)
51
+ 3. Click "Run" to see the agent work in real-time
52
+ 4. Watch as thoughts, actions, and observations unfold
53
+
54
+ ## ๐Ÿ“ Example Questions
55
+
56
+ - "What is the capital of France and what's the current weather there?"
57
+ - "Who wrote 'To Kill a Mockingbird' and when was it published?"
58
+ - "Calculate the compound interest on $1000 at 5% annual rate for 3 years"
59
+ - "What is the population of Tokyo and how does it compare to New York City?"
60
+
61
+ ## ๐Ÿ”ง Setup
62
+
63
+ To run this Space, you need to set your Hugging Face token:
64
+
65
+ 1. Go to Space Settings โ†’ Repository Secrets
66
+ 2. Add a secret named `HF_TOKEN` with your Hugging Face API token
67
+ 3. The Space will automatically use this token to access the model
68
+
69
+ ## ๐Ÿ“š Technical Details
70
+
71
+ - **Model**: openai/gpt-oss-20b (via Hugging Face Inference API)
72
+ - **Framework**: Gradio for the UI
73
+ - **Agent Format**: Inspired by smolagents/ReAct paradigm
74
+ - **Streaming**: Real-time display of intermediate steps
75
+
76
+ ## ๐ŸŽ“ Learn More
77
+
78
+ This implementation demonstrates the ReAct (Reason + Act) paradigm described in:
79
+ - Yao et al. (2022) "ReAct: Synergizing Reasoning and Acting in Language Models"
80
+
81
+ The three modes show how different combinations of reasoning and tool use affect problem-solving capabilities.
82
+
83
+ ## ๐Ÿ“„ License
84
+
85
+ MIT License - feel free to use and modify!
86
+
app.py ADDED
@@ -0,0 +1,503 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import re
3
+ import json
4
+ import gradio as gr
5
+ from typing import List, Dict, Any, Generator
6
+ import requests
7
+ from datetime import datetime
8
+ import ast
9
+ import operator as op
10
+ import wikipedia
11
+
12
+ # Tool implementations
13
+ class Tool:
14
+ def __init__(self, name: str, description: str, func):
15
+ self.name = name
16
+ self.description = description
17
+ self.func = func
18
+
19
+ def __call__(self, *args, **kwargs):
20
+ return self.func(*args, **kwargs)
21
+
22
+ def duckduckgo_search(query: str) -> str:
23
+ """Search DuckDuckGo for information."""
24
+ try:
25
+ url = "https://api.duckduckgo.com/"
26
+ params = {
27
+ 'q': query,
28
+ 'format': 'json',
29
+ 'no_html': 1,
30
+ 'skip_disambig': 1
31
+ }
32
+ response = requests.get(url, params=params, timeout=10)
33
+ data = response.json()
34
+
35
+ # Get abstract or first related topic
36
+ if data.get('Abstract'):
37
+ return f"Search result: {data['Abstract']}"
38
+ elif data.get('RelatedTopics') and len(data['RelatedTopics']) > 0:
39
+ results = []
40
+ for topic in data['RelatedTopics'][:3]:
41
+ if 'Text' in topic:
42
+ results.append(topic['Text'])
43
+ return f"Search results: {' | '.join(results)}" if results else "No results found."
44
+ else:
45
+ return "No results found."
46
+ except Exception as e:
47
+ return f"Search error: {str(e)}"
48
+
49
+ def wikipedia_search(query: str) -> str:
50
+ """Search Wikipedia for information."""
51
+ try:
52
+ wikipedia.set_lang("en")
53
+ # Get summary
54
+ summary = wikipedia.summary(query, sentences=3, auto_suggest=True)
55
+ return f"Wikipedia: {summary}"
56
+ except wikipedia.exceptions.DisambiguationError as e:
57
+ return f"Wikipedia: Multiple results found. Please be more specific. Options: {', '.join(e.options[:5])}"
58
+ except wikipedia.exceptions.PageError:
59
+ return f"Wikipedia: No page found for '{query}'."
60
+ except Exception as e:
61
+ return f"Wikipedia error: {str(e)}"
62
+
63
+ def get_weather(location: str) -> str:
64
+ """Get current weather for a location using wttr.in."""
65
+ try:
66
+ url = f"https://wttr.in/{location}?format=j1"
67
+ response = requests.get(url, timeout=10)
68
+ data = response.json()
69
+
70
+ current = data['current_condition'][0]
71
+ temp_c = current['temp_C']
72
+ temp_f = current['temp_F']
73
+ desc = current['weatherDesc'][0]['value']
74
+ humidity = current['humidity']
75
+ wind_speed = current['windspeedKmph']
76
+
77
+ return f"Weather in {location}: {desc}, {temp_c}ยฐC ({temp_f}ยฐF), Humidity: {humidity}%, Wind: {wind_speed} km/h"
78
+ except Exception as e:
79
+ return f"Weather error: {str(e)}"
80
+
81
+ def calculate(expression: str) -> str:
82
+ """Safely evaluate mathematical expressions."""
83
+ # Supported operators
84
+ operators = {
85
+ ast.Add: op.add,
86
+ ast.Sub: op.sub,
87
+ ast.Mult: op.mul,
88
+ ast.Div: op.truediv,
89
+ ast.Pow: op.pow,
90
+ ast.USub: op.neg,
91
+ ast.Mod: op.mod,
92
+ }
93
+
94
+ def eval_expr(node):
95
+ if isinstance(node, ast.Num):
96
+ return node.n
97
+ elif isinstance(node, ast.BinOp):
98
+ return operators[type(node.op)](eval_expr(node.left), eval_expr(node.right))
99
+ elif isinstance(node, ast.UnaryOp):
100
+ return operators[type(node.op)](eval_expr(node.operand))
101
+ elif isinstance(node, ast.Call):
102
+ # Support basic math functions
103
+ if node.func.id == 'abs':
104
+ return abs(eval_expr(node.args[0]))
105
+ elif node.func.id == 'round':
106
+ return round(eval_expr(node.args[0]))
107
+ else:
108
+ raise TypeError(node)
109
+
110
+ try:
111
+ # Clean the expression
112
+ expression = expression.strip()
113
+ # Parse and evaluate
114
+ node = ast.parse(expression, mode='eval')
115
+ result = eval_expr(node.body)
116
+ return f"Result: {result}"
117
+ except Exception as e:
118
+ return f"Calculation error: {str(e)}. Please use basic arithmetic operators (+, -, *, /, **, %)."
119
+
120
+ def python_repl(code: str) -> str:
121
+ """Execute safe Python code (limited to basic operations)."""
122
+ try:
123
+ # Whitelist of safe builtins
124
+ safe_builtins = {
125
+ 'abs': abs, 'round': round, 'min': min, 'max': max,
126
+ 'sum': sum, 'len': len, 'range': range, 'list': list,
127
+ 'dict': dict, 'str': str, 'int': int, 'float': float,
128
+ 'print': print, 'enumerate': enumerate, 'zip': zip,
129
+ 'sorted': sorted, 'reversed': reversed,
130
+ }
131
+
132
+ # Create restricted namespace
133
+ namespace = {'__builtins__': safe_builtins}
134
+
135
+ # Capture output
136
+ from io import StringIO
137
+ import sys
138
+ old_stdout = sys.stdout
139
+ sys.stdout = StringIO()
140
+
141
+ # Execute code
142
+ exec(code, namespace)
143
+
144
+ # Get output
145
+ output = sys.stdout.getvalue()
146
+ sys.stdout = old_stdout
147
+
148
+ # Also get any variables that were set
149
+ result_vars = {k: v for k, v in namespace.items() if k != '__builtins__' and not k.startswith('_')}
150
+
151
+ result = output if output else str(result_vars) if result_vars else "Code executed successfully (no output)"
152
+ return f"Python output: {result}"
153
+ except Exception as e:
154
+ return f"Python error: {str(e)}"
155
+
156
+ # Define tools
157
+ TOOLS = [
158
+ Tool(
159
+ name="duckduckgo_search",
160
+ description="Search the web using DuckDuckGo. Use this when you need current information or facts. Input should be a search query string.",
161
+ func=duckduckgo_search
162
+ ),
163
+ Tool(
164
+ name="wikipedia_search",
165
+ description="Search Wikipedia for detailed information about topics, people, places, etc. Input should be a search query string.",
166
+ func=wikipedia_search
167
+ ),
168
+ Tool(
169
+ name="get_weather",
170
+ description="Get current weather information for a location. Input should be a city name or location string.",
171
+ func=get_weather
172
+ ),
173
+ Tool(
174
+ name="calculate",
175
+ description="Perform mathematical calculations. Input should be a mathematical expression like '5 + 3 * 2' or '2 ** 10'.",
176
+ func=calculate
177
+ ),
178
+ Tool(
179
+ name="python_repl",
180
+ description="Execute Python code for data processing or calculations. Input should be valid Python code. Only basic operations are allowed.",
181
+ func=python_repl
182
+ ),
183
+ ]
184
+
185
+ # Create tool descriptions for prompt
186
+ def get_tool_descriptions() -> str:
187
+ descriptions = []
188
+ for tool in TOOLS:
189
+ descriptions.append(f"- {tool.name}: {tool.description}")
190
+ return "\n".join(descriptions)
191
+
192
+ # Agent prompts
193
+ THINK_ONLY_PROMPT = """You are a helpful AI assistant. You solve problems by thinking through them step-by-step.
194
+
195
+ For each question:
196
+ 1. Think through the problem carefully in your internal monologue
197
+ 2. Show your reasoning process using "Thought: ..." format
198
+ 3. Provide a final answer using "Answer: ..." format
199
+
200
+ You do NOT have access to any tools. Rely only on your knowledge and reasoning.
201
+
202
+ Question: {question}
203
+
204
+ Let's think step by step:"""
205
+
206
+ ACT_ONLY_PROMPT = """You are a helpful AI assistant with access to tools. You solve problems by using tools.
207
+
208
+ Available tools:
209
+ {tools}
210
+
211
+ For each question, you must use tools to find information. Do NOT think or reason - just use tools.
212
+
213
+ Format your response as:
214
+ Action: tool_name
215
+ Action Input: input_for_tool
216
+
217
+ After receiving the observation, you can call another tool or provide the final answer:
218
+ Answer: your final answer
219
+
220
+ Question: {question}
221
+
222
+ Action:"""
223
+
224
+ REACT_PROMPT = """You are a helpful AI assistant that can think and use tools. You solve problems by alternating between Thought, Action, and Observation.
225
+
226
+ Available tools:
227
+ {tools}
228
+
229
+ For each question, follow this pattern:
230
+ Thought: Think about what you need to do next
231
+ Action: tool_name
232
+ Action Input: input_for_tool
233
+ Observation: [tool result will be provided]
234
+ ... (repeat Thought/Action/Observation as needed)
235
+ Thought: I now know the final answer
236
+ Answer: your final answer
237
+
238
+ Question: {question}
239
+
240
+ Thought:"""
241
+
242
+ def parse_action(text: str) -> tuple:
243
+ """Parse action and action input from model output."""
244
+ action_pattern = r'Action:\s*(\w+)'
245
+ input_pattern = r'Action Input:\s*(.+?)(?=\n(?:Thought:|Action:|Answer:|$))'
246
+
247
+ action_match = re.search(action_pattern, text, re.IGNORECASE)
248
+ input_match = re.search(input_pattern, text, re.IGNORECASE | re.DOTALL)
249
+
250
+ if action_match and input_match:
251
+ action_name = action_match.group(1).strip()
252
+ action_input = input_match.group(1).strip()
253
+ return action_name, action_input
254
+ return None, None
255
+
256
+ def call_tool(tool_name: str, tool_input: str) -> str:
257
+ """Call a tool by name."""
258
+ for tool in TOOLS:
259
+ if tool.name.lower() == tool_name.lower():
260
+ return tool(tool_input)
261
+ return f"Error: Tool '{tool_name}' not found. Available tools: {', '.join([t.name for t in TOOLS])}"
262
+
263
+ def call_llm(messages: List[Dict], temperature: float = 0.7, max_tokens: int = 500) -> str:
264
+ """Call the LLM API."""
265
+ try:
266
+ api_key = os.environ.get("HF_TOKEN")
267
+ if not api_key:
268
+ return "Error: HF_TOKEN not found. Please set your Hugging Face token."
269
+
270
+ url = "https://api-inference.huggingface.co/models/openai/gpt-oss-20b/v1/chat/completions"
271
+ headers = {
272
+ "Authorization": f"Bearer {api_key}",
273
+ "Content-Type": "application/json"
274
+ }
275
+
276
+ payload = {
277
+ "model": "openai/gpt-oss-20b",
278
+ "messages": messages,
279
+ "temperature": temperature,
280
+ "max_tokens": max_tokens,
281
+ "stream": False
282
+ }
283
+
284
+ response = requests.post(url, headers=headers, json=payload, timeout=30)
285
+
286
+ if response.status_code == 200:
287
+ result = response.json()
288
+ return result['choices'][0]['message']['content']
289
+ else:
290
+ return f"API Error {response.status_code}: {response.text}"
291
+ except Exception as e:
292
+ return f"Error calling LLM: {str(e)}"
293
+
294
+ def think_only_mode(question: str) -> Generator[str, None, None]:
295
+ """Think-Only mode: Chain-of-Thought only, no tools."""
296
+ prompt = THINK_ONLY_PROMPT.format(question=question)
297
+ messages = [{"role": "user", "content": prompt}]
298
+
299
+ yield "**Mode: Think-Only (Chain-of-Thought)**\n\n"
300
+ yield "๐Ÿค” Generating thoughts...\n\n"
301
+
302
+ response = call_llm(messages, temperature=0.7, max_tokens=800)
303
+
304
+ # Parse and format the response
305
+ lines = response.split('\n')
306
+ for line in lines:
307
+ if line.strip():
308
+ if line.strip().startswith('Thought:'):
309
+ yield f"๐Ÿ’ญ **{line.strip()}**\n\n"
310
+ elif line.strip().startswith('Answer:'):
311
+ yield f"โœ… **{line.strip()}**\n\n"
312
+ else:
313
+ yield f"{line}\n\n"
314
+
315
+ yield "\n---\n**Mode completed**\n"
316
+
317
+ def act_only_mode(question: str, max_iterations: int = 5) -> Generator[str, None, None]:
318
+ """Act-Only mode: Tool use only, no explicit thinking."""
319
+ tool_descriptions = get_tool_descriptions()
320
+ prompt = ACT_ONLY_PROMPT.format(question=question, tools=tool_descriptions)
321
+
322
+ yield "**Mode: Act-Only (Tool Use Only)**\n\n"
323
+
324
+ messages = [{"role": "user", "content": prompt}]
325
+ iteration = 0
326
+
327
+ while iteration < max_iterations:
328
+ iteration += 1
329
+
330
+ response = call_llm(messages, temperature=0.5, max_tokens=300)
331
+
332
+ # Check for final answer
333
+ if 'Answer:' in response:
334
+ answer_match = re.search(r'Answer:\s*(.+)', response, re.IGNORECASE | re.DOTALL)
335
+ if answer_match:
336
+ yield f"โœ… **Answer:** {answer_match.group(1).strip()}\n\n"
337
+ break
338
+
339
+ # Parse action
340
+ action_name, action_input = parse_action(response)
341
+
342
+ if action_name and action_input:
343
+ yield f"๐Ÿ”ง **Action:** {action_name}\n"
344
+ yield f"๐Ÿ“ **Action Input:** {action_input}\n\n"
345
+
346
+ # Execute tool
347
+ observation = call_tool(action_name, action_input)
348
+ yield f"๐Ÿ‘๏ธ **Observation:** {observation}\n\n"
349
+
350
+ # Add to conversation
351
+ messages.append({"role": "assistant", "content": response})
352
+ messages.append({"role": "user", "content": f"Observation: {observation}\n\nContinue with another action or provide the final answer."})
353
+ else:
354
+ yield f"โš ๏ธ Could not parse action from response. Response: {response}\n\n"
355
+ break
356
+
357
+ if iteration >= max_iterations:
358
+ yield "โš ๏ธ **Reached maximum iterations.**\n\n"
359
+
360
+ yield "\n---\n**Mode completed**\n"
361
+
362
+ def react_mode(question: str, max_iterations: int = 5) -> Generator[str, None, None]:
363
+ """ReAct mode: Interleaving Thought, Action, Observation."""
364
+ tool_descriptions = get_tool_descriptions()
365
+ prompt = REACT_PROMPT.format(question=question, tools=tool_descriptions)
366
+
367
+ yield "**Mode: ReAct (Thought + Action + Observation)**\n\n"
368
+
369
+ messages = [{"role": "user", "content": prompt}]
370
+ iteration = 0
371
+
372
+ while iteration < max_iterations:
373
+ iteration += 1
374
+
375
+ response = call_llm(messages, temperature=0.7, max_tokens=400)
376
+
377
+ # Parse thoughts
378
+ thought_matches = re.findall(r'Thought:\s*(.+?)(?=\n(?:Action:|Answer:|$))', response, re.IGNORECASE | re.DOTALL)
379
+ for thought in thought_matches:
380
+ yield f"๐Ÿ’ญ **Thought:** {thought.strip()}\n\n"
381
+
382
+ # Check for final answer
383
+ if 'Answer:' in response:
384
+ answer_match = re.search(r'Answer:\s*(.+)', response, re.IGNORECASE | re.DOTALL)
385
+ if answer_match:
386
+ yield f"โœ… **Answer:** {answer_match.group(1).strip()}\n\n"
387
+ break
388
+
389
+ # Parse action
390
+ action_name, action_input = parse_action(response)
391
+
392
+ if action_name and action_input:
393
+ yield f"๐Ÿ”ง **Action:** {action_name}\n"
394
+ yield f"๐Ÿ“ **Action Input:** {action_input}\n\n"
395
+
396
+ # Execute tool
397
+ observation = call_tool(action_name, action_input)
398
+ yield f"๐Ÿ‘๏ธ **Observation:** {observation}\n\n"
399
+
400
+ # Add to conversation
401
+ messages.append({"role": "assistant", "content": response})
402
+ messages.append({"role": "user", "content": f"Observation: {observation}\n\nThought:"})
403
+ else:
404
+ # If no action but also no answer, there might be an issue
405
+ if 'Answer:' not in response:
406
+ yield f"โš ๏ธ No action found. Response: {response}\n\n"
407
+ break
408
+
409
+ if iteration >= max_iterations:
410
+ yield "โš ๏ธ **Reached maximum iterations.**\n\n"
411
+
412
+ yield "\n---\n**Mode completed**\n"
413
+
414
+ # Example questions
415
+ EXAMPLES = [
416
+ "What is the capital of France and what's the current weather there?",
417
+ "Who wrote 'To Kill a Mockingbird' and when was it published?",
418
+ "Calculate the compound interest on $1000 at 5% annual rate for 3 years using the formula A = P(1 + r)^t",
419
+ "What is the population of Tokyo and how does it compare to New York City?",
420
+ "If I have a list of numbers [15, 23, 8, 42, 16], what is the average and which number is closest to it?",
421
+ "What are the main causes of climate change according to scientific consensus?",
422
+ ]
423
+
424
+ def run_comparison(question: str, mode: str):
425
+ """Run the selected mode(s)."""
426
+ if mode == "Think-Only":
427
+ return think_only_mode(question), "", ""
428
+ elif mode == "Act-Only":
429
+ return "", act_only_mode(question), ""
430
+ elif mode == "ReAct":
431
+ return "", "", react_mode(question)
432
+ elif mode == "All (Compare)":
433
+ return think_only_mode(question), act_only_mode(question), react_mode(question)
434
+ else:
435
+ return "Invalid mode selected.", "", ""
436
+
437
+ # Gradio Interface
438
+ with gr.Blocks(title="LLM Reasoning Modes Comparison", theme=gr.themes.Soft()) as demo:
439
+ gr.Markdown("""
440
+ # ๐Ÿง  LLM Reasoning Modes Comparison
441
+
442
+ Compare three reasoning approaches using **openai/gpt-oss-20b**:
443
+
444
+ - **Think-Only**: Chain-of-Thought reasoning only (no tools)
445
+ - **Act-Only**: Tool use only (no explicit reasoning)
446
+ - **ReAct**: Interleaved Thought โ†’ Action โ†’ Observation
447
+
448
+ ### Available Tools:
449
+ ๐Ÿ” DuckDuckGo Search | ๐Ÿ“š Wikipedia | ๐ŸŒค๏ธ Weather API | ๐Ÿงฎ Calculator | ๐Ÿ Python REPL
450
+ """)
451
+
452
+ with gr.Row():
453
+ with gr.Column(scale=3):
454
+ question_input = gr.Textbox(
455
+ label="Enter your question",
456
+ placeholder="Ask a question that might require tools or reasoning...",
457
+ lines=3
458
+ )
459
+ mode_dropdown = gr.Dropdown(
460
+ choices=["Think-Only", "Act-Only", "ReAct", "All (Compare)"],
461
+ value="All (Compare)",
462
+ label="Select Mode"
463
+ )
464
+ submit_btn = gr.Button("๐Ÿš€ Run", variant="primary", size="lg")
465
+
466
+ with gr.Column(scale=1):
467
+ gr.Markdown("### ๐Ÿ“ Example Questions")
468
+ for idx, example in enumerate(EXAMPLES):
469
+ gr.Button(f"Ex {idx+1}", size="sm").click(
470
+ fn=lambda ex=example: ex,
471
+ outputs=question_input
472
+ )
473
+
474
+ gr.Markdown("---")
475
+
476
+ with gr.Row():
477
+ with gr.Column():
478
+ think_output = gr.Markdown(label="Think-Only Output")
479
+ with gr.Column():
480
+ act_output = gr.Markdown(label="Act-Only Output")
481
+ with gr.Column():
482
+ react_output = gr.Markdown(label="ReAct Output")
483
+
484
+ submit_btn.click(
485
+ fn=run_comparison,
486
+ inputs=[question_input, mode_dropdown],
487
+ outputs=[think_output, act_output, react_output]
488
+ )
489
+
490
+ gr.Markdown("""
491
+ ---
492
+ ### ๐Ÿ“– About
493
+ This Space demonstrates three reasoning paradigms:
494
+ - **Think-Only** relies on the model's internal knowledge and reasoning
495
+ - **Act-Only** uses external tools without explicit reasoning steps
496
+ - **ReAct** combines reasoning and acting for more robust problem-solving
497
+
498
+ *Note: Set your HF_TOKEN in Space secrets to use the model.*
499
+ """)
500
+
501
+ if __name__ == "__main__":
502
+ demo.launch()
503
+
requirements.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ gradio==4.44.0
2
+ requests==2.31.0
3
+ wikipedia-api==0.6.0
4
+ wikipedia==1.4.0
5
+