Spaces:
Paused
Paused
Deploy Option B to CTapi-raw HuggingFace Space
Your HuggingFace Space
- Space: https://huggingface.co/spaces/gmkdigitalmedia/CTapi-raw
- Local files:
/mnt/c/Users/ibm/Documents/HF/CTapi-raw/ - Target: Deploy Option B (7-10s per query)
β Files You Already Have (Ready to Deploy!)
Core Files
- β
app.py- Has/searchendpoint (Option B!) - β
foundation_engine.py- Has all Option B logic - β
requirements.txt- All dependencies - β
Dockerfile- Docker configuration
Documentation
- β
OPTION_B_IMPLEMENTATION_GUIDE.md- Complete guide - β
TEST_RESULTS_PHYSICIAN_QUERY.md- Test results - β
QUICK_START.md- Quick reference
π Deployment Steps
Step 1: Set HuggingFace Token in Space Settings
- Go to: https://huggingface.co/spaces/gmkdigitalmedia/CTapi-raw/settings
- Add Secret:
Name: HF_TOKEN Value: <your_huggingface_token>
Step 2: Push Your Local Files to HuggingFace
cd /mnt/c/Users/ibm/Documents/HF/CTapi-raw
# Initialize git if needed
git init
git remote add origin https://huggingface.co/spaces/gmkdigitalmedia/CTapi-raw
# Or if already initialized
git remote set-url origin https://huggingface.co/spaces/gmkdigitalmedia/CTapi-raw
# Stage all files
git add app.py foundation_engine.py requirements.txt Dockerfile README.md
# Commit
git commit -m "Deploy Option B: Query Parser + RAG + 355M Ranking"
# Push to HuggingFace
git push origin main
Step 3: Wait for Build
HuggingFace will automatically:
- Build the Docker container
- Download data files (3GB from gmkdigitalmedia/foundation1.2-data)
- Start the API server
- Expose it at: https://gmkdigitalmedia-ctapi-raw.hf.space
Build time: ~10-15 minutes
π What Your Space Will Have
Endpoints
Primary (Option B):
POST /search
Auxiliary:
GET / # API info
GET /health # Health check
GET /docs # Swagger UI
GET /redoc # ReDoc
Example Usage
# Test the API
curl -X POST https://gmkdigitalmedia-ctapi-raw.hf.space/search \
-H "Content-Type: application/json" \
-d '{
"query": "what should a physician prescribing ianalumab for sjogrens know",
"top_k": 5
}'
Expected Response:
{
"query": "...",
"processing_time": 7.5,
"query_analysis": {
"extracted_entities": {
"drugs": ["ianalumab", "VAY736"],
"diseases": ["SjΓΆgren's syndrome"]
}
},
"results": {
"total_found": 15,
"returned": 5
},
"trials": [...],
"benchmarking": {
"query_parsing_time": 2.3,
"rag_search_time": 2.9,
"355m_ranking_time": 2.3
}
}
π― For Your Clients
Client Code Example (Python)
import requests
# Your API endpoint
API_URL = "https://gmkdigitalmedia-ctapi-raw.hf.space/search"
def search_trials(query, top_k=10):
"""Search clinical trials using Option B API"""
response = requests.post(
API_URL,
json={"query": query, "top_k": top_k}
)
return response.json()
# Use it
query = "what should a physician prescribing ianalumab for sjogrens know"
results = search_trials(query, top_k=5)
# Get structured data
trials = results["trials"]
for trial in trials:
print(f"NCT ID: {trial['nct_id']}")
print(f"Title: {trial['title']}")
print(f"Relevance: {trial['scoring']['relevance_score']:.2%}")
print(f"URL: {trial['url']}")
print()
# Client generates their own response with their LLM
client_llm_response = their_llm.generate(
f"Based on these trials: {trials}\nAnswer: {query}"
)
Client Code Example (JavaScript)
const API_URL = "https://gmkdigitalmedia-ctapi-raw.hf.space/search";
async function searchTrials(query, topK = 10) {
const response = await fetch(API_URL, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ query, top_k: topK })
});
return response.json();
}
// Use it
const query = "what should a physician prescribing ianalumab for sjogrens know";
const results = await searchTrials(query, 5);
// Process results
results.trials.forEach(trial => {
console.log(`NCT ID: ${trial.nct_id}`);
console.log(`Title: ${trial.title}`);
console.log(`Relevance: ${trial.scoring.relevance_score}`);
});
π Performance on HuggingFace
With GPU (Automatic on HF Spaces)
Query Parsing: 2-3s
RAG Search: 2-3s
355M Ranking: 2-3s (GPU-accelerated with @spaces.GPU)
Total: 7-10s
Resource Usage
RAM: ~10 GB (for 556K trials + embeddings + models)
GPU: T4 or better (automatic)
Storage: ~4 GB (data files cached)
π§ Troubleshooting
If space doesn't start:
Check logs:
- Go to space settings β Logs
- Look for errors during data download or model loading
Common issues:
- Missing HF_TOKEN β Add in space secrets
- Out of memory β Increase hardware tier
- Data download fails β Check gmkdigitalmedia/foundation1.2-data exists
Check data files: Your space should download:
- dataset_chunks_TRIAL_AWARE.pkl (2.7 GB)
- dataset_embeddings_TRIAL_AWARE_FIXED.npy (816 MB)
- inverted_index_COMPREHENSIVE.pkl (308 MB)
These download automatically on first run.
If queries are slow:
Check GPU is enabled:
- Space settings β Hardware β Should be T4 or A10
- The @spaces.GPU decorator enables GPU for 355M ranking
First query is always slower:
- Models need to load (one-time)
- Subsequent queries are fast
β Verification Checklist
After deployment, verify:
- Space is running (green badge)
-
/healthendpoint returns healthy -
/searchreturns JSON in 7-10s - Top trials have >90% relevance
- Perplexity scores are calculated
- No hallucinations (355M only scores)
π Client Onboarding
Send this to your clients:
π Clinical Trial API - Option B
Fast foundational RAG for clinical trial search.
π Endpoint: https://gmkdigitalmedia-ctapi-raw.hf.space/search
β±οΈ Response time: 7-10 seconds
π° Cost: $0.001 per query
π Returns: Structured JSON with ranked trials
π Documentation: https://gmkdigitalmedia-ctapi-raw.hf.space/docs
Example:
curl -X POST https://gmkdigitalmedia-ctapi-raw.hf.space/search \
-H "Content-Type: application/json" \
-d '{"query": "ianalumab sjogren disease", "top_k": 10}'
Your LLM can then generate responses from the structured data.
π― Summary
You have everything ready to deploy!
- β
All code is in
/mnt/c/Users/ibm/Documents/HF/CTapi-raw/ - β Option B already implemented
- β Tested locally (works perfectly!)
- β Just needs to be pushed to HuggingFace
Next step:
cd /mnt/c/Users/ibm/Documents/HF/CTapi-raw
git push origin main
That's it! π