CTapi-raw / DEPLOY_TO_HUGGINGFACE.md
Your Name
Deploy Option B: Query Parser + RAG + 355M Ranking
45cf63e

Deploy Option B to CTapi-raw HuggingFace Space

Your HuggingFace Space


βœ… Files You Already Have (Ready to Deploy!)

Core Files

  • βœ… app.py - Has /search endpoint (Option B!)
  • βœ… foundation_engine.py - Has all Option B logic
  • βœ… requirements.txt - All dependencies
  • βœ… Dockerfile - Docker configuration

Documentation

  • βœ… OPTION_B_IMPLEMENTATION_GUIDE.md - Complete guide
  • βœ… TEST_RESULTS_PHYSICIAN_QUERY.md - Test results
  • βœ… QUICK_START.md - Quick reference

πŸš€ Deployment Steps

Step 1: Set HuggingFace Token in Space Settings

  1. Go to: https://huggingface.co/spaces/gmkdigitalmedia/CTapi-raw/settings
  2. Add Secret:
    Name: HF_TOKEN
    Value: <your_huggingface_token>
    

Step 2: Push Your Local Files to HuggingFace

cd /mnt/c/Users/ibm/Documents/HF/CTapi-raw

# Initialize git if needed
git init
git remote add origin https://huggingface.co/spaces/gmkdigitalmedia/CTapi-raw

# Or if already initialized
git remote set-url origin https://huggingface.co/spaces/gmkdigitalmedia/CTapi-raw

# Stage all files
git add app.py foundation_engine.py requirements.txt Dockerfile README.md

# Commit
git commit -m "Deploy Option B: Query Parser + RAG + 355M Ranking"

# Push to HuggingFace
git push origin main

Step 3: Wait for Build

HuggingFace will automatically:

  1. Build the Docker container
  2. Download data files (3GB from gmkdigitalmedia/foundation1.2-data)
  3. Start the API server
  4. Expose it at: https://gmkdigitalmedia-ctapi-raw.hf.space

Build time: ~10-15 minutes


πŸ“‹ What Your Space Will Have

Endpoints

Primary (Option B):

POST /search

Auxiliary:

GET /              # API info
GET /health        # Health check
GET /docs          # Swagger UI
GET /redoc         # ReDoc

Example Usage

# Test the API
curl -X POST https://gmkdigitalmedia-ctapi-raw.hf.space/search \
  -H "Content-Type: application/json" \
  -d '{
    "query": "what should a physician prescribing ianalumab for sjogrens know",
    "top_k": 5
  }'

Expected Response:

{
  "query": "...",
  "processing_time": 7.5,
  "query_analysis": {
    "extracted_entities": {
      "drugs": ["ianalumab", "VAY736"],
      "diseases": ["SjΓΆgren's syndrome"]
    }
  },
  "results": {
    "total_found": 15,
    "returned": 5
  },
  "trials": [...],
  "benchmarking": {
    "query_parsing_time": 2.3,
    "rag_search_time": 2.9,
    "355m_ranking_time": 2.3
  }
}

🎯 For Your Clients

Client Code Example (Python)

import requests

# Your API endpoint
API_URL = "https://gmkdigitalmedia-ctapi-raw.hf.space/search"

def search_trials(query, top_k=10):
    """Search clinical trials using Option B API"""
    response = requests.post(
        API_URL,
        json={"query": query, "top_k": top_k}
    )
    return response.json()

# Use it
query = "what should a physician prescribing ianalumab for sjogrens know"
results = search_trials(query, top_k=5)

# Get structured data
trials = results["trials"]
for trial in trials:
    print(f"NCT ID: {trial['nct_id']}")
    print(f"Title: {trial['title']}")
    print(f"Relevance: {trial['scoring']['relevance_score']:.2%}")
    print(f"URL: {trial['url']}")
    print()

# Client generates their own response with their LLM
client_llm_response = their_llm.generate(
    f"Based on these trials: {trials}\nAnswer: {query}"
)

Client Code Example (JavaScript)

const API_URL = "https://gmkdigitalmedia-ctapi-raw.hf.space/search";

async function searchTrials(query, topK = 10) {
  const response = await fetch(API_URL, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ query, top_k: topK })
  });
  return response.json();
}

// Use it
const query = "what should a physician prescribing ianalumab for sjogrens know";
const results = await searchTrials(query, 5);

// Process results
results.trials.forEach(trial => {
  console.log(`NCT ID: ${trial.nct_id}`);
  console.log(`Title: ${trial.title}`);
  console.log(`Relevance: ${trial.scoring.relevance_score}`);
});

πŸ“Š Performance on HuggingFace

With GPU (Automatic on HF Spaces)

Query Parsing:  2-3s
RAG Search:     2-3s
355M Ranking:   2-3s (GPU-accelerated with @spaces.GPU)
Total:          7-10s

Resource Usage

RAM: ~10 GB (for 556K trials + embeddings + models)
GPU: T4 or better (automatic)
Storage: ~4 GB (data files cached)

πŸ”§ Troubleshooting

If space doesn't start:

  1. Check logs:

    • Go to space settings β†’ Logs
    • Look for errors during data download or model loading
  2. Common issues:

    • Missing HF_TOKEN β†’ Add in space secrets
    • Out of memory β†’ Increase hardware tier
    • Data download fails β†’ Check gmkdigitalmedia/foundation1.2-data exists
  3. Check data files: Your space should download:

    • dataset_chunks_TRIAL_AWARE.pkl (2.7 GB)
    • dataset_embeddings_TRIAL_AWARE_FIXED.npy (816 MB)
    • inverted_index_COMPREHENSIVE.pkl (308 MB)

    These download automatically on first run.

If queries are slow:

  1. Check GPU is enabled:

    • Space settings β†’ Hardware β†’ Should be T4 or A10
    • The @spaces.GPU decorator enables GPU for 355M ranking
  2. First query is always slower:

    • Models need to load (one-time)
    • Subsequent queries are fast

βœ… Verification Checklist

After deployment, verify:

  • Space is running (green badge)
  • /health endpoint returns healthy
  • /search returns JSON in 7-10s
  • Top trials have >90% relevance
  • Perplexity scores are calculated
  • No hallucinations (355M only scores)

πŸ“ž Client Onboarding

Send this to your clients:

πŸŽ‰ Clinical Trial API - Option B

Fast foundational RAG for clinical trial search.

πŸ“ Endpoint: https://gmkdigitalmedia-ctapi-raw.hf.space/search

⏱️  Response time: 7-10 seconds
πŸ’° Cost: $0.001 per query
πŸ“Š Returns: Structured JSON with ranked trials

πŸ“– Documentation: https://gmkdigitalmedia-ctapi-raw.hf.space/docs

Example:
curl -X POST https://gmkdigitalmedia-ctapi-raw.hf.space/search \
  -H "Content-Type: application/json" \
  -d '{"query": "ianalumab sjogren disease", "top_k": 10}'

Your LLM can then generate responses from the structured data.

🎯 Summary

You have everything ready to deploy!

  1. βœ… All code is in /mnt/c/Users/ibm/Documents/HF/CTapi-raw/
  2. βœ… Option B already implemented
  3. βœ… Tested locally (works perfectly!)
  4. βœ… Just needs to be pushed to HuggingFace

Next step:

cd /mnt/c/Users/ibm/Documents/HF/CTapi-raw
git push origin main

That's it! πŸš€