GeneSilico Tools Hub

A precision oncology services platform that unifies clinical knowledgebases, pharmacogenomic networks, variant-level evidence, and dosing calculators behind a single async API. Every REST endpoint is automatically exposed as an MCP tool, enabling LLM-based agents to discover and invoke 81 operations without custom glue code.

Architecture

┌────────────────────────────────────────────────────────────────┐
│                       Clients                                  │
│   Browser / cURL / SDK        LLM Agent (Claude, GPT, …)       │
│         ▼                              ▼                       │
│    REST  :8000                   MCP  :8001                    │
│    (FastAPI)                (fastapi-mcp, Streamable HTTP)     │
├────────────────────────────────────────────────────────────────┤
│                   Shared Application Layer                     │
│    routes/      ──>      tools/      ──>      connectors/      │
├────────────────────────────────────────────────────────────────┤
│  Valkey/Redis    SQL Server (ODBC 18)    Qdrant Vector DB      │
│  (cache)         (HemOnc, DGI, drugs)    (embeddings, search)  │
│                  CIViC GraphQL API        VICC MetaKB          │
└────────────────────────────────────────────────────────────────┘
Component Role
REST API (:8000) Deterministic HTTP/JSON interface for web apps and direct programmatic access
MCP Server (:8001) Streamable-HTTP transport implementing the Model Context Protocol (v2025-03-26) for AI agent tool discovery and invocation
Valkey In-memory cache (Redis-compatible) for response memoisation and rate-limit state
Qdrant Vector similarity search over biomedical literature embeddings (MedCPT, Snowflake Arctic, MiniCoil)
SQL Server Relational store for HemOnc regimens, DGIdb interaction graphs, drug/patient cohort data

Domain Modules

Dosing & Calculators (/calculators)

Body Surface Area estimation via six validated formulae (Mosteller, Du Bois, Haycock, Gehan–George, Fujimoto, Boyd) and BSA-normalised chemotherapy dose calculation (mg/m² and flat-dose methods).

Clinical Trials (/clinicaltrials)

Semantic search over ClinicalTrials.gov snapshots indexed in Qdrant. Covers breast, prostate, lung, ovarian, and gastrointestinal malignancies. Supports free-text queries and structured intervention-based filtering.

Open Literature (/openliterature)

Vector-similarity retrieval against PubMed/MEDLINE article embeddings. Returns ranked results with title, abstract, PMID, and relevance score per cancer type.

Clinical Practice Guidelines (/guidelines)

Guideline retrieval for breast, lung, prostate, ovarian, GI, and haematological cancers. Content sourced from NCCN, ESMO, and ASCO guideline corpora embedded in Qdrant.

Drug Information (/drugs)

Drug lookup by trade name, INN, or therapeutic indication. Literature search returns publications linked to specific agents. Backed by SQL Server and vector indices.

Drug–Gene Interactions (/drug-gene-interactions)

Graph queries over DGIdb data: drug → gene, gene → drug, k-hop neighbourhood expansion, and shortest-path traversal between two entities in the pharmacogenomic interaction network.

Patient Similarity (/patient-similarity)

Cohort filtering by demographics, diagnosis, staging, and biomarker status. Kaplan–Meier survival analysis on matched cohorts with configurable censoring and stratification.

HemOnc / MedicalWiki (/medicalwiki)

Structured access to HemOnc.org chemotherapy regimens: disease contexts, treatment categories, dosing schedules, eligibility biomarkers, supporting evidence, and per-regimen reference URLs. Includes a snapshot endpoint that aggregates all facets for a given disease.

CIViC (/civicdb)

Interface to the Clinical Interpretation of Variants in Cancer knowledgebase. Type-ahead search for therapies, diseases, and molecular profiles. Evidence queries for drug sensitivity, resistance, and clinical response assertions.

Variant Interpretation (/variant-interpretation)

Aggregated variant–drug–disease associations from the VICC Meta-Knowledgebase. Supports gene+variant+disease precision queries, gene–therapy association lookups, and therapy-response filtering (sensitive / resistant).

Technology Stack

Layer Technology
Runtime Python 3.14, uvicorn (ASGI)
Framework FastAPI 0.128, Pydantic v2
MCP fastapi-mcp 0.4 (Streamable HTTP)
Embeddings sentence-transformers 5.2, ONNX Runtime 1.24, fastembed 0.7
Vector DB Qdrant (gRPC)
RDBMS SQL Server via aioodbc / pyodbc (ODBC Driver 18)
Cache Valkey (Redis 7-compatible) via redis-py async
ML/Stats lifelines (survival), scikit-learn, torch 2.10 (CPU)
Packaging uv, Docker multi-service compose

Getting Started

Prerequisites

Local Development

# Clone and install
git clone <repository-url> && cd tools-hub-service
uv sync

# Start Valkey (required for cache)
docker run -d --name valkey -p 6379:6379 valkey/valkey:latest

# REST API
uvicorn app.main:api_app --host 0.0.0.0 --port 8000

# MCP server (separate terminal)
uvicorn app.main:mcp_http_app --host 0.0.0.0 --port 8001

Docker Compose

# Full stack (API + MCP + Valkey)
docker compose up --build -d
# API  → http://localhost:8000
# MCP  → http://localhost:8001

# MCP-only stack (independent Valkey sidecar)
docker compose -f docker-compose.mcp.yml up --build -d

MCP Integration

The MCP server (:8001) implements the Model Context Protocol v2025-03-26 over Streamable HTTP. Any MCP-compatible client can connect:

# 1. Initialise session
curl -X POST http://localhost:8001/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize",
       "params":{"protocolVersion":"2025-03-26",
                 "clientInfo":{"name":"test","version":"1.0"},
                 "capabilities":{}}}'

# 2. List all 81 tools
curl -X POST http://localhost:8001/mcp \
  -H "Content-Type: application/json" \
  -H "mcp-session-id: <session-id-from-step-1>" \
  -d '{"jsonrpc":"2.0","id":2,"method":"tools/list"}'

Optional bearer-token authentication is available via the MCP_AUTH_ENABLED environment variable.

Testing

# Full suite with coverage
uv run pytest --cov=app --cov-report=term-missing

# Specific module
uv run pytest tests/test_calculators.py -v

# Load testing (locust)
uv run locust -f perf/locustfile.py --host http://localhost:8000

Configuration

Variable Default Description
LISTEN_PORT 8000 / 8001 HTTP listen port
REDIS_URL redis://localhost:6379/0 Valkey / Redis connection string
DB_SERVER SQL Server hostname
DB_USER / DB_PASS SQL Server credentials
QDRANT_HOST / QDRANT_PORT Qdrant gRPC endpoint
QDRANT_API_KEY Qdrant authentication key
MCP_AUTH_ENABLED false Require bearer tokens on the MCP endpoint

Project Layout

app/
├── main.py                 # FastAPI apps (REST + MCP), lifespan, middleware
├── cache_utils.py          # Redis-backed response caching
├── routes/                 # 11 route modules (81 endpoints)
├── tools/                  # Domain logic per module
├── connectors/             # MSSQL, Qdrant, embedding model wrappers
│   └── embedding/          # MedCPT, Snowflake Arctic, MiniCoil loaders
└── core/                   # Logger, semantic search utilities
tests/                      # pytest suite (unit + integration)
perf/                       # Locust load-test scenarios
docker-compose.yml          # API + MCP + Valkey
docker-compose.mcp.yml      # Standalone MCP + Valkey
Dockerfile                  # API image (Python 3.14-slim)
Dockerfile.mcp              # MCP image (Python 3.14-slim)

Data Sources

Source Usage
ClinicalTrials.gov Trial registry snapshots
PubMed / MEDLINE Biomedical literature
NCCN / ESMO / ASCO Clinical practice guidelines
HemOnc.org Chemotherapy regimens & protocols
CIViC Clinical variant interpretations
DGIdb Drug–gene interaction network
VICC MetaKB Aggregated variant evidence
PharmGKB Pharmacogenomics annotations

Version 1.0 · Python 3.14 · FastAPI 0.128 · 81 MCP tools