Skip to main content
Use the helix-py Python SDK to interact with HelixDB, a high-performance graph-vector database. The SDK provides a simple query interface and PyTorch-like API for defining custom graph queries and vector operations, making it ideal for similarity search, knowledge graphs, and machine learning pipelines.

How do I install the Python SDK?

Install helix-py using your preferred Python package manager:
uv add helix-py

How do I connect to HelixDB with Python?

Set up a Client to connect to your HelixDB instance:
Python
import helix

# Connect to a local helix instance
db = helix.Client(local=True, verbose=True)

# Note that the query name is case sensitive
db.query('add_user', {"name": "John", "age": 20})
The default port is 6969, but you can change it by passing in the port parameter. For cloud instances, you can pass in the api_endpoint parameter.

How do I execute queries with the Python SDK?

Define queries in a PyTorch-like manner, similar to neural network forward passes. Use built-in queries for common operations or define custom queries for complex workflows.

PyTorch-like query definition

Match your HelixQL queries with Python classes:
query.hx
QUERY add_user(name: String, age: I64) =>
  usr <- AddV<User>({name: name, age: age})
  RETURN usr
You can define a matching Python class:
Python
from helix.client import Query
from helix.types import Payload

class add_user(Query):
    def __init__(self, name: str, age: int):
        super().__init__()
        self.name = name
        self.age = age

    def query(self) -> Payload:
        return [{ "name": self.name, "age": self.age }]

    def response(self, response):
        return response
        
db.query(add_user("John", 20))
Make sure that the Query.query method returns a list of objects.

How do I manage HelixDB instances with Python?

Use the Instance class to manage HelixDB lifecycle automatically within your Python scripts:
Python
from helix.instance import Instance
helix_instance = Instance("helixdb-cfg", 6969, verbose=True)

# Deploy & redeploy instance
helix_instance.deploy()

# Start instance
helix_instance.start()

# Stop instance
helix_instance.stop()

# Delete instance
helix_instance.delete()

# Instance status
print(helix_instance.status())
helixdb-cfg is the directory where the configuration files are stored.
and from there you can interact with the instance using Client. The instance will be automatically stopped when the script exits.

How do I use LLM providers with HelixDB?

Integrate popular LLM providers directly with HelixDB using built-in provider interfaces. Available providers:
  • OpenAIProvider
  • GeminiProvider
  • AnthropicProvider
Don’t forget to set the OPENAI_API_KEY, GEMINI_API_KEY, and ANTHROPIC_API_KEY environment variables depending on the provider you are using.
All providers expose two methods:
  • enable_mcps(name: str, url: str=...) -> bool to enable Helix MCP tools
  • generate(messages, response_model: BaseModel | None=None) -> str | BaseModel
The generate method supports messages in the 2 formats:
  • Free-form text: pass a string
  • Message lists: pass a list of dict or provider-specific Message models
It also supports structured outputs by passing a Pydantic model to get validated results. Example:
from pydantic import BaseModel

# OpenAI
from helix.providers.openai_client import OpenAIProvider
openai_llm = OpenAIProvider(
    name="openai-llm",
    instructions="You are a helpful assistant.",
    model="gpt-5-nano",
    history=True
)
print(openai_llm.generate("Hello!"))

class Person(BaseModel):
    name: str
    age: int
    occupation: str

print(openai_llm.generate([{"role": "user", "content": "Who am I?"}], Person))
To enable MCP tools with a running Helix MCP server (see MCP Feature):
openai_llm.enable_mcps("helix-mcp")         # uses default http://localhost:8000/mcp/
gemini_llm.enable_mcps("helix-mcp")         # uses default http://localhost:8000/mcp/
anthropic_llm.enable_mcps("helix-mcp", url="https://your-remote-mcp/...")
  • OpenAI GPT-5 family models support reasoning while other models use temperature.
  • Anthropic local streamable MCP is not supported; use a URL-based MCP.

How do I generate embeddings with HelixDB?

Use built-in embedder interfaces to generate vector embeddings from popular providers. Available embedders:
  • OpenAIEmbedder
  • GeminiEmbedder
  • VoyageAIEmbedder
Each embedder implements:
  • embed(text: str, **kwargs) returns a vector [F64]
  • embed_batch(texts: List[str], **kwargs) returns a list of vectors [F64]
Examples (see examples/llm_providers/providers.ipynb for more):
from helix.embedding.openai_client import OpenAIEmbedder
openai_embedder = OpenAIEmbedder()  # requires OPENAI_API_KEY
vec = openai_embedder.embed("Hello world")
batch = openai_embedder.embed_batch(["a", "b", "c"])

from helix.embedding.gemini_client import GeminiEmbedder
gemini_embedder = GeminiEmbedder()
vec = gemini_embedder.embed("doc text", task_type="RETRIEVAL_DOCUMENT")

from helix.embedding.voyageai_client import VoyageAIEmbedder
voyage_embedder = VoyageAIEmbedder()
vec = voyage_embedder.embed("query text", input_type="query")

How do I chunk text for embeddings?

Split text into manageable pieces using Chonkie chunking methods:
from helix import Chunk

text = "Your long document text here..."
chunks = Chunk.token_chunk(text)

semantic_chunks = Chunk.semantic_chunk(text)

code_text = "def hello(): print('world')"
code_chunks = Chunk.code_chunk(code_text, language="python")

texts = ["Document 1...", "Document 2...", "Document 3..."]
batch_chunks = Chunk.sentence_chunk(texts)
You can find all the different chunking examples inside of Chunking Feature.

How do I load data into HelixDB?

The loader supports .parquet, .fvecs, and .csv files. Pass your file path and column names to automatically load data into your queries.

More Information

For more information, check out our examples!