NetMind Power Documentation
  • NetMind Account
  • Inference
    • Model APIs
    • Dedicated Endpoints
  • Fine-tuning
  • Rent GPUs
    • Cloud Sync
    • Use Ngrok as Ingress Service
  • Rent Cluster (Comming soon)
  • API
    • API token
    • Files
    • Fine-tuning
      • List Models
      • Preparing your dataset
      • Create Job
      • Retrieve job
      • Download model
      • Cancel job
      • Deploy Checkpoint (coming soon)
    • Inference
      • Chat
      • Images
      • Haiper Inference
      • Asynchronous Inference
      • Dedicated Endpoints
      • Batch Processing
      • Embedding API
      • Deprecated Models
    • Rent GPU
      • SSH Authentication
      • List Available images
      • List Available GPU Instances
      • Create Your First Environment
      • Stop GPU instace
    • API Reference
      • Files
      • Fine-tuning
      • Rent GPU
Powered by GitBook
On this page
  • Base URL
  • Authentication
  • Supported Models
  • Usage Examples

Was this helpful?

  1. API
  2. Inference

Embedding API

PreviousBatch ProcessingNextDeprecated Models

Last updated 1 month ago

Was this helpful?

The Embedding API allows you to generate high-quality vector representations (embeddings) of text inputs. These embeddings can be used for a variety of tasks such as semantic search, text classification, clustering, and more. This API is fully compatible with the OpenAI SDK, making it easy to integrate into your existing workflows.

Base URL

https://api.netmind.ai/inference-api/openai/v1

Authentication

To use the API, you need to obtain a Netmind AI API Key. For detailed instructions, please refer to the .

Supported Models

  • nvidia/NV-Embed-v2

  • dunzhang/stella_en_1.5B_v5

  • BAAI/bge-m3

Usage Examples

Python Client

The Embedding API is compatible with the OpenAI Python SDK. Below is an example of how to use it

from openai import OpenAI

# Initialize the client with NetMind API base URL and your API key
client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.netmind.ai/inference-api/openai/v1"
)

# Generate embeddings
response = client.embeddings.create(
    input="This is a sample text to embed.",
    model="nvidia/NV-Embed-v2",
    encoding_format="float" # only support float now
)

# Access the embedding
embedding = response.data[0].embedding
print(embedding)

CURL Example

# Set your API key
export API_KEY="<YOUR Netmind AI API Key>"

curl "https://api.netmind.ai/inference-api/openai/v1/embeddings" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${API_KEY}" \
  -d $'{
    "model": "nvidia/NV-Embed-v2",
    "input": "This is a sample text to embed.",
    "encoding_format": "float" # only support float now
}'

BAAI/bge-m3 Example

# Set your API key
export API_KEY="<YOUR Netmind AI API Key>"

curl "https://api.netmind.ai/inference-api/openai/v1/embeddings" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${API_KEY}" \
  -d $'{
    "model": "BAAI/bge-m3",
    "input": "This is a sample text to embed.",
    "encoding_format": "float", # only support float now
    "encoding_type": "dense" # [dense, sparse, colbert]
}'

The BAAI/bge-m3 model is a specialized embedding model designed to generate high-quality vector representations of text. It supports multiple encoding types, including dense (default), sparse, and colbert (Multi-Vector). For more details, please refer to the .

authentication documentation
Hugging Face model card