Pricing
Billing based on the GPU type, number of instances and the duration of instance services.
Get inference flavor
Replace {{API_TOKEN}}
with your actual token.
Example Request:
curl --location 'https://api.netmind.ai/v1/inference-service/flavor' \
--header 'Authorization: Bearer {{API_TOKEN}}'
import requests
url = "https://api.netmind.ai/v1/inference-service/flavor"
payload = {}
headers = {
'Authorization': 'Bearer {{API_TOKEN}}'
}
response = requests.request("GET", url, headers=headers, data=payload)
print(response.text)
Example Response:
{
"flavor_list": [
{
"flavor_id": "69475e82e81c4dd6be3467e2ca374e0c",
"display_name": "NVIDIA_GeForce_RTX_4090",
"cluster_id": "1",
"cluster_flavor_id": "US_01_4090",
"meta_info": {
"cuda": "12.0",
"region": "cn"
},
"billing": {
"cny_price_unit": 2.2,
"usd_price_unit": 0.3,
"nmt_price_unit": 0.147164
},
"created_at": "2024-11-13 09:00:19",
"updated_at": "2024-11-13 09:00:19",
"deleted_at": null,
"is_deleted": false,
"available_num": 1,
"node_max_gpu": 2
},
{
"flavor_id": "7b2f36d30e0743debc6c60d5017e2d16",
"display_name": "NVIDIA_GeForce_RTX_4090",
"cluster_id": "1",
"cluster_flavor_id": "US_01_4090",
"meta_info": {
"cuda": "12.0",
"region": "other"
},
"billing": {
"cny_price_unit": 2.2,
"usd_price_unit": 0.3,
"nmt_price_unit": 0.147164
},
"created_at": "2024-11-13 09:00:19",
"updated_at": "2024-11-13 09:00:19",
"deleted_at": null,
"is_deleted": false,
"available_num": 1,
"node_max_gpu": 2
}
]
}
Create endpoint
Replace {{API_TOKEN}}
with your actual token.
Example Request:
curl --location 'https://api.netmind.ai/v1/inference-service' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {{API_TOKEN}}' \
--data '{
"name": "test-flask-server-9",
"description": "test",
"payment_type": "usd",
"resource_metadata": {
"flavor_id": "69475e82e81c4dd6be3467e2ca374e0c",
"scale_type": "manual",
"target_instance_number": 1
},
"deploy_metadata": {
"image_url": "python:3.10-slim",
"command": "pip install flask && apt update && apt install -y curl && curl -O https://raw.githubusercontent.com/huang-hf/share_data/refs/heads/main/app.py && python app.py",
"port": 8080,
"gpu_num": 0
}
}'
import requests
import json
url = "https://api.netmind.ai/v1/inference-service"
payload = json.dumps({
"name": "test-flask-server-9",
"description": "test password-1",
"payment_type": "usd",
"resource_metadata": {
"flavor_id": "69475e82e81c4dd6be3467e2ca374e0c",
"scale_type": "manual",
"target_instance_number": 1
},
"deploy_metadata": {
"image_url": "python:3.10-slim",
"command": "pip install flask && apt update && apt install -y curl && curl -O https://raw.githubusercontent.com/huang-hf/share_data/refs/heads/main/app.py && python app.py",
"port": 8080,
"gpu_num": 0
}
})
headers = {
'Content-Type': 'application/json',
'Authorization': 'Bearer {{API_TOKEN}}'
}
response = requests.request("POST", url, headers=headers, data=payload)
print(response.text)
Example Response:
{
"service_id": "...",
"name": "test-flask-server-9",
"description": "test",
"user_id": "...",
"status": "initializing",
"status_info": null,
"resource_metadata": {
"resource_display_name": "NVIDIA_GeForce_RTX_4090",
"scale_type": "manual",
"flavor_id": "69475e82e81c4dd6be3467e2ca374e0c",
"target_instance_number": 1,
"scale_policy": null,
"VRAM": "32GB",
"image_size": "12GB"
},
"billing_metadata": {
...
},
"endpoint_metadata": {
...
},
"deploy_metadata": {
...
},
"service_type": "normal",
"created_at": "...",
"updated_at": "...",
"deleted_at": null,
"is_deleted": false,
"payment_type": "usd"
}
Get endpoint
Replace {{API_TOKEN}}
with your actual token.
Replace {{INFERENCE_ID}}
with "service_id" you got from previous step.
Example Request:
curl --location 'https://api.netmind.ai/v1/inference-service/{{INFERENCE_ID}}' \
--header 'Authorization: Bearer {{API_TOKEN}}'
import requests
url = "https://api.netmind.ai/v1/inference-service/{{INFERENCE_ID}}"
payload = {}
headers = {
'Authorization': 'Bearer {{API_TOKEN}}'
}
response = requests.request("GET", url, headers=headers, data=payload)
print(response.text)
Example Response:
{
"service_id": "...",
"name": "test-flask-server-9",
"description": "test",
"user_id": "...",
"status": "available",
"status_info": null,
"resource_metadata": {
"resource_display_name": "NVIDIA_GeForce_RTX_4090",
"scale_type": "manual",
"flavor_id": "69475e82e81c4dd6be3467e2ca374e0c",
"target_instance_number": 1,
"scale_policy": null,
"VRAM": "32GB",
"image_size": "12GB"
},
"billing_metadata": {
...
},
"endpoint_metadata": {
...
},
"deploy_metadata": {
...
},
"service_type": "normal",
"created_at": "...",
"updated_at": "...",
"deleted_at": null,
"is_deleted": false,
"payment_type": "usd"
}
The value of "status" can be:
initializing: The status immediately after creating a new instance or redeploying a stopped instance. It means the instance is being initializing.
available: Typically follows the "Deploying" status. This indicates the instance is running normally, can be scaled, and the endpoint is accessible.
stopped: The instance is stopped, and the number of workers will be reduced to zero.
unavailable: The instance deployment or scaling failed due to an error, which may cause the endpoint to become inaccessible.
When the instance is in an "available" state, it can be accessed via "https://api.deeptrin.com/inference-api/v1/inference_service/{{INFERENCE_ID}}".
Update endpoint
Replace {{API_TOKEN}}
with your actual token.
Replace {{INFERENCE_ID}}
with "service_id" you got from previous step.
Example Request:
curl --location --request PUT 'https://api.netmind.ai/v1/inference-service/{{INFERENCE_ID}}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {{API_TOKEN}}' \
--data '{
"resource_metadata": {
"target_instance_number": "2"
}
}'
import requests
import json
url = "https://api.netmind.ai/v1/inference-service/{{INFERENCE_ID}}"
payload = json.dumps({
"resource_metadata": {
"target_instance_number": "2"
}
})
headers = {
'Content-Type': 'application/json',
'Authorization': 'Bearer {{API_TOKEN}}'
}
response = requests.request("PUT", url, headers=headers, data=payload)
print(response.text)
Example Response:
{
"service_id": "...",
"name": "test-flask-server-9",
"description": "test",
"user_id": "...",
"status": "initializing",
"status_info": null,
"resource_metadata": {
"resource_display_name": "NVIDIA_GeForce_RTX_4090",
"scale_type": "manual",
"flavor_id": "69475e82e81c4dd6be3467e2ca374e0c",
"target_instance_number": 1,
"scale_policy": null,
"VRAM": "32GB",
"image_size": "12GB"
},
"billing_metadata": {
...
},
"endpoint_metadata": {
...
},
"deploy_metadata": {
...
},
"service_type": "normal",
"created_at": "...",
"updated_at": "...",
"deleted_at": null,
"is_deleted": false,
"payment_type": "usd"
}
Delete endpoint
Replace {{API_TOKEN}}
with your actual token.
Replace {{INFERENCE_ID}}
with "service_id" you got from previous step.
Example Request:
curl --location --request DELETE 'https://api.netmind.ai/v1/inference-service/{{GENERATION_ID}}' \
--header 'Authorization: Bearer {{API_TOKEN}}'
import requests
url = "https://api.netmind.ai/v1/inference-service/{{GENERATION_ID}}"
payload = {}
headers = {
'Authorization': 'Bearer {{API_TOKEN}}'
}
response = requests.request("DELETE", url, headers=headers, data=payload)
print(response.text)
Example Response: