> For the complete documentation index, see [llms.txt](https://netmind-power.gitbook.io/netmind-power-documentation/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://netmind-power.gitbook.io/netmind-power-documentation/api/inference/dedicated-endpoints.md).

# Dedicated Endpoints

## Pricing

Billing based on the GPU type, number of instances and the duration of instance services.&#x20;

## Get inference flavor

{% hint style="info" %}
Replace `{{API_TOKEN}}` with your actual token.
{% endhint %}

**Example Request:**

{% tabs %}
{% tab title="Curl" %}

```bash
curl --location 'https://api.netmind.ai/v1/inference-service/flavor' \
--header 'Authorization: Bearer {{API_TOKEN}}'

```

{% endtab %}

{% tab title="Python" %}

```python
import requests

url = "https://api.netmind.ai/v1/inference-service/flavor"

payload = {}
headers = {
  'Authorization': 'Bearer {{API_TOKEN}}'
}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)

```

{% endtab %}
{% endtabs %}

**Example Response:**

```json
{
    "flavor_list": [
        {
            "flavor_id": "69475e82e81c4dd6be3467e2ca374e0c",
            "display_name": "NVIDIA_GeForce_RTX_4090",
            "cluster_id": "1",
            "cluster_flavor_id": "US_01_4090",
            "meta_info": {
                "cuda": "12.0",
                "region": "cn"
            },
            "billing": {
                "cny_price_unit": 2.2,
                "usd_price_unit": 0.3,
                "nmt_price_unit": 0.147164
            },
            "created_at": "2024-11-13 09:00:19",
            "updated_at": "2024-11-13 09:00:19",
            "deleted_at": null,
            "is_deleted": false,
            "available_num": 1,
            "node_max_gpu": 2
        },
        {
            "flavor_id": "7b2f36d30e0743debc6c60d5017e2d16",
            "display_name": "NVIDIA_GeForce_RTX_4090",
            "cluster_id": "1",
            "cluster_flavor_id": "US_01_4090",
            "meta_info": {
                "cuda": "12.0",
                "region": "other"
            },
            "billing": {
                "cny_price_unit": 2.2,
                "usd_price_unit": 0.3,
                "nmt_price_unit": 0.147164
            },
            "created_at": "2024-11-13 09:00:19",
            "updated_at": "2024-11-13 09:00:19",
            "deleted_at": null,
            "is_deleted": false,
            "available_num": 1,
            "node_max_gpu": 2
        }
    ]
}
```

## Create endpoint

{% hint style="info" %}
Replace `{{API_TOKEN}}` with your actual token.
{% endhint %}

**Example Request:**

{% tabs %}
{% tab title="Curl" %}

```bash
curl --location 'https://api.netmind.ai/v1/inference-service' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {{API_TOKEN}}' \
--data '{
    "name": "test-flask-server-9",
    "description": "test",
    "payment_type": "usd",
    "resource_metadata": {
        "flavor_id": "69475e82e81c4dd6be3467e2ca374e0c",
        "scale_type": "manual",
        "target_instance_number": 1
    },
    "deploy_metadata": {
        "image_url": "python:3.10-slim",
        "command": "pip install flask && apt update && apt install -y curl && curl -O https://raw.githubusercontent.com/huang-hf/share_data/refs/heads/main/app.py && python app.py",
        "port": 8080,
        "gpu_num": 0
    }
}'
```

{% endtab %}

{% tab title="Python" %}

```python
import requests
import json

url = "https://api.netmind.ai/v1/inference-service"

payload = json.dumps({
  "name": "test-flask-server-9",
  "description": "test password-1",
  "payment_type": "usd",
  "resource_metadata": {
    "flavor_id": "69475e82e81c4dd6be3467e2ca374e0c",
    "scale_type": "manual",
    "target_instance_number": 1
  },
  "deploy_metadata": {
    "image_url": "python:3.10-slim",
    "command": "pip install flask && apt update && apt install -y curl && curl -O https://raw.githubusercontent.com/huang-hf/share_data/refs/heads/main/app.py && python app.py",
    "port": 8080,
    "gpu_num": 0
  }
})
headers = {
  'Content-Type': 'application/json',
  'Authorization': 'Bearer {{API_TOKEN}}'
}

response = requests.request("POST", url, headers=headers, data=payload)

print(response.text)


```

{% endtab %}
{% endtabs %}

**Example Response:**

```json
{
    "service_id": "...",
    "name": "test-flask-server-9",
    "description": "test",
    "user_id": "...",
    "status": "initializing",
    "status_info": null,
    "resource_metadata": {
        "resource_display_name": "NVIDIA_GeForce_RTX_4090",
        "scale_type": "manual",
        "flavor_id": "69475e82e81c4dd6be3467e2ca374e0c",
        "target_instance_number": 1,
        "scale_policy": null,
        "VRAM": "32GB",
        "image_size": "12GB"
    },
    "billing_metadata": {
        ...
    },
    "endpoint_metadata": {
        ...
    },
    "deploy_metadata": {
        ...
    },
    "service_type": "normal",
    "created_at": "...",
    "updated_at": "...",
    "deleted_at": null,
    "is_deleted": false,
    "payment_type": "usd"
}
```

## Get endpoint

{% hint style="info" %}
Replace `{{API_TOKEN}}` with your actual token.

Replace `{{INFERENCE_ID}}` with "service\_id" you got from previous step.
{% endhint %}

**Example Request:**

{% tabs %}
{% tab title="Curl" %}

```bash
curl --location 'https://api.netmind.ai/v1/inference-service/{{INFERENCE_ID}}' \
--header 'Authorization: Bearer {{API_TOKEN}}'
```

{% endtab %}

{% tab title="Python" %}

```python
import requests

url = "https://api.netmind.ai/v1/inference-service/{{INFERENCE_ID}}"

payload = {}
headers = {
  'Authorization': 'Bearer {{API_TOKEN}}'
}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)


```

{% endtab %}
{% endtabs %}

**Example Response:**

```json
{
    "service_id": "...",
    "name": "test-flask-server-9",
    "description": "test",
    "user_id": "...",
    "status": "available",
    "status_info": null,
    "resource_metadata": {
        "resource_display_name": "NVIDIA_GeForce_RTX_4090",
        "scale_type": "manual",
        "flavor_id": "69475e82e81c4dd6be3467e2ca374e0c",
        "target_instance_number": 1,
        "scale_policy": null,
        "VRAM": "32GB",
        "image_size": "12GB"
    },
    "billing_metadata": {
        ...
    },
    "endpoint_metadata": {
        ...
    },
    "deploy_metadata": {
        ...
    },
    "service_type": "normal",
    "created_at": "...",
    "updated_at": "...",
    "deleted_at": null,
    "is_deleted": false,
    "payment_type": "usd"
}
```

{% hint style="info" %}
The value of "status" can be:

* **initializing**: The status immediately after creating a new instance or redeploying a stopped instance. It means the instance is being initializing.
* **available**: Typically follows the "Deploying" status. This indicates the instance is running normally, can be scaled, and the endpoint is accessible.
* **stopped**: The instance is stopped, and the number of workers will be reduced to zero.
* **unavailable**: The instance deployment or scaling failed due to an error, which may cause the endpoint to become inaccessible.
  {% endhint %}

{% hint style="info" %}
When the instance is in an "available" state, it can be accessed via "<https://api.deeptrin.com/inference-api/v1/inference\\_service/\\{{INFERENCE\\_ID\\}}>".
{% endhint %}

## Update endpoint

{% hint style="info" %}
Replace `{{API_TOKEN}}` with your actual token.\
Replace `{{INFERENCE_ID}}` with "service\_id" you got from previous step.
{% endhint %}

**Example Request:**

{% tabs %}
{% tab title="Curl" %}

```bash
curl --location --request PUT 'https://api.netmind.ai/v1/inference-service/{{INFERENCE_ID}}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {{API_TOKEN}}' \
--data '{
  "resource_metadata": {
    "target_instance_number": "2"
  }
}'
```

{% endtab %}

{% tab title="Python" %}

```python
import requests
import json

url = "https://api.netmind.ai/v1/inference-service/{{INFERENCE_ID}}"

payload = json.dumps({
  "resource_metadata": {
    "target_instance_number": "2"
  }
})
headers = {
  'Content-Type': 'application/json',
  'Authorization': 'Bearer {{API_TOKEN}}'
}

response = requests.request("PUT", url, headers=headers, data=payload)

print(response.text)

```

{% endtab %}
{% endtabs %}

**Example Response:**

```json
{
    "service_id": "...",
    "name": "test-flask-server-9",
    "description": "test",
    "user_id": "...",
    "status": "initializing",
    "status_info": null,
    "resource_metadata": {
        "resource_display_name": "NVIDIA_GeForce_RTX_4090",
        "scale_type": "manual",
        "flavor_id": "69475e82e81c4dd6be3467e2ca374e0c",
        "target_instance_number": 1,
        "scale_policy": null,
        "VRAM": "32GB",
        "image_size": "12GB"
    },
    "billing_metadata": {
        ...
    },
    "endpoint_metadata": {
        ...
    },
    "deploy_metadata": {
        ...
    },
    "service_type": "normal",
    "created_at": "...",
    "updated_at": "...",
    "deleted_at": null,
    "is_deleted": false,
    "payment_type": "usd"
}
```

## Delete endpoint

{% hint style="info" %}
Replace `{{API_TOKEN}}` with your actual token.\
Replace `{{INFERENCE_ID}}` with "service\_id" you got from previous step.
{% endhint %}

**Example Request:**

{% tabs %}
{% tab title="Curl" %}

```bash
curl --location --request DELETE 'https://api.netmind.ai/v1/inference-service/{{GENERATION_ID}}' \
--header 'Authorization: Bearer {{API_TOKEN}}'
```

{% endtab %}

{% tab title="Python" %}

```python
import requests

url = "https://api.netmind.ai/v1/inference-service/{{GENERATION_ID}}"

payload = {}
headers = {
  'Authorization': 'Bearer {{API_TOKEN}}'
}

response = requests.request("DELETE", url, headers=headers, data=payload)

print(response.text)


```

{% endtab %}
{% endtabs %}

**Example Response:**

```json
```


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://netmind-power.gitbook.io/netmind-power-documentation/api/inference/dedicated-endpoints.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
