# Dedicated Endpoints

## Pricing

Billing based on the GPU type, number of instances and the duration of instance services.&#x20;

## Get inference flavor

{% hint style="info" %}
Replace `{{API_TOKEN}}` with your actual token.
{% endhint %}

**Example Request:**

{% tabs %}
{% tab title="Curl" %}

```bash
curl --location 'https://api.netmind.ai/v1/inference-service/flavor' \
--header 'Authorization: Bearer {{API_TOKEN}}'

```

{% endtab %}

{% tab title="Python" %}

```python
import requests

url = "https://api.netmind.ai/v1/inference-service/flavor"

payload = {}
headers = {
  'Authorization': 'Bearer {{API_TOKEN}}'
}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)

```

{% endtab %}
{% endtabs %}

**Example Response:**

```json
{
    "flavor_list": [
        {
            "flavor_id": "69475e82e81c4dd6be3467e2ca374e0c",
            "display_name": "NVIDIA_GeForce_RTX_4090",
            "cluster_id": "1",
            "cluster_flavor_id": "US_01_4090",
            "meta_info": {
                "cuda": "12.0",
                "region": "cn"
            },
            "billing": {
                "cny_price_unit": 2.2,
                "usd_price_unit": 0.3,
                "nmt_price_unit": 0.147164
            },
            "created_at": "2024-11-13 09:00:19",
            "updated_at": "2024-11-13 09:00:19",
            "deleted_at": null,
            "is_deleted": false,
            "available_num": 1,
            "node_max_gpu": 2
        },
        {
            "flavor_id": "7b2f36d30e0743debc6c60d5017e2d16",
            "display_name": "NVIDIA_GeForce_RTX_4090",
            "cluster_id": "1",
            "cluster_flavor_id": "US_01_4090",
            "meta_info": {
                "cuda": "12.0",
                "region": "other"
            },
            "billing": {
                "cny_price_unit": 2.2,
                "usd_price_unit": 0.3,
                "nmt_price_unit": 0.147164
            },
            "created_at": "2024-11-13 09:00:19",
            "updated_at": "2024-11-13 09:00:19",
            "deleted_at": null,
            "is_deleted": false,
            "available_num": 1,
            "node_max_gpu": 2
        }
    ]
}
```

## Create endpoint

{% hint style="info" %}
Replace `{{API_TOKEN}}` with your actual token.
{% endhint %}

**Example Request:**

{% tabs %}
{% tab title="Curl" %}

```bash
curl --location 'https://api.netmind.ai/v1/inference-service' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {{API_TOKEN}}' \
--data '{
    "name": "test-flask-server-9",
    "description": "test",
    "payment_type": "usd",
    "resource_metadata": {
        "flavor_id": "69475e82e81c4dd6be3467e2ca374e0c",
        "scale_type": "manual",
        "target_instance_number": 1
    },
    "deploy_metadata": {
        "image_url": "python:3.10-slim",
        "command": "pip install flask && apt update && apt install -y curl && curl -O https://raw.githubusercontent.com/huang-hf/share_data/refs/heads/main/app.py && python app.py",
        "port": 8080,
        "gpu_num": 0
    }
}'
```

{% endtab %}

{% tab title="Python" %}

```python
import requests
import json

url = "https://api.netmind.ai/v1/inference-service"

payload = json.dumps({
  "name": "test-flask-server-9",
  "description": "test password-1",
  "payment_type": "usd",
  "resource_metadata": {
    "flavor_id": "69475e82e81c4dd6be3467e2ca374e0c",
    "scale_type": "manual",
    "target_instance_number": 1
  },
  "deploy_metadata": {
    "image_url": "python:3.10-slim",
    "command": "pip install flask && apt update && apt install -y curl && curl -O https://raw.githubusercontent.com/huang-hf/share_data/refs/heads/main/app.py && python app.py",
    "port": 8080,
    "gpu_num": 0
  }
})
headers = {
  'Content-Type': 'application/json',
  'Authorization': 'Bearer {{API_TOKEN}}'
}

response = requests.request("POST", url, headers=headers, data=payload)

print(response.text)


```

{% endtab %}
{% endtabs %}

**Example Response:**

```json
{
    "service_id": "...",
    "name": "test-flask-server-9",
    "description": "test",
    "user_id": "...",
    "status": "initializing",
    "status_info": null,
    "resource_metadata": {
        "resource_display_name": "NVIDIA_GeForce_RTX_4090",
        "scale_type": "manual",
        "flavor_id": "69475e82e81c4dd6be3467e2ca374e0c",
        "target_instance_number": 1,
        "scale_policy": null,
        "VRAM": "32GB",
        "image_size": "12GB"
    },
    "billing_metadata": {
        ...
    },
    "endpoint_metadata": {
        ...
    },
    "deploy_metadata": {
        ...
    },
    "service_type": "normal",
    "created_at": "...",
    "updated_at": "...",
    "deleted_at": null,
    "is_deleted": false,
    "payment_type": "usd"
}
```

## Get endpoint

{% hint style="info" %}
Replace `{{API_TOKEN}}` with your actual token.

Replace `{{INFERENCE_ID}}` with "service\_id" you got from previous step.
{% endhint %}

**Example Request:**

{% tabs %}
{% tab title="Curl" %}

```bash
curl --location 'https://api.netmind.ai/v1/inference-service/{{INFERENCE_ID}}' \
--header 'Authorization: Bearer {{API_TOKEN}}'
```

{% endtab %}

{% tab title="Python" %}

```python
import requests

url = "https://api.netmind.ai/v1/inference-service/{{INFERENCE_ID}}"

payload = {}
headers = {
  'Authorization': 'Bearer {{API_TOKEN}}'
}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)


```

{% endtab %}
{% endtabs %}

**Example Response:**

```json
{
    "service_id": "...",
    "name": "test-flask-server-9",
    "description": "test",
    "user_id": "...",
    "status": "available",
    "status_info": null,
    "resource_metadata": {
        "resource_display_name": "NVIDIA_GeForce_RTX_4090",
        "scale_type": "manual",
        "flavor_id": "69475e82e81c4dd6be3467e2ca374e0c",
        "target_instance_number": 1,
        "scale_policy": null,
        "VRAM": "32GB",
        "image_size": "12GB"
    },
    "billing_metadata": {
        ...
    },
    "endpoint_metadata": {
        ...
    },
    "deploy_metadata": {
        ...
    },
    "service_type": "normal",
    "created_at": "...",
    "updated_at": "...",
    "deleted_at": null,
    "is_deleted": false,
    "payment_type": "usd"
}
```

{% hint style="info" %}
The value of "status" can be:

* **initializing**: The status immediately after creating a new instance or redeploying a stopped instance. It means the instance is being initializing.
* **available**: Typically follows the "Deploying" status. This indicates the instance is running normally, can be scaled, and the endpoint is accessible.
* **stopped**: The instance is stopped, and the number of workers will be reduced to zero.
* **unavailable**: The instance deployment or scaling failed due to an error, which may cause the endpoint to become inaccessible.
  {% endhint %}

{% hint style="info" %}
When the instance is in an "available" state, it can be accessed via "<https://api.deeptrin.com/inference-api/v1/inference\\_service/\\{{INFERENCE\\_ID\\}}>".
{% endhint %}

## Update endpoint

{% hint style="info" %}
Replace `{{API_TOKEN}}` with your actual token.\
Replace `{{INFERENCE_ID}}` with "service\_id" you got from previous step.
{% endhint %}

**Example Request:**

{% tabs %}
{% tab title="Curl" %}

```bash
curl --location --request PUT 'https://api.netmind.ai/v1/inference-service/{{INFERENCE_ID}}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {{API_TOKEN}}' \
--data '{
  "resource_metadata": {
    "target_instance_number": "2"
  }
}'
```

{% endtab %}

{% tab title="Python" %}

```python
import requests
import json

url = "https://api.netmind.ai/v1/inference-service/{{INFERENCE_ID}}"

payload = json.dumps({
  "resource_metadata": {
    "target_instance_number": "2"
  }
})
headers = {
  'Content-Type': 'application/json',
  'Authorization': 'Bearer {{API_TOKEN}}'
}

response = requests.request("PUT", url, headers=headers, data=payload)

print(response.text)

```

{% endtab %}
{% endtabs %}

**Example Response:**

```json
{
    "service_id": "...",
    "name": "test-flask-server-9",
    "description": "test",
    "user_id": "...",
    "status": "initializing",
    "status_info": null,
    "resource_metadata": {
        "resource_display_name": "NVIDIA_GeForce_RTX_4090",
        "scale_type": "manual",
        "flavor_id": "69475e82e81c4dd6be3467e2ca374e0c",
        "target_instance_number": 1,
        "scale_policy": null,
        "VRAM": "32GB",
        "image_size": "12GB"
    },
    "billing_metadata": {
        ...
    },
    "endpoint_metadata": {
        ...
    },
    "deploy_metadata": {
        ...
    },
    "service_type": "normal",
    "created_at": "...",
    "updated_at": "...",
    "deleted_at": null,
    "is_deleted": false,
    "payment_type": "usd"
}
```

## Delete endpoint

{% hint style="info" %}
Replace `{{API_TOKEN}}` with your actual token.\
Replace `{{INFERENCE_ID}}` with "service\_id" you got from previous step.
{% endhint %}

**Example Request:**

{% tabs %}
{% tab title="Curl" %}

```bash
curl --location --request DELETE 'https://api.netmind.ai/v1/inference-service/{{GENERATION_ID}}' \
--header 'Authorization: Bearer {{API_TOKEN}}'
```

{% endtab %}

{% tab title="Python" %}

```python
import requests

url = "https://api.netmind.ai/v1/inference-service/{{GENERATION_ID}}"

payload = {}
headers = {
  'Authorization': 'Bearer {{API_TOKEN}}'
}

response = requests.request("DELETE", url, headers=headers, data=payload)

print(response.text)


```

{% endtab %}
{% endtabs %}

**Example Response:**

```json
```
