Dedicated Endpoints

Pricing

Billing based on the GPU type, number of instances and the duration of instance services.

Get inference flavor

circle-info

Replace {{API_TOKEN}} with your actual token.

Example Request:

curl --location 'https://api.netmind.ai/v1/inference-service/flavor' \
--header 'Authorization: Bearer {{API_TOKEN}}'

Example Response:

{
    "flavor_list": [
        {
            "flavor_id": "69475e82e81c4dd6be3467e2ca374e0c",
            "display_name": "NVIDIA_GeForce_RTX_4090",
            "cluster_id": "1",
            "cluster_flavor_id": "US_01_4090",
            "meta_info": {
                "cuda": "12.0",
                "region": "cn"
            },
            "billing": {
                "cny_price_unit": 2.2,
                "usd_price_unit": 0.3,
                "nmt_price_unit": 0.147164
            },
            "created_at": "2024-11-13 09:00:19",
            "updated_at": "2024-11-13 09:00:19",
            "deleted_at": null,
            "is_deleted": false,
            "available_num": 1,
            "node_max_gpu": 2
        },
        {
            "flavor_id": "7b2f36d30e0743debc6c60d5017e2d16",
            "display_name": "NVIDIA_GeForce_RTX_4090",
            "cluster_id": "1",
            "cluster_flavor_id": "US_01_4090",
            "meta_info": {
                "cuda": "12.0",
                "region": "other"
            },
            "billing": {
                "cny_price_unit": 2.2,
                "usd_price_unit": 0.3,
                "nmt_price_unit": 0.147164
            },
            "created_at": "2024-11-13 09:00:19",
            "updated_at": "2024-11-13 09:00:19",
            "deleted_at": null,
            "is_deleted": false,
            "available_num": 1,
            "node_max_gpu": 2
        }
    ]
}

Create endpoint

circle-info

Replace {{API_TOKEN}} with your actual token.

Example Request:

Example Response:

Get endpoint

circle-info

Replace {{API_TOKEN}} with your actual token.

Replace {{INFERENCE_ID}} with "service_id" you got from previous step.

Example Request:

Example Response:

circle-info

The value of "status" can be:

  • initializing: The status immediately after creating a new instance or redeploying a stopped instance. It means the instance is being initializing.

  • available: Typically follows the "Deploying" status. This indicates the instance is running normally, can be scaled, and the endpoint is accessible.

  • stopped: The instance is stopped, and the number of workers will be reduced to zero.

  • unavailable: The instance deployment or scaling failed due to an error, which may cause the endpoint to become inaccessible.

circle-info

When the instance is in an "available" state, it can be accessed via "https://api.deeptrin.com/inference-api/v1/inference_service/{{INFERENCE_ID}}".

Update endpoint

circle-info

Replace {{API_TOKEN}} with your actual token. Replace {{INFERENCE_ID}} with "service_id" you got from previous step.

Example Request:

Example Response:

Delete endpoint

circle-info

Replace {{API_TOKEN}} with your actual token. Replace {{INFERENCE_ID}} with "service_id" you got from previous step.

Example Request:

Example Response:

Last updated