# Batch Processing

Our Batch API is compatible with OpenAI. And It will save you 50% of the cost compared to synchronous interfaces.

## Supported Models

### /v1/chat/completions

* meta-llama/Meta-Llama-3.1-8B-Instruct
* meta-llama/Llama-3.3-70B-Instruct
* google/gemma-2-27b-it
* google/gemma-2-9b-it \* Qwen/Qwen2.5-7B-Instruct

### /v1/embeddings

coming soon

## Preparing Your Batch File

Batches start with a .jsonl file where each line contains the details of an individual request to the API. For now, the available endpoints are /v1/chat/completions (Chat Completions API), /v1/embeddings (Embeddings API) is not supported now. For a given input file, the parameters in each line's body field are the same as the parameters for the underlying endpoint. Each request must include a unique custom\_id value, which you can use to reference results after completion. Here's an example of an input file with 2 requests. Note that each input file can only include requests to a single model.

```
{"custom_id": "request-1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-3.5-turbo-0125", "messages": [{"role": "system", "content": "You are a helpful assistant."},{"role": "user", "content": "Hello world!"}],"max_tokens": 1000}}
{"custom_id": "request-2", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-3.5-turbo-0125", "messages": [{"role": "system", "content": "You are an unhelpful assistant."},{"role": "user", "content": "Hello world!"}],"max_tokens": 1000}}
```

## Uploading Your Batch File

You need to use the Netmind platform's File API to create files.You can refer to [File API](https://netmind-power.gitbook.io/netmind-power-documentation/api/files) for more information.

## Creating the Batch

Once you've successfully uploaded your input file, you can use the input File object's ID to create a batch. In this case, let's assume the file ID is file-123456. For now, the completion window can only be set to 24h. You can also provide custom metadata via an optional metadata parameter.

### Curl Example

```sh
export API_TOKEN={{API_TOKEN}}
curl -X POST 'https://api.netmind.ai/inference-api/openai/v1/batches' \
--header 'Authorization: $API_TOKEN' \
--header 'Content-Type: application/json' \
--data-raw '{
    "input_file_id": "file-123456",
    "endpoint": "/v1/chat/completions",
    "completion_window": "24h"
}'
```

### Python Example

```python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.netmind.ai/inference-api/openai/v1",
    api_key={{API_TOKEN}},
)

batch_input_file_id = "file-123456"
client.batches.create(
    input_file_id=batch_input_file_id,
    endpoint="/v1/chat/completions",
    completion_window="24h",
    metadata={
        "description": "nightly eval job"
    }
)
```

### Example response

```json
{
    "id": "batch_123456",
    "object": "batch",
    "endpoint": "",
    "errors": null,
    "input_file_id": null,
    "completion_window": "",
    "status": "pending",
    "output_file_id": null,
    "error_file_id": null,
    "created_at": null,
    "in_progress_at": null,
    "expires_at": null,
    "finalizing_at": null,
    "completed_at": null,
    "failed_at": null,
    "expired_at": null,
    "cancelling_at": null,
    "cancelled_at": null,
    "request_counts": {
        "total": 0,
        "completed": 0,
        "failed": 0
    },
    "metadata": null
}
```

## Checking the Status of a Batch

You can check the status of a batch at any time, which will also return a Batch object.

### Curl Example

```sh
export API_TOKEN={{API_TOKEN}}
curl -X GET 'https://api.netmind.ai/inference-api/openai/v1/batches/{{BATCH_ID}}' \
--header 'Authorization: $API_TOKEN'
```

### Python Example

```python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.netmind.ai/inference-api/openai/v1",
    api_key={{API_TOKEN}},
)

batch_id = "{{BATCH_ID}}"
batch = client.batches.retrieve(batch_id)
print(batch)
```

## Retrieving the Results

You need to use the Netmind platform's File API to get file content.You can refer to [File API](https://netmind-power.gitbook.io/netmind-power-documentation/api/files) for more information. Reulst file id in batch.output\_file\_id.

## Canceling the Batch

If necessary, you can cancel an ongoing batch. The batch's status will change to cancelling until in-flight requests are complete (up to 10 minutes), after which the status will change to cancelled.

### Curl Example

```sh
export API_TOKEN={{API_TOKEN}}
curl -X POST 'https://api.netmind.ai/inference-api/openai/v1/batches/{{BATCH_ID}}/cancel' \
--header 'Authorization: $API_TOKEN'
```

### Python Example

```python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.netmind.ai/inference-api/openai/v1",
    api_key={{API_TOKEN}},
)

batch_id = "{{BATCH_ID}}"
client.batches.cancel(batch_id)
```

## Getting a List of All Batches

At any time, you can see all your batches. For users with many batches, you can use the limit and after parameters to paginate your results.

### Curl Example

```sh
export API_TOKEN={{API_TOKEN}}
curl -X GET 'https://api.netmind.ai/inference-api/openai/v1/batches?limit=10' \
--header 'Authorization: $API_TOKEN' \
```

### Python Example

```python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.netmind.ai/inference-api/openai/v1",
    api_key={{API_TOKEN}},
)

batches = client.batches.list(limit=10)
print(batches)
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://netmind-power.gitbook.io/netmind-power-documentation/api/inference/batch-processing.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
