# Batch Processing

Our Batch API is compatible with OpenAI. And It will save you 50% of the cost compared to synchronous interfaces.

## Supported Models

### /v1/chat/completions

* meta-llama/Meta-Llama-3.1-8B-Instruct
* meta-llama/Llama-3.3-70B-Instruct
* google/gemma-2-27b-it
* google/gemma-2-9b-it \* Qwen/Qwen2.5-7B-Instruct

### /v1/embeddings

coming soon

## Preparing Your Batch File

Batches start with a .jsonl file where each line contains the details of an individual request to the API. For now, the available endpoints are /v1/chat/completions (Chat Completions API), /v1/embeddings (Embeddings API) is not supported now. For a given input file, the parameters in each line's body field are the same as the parameters for the underlying endpoint. Each request must include a unique custom\_id value, which you can use to reference results after completion. Here's an example of an input file with 2 requests. Note that each input file can only include requests to a single model.

```
{"custom_id": "request-1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-3.5-turbo-0125", "messages": [{"role": "system", "content": "You are a helpful assistant."},{"role": "user", "content": "Hello world!"}],"max_tokens": 1000}}
{"custom_id": "request-2", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-3.5-turbo-0125", "messages": [{"role": "system", "content": "You are an unhelpful assistant."},{"role": "user", "content": "Hello world!"}],"max_tokens": 1000}}
```

## Uploading Your Batch File

You need to use the Netmind platform's File API to create files.You can refer to [File API](https://netmind-power.gitbook.io/netmind-power-documentation/api/files) for more information.

## Creating the Batch

Once you've successfully uploaded your input file, you can use the input File object's ID to create a batch. In this case, let's assume the file ID is file-123456. For now, the completion window can only be set to 24h. You can also provide custom metadata via an optional metadata parameter.

### Curl Example

```sh
export API_TOKEN={{API_TOKEN}}
curl -X POST 'https://api.netmind.ai/inference-api/openai/v1/batches' \
--header 'Authorization: $API_TOKEN' \
--header 'Content-Type: application/json' \
--data-raw '{
    "input_file_id": "file-123456",
    "endpoint": "/v1/chat/completions",
    "completion_window": "24h"
}'
```

### Python Example

```python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.netmind.ai/inference-api/openai/v1",
    api_key={{API_TOKEN}},
)

batch_input_file_id = "file-123456"
client.batches.create(
    input_file_id=batch_input_file_id,
    endpoint="/v1/chat/completions",
    completion_window="24h",
    metadata={
        "description": "nightly eval job"
    }
)
```

### Example response

```json
{
    "id": "batch_123456",
    "object": "batch",
    "endpoint": "",
    "errors": null,
    "input_file_id": null,
    "completion_window": "",
    "status": "pending",
    "output_file_id": null,
    "error_file_id": null,
    "created_at": null,
    "in_progress_at": null,
    "expires_at": null,
    "finalizing_at": null,
    "completed_at": null,
    "failed_at": null,
    "expired_at": null,
    "cancelling_at": null,
    "cancelled_at": null,
    "request_counts": {
        "total": 0,
        "completed": 0,
        "failed": 0
    },
    "metadata": null
}
```

## Checking the Status of a Batch

You can check the status of a batch at any time, which will also return a Batch object.

### Curl Example

```sh
export API_TOKEN={{API_TOKEN}}
curl -X GET 'https://api.netmind.ai/inference-api/openai/v1/batches/{{BATCH_ID}}' \
--header 'Authorization: $API_TOKEN'
```

### Python Example

```python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.netmind.ai/inference-api/openai/v1",
    api_key={{API_TOKEN}},
)

batch_id = "{{BATCH_ID}}"
batch = client.batches.retrieve(batch_id)
print(batch)
```

## Retrieving the Results

You need to use the Netmind platform's File API to get file content.You can refer to [File API](https://netmind-power.gitbook.io/netmind-power-documentation/api/files) for more information. Reulst file id in batch.output\_file\_id.

## Canceling the Batch

If necessary, you can cancel an ongoing batch. The batch's status will change to cancelling until in-flight requests are complete (up to 10 minutes), after which the status will change to cancelled.

### Curl Example

```sh
export API_TOKEN={{API_TOKEN}}
curl -X POST 'https://api.netmind.ai/inference-api/openai/v1/batches/{{BATCH_ID}}/cancel' \
--header 'Authorization: $API_TOKEN'
```

### Python Example

```python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.netmind.ai/inference-api/openai/v1",
    api_key={{API_TOKEN}},
)

batch_id = "{{BATCH_ID}}"
client.batches.cancel(batch_id)
```

## Getting a List of All Batches

At any time, you can see all your batches. For users with many batches, you can use the limit and after parameters to paginate your results.

### Curl Example

```sh
export API_TOKEN={{API_TOKEN}}
curl -X GET 'https://api.netmind.ai/inference-api/openai/v1/batches?limit=10' \
--header 'Authorization: $API_TOKEN' \
```

### Python Example

```python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.netmind.ai/inference-api/openai/v1",
    api_key={{API_TOKEN}},
)

batches = client.batches.list(limit=10)
print(batches)
```
