Our Model Inference API is now available. Before running the examples, please ensure that you have created your API token. Below are examples demonstrating how to use CURL and Python to call the model for predictions.
If you want more model inference APIs and more detailed parameter descriptions, please visit the inference section.
Bart-cnn
Other text-based input models can also refer to this example.
CURL Example
export API_TOKEN=your_api_token -H"Authorization: $API_TOKEN" \ -H"Content-Type: application/json" \ -d$'{ "input":"New York (CNN)When Liana Barrientos was 23 years old, she got married in Westchester County, New York.A year later, she got married again in Westchester County, but to a different man and without divorcing her first husband."
}' \https://inference-api.netmind.ai/inference-api/v1/bart-large-cnn
Python Example
import requestsapi_token ='your_api_token'headers ={'Authorization': api_token,'Content-Type':'application/json'}data ={ 'input': 'New York (CNN)When Liana Barrientos was 23 years old, she got married in Westchester County, New York.A year later, she got married again in Westchester County, but to a different man and without divorcing her first husband.'
}response = requests.post('https://inference-api.netmind.ai/inference-api/v1/bart-large-cnn', headers=headers, json=data)
print(response.json())
rmbg
Other file-based input and file-based output models can also refer to this example.
import requests
api_token = 'your_api_token'
headers = {
'Authorization': api_token,
'Content-Type': 'application/json'
}
data = {
'image': 'https://i.postimg.cc/1XJMrxrT/giraffe.png'
}
response = requests.post('https://inference-api.netmind.ai/inference-api/v1/rmbg-1-4', headers=headers, json=data)
if response.status_code == 200:
with open('result.png', 'wb') as f:
f.write(response.content)
else:
print('Failed:', response.status_code, response.text)
Video-llava
Other file-based input models can also refer to this example.
video-llava supports two content types: application/json and multipart/form-data. We use different input types mainly to provide different file transfer methods. You can choose the type according to different situations. Generally, if your file is large, it is recommended to use application/json where you can directly transfer the file using its URL. If your file is small, it is recommended to use multipart/form-data. Using application/json (file URL) may provide faster data transmission performance. Below we provide examples for different content types.
CURL Example
multipart/form-data Example
export API_TOKEN=your_api_token
curl -X POST \
-H "Authorization: $API_TOKEN" \
-F 'video=@"/path/to/audio.mp4"' \
-F 'inp="How many people are in the video?"' \
https://inference-api.netmind.ai/inference-api/v1/video-llava
application/json Example
export API_TOKEN=your_api_token
curl -X POST \
-H "Authorization: $API_TOKEN" \
-H "Content-Type: application/json" \
-d $'{
"video": "https://inference-api-dev.netmind.ai/example_file/baby_laugh.mp4",
"inp": "Why is this video funny?"
}' \
https://inference-api.netmind.ai/inference-api/v1/video-llava
Python Example
multipart/form-data Example
import requests
api_token = 'your_api_token'
headers = {
'Authorization': api_token,
}
files = {
'video': open('/path/to/audio.mp4', 'rb')
}
data = {
'inp': 'How many people are in the video?'
}
response = requests.post('https://inference-api.netmind.ai/inference-api/v1/video-llava', headers=headers, files=files, data=data)
print(response.json())
application/json Example
import requests
api_token = 'your_api_token'
headers = {
'Authorization': api_token,
'Content-Type': 'application/json'
}
data = {
'video': 'https://inference-api-dev.netmind.ai/example_file/baby_laugh.mp4',
'inp': 'Why is this video funny?'
}
response = requests.post('https://inference-api.netmind.ai/inference-api/v1/video-llava', headers=headers, json=data)
print(response.json())
llama3-70B
Other chat-based models can also refer to this example.
CURL Example
export API_TOKEN=your_api_token
curl -X POST \
-H "Authorization: $API_TOKEN" \
-H "Content-Type: application/json" \
-d $'{
"messages":[
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Write a 100-word article on Benefits of Open-Source in AI research"
}
],
"max_new_tokens":1024,
"temperature":0.6,
"top_p":0.9,
"top_k":50,
"repetition_penalty":1.2
}' \
https://inference-api.netmind.ai/inference-api/v1/llama3-70B
Python Example
import requests
api_token = 'your_api_token'
headers = {
'Authorization': api_token,
'Content-Type': 'application/json'
}
data = {
'messages': [
{
'role': 'system',
'content': 'You are a helpful assistant.'
},
{
'role': 'user',
'content': 'Write a 100-word article on Benefits of Open-Source in AI research'
}
],
'max_new_tokens': 1024,
'temperature': 0.6,
'top_p': 0.9,
'top_k': 50,
'repetition_penalty': 1.2
}
response = requests.post('https://inference-api.netmind.ai/inference-api/v1/llama3-70B', headers=headers, json=data, stream=True)
for line in response.iter_lines():
if line:
decoded_line = line.decode('utf-8')
if decoded_line.startswith('data:'):
print(decoded_line[len('data:'):].strip())