Ctrlk

Website Discord Github

👋INTRODUCTION
- Secret Network Introduction
- Secret Network Techstack
💻Development
Secret AI
🌐SecretVM - Confidential Virtual Machines
🔓Confidential Computing Layer
🤫Overview, Ecosystem and Technology
- 🚀Secret Network Overview
- 🐸Ecosystem Overview
🔧Infrastructure
⏰Cron module

Powered by GitBook

On this page

Was this helpful?

Secret AI
👩‍💻Secret AI SDK
SECRET AI SDK

Concurrency Support

Concurrency Support

Ollama is designed to handle multiple requests concurrently, allowing you to serve several clients or processes at once. By default, the Ollama server runs as a long-lived process and accepts parallel HTTP requests over its OpenAI-compatible API endpoints (e.g., /v1/chat/completions).

Key concurrency features:

Each request spawns its own model evaluation stream
Server manages batching and queuing under the hood for fair scheduling
Safe to send simultaneous requests from different threads, processes, or machines
Built-in load balancing across multiple Ollama instances

PreviousOpenAI-Compatible Example NextTesting

Last updated 2 months ago

Was this helpful?