Limits and Limitations
This reference covers a list of all the limits and limitations that apply on ModelZ.
Please contact us on Discord (opens in a new tab) if you have any questions about the current limits.
The maximum replicas per deployment is set to 5.
The maximum startup time is 600 seconds. The inference will fail if the startup time exceeds 600 seconds.
The maximum inference time per request is 60 seconds.
The maximum request body size is 5MB.
Currently, we only support public images. Private images are not supported yet.
We only support public Huggingface models if you are using the Huggingface Hub.
At the moment, our system is designed to display only the most recent 30 minutes of logs. However, we are actively working on expanding this feature to include a longer historical range of logs.