Dockerfiles and Kubernetes configs for the Zep server, Zep NLP server, and a Postgres database with
pgvector installed may be found in the Zep repo.
Many cloud providers, including AWS, now offer managed Postgres services with
The Zep docker-compose file may be used as a deployment template for other container environments.
Prebuilt containers for both
arm64 architectures may be found on the Zep Package Repo.
For production deployment planning, please be mindful of the following.
You will need adequate memory for Zep's NLP and Postgres containers. The amount of memory required will be dictated by:
- NLP Server: The number of different embedding models you will be using and the size of these models. See Selecting Embedding Models.
- Postgres As the number of documents in your Collections grow and you create indexes over these collections, you will need to ensure that you have adequate memory available. Please see Postgres best practice deployment guides for more information.
Zep's Postgres container ships with a default configuration that is suitable for development and testing. For production deployments, you will need to tune your Postgres configuration to your use case. Please see Postgres best practice deployment guides for more information. In particular, you may need to increase
max_parallel_maintenance_workers to ensure that your indexes are created correctly and without timeouts. Other settings may also need to be adjusted in order to provide Postgres with adequate resources to execute vector search queries.
Zep limits queries to 10 minutes. If you're experiencing timeouts building IVFFLAT indexes, you may need to increase the
max_parallel_maintenance_workers settings in your Postgres configuration.
Zep's Postgres database will require adequate storage for your documents, document vectors, chat message histories, and chat message history vectors. The amount of storage required will be dictated by your use case, the width of your embedding vectors, and other variables.