Abstract: Despite advances in Large Multi-Modal Models, applying them to long and untrimmed video content remains challenging due to limitations in context length and substantial memory overhead.
This module initializes a freshly deployed controller. It requires the usage of a g3 based controller image, in order for the API v2 operations to function properly.