Surgical Video Understanding and Multimodal Foundation Models Research Assistant