Multilingual Semantic Video Event and Action Search Engine and API

Built FastAPI semantic search over videos with clip indexing and multilingual query support. This project demonstrates practical execution from architecture and implementation to measurable delivery outcomes.

Personal ProjectsYear 2026

Project Overview

Objective

Built FastAPI semantic search over videos with clip indexing and multilingual query support.

Stack

FastAPIOpenCVCLIP (ViT-B/32)PyTorchFAISS

Delivery highlights

  • Built a FastAPI-based video semantic search system that enables natural language queries over video content by segmenting videos into fixed-duration clips, extracting representative frames with OpenCV, generating semantic embeddings using CLIP (ViT-B/32) in PyTorch, and indexing them in FAISS for efficient cosine similarity search; the system processes queries such as 'dog running in park' or 'car accident at intersection', performs multilingual translation when necessary to improve embedding alignment, matches results against stored clip embeddings using similarity scoring, groups matched segments by video, and returns structured JSON responses with video filenames, URLs, and precise timestamp intervals (start_time, end_time) for direct navigation to relevant moments without watching the entire video.
Back to Topic ProjectsBack to All Projects

Related Projects

3 items

Text-to-Video Semantic Search

Personal ProjectsYear: 2026

Built text-to-video semantic scene retrieval with multilingual query processing.

Multimodal Semantic Retrieval (Video and Image Search)

Personal ProjectsYear: 2026

Unified text-to-video and text-to-image search into one cross-modal retrieval platform.

Multilingual Video Understanding and Event Summarization System (Thai-English Timeline Intelligence)

Personal ProjectsYear: 2026

Built end-to-end multilingual video analysis with clip-level descriptions and bilingual summaries.