Colonoscopy is an important screening procedure for colorectal cancer. During this procedure, the endoscopist visually inspects the colon. Currently, there is no content-based analysis and retrieval system that automatically analyzes videos captured from colonoscopic procedures and provides a user-friendly and efficient access to important content. Such a system will be valuable for endoscopic research and education. The first necessary step for the analysis is parsing for semantic units. Since the characteristics of colonoscopy videos differ from those of videos studied in the literature, we introduce a new video parsing framework that includes (i) a new scene definition and a new video parsing paradigm and (ii) a novel scene segmentation algorithm using audio analysis and finite state automata to recognize scenes and associated boundaries. Our experimental results show average precision and recall of 95% and 81 % for parsing scenes, respectively. The framework is extensible to videos captured from other endoscopic procedures such as upper gastrointestinal endoscopy, enteroscopy, cystoscopy, and laparoscopy.