The advent of high-Throughput technologies and the resultant generation of data has increased the demand for datadriven analytics. However, a comprehensive and computationally efficient method for analyzing, understanding and managing the emergent behavior of complex biological systems using timeseries data remains elusive. In this paper, we introduce a new computational framework and modeling formalism designed for unsupervised learning and model construction in highthroughput biological data applications. This framework uses an underlying Bayesian nonparametric model which effectively infers long-range temporal dependencies from heterogeneous data streams to produce grammatical rules used for real-Time insilico modeling, behavior recognition and prediction. We present initial results of unsupervised learning tasks using unlabeled livecell imaging data from experiments performed on the Large Scale Digital Cell Analysis System (LSDCAS), namely cellular event classification and large-scale spatio-Temporal behavior recognition.