A large number of structured and unstructured data (e.g., EHRs, ontologies, reports) have been introduced by the biomedical community. Cross-domain data integration is identified as an important research problem for translational research. From an application perspective, identifying related concepts among medical ontologies is an important goal of life science research. It is essential to analyze how relations are specified to connect concepts in a single ontology or across multiple ontologies. With the explosion of cross domain datasets, it is extremely hard for researchers to discover knowledge from current infrastructures of ontologies. It is mainly a lack of the connectivity between the ontologies' cross domains and ontologies to unstructured data; even if they have specific biomedical knowledge in a more general and comprehensive level. Therefore, there is a need for a mechanism to do semantic partition and query generation for cross domain biomedical knowledge discovery. In this paper, we present such a model that clusters integrated data based on semantic closeness of predicates into different groups and produces meaningful queries to fully discover knowledge over a set of interlinked data sources. We have implemented a prototype of the BmQGen system and evaluated the proposed query model based on the predicate oriented clustering with colorectal surgical cohort from the Mayo Clinic.