Alternative splicing is considered to be a key factor underlying increased cellular and functional complexity in higher eukaryotes. With the advance of high-throughput genomics technologies, it becomes critical to mine alternative splicing knowledge from biological research literature. Meanwhile, there have been many papers published on DNA splicing and translation and it is time-consuming to find papers specifically relevant to alternative splicing. Observing that documents reporting alternative splicing can be obtained from existing knowledge bases recording literature evidences and also that a large number of unlabeled documents are freely available, we investigated learning from positive and unlabeled data (LPU) for retrieving papers relevant to alternative splicing. The positive documents are from Literature Support for Alternative Transcripts (LSAT) and unlabeled documents are obtained from Gene Reference Into Function (GeneRIF). We generated nine unlabeled datasets different in size or the way documents were sampled, and compared the performance of document classifiers built using different unlabeled datasets and machine learning algorithms. The study shows that LPU is a viable strategy to build document filtering system, while the performance of trained classifiers is affected by the choice of the unlabeled data set. Selection of machine learning algorithms and that of unlabeled documents would be critical in constructing an effective LPU-based system.