TY - GEN
T1 - Real-time mutual-information-based linear registration on the cell broadband engine processor
AU - Ohara, Moriyoshi
AU - Yeo, Hangu
AU - Savino, Frank
AU - Iyengar, Giridharan
AU - Gong, Leiguang
AU - Inoue, Hiroshi
AU - Komatsu, Hideaki
AU - Sheinin, Vadim
AU - Daijavad, Shahrokh
AU - Erickson, Bradley
PY - 2007
Y1 - 2007
N2 - Emerging multi-core processors are able to accelerate medical imaging applications by exploiting the parallelism available in their algorithms. We have implemented a mutual-information-based 3D linear registration algorithm on the Cell Broadband Engine™ (CBE) processor, which has nine processor cores on a chip and has a 4-way SIMD unit for each core. By exploiting the highly parallel architecture and its high memory bandwidth, our implementation with two CBE processors can compute mutual information for about 33 million pixel pairs in a second. This implementation is significantly faster than a conventional one on a traditional microprocessor or even faster than a previously reported custom-hardware implementation. As a result, it can register a pair of 256×256×30 3D images in one second by using a multi-resolution method. This paper describes our implementation with a focus on localized sampling and speculative packing techniques, which reduce the amount of the memory traffic by 82%.
AB - Emerging multi-core processors are able to accelerate medical imaging applications by exploiting the parallelism available in their algorithms. We have implemented a mutual-information-based 3D linear registration algorithm on the Cell Broadband Engine™ (CBE) processor, which has nine processor cores on a chip and has a 4-way SIMD unit for each core. By exploiting the highly parallel architecture and its high memory bandwidth, our implementation with two CBE processors can compute mutual information for about 33 million pixel pairs in a second. This implementation is significantly faster than a conventional one on a traditional microprocessor or even faster than a previously reported custom-hardware implementation. As a result, it can register a pair of 256×256×30 3D images in one second by using a multi-resolution method. This paper describes our implementation with a focus on localized sampling and speculative packing techniques, which reduce the amount of the memory traffic by 82%.
KW - And parallel processing
KW - Biomedical image processing
KW - Image registration
UR - http://www.scopus.com/inward/record.url?scp=36349025901&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=36349025901&partnerID=8YFLogxK
U2 - 10.1109/ISBI.2007.356781
DO - 10.1109/ISBI.2007.356781
M3 - Conference contribution
AN - SCOPUS:36349025901
SN - 1424406722
SN - 9781424406722
T3 - 2007 4th IEEE International Symposium on Biomedical Imaging: From Nano to Macro - Proceedings
SP - 33
EP - 36
BT - 2007 4th IEEE International Symposium on Biomedical Imaging
T2 - 2007 4th IEEE International Symposium on Biomedical Imaging: From Nano to Macro; ISBI'07
Y2 - 12 April 2007 through 15 April 2007
ER -