HSENet: Hybrid Spatial Encoding Network for 3D Medical Vision-Language Understanding
Published in arXiv preprint arXiv:2506.09634, 2025
This paper proposes HSENet, a novel hybrid spatial encoding network designed to enhance 3D medical vision-language understanding by effectively integrating spatial information with language representations.
Recommended citation: Shi, Y., Zhang, X., Ji, J., Jiang, H., Zheng, C., Wang, Y., & Qu, L. (2025). "HSENet: Hybrid Spatial Encoding Network for 3D Medical Vision-Language Understanding." arXiv preprint arXiv:2506.09634. https://arxiv.org/abs/2506.09634
