ORCID: https://orcid.org/0009-0003-4189-7133; Zhou, Lanyue und Lehnert, Lukas W.
ORCID: https://orcid.org/0000-0002-5229-2282
(2025):
Geo-scenes dissecting urban fabric: Understanding and recognition combining AI, remotely sensed data and multimodal spatial semantics.
In: ISPRS Journal of Photogrammetry and Remote Sensing, Bd. 230: S. 716-737
[PDF, 24MB]
Abstract
Urban fabric represents the intersection of spatial structure and social function. Analyzing its geographic components, functional semantics, and interactive relationships enables a deeper understanding of the formation and evolution of urban geo-scenes. Urban geo-scenes (UGS), as the fundamental units of urban systems, play a vital role in balancing and optimizing spatial layout, while enhancing urban resilience and vitality. Although multimodal spatial data are widely used to describe UGS, conventional approaches that rely solely on visual or social features are insufficient when addressing the complexity of modern urban systems. The spatial relationships and distributional patterns among urban elements are equally crucial for capturing the full semantic structure of urban geo-scenes. In parallel, most deep learning models still face limitations in effectively mining and fusing such diverse information. To address these challenges, we propose a multimodal deep learning framework for UGS recognition. Guided by the concepts of urban fabric and spatial co-location patterns, our method dissects the internal structure of geo-scenes and constructs a bottom-up urban fabric graph model to capture spatial semantics among geographic entities. Specifically, we employ a customized SE-DenseNet branch to extract deep physical and visual features from high-resolution satellite imagery, along with social semantic information from auxiliary data (e.g., POIs, building footprint coverage). A semantic fusion module is further introduced to enable collaborative interaction among multi-modal and multi-scale features. The framework was validated across four Chinese cities with varying sizes, economic levels, and cultural contexts. The proposed method achieved an overall accuracy of approximately 90%, outperforming existing state-of-the-art multimodal approaches. Moreover, ablation studies conducted in three cities of different scales confirm the critical role of urban fabric in UGS recognition. Our results demonstrate that the joint modeling of visual appearance, functional attributes, and spatial semantics offers a novel and more comprehensive understanding of urban geo-scenes.
| Dokumententyp: | Zeitschriftenartikel |
|---|---|
| Fakultät: | Geowissenschaften > Department für Geographie |
| Themengebiete: | 500 Naturwissenschaften und Mathematik > 550 Geowissenschaften, Geologie |
| URN: | urn:nbn:de:bvb:19-epub-130306-7 |
| ISSN: | 09242716 |
| Sprache: | Englisch |
| Dokumenten ID: | 130306 |
| Datum der Veröffentlichung auf Open Access LMU: | 10. Dez. 2025 12:32 |
| Letzte Änderungen: | 10. Dez. 2025 12:32 |
