RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

With rapid urbanization, urban renewal has become increasingly important. Traditional research has relied on expert assessments and objective indicators, lacking scalable frameworks that effectively translate street-level conditions into actionable renewal strategies. This study proposes a Vision–Language Model (VLM)-based framework to address these gaps, using the Hongshan Central District of Urumqi, China, as a case study. Specifically, we collected 4215 street-view images (SVIs) and employed VLMs to assess six perceptual dimensions (i.e., safety, liveliness, beauty, wealthiness, depressiveness, and boringness), together with textual descriptions. The best-performing model, selected by a 500-respondent perception survey validation, was used to conduct spatial pattern and text mining analyses to inform targeted urban renewal strategies. Results show that (1) VLMs have a high consistency with humans in evaluating the spatial perception of six dimensions; (2) spatial clustering analysis successfully delineated four distinct renewal priority tiers, confirming the method’s capability in translating perceptual data into actionable spatial strategies; and (3) textual mining of the VLM’s rationales revealed that areas with lower perceptual scores are predominantly characterized by deficiencies in foundational infrastructure and street-level order, thereby providing explanatory evidence directly linked to the generated renewal priorities. This study provides a generative artificial intelligence (GAI)-driven and interpretable evaluation framework for urban renewal decision-making, facilitating precision-oriented and intelligent urban regeneration. © 2026 by the authors.

Urban Street-Scene Perception and Renewal Strategies Powered by Vision–Language Models

Yao Yuhan;Dall'O' Giuliano;Lu Feidong

2026-01-01

Abstract

With rapid urbanization, urban renewal has become increasingly important. Traditional research has relied on expert assessments and objective indicators, lacking scalable frameworks that effectively translate street-level conditions into actionable renewal strategies. This study proposes a Vision–Language Model (VLM)-based framework to address these gaps, using the Hongshan Central District of Urumqi, China, as a case study. Specifically, we collected 4215 street-view images (SVIs) and employed VLMs to assess six perceptual dimensions (i.e., safety, liveliness, beauty, wealthiness, depressiveness, and boringness), together with textual descriptions. The best-performing model, selected by a 500-respondent perception survey validation, was used to conduct spatial pattern and text mining analyses to inform targeted urban renewal strategies. Results show that (1) VLMs have a high consistency with humans in evaluating the spatial perception of six dimensions; (2) spatial clustering analysis successfully delineated four distinct renewal priority tiers, confirming the method’s capability in translating perceptual data into actionable spatial strategies; and (3) textual mining of the VLM’s rationales revealed that areas with lower perceptual scores are predominantly characterized by deficiencies in foundational infrastructure and street-level order, thereby providing explanatory evidence directly linked to the generated renewal priorities. This study provides a generative artificial intelligence (GAI)-driven and interpretable evaluation framework for urban renewal decision-making, facilitating precision-oriented and intelligent urban regeneration. © 2026 by the authors.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2026
			
	Titolo della rivista
	
				LAND
			
	Appare nelle tipologie:
	
				01.1 Articolo in Rivista

File in questo prodotto:

File	Dimensione	Formato
land-15-00244 (1) (1).pdf accesso aperto : Publisher’s version Dimensione 6.53 MB Formato Adobe PDF Visualizza/Apri	6.53 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1308460

Citazioni

ND

0

1

ND

social impact