CityX: Controllable Procedural Content Generation for Unbounded 3D Cities

1Institute of Automation, Chinese Academy of Sciences, 2China University of Geosciences (Beijing), 3Center for Research on Intelligent Perception and Computing, CASIA 4University of Science and Technology


Generating a realistic, large-scale 3D virtual city remains a complex challenge due to the involvement of numerous 3D assets, various city styles, and strict layout constraints. Existing approaches provide promising attempts at procedural content generation to create large-scale scenes using Blender agents. However, they face crucial issues such as difficulties in scaling up generation capability and achieving fine-grained control at the semantic layout level. To address these problems, we propose a novel multi-modal controllable procedural content generation method, named \mymethod, which enhances realistic, unbounded 3D city generation guided by multiple layout conditions, including OSM, semantic maps, and satellite images. Specifically, the proposed method contains a general protocol for integrating various PCG plugins and a multi-agent framework for transforming instructions into executable Blender actions. Through this effective framework, \mymethod shows the potential to build an innovative ecosystem for 3D scene generation by bridging the gap between the quality of generated assets and industrial requirements. Extensive experiments have demonstrated the effectiveness of our method in creating high-quality, diverse, and unbounded cities guided by multi-modal conditions.

City generate demo

The proposed CityX can create large-scale 3D unbounded cities automatically according to user instructions.

The proposed CityX, under the guidance of multimodal inputs including OSM data, semantic maps, and satellite images, facilitates the automatic creation of realistic large-scale 3D urban scenes.The generated models are characterized by delicate geometric structures, realistic material textures, and natural lighting, allowing for seamless deployment in the industrial pipeline.

The proposed CityX, under the guidance of multimodal inputs including OSM data, semantic maps, and satellite images, facilitates the automatic creation of realistic large-scale 3D urban scenes.The generated models are characterized by delicate geometric structures, realistic material textures, and natural lighting, allowing for seamless deployment in the industrial pipeline.

The Multi-agent Workflow

The proposed SceneX can create large-scale 3D natural scenes or unbounded cities automatically according to user instructions. The generated models are characterized by delicate geometric structures, realistic material textures, and natural lighting, allowing for seamless deployment in the industrial pipeline.

The Multi-agent framework includes a pre-processing stage and four task stages. During the pre-processing stage, the PCG is encapsulated into action functions according to the PCG Management Protocol proposed in this work. In the task stages, the planner first specifies the subtask plans based on the user's description and provided inputs. For each subtask, the planner's assistant validates the proposed subtask. If the conditions are met, the planner passes the relevant parameters to the executor to be performed in Blender. The results are then evaluated by the feedback agent. If the requirements are met, the process moves to the next subtask; if not, it reverts until all tasks are completed. This coordinated effort among agents facilitates the generation of large-scale urban scenes.


  author = {Shougao Zhang and Mengqi Zhou and Yuxi Wang and Chuanchen Luo and Rongyu Wang and Yiwei Li and Xucheng Yin and Zhaoxiang Zhang and Junran Peng},
  title  = {CityX: Controllable Procedural Content Generation for Unbounded 3D Cities},
  journal = {},
  year = {2024},