From Object to Context: Scene Knowledge Enhanced Visual Grounding for Geospatial Understanding

This is the offical repo for paper "From Object to Context: Scene Knowledge Enhanced Visual Grounding for Geospatial Understanding". The dataset and code are coming soon.

Overview

Remote Sensing Visual Grounding (RSVG) is a critical task aimed at precise object localization in remote sensing images using language expressions. Existing methods align visual and textual features through cross-modal fusion but often fail to capture object dependencies, hindering complex visual reasoning about relationships and contexts. To address this, we introduce the Luojia-VG dataset, a benchmark enhancing visual reasoning with scene knowledge, including object-level annotations and contextual descriptions of relationships, functions, and activities. Unlike previous datasets focusing on basic object descriptions, Luojia-VG bridges the semantic gap between referring expressions and detailed visual content. Furthermore, we propose Knowledge-Enhanced Visual Grounding (KEVG), a novel model that combines scene knowledge with visual features and textual queries. KEVG contains two key components: the Deep Knowledge Fusion (DKF) module and the Query-Region Alignment (QRA) module. The DKF module progressively embeds scene knowledge into multi-scale visual features via cross-attention, enhancing the model's fine-grained understanding of scene contexts. The QRA module aligns image regions with the query by concentrating on the most contextually relevant areas for precise localization. Experiments demonstrate KEVG achieves state-of-the-art performance, with Pr@0.5 scores of 82.31% on DIOR-RSVG and 83.29% on Luojia-VG.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
fig		fig
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

From Object to Context: Scene Knowledge Enhanced Visual Grounding for Geospatial Understanding

Table of Contents

Overview

Luojia-VG Dataset

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

WHULuoJiaTeam/KEVG

Folders and files

Latest commit

History

Repository files navigation

From Object to Context: Scene Knowledge Enhanced Visual Grounding for Geospatial Understanding

Table of Contents

Overview

Luojia-VG Dataset

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages