Skip to content

guoluqingfeng10086/MMQA

Repository files navigation

MMQA: Mars Mineral Question Answering

MMQA framework


Overview

MMQA has been accepted by TGRS 2026. It is a Mars mineral question answering system that combines the Martian Mineral Knowledge Graph, multi-source geological datasets, a Mars mineral text corpus, and large language models for interpretable geological reasoning.

Data and Knowledge Graph

MMQA is built around two complementary data resources:

  • M200: a curated bibliography of 200 mineral-formation-related papers. The source list is provided in geodata/MM200.csv.
  • M2000: a larger Mars mineral text corpus containing 2,000+ papers and reports collected for retrieval, evidence grounding, and corpus-scale knowledge extraction. The source list is provided in geodata/MM2000.csv.

From these resources, we construct the Martian Mineral Knowledge Graph (MMKG). The graph stores mineral entities, geological environments, formation processes, relations, descriptions, and provenance evidence. The extraction and fusion workflow is summarized below.

The ontology design is provided in FIG/Fig2_01.png, and the knowledge extraction prompt template is provided in kg_extract_prompts.txt.

MMKG construction workflow

The MMQA corpus also integrates multi-source geological data, including raster maps, vector maps, tabular geomorphological records, and text evidence. These data provide the local geological context used during formation analysis.

Data Sources

Data type Data Spatial resolution / content Format
Physical property OMEGA NIR Albedo 14400 x 7200 pixels (1.48 km/px) Raster map
Physical property TES Thermal Inertia 7200 x 3600 pixels (3 km/px) Raster map
Physical property MOLA Terrain Elevation 200 m/px Raster map
Chemical property TES Mineral Maps 1440 x 720 pixels (16 km/px) Raster map
Chemical property Elemental Abundance 72 x 36 pixels (300 km/px) Raster map
Geological age Geologic Map Global distribution of geological eras Vector map
Geomorphological feature Paleolake Basins Distribution of 425 paleolake basins Tabular
Geomorphological feature Fluvial Systems Distribution of 3,772 valley systems Vector map
Geomorphological feature Craters > 1 km Distribution of 384,343 craters Tabular
Geomorphological feature HiRISE Topography 96,365 coordinate-topography pairs Tabular
Text corpus Multi-source texts 214 research articles and 15 NASA reports Text

Main public sources include:

Code Structure

The code is organized around a compact reasoning pipeline:

Quick Start

To get started with MMQA, prepare the local project folder, configure the OpenAI-compatible API endpoint, and place the required geological data, MMKG files, text corpus, and embedding indexes in the expected local paths.

cd MMQA

Configure API access in proxy_config.py:

API_KEY = "your_api_key"
BASE_URL = "your_base_url"

Run the full MMQA pipeline with graph-path reasoning and text retrieval:

python MMAgentV2.py

For a minimal demonstration without the knowledge graph and text corpus, run the geological-context-only version:

python MMQAsimple.py

Example query:

At 109.9 degrees E, 25.1 degrees N on Mars, sulfate was detected. What could be the formation mechanism?

The full system returns an answer grounded in geological context, retrieved text evidence, and MMKG reasoning paths. The simplified version is useful for testing coordinate-based geological reasoning when the complete MMKG and corpus resources are not available.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages