Skip to content

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Jan 8, 2026

Which issue does this PR close?

Rationale for this change

I had a few minor comments / suggestions while reviewing #19616 from @jizezhang but they weren't needed to do the initial merge, so I would like to propose them in a follow up PR

What changes are included in this PR?

  1. Improve documentation
  2. Improve handling of table_ref in ListingTableURL
  3. use Null rather than "NULL" in list_files_cache table function

I can break this into separate PRs if that would help

Are these changes tested?

Yes by CI

Are there any user-facing changes?

The list_files_cache function now might return null

@github-actions github-actions bot added core Core DataFusion crate execution Related to the execution crate datasource Changes to the datasource crate labels Jan 8, 2026

let schema = Arc::new(Schema::new(vec![
Field::new("table", DataType::Utf8, false),
Field::new("table", DataType::Utf8, true),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This uses NULL (rather than "NULL") in list_files_cache

Pattern::new(glob).map_err(|e| DataFusionError::External(Box::new(e)))?;
Self::try_new(self.url, Some(glob))
pub fn with_glob(mut self, glob: &str) -> Result<Self> {
self.glob =
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to call the constructor again as self.path hasn't changed and the constructor doesn't do anything with glob

pub fn try_new(url: Url, glob: Option<Pattern>) -> Result<Self> {
let prefix = Path::from_url_path(url.path())?;
Ok(Self {
url,
prefix,
glob,
table_ref: None,
})
}

This then simplifies the code in datafusion/core/src/datasource/listing_table_factory.rs

table_path = table_path
.with_glob(glob.as_ref())?
.with_table_ref(cmd.name.clone());
table_path = table_path.with_glob(glob.as_ref())?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be simplified due to the change in with_glob

@alamb alamb marked this pull request as ready for review January 8, 2026 14:42
Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @alamb it is lgtm!

@jizezhang
Copy link
Contributor

thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate datasource Changes to the datasource crate execution Related to the execution crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants