Skip to content

feat: Allow datafusion-ffi to opt out of proto parquet#22951

Open
Xuanwo wants to merge 1 commit into
apache:mainfrom
Xuanwo:xuanwo/datafusion-ffi-proto-parquet
Open

feat: Allow datafusion-ffi to opt out of proto parquet#22951
Xuanwo wants to merge 1 commit into
apache:mainfrom
Xuanwo:xuanwo/datafusion-ffi-proto-parquet

Conversation

@Xuanwo

@Xuanwo Xuanwo commented Jun 15, 2026

Copy link
Copy Markdown
Member

Which issue does this PR close?

N/A

Rationale for this change

datafusion-ffi inherits datafusion-proto from the workspace. Because datafusion-proto enables its parquet feature by default, downstream crates that depend on datafusion-ffi also pull in datafusion-datasource-parquet, parquet, and parquet compression dependencies even when they do not need parquet-aware proto support.

Cargo features are additive, so downstream users cannot opt out by disabling default features on their own direct datafusion-proto dependency if the datafusion-ffi -> datafusion-proto edge has already enabled defaults.

What changes are included in this PR?

This PR makes the workspace datafusion-proto dependency default to default-features = false, then lets crates opt into the previous behavior explicitly.

For datafusion-ffi, the previous default behavior is preserved with a new default parquet feature. Downstream users that want to avoid parquet dependencies can now use datafusion-ffi with default-features = false.

The FFI session table option bridge now gates ConfigFileType::PARQUET usage behind the datafusion-ffi/parquet feature, while still preserving parquet table option values in no-default builds.

Are these changes tested?

Yes. Existing datafusion-ffi tests pass with default features and with --no-default-features. The feature graph was also checked to confirm that default datafusion-ffi still enables parquet, while datafusion-ffi --no-default-features does not include datafusion-datasource-parquet.

Are there any user-facing changes?

Default behavior is unchanged. Users can now opt out of parquet-aware proto support from datafusion-ffi by disabling datafusion-ffi default features.

@Xuanwo Xuanwo marked this pull request as ready for review June 15, 2026 07:36
@github-actions github-actions Bot added proto Related to proto crate ffi Changes to the ffi crate labels Jun 15, 2026

@waynexia waynexia left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good

@Xuanwo Xuanwo changed the title Allow datafusion-ffi to opt out of proto parquet feat: Allow datafusion-ffi to opt out of proto parquet Jun 15, 2026

@martin-g martin-g left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no new tests / CI jobs verifying that cargo check -p datafusion-ffi --no-default-features still builds.

Comment on lines +479 to +485
let formats = vec![ConfigFileType::CSV, ConfigFileType::JSON];
#[cfg(feature = "parquet")]
let formats = {
let mut formats = formats;
formats.push(ConfigFileType::PARQUET);
formats
};

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let formats = vec![ConfigFileType::CSV, ConfigFileType::JSON];
#[cfg(feature = "parquet")]
let formats = {
let mut formats = formats;
formats.push(ConfigFileType::PARQUET);
formats
};
let formats = [
ConfigFileType::CSV,
ConfigFileType::JSON,
#[cfg(feature = "parquet")]
ConfigFileType::PARQUET,
];

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any downsides of not doing the changes below when the parquet feature is not enabled ?
Since there is no usage of Parquet types I think it would be better to still support ConfigFileType::PARQUET as before.

Comment thread benchmarks/Cargo.toml

[dev-dependencies]
datafusion-proto = { workspace = true }
datafusion-proto = { workspace = true, default-features = true }

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is redundant since default-features = true is the default behavior anyway.
Maybe list the used features explicitly:

Suggested change
datafusion-proto = { workspace = true, default-features = true }
datafusion-proto = { workspace = true, features = ["parquet"] }

datafusion-expr = { workspace = true }
datafusion-physical-expr-adapter = { workspace = true }
datafusion-proto = { workspace = true }
datafusion-proto = { workspace = true, default-features = true }

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as for -benchmarks

Comment thread Cargo.toml
datafusion-physical-optimizer = { path = "datafusion/physical-optimizer", version = "54.0.0" }
datafusion-physical-plan = { path = "datafusion/physical-plan", version = "54.0.0" }
datafusion-proto = { path = "datafusion/proto", version = "54.0.0" }
datafusion-proto = { path = "datafusion/proto", version = "54.0.0", default-features = false }

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes the default behavior for all members of the workspace, no just for -ffi. Is this intentional ?

The -ffi crate already does it at https://github.com/apache/datafusion/pull/22951/changes#diff-c3dbf9bcba2d884b87eb3587e66d470700dd7e8e75f1630765e1da4b42a81c5aR66

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ffi Changes to the ffi crate proto Related to proto crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants