Skip to content

[FEA] Converting structs to strings #18454

@NdroidY

Description

@NdroidY

Is your feature request related to a problem? Please describe.

I wish I could use cuDF to directly convert a struct column in a cudf.Series to a string representation using something like cudf.Series.struct.astype(str). Currently, cuDF does not seem to support a straightforward way to cast a struct-type column to a string, which makes it challenging to work with struct data when I need a human-readable format or to pass it to other systems expecting strings.

Describe the solution you’d like
I’d like a method such as cudf.Series.struct.astype(str) that converts a struct column in a cudf.Series to a string representation. For example, if I have a struct column with fields like {"a": 1, "b": 2}, it could output a string like "{a: 1, b: 2}" or another consistent format. This would make it easier to manipulate, display, or export struct data without needing complex workarounds.

Describe alternatives you’ve considered
I’ve considered extracting the individual fields of the struct using cudf.Series.struct.explode() or accessing fields directly and then concatenating them manually into a string. However, this is cumbersome, especially for structs with many fields or nested structures, and it doesn’t scale well for large datasets. Another option is converting the data to a pandas DataFrame and using pandas string operations, but this defeats the purpose of using cuDF for GPU acceleration.

Additional context
This feature would be particularly useful for debugging, logging, or interoperability with systems that expect string data. For example, in a dataset with a struct column like {"x": int, "y": float}, being able to call .struct.astype(str) would simplify workflows significantly. Existing implementations in libraries like pandas have astype(str) for general use, so extending this to cuDF’s struct handling would align with user expectations and enhance functionality.

Metadata

Metadata

Assignees

No one assigned

    Labels

    0 - Waiting on AuthorWaiting for author to respond to reviewSparkFunctionality that helps Spark RAPIDSfeature requestNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions