GH-48241: [Python] Scalar inferencing doesn't infer UUID#48727
GH-48241: [Python] Scalar inferencing doesn't infer UUID#48727tadeja wants to merge 11 commits intoapache:mainfrom
Conversation
|
Happy New Year! ❤️ I would suggest adding the documentation to the Extending PyArrow page under the Canonical extension types section as a separate subsection next to Fixed size tensor one. |
rok
left a comment
There was a problem hiding this comment.
Looks good to me. Two minor nits.
rok
left a comment
There was a problem hiding this comment.
Looks good. Just minor suggestions for comments.
|
@pitrou could you take a look at this PR? Especially cython change could use your expertise. |
8b1e4a2 to
2974e12
Compare
Co-authored-by: Rok Mihevc <rok@mihevc.org>
|
@pitrou, any wise thoughts on changes here? |
| GetUuidStaticSymbols(); | ||
| uuid_static_initialized = true; | ||
| } | ||
| #endif |
There was a problem hiding this comment.
We're duplicating code here between different module imports. It would be really nice to write something like this:
struct UuidModuleData {
PyObject* UUID_class = nullptr;
};
UuidModuleData* InitUuidStaticData() {
static ModuleOnceRunner runner("uuid");
return runner.Run([&](OwnedRef module) -> UuidModuleData {
UuidModuleData data;
OwnedRef ref;
if (ImportFromModule(module.obj(), "UUID", &ref).ok()) {
data.UUID_class = ref.obj();
}
return data;
});
}struct ModuleOnceRunner {
std::string module_name;
#ifdef Py_GIL_DISABLED
std::once_flag initialized;
#else
bool initialized = false;
#endif
template <typename Func>
auto Run(Func&& func) -> decltype(func(OwnedRef()) {
using RetType = decltype(func(OwnedRef());
RetType ret{};
auto wrapper_func = [&]() {
OwnerRef module;
if (ImportModule("uuid", &module).ok()) {
ret = func(std::move(module));
}
};
#ifdef Py_GIL_DISABLED
std::call_once(initialized, wrapper_func);
#else
if (!initialized) {
initialized = true;
wrapper_func();
}
#endif
return ret;
};
};
I think @rok can help.
| ARROW_ASSIGN_OR_RAISE(auto converter, (MakeConverter<PyConverter, PyConverterTrait>( | ||
| options.type, options, pool))); |
There was a problem hiding this comment.
Does MakeConverter support extension types here? I see that we only unwrap the extension type in the inference path above.
Co-authored-by: Antoine Pitrou <pitrou@free.fr>
Rationale for this change
This closes #48241, #44224 and #43855.
Currently uuid.UUID objects are not inferred/converted automatically in PyArrow, requiring users to explicitly specify the type.
What changes are included in this PR?
Adding support for Python's uuid.UUID objects in PyArrow's type inference and conversion.
Are these changes tested?
Yes, added test_uuid_scalar_from_python() and test_uuid_array_from_python() in
test_extension.py.Are there any user-facing changes?
Users can now pass Python uuid.UUID objects directly to PyArrow functions like pa.scalar() and pa.array() without specifying the type;
<pyarrow.UuidScalar: UUID('958174b9-3a5c-4cdd-8fc5-d51a2fc55784')>
<pyarrow.lib.UuidArray object at 0x1217725f0>
[
73611FD81F764A209C8B9CDBADDA1F53
]