Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC 101 implementation] Add GDALGetThreadSafeDataset() and GDAL_OF_THREAD_SAFE #10746

Merged
merged 9 commits into from
Sep 19, 2024

Conversation

rouault
Copy link
Member

@rouault rouault commented Sep 7, 2024

Implements PR #10676

@rouault rouault added this to the 3.10.0 milestone Sep 7, 2024
@rouault rouault force-pushed the raster_multi_thread branch 2 times, most recently from 24f82da to d263783 Compare September 7, 2024 17:03
@rouault rouault changed the title [RFC 101] Add GDALCreateThreadSafeDataset() and GDAL_OF_THREAD_SAFE [RFC 101 implementation] Add GDALCreateThreadSafeDataset() and GDAL_OF_THREAD_SAFE Sep 7, 2024
@rouault rouault force-pushed the raster_multi_thread branch 3 times, most recently from 35ad269 to ae5944a Compare September 7, 2024 18:00
@rouault rouault changed the title [RFC 101 implementation] Add GDALCreateThreadSafeDataset() and GDAL_OF_THREAD_SAFE [RFC 101 implementation] Add GDALGetThreadSafeDataset() and GDAL_OF_THREAD_SAFE Sep 7, 2024
@rouault rouault force-pushed the raster_multi_thread branch 5 times, most recently from 4daeb38 to c8459a1 Compare September 9, 2024 09:41
@coveralls
Copy link
Collaborator

coveralls commented Sep 9, 2024

Coverage Status

coverage: 69.426% (+2.5%) from 66.937%
when pulling f2a6849 on rouault:raster_multi_thread
into 9ada0c8 on OSGeo:master.

@rouault rouault force-pushed the raster_multi_thread branch 2 times, most recently from bf2411b to 849e9ce Compare September 9, 2024 11:50
@rouault
Copy link
Member Author

rouault commented Sep 11, 2024

CC @abellgithub if you want to review it. I've made an effort to heavily comment the new gcore/gdalthreadsafedataset.cpp file

Copy link
Contributor

@abellgithub abellgithub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having trouble understanding the point of the LRU cache -- you have to mess with a map anyway. The only benefit I can see is that it will evict datasets after 64 (or was it 32?), but from what I've seen having that many datasets open, especially in one thread, probably indicates an issue and developers should probably be doing the destruction of datasets before that ever occurs.

I started a vector-based implementation instead of the LRU cache/map that doesn't require a lock except when something is deleted. I mostly just wanted to see how it would work. It's not done and I may just drop it. If I finish and it seems any better, I'll let you know.

*
* The implementation of this method must be thread-safe.
*/
bool MEMDataset::CanBeCloned(int nScopeFlags, bool bCanShareState) const
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems weird to return a boolean from one that you passed in.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an overloaded method. Documentation is in the base class

*
* The implementation of this method must be thread-safe.
*/
std::unique_ptr<GDALDataset> MEMDataset::Clone(int nScopeFlags,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the arguments?

Why isn't this invoked by copy constructor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an overloaded method. Documentation is in the base class

frmts/mem/memdataset.cpp Outdated Show resolved Hide resolved
@@ -1582,6 +1591,11 @@ class CPL_DLL GDALRasterBand : public GDALMajorObject
m_poBandRef = bOwned ? nullptr : poBand;
}

const GDALRasterBand *get() const
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is confusing. I see below the operator that makes it work, but it's not intuitive.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not new code, just making it const-safe. Basically this class is a inspired from unique_ptr, but it does not own always have ownership of the stored band

@@ -1592,6 +1606,11 @@ class CPL_DLL GDALRasterBand : public GDALMajorObject
return m_poBandOwned != nullptr;
}

operator const GDALRasterBand *() const
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this is also odd. I guess it's an inner class so protected, but maybe just a function name instead of the operator?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, I guess it was this way before. Still odd.

@@ -3577,6 +3613,39 @@ GDALDatasetH CPL_STDCALL GDALOpenEx(const char *pszFilename,
{
VALIDATE_POINTER1(pszFilename, "GDALOpen", nullptr);

// Do some sanity checks on incompatible flags with thread-safe mode.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General comment: There is a lot of bit twiddling here that's easy to get wrong and perhaps another scheme that provides access to these options would improve readability.

@@ -172,8 +178,34 @@ struct OGRSpatialReference::Private
const char *nullifyTargetKeyIfPossible(const char *pszTargetKey);

void refreshAxisMapping();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a lot of complexity. Is there some reason not to just take a lock, rather then all the optional stuff? I'm assuming that this was introduced because this isn't thread-safe and you may not access from multiple threads. Is the right solution perhaps to lock when you access this structure rather than to push locking into this structure? Also, using a recursive mutex makes me scratch my head and wonder why it's necessary and if there is a reasonable way to allow a non-recursive mutex. If nothing else, a comment seems in order.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this comment:

    // This structures enables locking during calls to OGRSpatialReference
    // public methods. Locking is only needed for instances of
    // OGRSpatialReference that have been asked to be thread-safe at
    // construction.
    // The lock is not just for a single call to OGRSpatialReference::Private,
    // but for the series of calls done by a OGRSpatialReference method.
    // We need a recursive mutex, because some OGRSpatialReference methods
    // may call other ones.


explicit OptionalLockGuard(Private *p) : m_private(*p)
{
if (m_private.m_bIsThreadSafe)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this state change during use? It doesn't seem correct that it would, so it seems like the lock guard should simply be initialized with that state and hold it.

Perhaps better would be to create either a regular lock or a null mutex depending on the thread safety. Then you could just lock_guard the mutex without all of this.

Can you create either a thread-safe object or a non-thread-safe one and avoid opening up a state change with SetThreadSafe?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to


/** Assignment method, with thread-safety.
 *
 * Same as an assignment operator, but asking also that the *this instance
 * becomes thread-safe.
 *
 * @param oSource SRS to assign to *this
 * @return *this
 * @since 3.10
 */

OGRSpatialReference &
OGRSpatialReference::AssignAndSetThreadSafe(const OGRSpatialReference &oSource)
{
    *this = oSource;
    d->SetThreadSafe();
    return *this;
}

* This file is at the core of the "RFC 101 - Raster dataset read-only thread-safety".
* Please consult it for high level understanding.
*
* 3 classes are involved:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And yet you list 4 ;)

One twice...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, I do list 3 distinct classes in the bullet points: GDALThreadSafeDataset, GDALThreadSafeRasterBand and GDALThreadLocalDatasetCache

}
}

for (const auto &oEntry : aoDSToFree)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a comment to explain why you're sticking things in a list to delete them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is a few line before, at the start of the method:

    // Collect TLS datasets in a vector, and free them after releasing
    // g_nInDestructorCounter to limit contention

@rouault
Copy link
Member Author

rouault commented Sep 19, 2024

@abellgithub Thanks for the review. I believe I've addressed one way or another your comments

but from what I've seen having that many datasets open, especially in one thread, probably indicates an issue and developers should probably be doing the destruction of datasets before that ever occurs.

yes, this is to handle such unlikely cases

@rouault rouault force-pushed the raster_multi_thread branch 2 times, most recently from faa65a3 to f2a6849 Compare September 19, 2024 18:58
@rouault rouault merged commit 927be3f into OSGeo:master Sep 19, 2024
32 of 34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants