Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nsqlookupd: fix write lock starvation #1208

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

andyxning
Copy link
Member

@andyxning andyxning commented Nov 1, 2019

This PR will fix register db write lock starvation.

We have encountered a circumstance:

  • we have many topics with many ephemeral channels.
  • our cluster fo nsqd is large with more than 600+ instances.
  • some channels will register and unregister itself normally since the consumer client can break because timeout is readed or other various situations.

And, finally, the register db is large.

So the lookup api will consume some little more time to respond. In the case where a lookup api get the read lock fo register db, all the write lock, for example, register/unregister channels, will be blocked. And since the read lock can be acquired simultaneously by lookup requests, the operations that needs write lock will all be blocked and the memory and goroutine will accumulated.

This PR will add a cache for the FindProducers resposne. TTL is 1 minute, the purge operation will be operated at 5 minutes interval.

Corns:

  • This will add some pressure on the memory since a new memory will be allocated.
  • newly added nsqds instance may have some more time to be discovered by consumer.

return val.(Producers)
}

r.cachedMutex.RUnlock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/patrickmn/go-cache#go-cache says "the cache can be safely used by multiple goroutines" so I don't think you need this lock at all

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mainly used to avoid multiple set operations happen.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, but you still don't need the read lock for that, only the write lock (which could be a plain mutex) for the following section, where you check again if the result is in the cache, before running the expensive loop.

That expensive loop, which is the main thing you are trying to avoid, is:

	for k, producers := range r.registrationMap {
		if !k.IsMatch(category, key, subkey) {
			continue
		}
		for _, producer := range producers {

I think that there is probably a better way to structure the registration map, or maintain pre-computed results along side it or embedded in it, which are updated or invalidated whenever the registration map is updated, to get better performance with huge numbers of producers / topics / channels / whatever.

Copy link
Member Author

@andyxning andyxning Nov 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the sync.Map doc:

(1) when the entry for a given key is only ever written once but read many times, as in caches that only grow, or (2) when multiple goroutines read, write, and overwrite entries for disjoint sets of keys.

the item one said that sync.Map suites for a situation where only one written happens which is definitely not the truth for this nsqlookupd registeration map usage.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but you still don't need the read lock for that, only the write lock (which could be a plain mutex) for the following section, where you check again if the result is in the cache, before running the expensive loop.

@ploxiln Thanks, i understand. Will do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Despite being less than optimal for this use-case, a sync.Map can have entries inserted and updated concurrently while a query is iterating over the entire structure. So it might resolve the initial complaint about the write lock being starved.

I consider the addition of this cache to be a hacky work-around. It adds up to a 1 minute delay to updates, for all users. It may be OK for many consumers, but probably not all, and not for nsqd or nsqadmin.

nsqlookupd is already just a big in-memory data structure. It could probably be re-structured to process your extreme case faster. That would admittedly require more time and effort to understand the current structure and queries. So a cheap quick fix, without those problems, might be a sync.Map.

@jehiah jehiah added the perf label Nov 2, 2019
@andyxning andyxning force-pushed the nsqlookupd_fix_write_lock_starvation branch 8 times, most recently from 7aa646f to 0ab3fe3 Compare November 4, 2019 10:20
@andyxning
Copy link
Member Author

@ploxiln @jehiah PTAL.

@andyxning andyxning force-pushed the nsqlookupd_fix_write_lock_starvation branch from 0ab3fe3 to 1a3e5a8 Compare November 4, 2019 13:16
removed := false
if _, exists := producers[id]; exists {
removed = true
}

// Note: this leaves keys in the DB even if they have empty lists
delete(producers, id)

r.registrationMap.Store(k, producers)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need to store the producers map again after modifying it, because map is a reference type.

I think this is true for AddProducer() too (registrationMap.LoadOrStore() ensured that the producers map returned for you to operate on is the one already in the registrationMap).

@ploxiln
Copy link
Member

ploxiln commented Nov 4, 2019

commit title typo: "startvation" -> "starvation"

@ploxiln
Copy link
Member

ploxiln commented Nov 4, 2019

I thought of another problem, unfortunately. The values, which are plain maps, are not safe for concurrent updates and reads.

By the way, in a few places you do val.(map[string]*Producer). We have a ProducerMap type defined already, so I think you could do val.(ProducerMap).

So anyway, while concurrently updating and iterating over the RegistrationDB is OK (now that it is a sync.Map), updating the ProducerMap in a value is technically not OK while a lookup is iterating over it concurrently. One possible fix is to add a RWMutex to every ProducerMap, changing it to:

type ProducerMap struct {
    sync.RWMutex
    m map[string]*Producer
}

Another idea is to make every ProducerMap a sync.Map instead.

Both of these ideas sound like they will add a significant amount of overhead, more than just one top-level sync.Map. Maybe making all these ProducerMap values sync.Map will still come out over-all helpful for your usecase, I don't know. But it is sounding more iffy.

If sync.Map everywhere is too much overhead, the next step is to think about the data processing that nsqlookupd is doing that is expensive. I think it's the "iterate over everything" which happens for some kind of lookups. And I think this can be avoided with another data structure along-side. But we have to understand when and why this happens. It's for the *:

func (r *RegistrationDB) needFilter(key string, subkey string) bool {
	return key == "*" || subkey == "*"
}
channels := s.ctx.nsqlookupd.DB.FindRegistrations("channel", topicName, "*").SubKeys()
topics := s.ctx.nsqlookupd.DB.FindRegistrations("topic", "*", "").Keys()

So if we could come up with an efficient way to store information needed to serve these queries, and keep it consistent as registrations are added and removed, we could avoid ever needing to do the expensive looping ... (The other thing I might worry about is LookupRegistrations(id).)

Accelerating each one of these queries is sort of like adding another index. It is more complicated to keep all these indexes consistent as registrations are added/removed but it may be worth it so that all these queries are fast at your scale, and still accurate.

@andyxning
Copy link
Member Author

I thought of another problem, unfortunately. The values, which are plain maps, are not safe for concurrent updates and reads.

That's quite true. I have encountered this while debugging this PR code. That's why i add a lock before ProducerMap2Slice:

		r.RLock()
		defer r.RUnlock()
		return ProducerMap2Slice(val.(map[string]*Producer))

By the way, in a few places you do val.(map[string]*Producer). We have a ProducerMap type defined already, so I think you could do val.(ProducerMap).

I use map[string]*Producer because the original code uses it. I will change this to ProducerMap since either is fine.

@andyxning andyxning force-pushed the nsqlookupd_fix_write_lock_starvation branch from 1a3e5a8 to b3847f1 Compare November 5, 2019 02:38
r.registrationMap.Range(func(k, v interface{}) bool {
if k.(Registration).IsMatch(category, key, subkey) {
producers := v.(ProducerMap)
for _, producer := range producers {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to r.RLock() around this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you take the r.RLock() around the whole loop, that puts this back close to where it started. If you can take the RLock() only after matching a Registration and before iterating over the ProducerMap, and then releasing it after finishing with that ProducerMap, it could avoid starving the write lock.

Copy link
Member Author

@andyxning andyxning Nov 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Actually after finally finished this PR, i have thought about this. The PR now seems similar to the original one. But, some write lock has been erased.

If you can take the RLock() only after matching a Registration and before iterating over the ProducerMap, and then releasing it after finishing with that ProducerMap, it could avoid starving the write lock.

Not quite sure about this. Since read lock can be acquired simultaneously and acquiring and releasing read lock frequently may utilize more CPU.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah ... that's why I said this strategy is "sounding more iffy"

as I see it, it's either go back to the original, or take the read lock just around accessing each ProducerMap ("thrashing" the lock I guess), or turn all these ProducerMap into sync.Map themselves

results := Registrations{}
for k, producers := range r.registrationMap {
r.registrationMap.Range(func(k, v interface{}) bool {
producers := v.(ProducerMap)
if _, exists := producers[id]; exists {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you also need the r.RLock() here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@andyxning
Copy link
Member Author

@ploxiln Actually i have test this PR in our clusters for about 1 day. Every thing is fine. The worry about concurrent read/write access for the value of ProducerMap also hits me for a quite long time. But after a second thought about sync.Map's Range and the source code for sync.Map, it seems that it is safe to use like this without lock.

@andyxning andyxning force-pushed the nsqlookupd_fix_write_lock_starvation branch from b3847f1 to 071a571 Compare November 5, 2019 03:00
@andyxning andyxning force-pushed the nsqlookupd_fix_write_lock_starvation branch 2 times, most recently from 55de623 to 383fba1 Compare November 5, 2019 03:45
@ploxiln
Copy link
Member

ploxiln commented Nov 5, 2019

you pushed the wrong branch over this one :)

@andyxning andyxning force-pushed the nsqlookupd_fix_write_lock_starvation branch from 383fba1 to 55de623 Compare November 5, 2019 04:08
@andyxning
Copy link
Member Author

@ploxiln Done.

@mreiferson
Copy link
Member

mreiferson commented Jun 14, 2020

Where are we at on this one? Seems like the conversation has stalled at having to take a lock around a large critical section when iterating over registrationMap to handle * cases?

First, it doesn't look like we ever actually call FindProducers with a *, so we can simplify that code and remove the pathological loop:

r.RLock()
results := make(map[string]struct{})
var retProducers Producers
r.registrationMap.Range(func(k, v interface{}) bool {
if k.(Registration).IsMatch(category, key, subkey) {
producers := v.(ProducerMap)
for _, producer := range producers {
_, found := results[producer.peerInfo.id]
if found == false {
results[producer.peerInfo.id] = struct{}{}
retProducers = append(retProducers, producer)
}
}
}
return true
})
r.RUnlock()

Then, for FindRegistrations, which is called with *, seems like to @ploxiln's point, the best path forward is actually to maintain two additional maps that handle Key or SubKey lookups for *. The locking can be simplified after that, since the critical section will be much smaller.

@mreiferson
Copy link
Member

see also #584, where some of that code referenced above was added.

@mreiferson
Copy link
Member

Then, for FindRegistrations, which is called with *, seems like to @ploxiln's point, the best path forward is actually to maintain two additional maps that handle Key or SubKey lookups for *. The locking can be simplified after that, since the critical section will be much smaller.

Or if we want to get fancy, something like https://github.com/derekparker/trie is probably more appropriate. We'd have to change how we generate keys and choose a separator (e.g. :, something that isn't a valid topic/channel character) to produce string keys: topic:foo:*

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants