</pre>This has a lot of problems since if you add one server you need to move too much keys and so on, but this is the general idea even if you use a better hashing scheme like consistent hashing.<br/><br/>Ok, are key accesses distributed among the key space? Well, all the user data will be partitioned among different servers. There are no inter-keys operations used (like SINTER, otherwise you need to care that things you want to intersect will end in the same server. <b>This is why Redis unlike memcached does not force a specific hashing scheme, it's application specific</b>). Btw there are keys that are accessed more frequently.<h3><a name="Special keys">Special keys</a></h3>For example every time we post a new message, we <b>need</b> to increment the <code name="code" class="python">global:nextPostId</code> key. How to fix this problem? A Single server will get a lot if increments. The simplest way to handle this is to have a dedicated server just for increments. This is probably an overkill btw unless you have really a lot of traffic. There is another trick. The ID does not really need to be an incremental number, but just <b>it needs to be unique</b>. So you can get a random string long enough to be unlikely (almost impossible, if it's md5-size) to collide, and you are done. We successfully eliminated our main problem to make it really horizontally scalable!<br/><br/>There is another one: global:timeline. There is no fix for this, if you need to take something in order you can split among different servers and <b>then merge</b> when you need to get the data back, or take it ordered and use a single key. Again if you really have so much posts per second, you can use a single server just for this. Remember that with commodity hardware Redis is able to handle 100000 writes for second, that's enough even for Twitter, I guess.<br/><br/>Please feel free to use the comments below for questions and feedbacks.
0 commit comments