Remove synchronized methods in favour of ConcurrentHashMap using#76
Remove synchronized methods in favour of ConcurrentHashMap using#76dmitry-grytsovets wants to merge 6 commits intomsgpack:masterfrom
Conversation
|
Thank you for the contribution, @dgreenru! IIUC, TemplateRegistry#buildAndRegister cannot be synchronized-free because it includes check-then-act behavior like this: About Introspector, we should not change it because there is no compatibility between WeakHashMap and ConcurrentHashMap - reference strength is different. Please let me know if I get wrong. Again, thanks for the contribution! |
|
Please feel free to reopen this issue if you have any opinion or comments. |
|
And this code should looks like cause you shouldn't do key find twice. And about WeakHashMap. I can't imagine application where message converter the main memory consumer. |
I know that synchronized blocks are very heavy operation in Java memory model. Therefore, I agree with you generally - lock granuality should be smaller. However I'd like to crarify some points in this case. I'm still not certain whether TemplateRegistry can be really bottleneck. The registration runs only at initialization time. Or is register() method called again and again from multiple threads in your use case? One alternative is to have thread-local MessegePack object. Is it not enough for you?
+1. We can refactor the code. About WeakHashMap, I'll check for more detail in a few days. |
|
I think that high load app can use thousands of threads so ThreadLocal is not a good way to store MessagePack object. Also synchronized block will perform cpu cache synchronization if any changes was made by another cpu ( happens before and so on). Message conversion is a simple operation so it should be lightweight as possible. A think that ConcurrentHashMap templates;
...
Template template = templates.get(type);
if(template == null) {
// create the lightweight object which will do init and put self in map
//wrapper will change self to initilized object when init will be completed
Template wrapper = new TemplateWrapper(templates, type);
//putIfAbsent will put object only if it not exists
template = (template = templates.putIfAbsent(type, wrapper)) == null?wrapper:template;
}And wrapper should works like this TemplateWrapper extends Template {
Type type;
ConcurrentHashMap templates;
volitile Template template;
public void doSomething() {
if(template != null) {
template.doSomething();
} else {
init();
template.doSomething();
}
}
private synchronized void init() {
if(template != null) {
//create template in local variable cause other thread can see not fully initialized instance
Template instance = new TemplateImpl(type);
templates.put(type, instance);
template = instance;
}
}
} |
|
The bottleneck is public synchronized Template lookup(Type targetType)cause it calls N times over each read\write operation |
|
I've changed my pull request to minor modification which only removes synchronized from lookup. |
|
I'm interested in the performance improvement of this change. I've not finished reviewing your commits yet, though. @dgreenru Can you show me the result of the benchmark if you have? |
|
I haven't any benchmark. Synchronized method will always produce locks in multithread env. You can look at https://github.com/msgpack/msgpack-java/pull/76/files cause I reverted my first commit and create small fix which should remove synchronization only from frequently called method. |
There was a problem hiding this comment.
Without synchronized keyword, a thread can call register() or unregister() while another thread is executing this method. Did you consider those cases already?
As you know, this method calls register() and cache.get() at some points. So we need to take account of such as the case that thread#A calls unregister() while thread#B is executing in the middle of lookupInterfaceTypes().
There was a problem hiding this comment.
I also changed HashMap to ConcurrentHashMap which is fully thread safe. So you can remove all synchronized methods over the maps if you need :) You can look into the ConcurrentHashMap source code where you will find locked write and optimistic read with locked fallback :)
|
Well, I don't worry about the atomicity of ConcurrentHashMap itself, but of lookup()... BTW, I think it might be not profitable to merge this change if it doesn't improve the performance significantly. So can you benchmark the performance of this change in your use case? Of course, I hope your fix improves the performance because I use this library heavily. Thanks. |
|
I've made performance test package org.msgpack;
import java.io.IOException;
import java.util.Arrays;
import java.util.List;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Semaphore;
public class Test {
public static void main(String[] args) throws InterruptedException {
int cpus = Runtime.getRuntime().availableProcessors();
ExecutorService exec = Executors.newFixedThreadPool(cpus);
final MessagePack mpack = new MessagePack();
final List message = Arrays.asList(1,2,3,4,5);
try {
mpack.read(mpack.write(message));
} catch (IOException e) {
throw new IllegalStateException(e);
}
final Semaphore in = new Semaphore(0);
final Semaphore out = new Semaphore(0);
Runnable runnable = new Runnable() {
@Override
public void run() {
try {
in.acquire();
for (int i = 0; i < 1000000; i++) {
mpack.read(mpack.write(message));
}
} catch (Exception e) {
throw new IllegalStateException(e);
} finally {
out.release();
}
}
};
for (int i = 0; i < cpus; i++) {
exec.execute(runnable);
}
in.release(cpus);
long start = System.currentTimeMillis();
out.acquire(cpus);
System.out.println("Cpus: "+Runtime.getRuntime().availableProcessors()+" Total time "+(System.currentTimeMillis() - start));
exec.shutdown();
}
}syncronized version: Cpus: 8 Total time 15704 |
There was a problem hiding this comment.
Move put to the end of template build cause without synchronization other threads can see not initialized version
|
@dgreenru Thanks! I'll look at it later. |
|
@dgreenru I tried to test serialization and deserialization of custom class in your branch and changed your benchmark code like this: But I ran into the following exception: Also, I executed this test code with msgpack/msgpack-java:origin/master and it worked well. Can you check it? |
|
I've executed your test without any problem. private final CountDownLatch ready = new CountDownLatch(1);field in TemplateReference? In my repository https://github.com/dgreenru/msgpack-java/blob/master/src/main/java/org/msgpack/template/TemplateReference.java |
|
Let me explain the performance test result. Set TemplateRegistry.lookup call execution time to 1 and number of cpu to 8. If other serialization and reflection operations execution time more than 8 you will have the same execution time for synchronized and not sychronized version cause syncronization only perfom uniform lookup distribution. Simple example: cpu = 2 synchronized version as you can see thread 2 simple got 1 step penalty not synchronized version thread 1 = [ a b b a b b ] I use array and integers in my performance test to reduce other operations execution time and have more obvious result. |
I used the latest revision: One question. Why did you add 5db19c1 in your changes though you just only removed synchronized keyword from TemplateRegistry#lookup()? |
|
As you can see in TemplateRegistry#buildAndRegister cache.put(targetClass, ref);made before actual template is ready. That was done to resolve circular dependencies. So other threads should wait for actual template when trying to use reference. I've added you as collabrator for my forked version can you push your test code. |
No description provided.