Wednesday, June 30, 2004

Java and MT


Java's memory model is very aggressive, and you have to be very careful when accessing memory from multiple threads. You of course have to synchronize access to memory locations, but you have to synchronize them even when it looks like you don't have to. There are several cases where you must use synchronize:
  • To provide a mutual exclusion barrier to prevent one thread from modifying a data structure while the other is reading it.
  • To provide a memory barrier to prevent memory operation reordering from doing something you didn't want to have happen.
  • To make the memory you're accessing volatile so that the runtime optimizer doesn't throw away your request to read a memory location.
Here's a good web page that discusses this.
A good rule to use is that when in doubt, synchronize.
Reordering can only hit you with a multiple-cpu machine, but the problems that I've been running into recently happen on my single CPU machine, with something like this:
(Note that everything after this is speculation based on behavior I've seen):
int m_y = 0;
Thread1() {
    synchronized(m_x) {
        m_y = 1;
    }
}
void Thread2() {
    while(true)
        System.out.println(m_y);
}
Even after the code in Thread1 has executed in its thread, the code in Thread2 will print 0; I believe this is because the runtime optimizer doesn't bother to look at the value of m_y after the first access. This is similar to a compile-time optimizer, which you'd fix with volatile. But a compile-time optimizer couldn't do anything in this situation.
But in Java the runtime optimizer will make it so that the first access gets the value, but it won't bother reading the value from memory anymore after that.
This strange behavior goes away by putting the synchronize(m_x) around the access to m_y. I believe this tells the runtime optimizer that something is likely to have been changed by another thread.

Tuesday, June 8, 2004

Java uses /dev/random; may block forever creating SSL connections

The software that we're developing creates SSL connections when it starts up, and it does so at S13 (has to be after network, but before other services start). The result is that on an NFS booted Linux machine, it sits there forever, and never completes the connection.

Clue #1: if you move the mouse or type on the machine's keyboard, eventually the connection will complete.

Of course the reason for it hanging is that Java is using /dev/random to generate the keys for the SSL connection. And /dev/random gets all of its entropy from the physical environment, and refuses to return random values until it gets some input from the outside world.

We don't see this on a machine that boots from disk; I assume that /dev/random gets entropy from the interaction with the drive, via interrupts and so forth. For some reason the network activity doesn't yield the same entropy data, or at least not enough.

I found this article that discusses the usefulness of /dev/random given its current design.

In order to work around this, we decided to use /dev/urandom. We could do this by a link in the file system, but a much superior solution is to set the following system property in Java:

-Djava.security.egd=file:/dev/urandom

Now all you have to worry about is attacks against your SSL connection from those who know that you are using the pseudo-random number generator...