Monday, October 1, 2012

Performance Turning for WebLogic Server—Native Muxers vs. Java Muxers

There are two critical areas for WebLogic Server (WLS) performance tuning:
  • Thread management
  • Network I/O tuning
In this article, we will touch upon one aspect of Network I/O tuning—Native Muxers vs. Java Muxers.

Listen Thread


  Listen Thread ---> Listen Thread Queue --> Socket Muxer 

When a server process starts up, it binds itself to a port and assigns a listen thread to the port to listen for incoming requests.  Once the request makes a connection, the server passes the control of that connection to the socket muxer.

From the thread dump, you can find an entry like this:
  "DynamicListenThread[Default[9]]" daemon prio=10 tid=0x00002aaac921b800 
   nid=0x3bf1 runnable [0x000000004c026000]

From the server log file, you can find a matching entry like this:
  <Oct 2, 2012 11:02:28 AM PDT> <Notice> <Server> <BEA-002613>
  <Channel "Default[9]" is now listening on 0:0:0:0:0:0:0:1:9000 
  for protocols iiop, t3, ldap, snmp, http.>

Socket Muxer


  socket muxer --> execute queue 
Muxers read messages from the network, bundle them into a package of work, and queue them to the Work Manager.  An idle execute thread will pick up a request from the execute queue and may in turn hand off the job of responding to those requests to special threads.  Finally, socket muxers also make sure the response gets back to the same socket from which the request came. Socket muxers are software modules and there are two types:
  • Java Muxers
    • Uses pure Java to read data from sockets
    • The number of threads is tunable for Java muxers by configuring the Percent Socket Readers parameter setting in the Administration Console
  • Native Muxers 
    • Native muxers use platform-specific native binaries to read data from sockets
      • The majority of all platforms provide some mechanism to poll a socket for data
    • Native muxers provide better performance, especially when scaling to large user bases, because they implement a non-blocking thread model
    • Note that Native IO is not supported for WebLogic clients which includes WLST
The Enable Native IO checkbox on the server’s configuration settings tells the server which version to use.  In the above figure, we have selected Native IO and, therefore, JavaSocketMuxer Socket Readers was grayed out.

In general, the server will determine the correct type of muxer to use and will use the native muxers by default without having to make any specific tuning changes.

Which Muxer Was Actually Used?


The quickest way is to create a thread dump (for example, using jstack) and search for "Muxer".  In our experimental environment, Posix Muxer was picked up accidentally:

"ExecuteThread: '2' for queue: 'weblogic.socket.Muxer'" daemon prio=10 tid=0x00002aaae190b800 nid=0x10cf runnable [0x0000000040e13000]
   java.lang.Thread.State: RUNNABLE
        at weblogic.socket.PosixSocketMuxer.poll(Native Method)
        at weblogic.socket.PosixSocketMuxer.processSockets(PosixSocketMuxer.java

-Dweblogic.SocketReaders


You can explicitly set the number of socket readers using the following command line option:
  • -Dweblogic.SocketReaders=3
If you set it to be 3, you can find the following entries from the thread dump:
"ExecuteThread: '2' for queue: 'weblogic.socket.Muxer'" daemon prio=10 tid=0x00002aaac8776000 nid=0x3475 waiting for monitor entry [0x0000000041dbd000]
   java.lang.Thread.State: BLOCKED (on object monitor)

"ExecuteThread: '1' for queue: 'weblogic.socket.Muxer'" daemon prio=10 tid=0x00002aaac8774800 nid=0x3474 waiting for monitor entry [0x0000000041cbc000]
   java.lang.Thread.State: BLOCKED (on object monitor)

"ExecuteThread: '0' for queue: 'weblogic.socket.Muxer'" daemon prio=10 tid=0x00002aaac8770000 nid=0x3473 runnable [0x0000000041877000]
   java.lang.Thread.State: RUNNABLE

The main reason to do this is that in some releases the number of readers is set by default to the number of CPUs available on the system. On some types of hardware this results in as many as 128 reader threads, which is not so good.

Typically you will see good performance anywhere between 1-3 socket readers threads. In some case, folks have used 6—but, those are special cases.  Be warned that not having enough readers will result in work not being read from the sockets quickly enough for the server to process.

Using our ATG CRM benchmark, you can see the changes of throughput and response time when number of SocketReaders is changed from 1 to 3:


SocketReaders=1

SocketReaders=3

Maximum Running Vusers 400 400
Total Throughput (bytes) 2,487,087,264 2,496,307,995
Average Throughput (bytes/second) 1,036,286 1,040,128
Average Hits per Second 29.786 29.86
Average Response Time (seconds) 0.248 0.236
90% Response Time (seconds) 0.209 0.210


BEA-000438


In some circumstances, you may see the following error message:
  <BEA-000438> <Unable to load performance pack. Using Java I/O instead.
  Please ensure that libmuxer library is in...
For instance, when you use 64-bit JVM and libmuxer.so is not on the LD_LIBRARY_PATH.  To resolve it, just add the following path:
  • <Oracle Home>/wlserver_10.3/server/native/linux/x86_64
to the LD_LIBRARY_PATH.

Acknowledgement


Some of the writings here are based on the feedback from Sandeep Mahajan. However, the author would assume the full responsibility for the content himself.

References

  1. Oracle WebLogic Server 11g Administration Handbook (Oracle Press)
  2. HotSpot VM Binaries: 32-Bit vs. 64-Bit
  3. Weblogic - Socket Muxers in Thread Dumps

No comments: