Performance problems running under UNIX

b Barry

on November 6, 2012

I’ve down loaded the latest version of the OpenCDISC tool (v1.3) and I'm in the process of setting it up for our user group. The users work on a UNIX (Solaris 8 - 64bit) server. The UNIX server has 4 dual core 1.2 Ghz CPUs and 32 M memory. I have been able to get the tool to work but it much slower than I expected. For testing purposes, I'm using a single domain (CM) which has over 217K rows of data. When I run this job on my Windows XP machine, it takes less than 5 minutes to finish. On the UNIX server, same job, it takes almost 9 minutes to run. I would have expected the UNIX server to be faster, not slower.

Some of the datasets we have can have over a million records (i.e. LB). I actually tried the LB dataset on UNIX and had to kill the run after several hours. I’ve tried several configurations when invoking the tool. None seem to make much difference in how long it takes the CM job to complete. Can you tell me if there's something else I could try to speed up this execution with regards to how I invoke the tool with Java? The UNIX Java version I'm using is 1.6.0_20. Below are some of the attempts I've made to make this job run faster. Oddly, I thought the "-d64" option would have made the job much faster but it was the slowest of all my attempts.

# java –d64 –xms256m –xmx16384m -jar lib/validator-gui-1.3.jar (Validation of CM took 10 minutes 51 seconds )

# java –xms256m –xmx1024M -jar lib/validator-gui-1.3.jar (Validation of CM took 8 minutes 50 seconds)

# java –xms256m –xmx3840m -jar lib/validator-gui-1.3.jar (Validation of CM took 8 minutes 41 seconds)

The Windows run used the standard configuration that came with the tool, it took 4 minutes 38 seconds to run the CM job.

Additionally, I’ve updated the UNIX properties file so the property "Engine.ThreadCount = AUTO".

Note, I understand I can split the data up (by site for example) and run them separately but I consider that my last resort to solve this problem. Many thanks for the help on this!

Forums: Troubleshooting and Problems

t Tim

on November 6, 2012

RE: Performance problems running under UNIX

Hi,

Do you know what the architecture of the CPUs in your Solaris machine are? We've noticed some degraded performance on Sun/Oracle hardware in cases where the CPU architecture is optimized for high concurrent throughput, but not high single-thread performance.

Since the Validator currently validates each dataset on a dedicated thread (up to Engine.ThreadCount parallel datasets), a low single thread processing speed on the server would explain why the XP machine with an x86 processor of reasonable clock speed might show better performance in this case.

Addressing this is a bit tricky, since you're pretty much at the mercy of the hardware/software combination on this one. That said, the upcoming 1.4 release does have a few performance related changes that reduce the processing time for large datasets in particular, so you may be able to benefit from those.

We'll be releasing 1.4 later this month, and we'll see if there are any options that could improve engine performance for this particular case for a future release.

Regards,
Tim