Friday, August 15, 2008

RHQ server HA work

We have started to bring High-Availability to the RHQ Server. This is targeted for the upcoming 1.1 release later this year.

The installer has been updated to support HA and in the administration screens you will find a new section on High Availability, that lists the servers in a server cluster:

HA_servers_list.png


You can check this out and directly follow it in SVN head.

Wednesday, August 13, 2008

New version of iTunes Remote Control (iTRC)

The latest update of iTunes introduced an incompatibility with iTunes Remote Control, so ITRC did no longer work.
James is fortunately providing an updated version of iTRC that fixes the issues.
You can grab v1.4.1 from iTunes Remote Control.

Thanks James!

Monday, August 11, 2008

Visit to Legoland

Two weeks ago we've been in Legoland Günzburg, which is around 1.5h away from Stuttgart (by car; Günzburg is also reachable via train).
Parking and entrance was no problem. We had tickets that we ordered over the internet and printed them by ourselves. At the gates, those got just scanned in, so there was no waiting in line.

IMG_3704.jpg


Behind the entrance Orlando did for the first time not run away from a dog, but even approached it and patted it:

IMG_3706.jpg


First station was the little express train which does a tour thorough the park which enabled us to get an overview. Unfortunately this has been our first 10 minutes wait on the day with more to come (in total the park was not full and wait times were not too bad).

Next we went through Miniaturland (a landscape with different scenes, all in 1:20 scale). And this really was one of the greatest parts of Legoland. Not only we enjoyed it, but Marlene was totally excited. Whenever she saw a moving car in it, she exclaimed "Da!" ('There') and pointed to it. I think she could have been there for hours.

IMG_3724.jpg

(Rialto bridge from Venice, Italy)

IMG_3726.jpg


IMG_3727.jpg

(Brandenburger Tor, Berlin, Germany)

After this we went to lunch to the Dino-Grill. On the next picture you'll see some typical food here :-)
IMG_3734.jpg


Otherwise Legoland featured some wild animals:

IMG_3738.jpg


IMG_3751.jpg
IMG_3804.jpg


Knights on their horses:
IMG_5873.jpg


Driving schools:
IMG_3800.jpg


And a lot of other tourists:
IMG_3823.jpg



The kids really liked the visit to Legoland and Orlando is very willing to go there soon again. Actually as he was rejected at two places for being to small, he is even now drinking milk to accelerate his height growth.

Thursday, August 07, 2008

Don't be afraid to use a profiler

While developing on RHQ I encountered several situations where things were slow. One starts to look at the code, add a few print statements here and there and tries to guess what it happening.
Running the whole show through a debugger usually also doesn't help as digging deep into method calls will let you run into transaction timeouts from which point on, results are just useless.

Luckily there is a different sort of debugging help: Profilers like jProfiler from ej-technology, the one built into NetBeans, TPTP in Eclipse and others.

Profilers like jProfiler give you a few different ways to look at your application. For this discussion about performance, I will concentrate on the cpu usage part, as this is the most relevant for our case.
But actually (and this is what my presentation at Java Forum Stuttgart was about) they also make very nice debugging helpers.

Differentiation Profiler vs. Debugger



Before I dive deeper into the subject, I want to give a sort differentiation between profiler and debugger.

Debugger




  • Stepping through individual statements of the code
  • Stops the execution, so not for production use
  • Allows to see the content of individual variables
  • Hard to see the root cause of slowness, as this might be deep down in used libraries.


If the application code you want to step through is large, you will most likely run into transaction timeouts, which renders all results invalid.

Profiler




  • Application runs normally (but slower)
  • No way to see individual variable content
  • Easy way to get call graphs through the application


How to tackle the issues



Suppose you want to participate in RHQ and ask yourself "what happens if I click here?"
RHQ_gui.png


With a debugger this is hard do tell. You could search through the code for this string. And then try to find out if this is a Struts-page or a JSF page or you could look at where this link is pointing to and try to determine the resulting page from this.

With a profiler the steps could be as follows:


  • Instrument the RHQ server with in a profiler specific way (have a look at the profiler vendors manual)
  • Start the application and navigate to the above page
  • Start the profiler and start profiling CPU usage
  • Click on the link
  • When the result page is rendered, stop CPU profiling
  • Start looking at the timing tree as outlined below


CallTree1.png


Generally open the node with the most CPU usage (Thread.run, the top line here).

This will give a subtree view like this:

CallTree2.png


Here we see that the CPU time is spent in 3 invocations of some org.rhq.enterprise.* method and also in 4 calls to a HighLowChart. Here we are interested in the org.rhq code, so we open this subtree:

CallTree3.png


And we see that those 3 invocations are actually 3 individual Struts actions. From here on it is easy to dive into them by either just opening the tree nodes again or by looking them up in struts-config.xml

Repeated invocations



Now that you know where time is spent, you want to improve the methods. Always going from the top can be tiring. Most profilers allow you to limit the display to just one method and its children.

Also its possible (at least with JProfiler) to add a trigger, that fires and starts profiling when a certain method is called and stops when the path of execution leaves this method again.



RHQ - tip of the day: Agent waiting at startup

Sometimes when you start the RHQ agent (on commandline), it will not proceed to the sending> prompt, but sit there and wait for something. This post will talk about some of the possibilities.

The server has rejected the agent registration request...



Well, this message from the agent actually goes on:


Cause: [org.rhq.core.clientapi.server.core.AgentRegistrationException:The agent asking for registration is trying to register the same address/port [172.31.7.7:16163] that is already registered under a different name [snert]; if this new agent is actually the same as the original, then re-register with the same name]


This means that the connecting agent is known as 'snert' to the server, but it was passing a different name to it on this start.

To solve this, start the agent with option --clean and give the correct name.

The agent will now wait until it has registered with the server...



By default (well, you had to answer the questions on about it), the communication ports for server-agent communication are as follows (yes, two unidirectional connections):

  • Agent to server: server is listening on port 7080

  • Server to agent: the agent is listening on port 16163



... and hangs there



This is an agent state where the server can not be reached (perhaps because it is down or because a firewall blocks the traffic.
So make sure port 7080 on the server machine is reachable from the agents machine. You can simply do this with a web browser like Safari or lynx or wget.

... and shows an additional error


Here, after a little time the agent will show a message like this:

The server has rejected the agent registration request. Cause: [org.rhq.core.clientapi.server.core.AgentRegistrationException:Server cannot ping the agent's endpoint. The agent's endpoint is probably invalid or there is a firewall preventing the server from connecting to the agent. Endpoint: socket://172.31.7.3:12345/....

This means that the agent was able to talk to the server (so this communication channel is ok), but
the other direction is failing. In the example above, the server was trying to reach an agent on IP 172.31.7.3 and TCP port 12345, which was probably blocked in the firewall.

The agent does not have plugins - it will now wait for them to be downloaded...



This usually means that the server has a different security token than the one the agent was sending.
This could come from the fact that the java preferences entry got mangled e.g by testing with different agent versions or VMs or ...

You will see this message only on initial agent startup when it does not have any plugins yet.
If plugins got downloaded in a previous run, you will probably run in the situation shown below.

If you see this on the agent, you should also see messages like this on the server side:

11:40:48,454 WARN [CommandProcessor] {CommandProcessor.failed-authentication}Command failed to be authenticated! This command will be ignored and not processed: Command: type=[remotepojo]; cmd-in-response=[false]; config=[{rhq.security-token=1217855913569-109582636-403140853869881172, rhq.send-throttle=true}]; params=[{targetInterfaceName=org.rhq.core.clientapi.server.core.CoreServerService, invocation=NameBasedInvocation[getLatestPlugins]}]

To solve this, start the agent interactively with the --clean option.

Agent startup is ok, but ping command fails



Here, the agent successfully starts, but you will e.g not see any new metric data coming in from this agent. When you give the ping command on the agent command line you will see something like:

sending> ping

Pinging...

Failed to execute prompt command [ping]. Cause: org.rhq.enterprise.communications.command.server.AuthenticationException:Command failed to be authenticated! This command will be ignored and not processed: Command: type=[remotepojo]; cmd-in-response=[false]; config=[{rhq.security-token=1214208960346-102975580-7334156733284942657, rhq.send-throttle=true}]; params=[{targetInterfaceName=org.rhq.enterprise.communications.Ping, invocation=NameBasedInvocation[ping]}]


This is basically the same as above. Your server log should also be full of those CommandProcessor.failed-authentication messages. Solution as in the previous section.



Tuesday, July 29, 2008

Netbeans: where is the servlet-api?

I am currently writing a sample servlet for x2svg. And I am also trying to use NetBeans for it (as you probably remember, I am normally using Eclipse for my daily work). Actually I even downloaded 6.5M1 as Adam Bien sounded enthusiastic about it (and the editor really feels somewhat snappier than the 6.1 one).

To be able to include javax.servlet.* I need the servlet-api.
So I went to Tools->Libraries and expected it to be there. I have downloaded the EE edition of NetBeans, that should have it, but I just can't find it. There are tons of stuff like JSF libs etc. that all depend on it in some way, but I can't just pull in the basic servlet api.

For now I ended up with including the external servlet-api jar from tomcat, but this can't be the real solution.

What am I doing wrong? Where is the servlet-api in the NetBeans distribution?

Sunday, July 27, 2008

First 100km

As I indicated in this post I have started running again some weeks ago. Now after around 7 weeks, I crossed the 100km barrier (so this makes 15km per week on average, in 2 sessions on average per week).

Lets see if I will stay motivated to do the next 100km also in 7 weeks or less.

One of my big motivational factors is clearly Trailrunner, a very cool desktop app for the Mac to file each run in a training diary and to graphically select routes and compute distances etc. It's really fun to select new routes to run.
Trailrunner is even able to create images from the route
Daheim73-04.jpg


that can be uploaded to an iPod or a mobile phone as a miniature map to go (the image shows on top the name of the way-point and on bottom the expected target time (*) and the distance from the start). Also the map data (from various map sources can be shown on those micro maps)


*): One can set an expected speed for each route which is used to compute the expected time for each way-point; this helps to adjust the running speed while on the go.

It's no full blown GPS navigation (which seems to be supported with more sophisticated devices like the Garmin Forerunner), but is very helpful when running a new track for the first time.

Saturday, July 26, 2008

Why conference calls fail (2)

RHQ plugin descriptor development - and validation in your IDE

When writing a plugin for RHQ, one has to write the plugin descriptor. While this is not complicated, it still leaves some room for errors.
The good thing (but you will hate it, when you run into it) is that when deploying the plugin, its descriptor will be validated against the defining XML Schema - and errors will reported.

Of course, you could just have a look at the structural diagram of the second part of my plugin development postings, but this doesn't really make the task of writing XML easier.

The elements of the plugin descriptor are encoded in two XML Schema files in the source code:

  • urn:xmlns:rhq-plugin : rhq/modules/core/client-api/src/main/resources/rhq-plugin.xsd
  • urn:xmlns:rhq-configuration : rhq/modules/core/client-api/src/main/resources/rhq-configuration.xsd



Which means for you, that you can use the Schema files too for validation in your IDE. The following sections show how this is done; as my base install path of the RHQ project is in /jon/jonHEAD, this path will be prepended in the examples.

Eclipse 3.3



Go to Preferences->Web and XML->XML Catalog.



IntelliJ 7



Go to Setup->Resources and enter the urn and the path. So it might look like this:



NetBeans 6.1



Go to Tools->DTDs and XML Schemas. Then select the User Catalog and click on Add Local DTD or Schema. This might look as follows:


Sunday, July 20, 2008

RHQ snmptrapd updated

Since I have created the SNMPtrapd plugin for RHQ, I have updated it with a lot of little features and corrections:


  • The listner port can be configured via gui

  • A severity oid can be configured via gui: if a varbinding with this OID is received, then it is taken to compute the even severity

  • It correctly binds and unbinds to its listening socket

  • OID to text mapping: As I am still looking for an open source MIB that fits my needs and that where development is active, I've added the possibility to put mappings of oid to text in a properties file

  • Sender address is not only for V1 traps computed

  • More information about V1 traps are put extracted and put in the resulting event.



The plugin is in SVN and its description is on the plugin wiki (which also has the link to SVN etc.).

Thursday, July 17, 2008

RHQ plugin dev - api docs online

When you want to develop a plugin for RHQ,
you can of course follow my plugin dev series. But in most cases you probably want to know more.
Of course, as RHQ is open source, you can just look at the source. Most of the time it is more convenient though to just browse api docs.

JavaDoc for plugin development is now available - check it out. The docs are for RHQ version 1.0.1.

If you have questions, join us in #rhq on irc.freenode.net or post in the RHQ forums.

Tuesday, July 15, 2008

The reorg

This is one that I had in mind for quite some time now. Actually I even had it on paper and scanned in a few years ago, but lost both versions.

I've drawn the original version directly after coming out of a meeting, as this was the only thing I was able to think of after that meeting.



Sunday, July 13, 2008

x2svg 1.2.1 released


I have just released version 1.2.1 of x2svg, a tool to render input graphs as svg and to convert the created SVG (actually any SVG) into other formats like PNG or PDF.

This release consists of two changes relative to the previous one:
  • Lines have shadows now

  • Fixed a bug, where straight lines had 'steps' in them


As usual, you can download the release from sourceforge.

Friday, July 11, 2008

Thursday, July 10, 2008

RHQ and x2svg on ohloh

RHQ and x2svg have lately been added to the list of projects on ohloh.net - if you are a user of said software, please consider passing at ohloh and to click on the "I use this" button.

Direct links to the projects:


Unfortunately, Ohloh can't know that we already were working on RHQ before open sourcing it in February 2008, so it thinks its history is still relatively short.