Friday, December 05, 2014

RHQ 4.13 released

I am very pleased to announce the release of RHQ 4.13 as an
early gift from St. Nicholas :)


Screenshot of the RHQ UI showing some charts

This release contains a lot of bug fixes and smaller improvements, as well as some new features:

  • Alerts have a new status 'recovered', that can be filtered upon
  • The UI allows to hide elements that are not needed on a per user basis
  • The as7/WildFly plugin now supports runtime queues and topic subscribers
  • Further improvements in the Storage Nodes

As always RHQ is available for download in form of a zip archive. If you want to try out RHQ without too much setup, you can also use a pre-created Docker image
from https://registry.hub.docker.com/u/rhqproject/rhq-nodb/ (the link contains setup information).

Please consult the release notes for further details and a download link.

Maven artifacts will soon be available on Maven Central.

Special thanks goes to

  • Alan Santos
  • Andreas Veithen
  • Elias Ross
  • Jérémie Lagarde

for their code contributions for this release.

Friday, October 10, 2014

WildFly subsystem for RHQ Metrics

For RHQ-Metrics I have started writing a subsystem for WildFly 8 that is able to collect metrics inside WildFly and then send them at regular intervals (currently every minute) to a RHQ-Metrics server.


The next graph is a visualization with Grafana of the outcome when this sender was running for 1.5 days in a row:

Graphs of JVM memory usage
Graphs of JVM memory usageWildFly memory usage



( It is interesting to see how the JVM is fine tuning its memory requirement over time and using less and less memory for this constant workload ).


The following is a visualization of the setup:


Setup


The sender is running as a subsystem inside WildFly and reading metrics from the WildFly management api. The gathered metrics are then pushed via REST to RHQ-Metrics. Of course it is possible to send them to a RHQ-Metrics server that is running on a separate host.


The configuration of the subsystem looks like this:

<subsystem xmlns="urn:org.rhq.metrics:wildflySender:1.0">
<rhqm-server
name="localhost"
enabled="true"
port="8080"
token="0x-deaf-beef"/>
<metric name="non-heap"
path="/core-service=platform-mbean/type=memory"
attribute="non-heap-memory-usage"/>
<metric name="thread-count"
path="/core-service=platform-mbean/type=threading"
attribute="thread-count"/>
</subsystem>

As you see, the path to the DMR resource and the name of the attribute to be monitored as metrics can be given in the configuration.


The implementation is still basic at the moment - you can find the source code in the RHQ-Metrics repository on GitHub. Contributions are very welcome.

Heiko Braun and Harald Pehl are currently working on optimizing the scheduling with individual intervals and possible batching of requests for managed servers in a domain.


Many thanks go to Emmanuel Hugonnet, Kabir Khan and especially Tom Cerar for their help to get me going with writing a subsystem, which was pretty tricky for me. The parsers, the object model and the XML had a big tendency to disagree with each other :-)

Sunday, August 24, 2014

Alert definition templates in plugin(descriptor)s ?

Hey,

his is a more general question and not tied to a specific version of RHQ (but may become part of a future version of RHQ and/or RHQ-alerts).

Do you feel it would make sense to provide alert templates inside the plugin (descriptor) to allow the plugin writer to pre-define alert definitions for certain resource types / metrics? Plugin-writers know best what alert definitions and conditions make sense and should get the power to pre-define them.

This idea would probably work best with some relative metrics like disk is 90% full as opposed to absolute values that probably depend a lot more on concrete customer scenarios (e.g. heap usage over 64MB may be good for small installations, but not for large ones).
In the future with RHQ-alerts, it should also be possible to compare two metrics with each other, which will allow to say "if metric(disk usage) > 90% of metric(disk capacity) then ...".

I've scribbled the idea down in the Wintermute page on the RHQ wiki.

If you think this is useful, please respond here or on the wiki page. Best is if you could add a specific example.

Thursday, August 14, 2014

RHQ-Alerts aka Alerts 2.0 aka Wintermute

The RHQ-team has been thinking about next generation Alerting for RHQ for a while.

I have written my thoughts down in a blog post in the RHQ space of the JBoss
community site in order to start the (design) process:

Thoughts on RHQ-Alerts aka Alerts 2.0 aka Wintermute.

Please feel free to comment here or there :)

Tuesday, July 29, 2014

Java Forum Stuttgart 2014

tl;dr: Great as every year.

On July 17th, I attended the Java Forum Stuttgart conference, a one day conference with 49 talks in 7 tracks. This years edition marked a new record with 1600 attendees from all over Germany (and probably also from other countries).

Conference itself was awesome as every year with good talks and good food, 30+ exhibition stands and some free beer at the end :) The conference is organized by the Java User Group Stuttgart e.V. and as such is a non-profit event, which for sure partially explains the success.

I was lucky that my talk "Slim Fast" about a "diet for the application memory footprint" was accepted and I was positively surprised that the room was full despite the fact that the abstract did not contain any buzzwords like IoT or Cloud :-) (German) slides are attached to the above link, JUG Stuttgart now also has posted some pictures.

JUG Stuttgart also posted more pictures of sessions and also some general impression on their G+ page<./p>

Other talks that I've attended were:

"Java FX everywhere" by Gerrit Grunwald, who showed an app that he wrote for the desktop with Java FX and then ported over to devices like Raspberry PI, Android, iPad etc.

"Android ist anders - Android dependency management". This was about dependency management for Android - and for all those that blame Maven hell, check this out and you will like plain Maven afterwards :)

"IBM BlueMix". In this talk a guy from IBM explained a bit what BlueMix is and then demoed creating an app from a simple template and deploying it to a live BlueMix server in the cloud.

"TypeScript - JavaScript für Java Entwickler". Kai Toedter showed the TypeScript language as an alternative to JavaScript. While I was a bit skeptical about "yet another language", the talk totally made me change my mind and I can really see it as good alternative for future UI-work.





Sunday, June 15, 2014

RHQ-Metrics and Grafana (updated)

As you may know I am currently working on RHQ-Metrics, a time series database with some charting extensions that will be embeddable into Angular apps.

One of the goals of the project is also to make the database also available for other consumers that may have other input sources deployed as well as their own graphing.

Over the weekend I looked a bit at Grafana which looks quite nice, so I started to combine RHQ-metrics with Grafana.

I did not (immediately) find a description of the data format between Grafana and Graphite, but saw that InfluxDB is supported, so I fired up WireShark and was able to write an endpoint for RHQ-Metrics that supports listing of metrics and retrieving of data.

The overall setup looks like this:

RHQ metrics grafana setup
Setup used

A Ganglia gmond emits data over IP multicast, which is received with the protocol translator client, ptrans. This translates the data into requests that are pushed into the RHQ-metrics system (shown as the dotted box) and there stored in the Cassandra backend.

Grafana then talks to the RHQ-Metrics server and fetches the data to be displayed.
For this to work, I had to comment out the graphiteUrl in conf.js and use the influx backend like this:

 datasources: {
influx: {
default: true,
type: 'influxdb',
username: 'test',
password: 'test',
url: 'http://localhost:8080/rhq-metrics/influx',
}
},

As you can see, the URL points to the RHQ-Metrics server.

The code for the new handler is still "very rough" and thus not yet in the normal source repository. To use it, you can clone my version of the RHQ-Metrics repository and then run start.sh in the root directory to get the RHQ-Metrics server (with the built-in memory backend) running.

Update

I have now added code (still in my repository) to support some of the aggregating functions of Influx like min, max, mean, count and sum. The following shows a chart that uses those functions:

Wednesday, May 07, 2014

RHQ 4.11 released


I am proud to announce the immediate availability of RHQ 4.11.

As always this release contains new features and bug fixes.

Notable changes are:

  • Plugins can now define plugin specific DynaGroup expressions, which can even be auto-activated
  • Further agent footprint reduction for jmx-clients (especially if the agent runs on jdk 1.6)
  • Much improved agent install via ssh from the RHQ UI.
  • Support for Oracle 12 as backend database
  • New login screen that follows patternfly.org

As always RHQ is available for download in form of a zip archive. If you want to try out RHQ without too much setup, you can also use a pre-created Docker image
from https://index.docker.io/u/gkhachik/rhq-fedora.20/ (the link contains setup information).

Please consult the release notes for further details and a download link.

If you only want to have a quick look, you can also consult a
running RHQ 4.11 instance that we have set up. User/Pass are guest / rhqguest

Special thanks goes to

  • Elias Ross
  • Michael Burman

for their code contributions for this release.

My first Netty-based server

Netty logo
 
RHQ logo

For our rhq-metrics project (and actually for RHQ proper) I wanted to be able to receive syslog messages, parse them and in case of content in the form of


type=metric thread.count=11 thread.active=1 heap.permgen.size=20040000

forward the individual key-value pairs (e.g. thread.count=11) to a rest endpoint for further processing.

After I started with the usual plain Java ServerSocket.. code I realized that this may become cumbersome and that this would be the perfect opportunity to check out Netty. Luckily I already had an Early Access copy of Netty in Action lying around and was ready to start.

Writing the first server that listens on TCP port 5140 to receive messages was easy and modifying /etc/syslog.conf to forward messages to that port as well.

Unfortunately I did not get any message.

I turned out that on OS/X syslog message forwarding is only UDP. So I sat down, converted the code to use UDP and was sort of happy. I was also able to create the backend connection to the REST server, but here I was missing the piece on how to "forward" the incoming data to the rest-client connection.

Luckily my colleague Norman Maurer, one of the Netty gurus helped me a lot and so I got that going.

After this worked, I continued and added a TCP server too, so that it is now possible to receive messages over TCP and UDP.

The (current) final setup looks like this:

Overviw of my Netty app

I need two different decoders here, as for TCP the (payload) data received is directly contained in a ByteBuf while for UDP it is inside a DatagramPacker and needs to be pulled out into a ByteBuf for further decoding.

One larger issue I had during development (aside from wrapping my head around "The Netty Way" of doing things) was that I wrote a Decoder like


public class UdpSyslogEventDecoder extends
MessageToMessageDecoder<DatagramPacket>{

protected void decode(ChannelHandlerContext ctx, DatagramPacket msg, List<Object> out) throws Exception {

and it never got called by Netty. I could see that it was instantiated when the pipeline was set up, but for incoming messages it was just bypassed.

After I added a method


public boolean acceptInboundMessage(Object msg) throws Exception {
return true;
}

my decoder was called, but Netty was throwing ClassCastExceptions, but they were pretty helpful, as it turned out that the IDE automatically imported java.net.DatagramPacket, which is not what Netty wants here. As Netty checks for the signature of the decode() and this did not match, Netty just ignored the decoder.

You can find the source of this server on GitHub.