JMX information is available from Alfresco Enterprise 3.2, giving many possibilities to external monitoring tools like Nagios/Icinga for checking Alfresco variables. An example is the well-known Nagios/Icinga plugin for monitoring Alfresco.
The most interesting information of this plugin is for Enterprise Edition (EE), although general monitoring commands (not JMX-based) may be used for Community Edition (CE) too. For example, in an Alfresco CE installation we use some Nagios plugins like:
Last day, we tested Alfresco OOTB Support Tools addon for Community Edition, and this addon provides some of this useful information about JVM, threads or logged users, that can be consumed from Alfresco OOTB webscripts via curl command, for generating alerts and graphs. We can use the JSON information of the webscripts (thanks Axel Faust, for showing me this), in some shell scripts like this:
check_ootb_active_sessions.sh
#!/bin/bash # # Author: Cesar Capillas # # https://github.com/CesarCapillas # SERVER=$1 PORT=$2 USERNAME=$3 PASSWORD=$4 VAR=$5 WARNING=${6:-100} CRITICAL=${7:-200} if [ "$PORT" = "443" ]; then PROTOCOL="https" else PROTOCOL="http" fi # Endpoint for Alfresco CE with OOTB Support Tools ENDPOINT="$PROTOCOL://${SERVER}:${PORT}/alfresco/service/ootbee/admin/active-sessions?format=json" # Endpoint for Alfresco EE with Support Tools addon #ENDPOINT="$PROTOCOL://${SERVER}:${PORT}/alfresco/service/enterprise/admin/admin-activesessions?format=json" if [[ "$1" == "" ]]; then echo "USAGE:" echo " check_ootb_active_sessions.sh <SERVER> <PORT> <USERNAME> <PASSWORD> <VAR> <WARNING> <CRITICAL>" echo echo " where VAR=[NumActive|MaxActive|NumIdle|UserCountNonExpired|TicketCountNonExpired]" echo exit fi CURL=`curl --silent -u ${USERNAME}:${PASSWORD} -X GET ${ENDPOINT}` CHCK=`echo $CURL | grep "$5"` if [[ "$CHCK" == "" ]]; then CHECK="Failed" else CHECK="OK" ACTIVE_SESSION_VAR=`echo $CURL | jshon -e $5 | sed 's/"//g'` fi if [[ "$CHECK" == "OK" ]]; then if (($ACTIVE_SESSION_VAR > $CRITICAL));then echo "CRITICAL: $5 = $ACTIVE_SESSION_VAR (>$CRITICAL)" exit 2 fi if (($ACTIVE_SESSION_VAR > $WARNING));then echo "WARNING: $5 = $ACTIVE_SESSION_VAR (>$WARNING)" exit 1 fi echo "INFO: Sessions ($5) = $ACTIVE_SESSION_VAR" exit 0 elif [[ "$CHECK" == "Failed" ]]; then echo "CRITICAL: ${SERVER}" exit 2 else echo "Check failed." exit 3 fi
check_ootb_performance_stats.sh
#!/bin/bash # # Author: Cesar Capillas # # https://github.com/CesarCapillas # # License: see accompanying LICENSE file # SERVER=$1 PORT=$2 USERNAME=$3 PASSWORD=$4 VAR=$5 WARNING=${6:-10000} CRITICAL=${7:-10000} if [ "$PORT" = "443" ]; then PROTOCOL="https" else PROTOCOL="http" fi # Endpoint for Alfresco CE with OOTB Support Tools ENDPOINT="$PROTOCOL://${SERVER}:${PORT}/alfresco/service/ootbee/admin/admin-performance?format=json" # Endpoint for Alfresco EE with Support Tools addon #ENDPOINT="$PROTOCOL://${SERVER}:${PORT}/alfresco/service/enterprise/admin/admin-performance?format=json" # Most useful are UsedMemory (JVM) and ThreadCount # Memory is in Mb Ej; 4096M # Load is in percentage if [[ "$1" == "" ]]; then echo "USAGE:" echo " check_ootb_performance_stats.sh <SERVER> <PORT> <USERNAME> <PASSWORD> <VAR> <WARNING> <CRITICAL>" echo echo " where VAR=[MaxMemory|TotalMemory|UsedMemory|FreeMemory|ProcessLoad|SystemLoad|ThreadCount|PeakThreadCount]" echo echo $CURL exit fi CURL=`curl --silent -u ${USERNAME}:${PASSWORD} -X GET ${ENDPOINT}` CHCK=`echo $CURL | grep "$5"` if [[ "$CHCK" == "" ]]; then CHECK="Failed" else CHECK="OK" PERFORMANCE_VAR=`echo $CURL | jshon -e $5` fi if [[ "$CHECK" == "OK" ]]; then if (($PERFORMANCE_VAR > $CRITICAL));then echo "CRITICAL: $5 = $PERFORMANCE_VAR (>$CRITICAL)" exit 2 fi if (($PERFORMANCE_VAR > $WARNING));then echo "WARNING: $5 = $PERFORMANCE_VAR (>$WARNING)" exit 1 fi echo "INFO: $5 = $PERFORMANCE_VAR" exit 0 elif [[ "$CHECK" == "Failed" ]]; then echo "CRITICAL: ${SERVER}" exit 2 else echo "Check failed." exit 3 fi
The two upper scripts use curl and jshon commands. The corresponding commands for Nagios look like:
curl
jshon
ootb-commands.cfg
define command { command_name check_performance_stats command_line /usr/lib/nagios/plugins/check_ootb_performance_stats.sh '$ARG1$' '$ARG2$' '$ARG3$' '$ARG4$' '$ARG5$' '$ARG6$' '$ARG7$' } define command { command_name check_active_sessions command_line /usr/lib/nagios/plugins/check_ootb_active_sessions.sh '$ARG1$' '$ARG2$' '$ARG3$' '$ARG4$' '$ARG5$' '$ARG6$' '$ARG7$' }
And finally we define services for an Alfresco host (alf5) in the next file:
services_ootb.cfg
define service { use generic-service host_name alf5 service_description [OOTB] Number of active database connections max_check_attempts 3 normal_check_interval 10 retry_check_interval 3 check_command check_active_sessions!alfie.zylk.net!443!monitor!secret!NumActive!15!20 } define service { use generic-service host_name alf5 service_description [OOTB] Number of logged users max_check_attempts 3 normal_check_interval 10 retry_check_interval 3 check_command check_active_sessions!alfie.zylk.net!443!monitor!secret!UserCountNonExpired!15!20 } define service { use generic-service host_name alf5 service_description [OOTB] Number of tickets max_check_attempts 3 normal_check_interval 10 retry_check_interval 3 check_command check_active_sessions!alfie.zylk.net!443!monitor!secret!TicketCountNonExpired!15!20 } define service { use generic-service host_name alf5 service_description [OOTB] JVM Used Memory max_check_attempts 3 normal_check_interval 10 retry_check_interval 3 check_command check_performance_stats!alfie.zylk.net!443!monitor!secret!UsedMemory!3500!4000 } define service { use generic-service host_name alf5 service_description [OOTB] Number of Threads max_check_attempts 3 normal_check_interval 10 retry_check_interval 3 check_command check_performance_stats!alfie.zylk.net!443!monitor!secret!ThreadCount!225!250 } define service { use generic-service host_name alf5 service_description [OOTB] Process Load max_check_attempts 3 normal_check_interval 10 retry_check_interval 3 check_command check_performance_stats!alfie.zylk.net!443!monitor!secret!ProcessLoad!75!85 } define service { use generic-service host_name alf5 service_description [OOTB] System Load max_check_attempts 3 normal_check_interval 10 retry_check_interval 3 check_command check_performance_stats!alfie.zylk.net!443!monitor!secret!SystemLoad!85!95 }
Similar webscripts are found for the original support tools work of Antonio Soler, so it is quite simple to change the corresponding webscript endpoints in the shell scripts. Other safer possibility may be, instead of a direct monitoring, to run this script locally from Alfresco Server, and to expose those metrics via NRPE or SNMP custom comands.
And the result for this:
Links:
that is very cool!