Auditing added content in Alfresco repository II

Alfresco Activity Console in Elastic Search

In the last post about "Auditing added content in Alfresco repository", we talked about how to create a simple audit console based on an Alfresco behaviour. Let's give some details for this example. The idea is quite simple, for Alfresco creation events, to write an event log in catalina.out, such the following one:

2019-04-15 08:50:02,346  DEBUG [repo.behavior.AuditContentCreatedBehavior] [http-apr-8080-exec-5] version created => uuid=845e3c4e-197a-472a-9b85-cf3a31ad7ae0 name=liferayDXP7.0-front-end-developer-book-letter-7.0.5.1.pdf mimetype=application/pdf size=63709958 path=/Espacio de empresa/Sitios/portal/documentLibrary/Recursos/Certificación FrontEnd Developer DXP 7.0, 7.1/liferayDXP7.0-front-end-developer-book-letter-7.0.5.1.pdf site=portal creator=aiu001 type=cm:content created=2019-04-11T12:52:41.661+02:00 version=1.0


In the log, we can see some information about an Alfresco node, for example, the uuid, fillename, mimetype, path, site, creator, version or content type. These fields will define later the index schema in Elastic Search.

A filebeat agent is used to ship the corresponding log to a Logstash server, which parses the input information. Logstash extracts the corresponding fields according to a grok expression in the filter part of logstash configuration, and then it is indexed in Elastic Search. For testing these type of grok expressions, we can use the Kibana Development Console for example. The grok expression is something like this:

filter {

    grok {
      match => [ "message", "%{TIMESTAMP_ISO8601:logdate}\s*%{LOGLEVEL:logLevel}\s*%{NOTSPACE:class}\s*%{NOTSPACE:thread}\s*%{NOTSPACE:myclass}\s*%{GREEDYDATA:alf-event}\s*=>\s*uuid=%{NOTSPACE:alf-uuid}\s*name=%{NOTSPACE:alf-file}\s*mimetype=%{NOTSPACE:alf-mimetype}\s*size=%{INT:alf-size}\s*path=%{GREEDYDATA:alf-path}\s*site=%{NOTSPACE:alf-site}\s*creator=%{NOTSPACE:alf-creator}\s*type=%{NOTSPACE:alf-type}\s*created=%{TIMESTAMP_ISO8601:alf-created}\s*version=%{NOTSPACE:version}" ]
      add_tag => [ "alf-audit-log"]
    }

}


So what we did:

  • Filebeat shipper (in Alfresco node) --> Logstash --> Elastic Search <-- Kibana (for visualization)

but a similar model may be done with SOLR:

  •  Filebeat shipper (in Alfresco node)  --> Logstash --> SOLR <-- Apache Zeppelin (for visualization)

In a first approach to the problem, we prefered SOLR versus Elastic Search, because it is closer to Alfresco architecture. But Kibana time representation capabilities are better suited for log search analysis. Besides, in the last 6.x versions of ELK stack, Kibana is enhanced with many improvements in the console allowing to manage Elastic indices or index-patterns, to filter controls in dashboards, or to autocomplete queries. The dashboards are easy to built via saved searches and visualizations, providing interesting information about Alfresco use from a general perspective to particular insights, for example, which are the most active users per site, what are the most used sites or what are the most used content types for your organization.

Below, we are are showing the main dashboard for auditing Site console, including site controls for filtering, tag cloud and the following visualizations:

  • Number of documents by site (bar)
  • Volume storage by site (pie)
  • Number of documents by site and user (stacked bar)
  • Number of documents by site and and mimetype (stacked bar)
  • Number of documents per user and site (heatmap)
  • Volume per user and site (heatmat)
  • Alfresco Audit Logs (table of added content)

Links:

 

00

More Blog Entries

thumbnail

0 Comments