Merge pull request #8 from olblak/kubernetes-004

Update logging workflow
This commit is contained in:
R. Tyler Croy 2017-01-17 14:48:05 -08:00 committed by GitHub
commit b2906a4df9
1 changed files with 21 additions and 67 deletions

View File

@ -129,40 +129,43 @@ link:http://www.microsoft.com/en-us/cloud-platform/operations-management-suite[O
Fluentd container will redirect logs based on rules defined by log's type.
As a first iteration, we identify two log's type, archive and stream.
They are explain more in depth below.
They are explain below.
==== Type
===== Stream
'Stream' means that their access are almost realtime and
we only want to retrieve them for a short time period.
'Stream' means that logs are directly send to log analytics
where they will be available for a short time period (7 days).
After what they will be definitively deleted.
Reasons why we consider logs a 'stream log' are:
Reasons why we consider logs as 'stream' are:
* Costs, we don't want to pay for useless logs' storage
* Debugging, we may have to analyze application's behaviour
'Stream' logs will be stored on azure log analytics for 7 days and then deleted.
In order to retrieve logs informations, we'll need access to log analytics dashboard.
In order to retrieve logs informations, we'll need an access to log analytics dashboard.
*! This is the default behaviour followed by all log's types*
===== Archive
'Archive' means that we want to access them for a long time period.
We store them on an azure blob storage (or shared disk).
Those logs will be kept on Azure containers for an undetermined period.
In order to retrieved them, we'll have to request compressed archives from an admin.
Reasons why we may consider logs as 'archive' are:
* Need long time period access
* Want to backup important informations
'Archive' logs will be stored on Azure containers for an undetermined period.
In order to retrieved them, we'll have to request compressed archives from an admin.
N.B:
N.B We prefer to use Azure container over Azure Shared Disk as we can access logs without having to
* We prefer to use Azure container over Azure Shared Disk as we can access logs without having to
mount azure shared disks.
* At the moment the fluentd plugin for azure storage doesn't work as expect (this should be fixed).
So we use azure shared disk.
==== Logging Strategy
==== Logging Workflow
Logs work as follow:
* Docker containers write logs to json files located in /var/log/containers/ on each kubernetes agent
@ -194,61 +197,13 @@ and apply some 'rules'
+--------------------------------------------------------------+
....
In order to know which workflow need to be apply.
We use kubernetes lables.
In order to know howto apply rules, we can follow one of the 3 following strategies.
By convention we use label 'logtype'.
*We must agree with one of them*
.1) We search for patterns inside logs file.
We can use this fluentd plugin http://docs.fluentd.org/articles/filter_grep[filter_grep] to search for log patterns
ex:
Default apache access log
....
127.0.0.1 - - [05/Feb/2012:17:11:55 +0000] "GET / HTTP/1.1" 200 140 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/18.0.1025.5 Safari/535.19"
....
Apache access log with type information
....
TO_ARCHIVE 127.0.0.1 - - [05/Feb/2012:17:11:55 +0000] "GET / HTTP/1.1" 200 140 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/18.0.1025.5 Safari/535.19"
....
Pos:
* More flexible, we can define log type per log level
Cons:
* Error prone, fluentd doesn't know any formats or contents for parsed logs.
* More configurations: we need to modify default logs configuration.
* More work for contributors
* We Need logging knowledge for each application
* We must rebuild docker image if we want to change log type
.2) We define log's type information based on container's name.
Container 'fluentd-2ae2r' become 'archive_fluentd-2ae2r'
Or we can use numbers.
Example 'fluentd-2ae2r' become '0_fluentd-2ae2r'
where '0' mean by convention 'archive'
pros:
* We don't have to modify default logging configuration.
* Contributors only have to define logs' type in containers' name.
* Easy to handle from fluentd configuration.
* We don't have to rebuild docker image when we change log type.
cons:
* We can't update container log type at runtime
* We can't have several type of logs within an application
* We add meanings in containers' name
.3) We define logs's type based on container's label.
We define a label 'log_type' with default value set to 'stream'.
If we want to archive logs we can update the value to 'archive'
If logtype == 'archive', we apply 'archive' workflow.
Otherwise we apply 'stream' workflow.
pros:
@ -258,11 +213,10 @@ pros:
* Easy to handle from fluentd configuration.
cons:
* We can't have different log type within an application
I think the best way to go would be to use labels
* We can't have different log's types within an application
A prototype of this architecture can be found in Olivier Vernin's
A docker image that implement this workflow can be found in Olivier Vernin's
link:https://github.com/olblak/fluentd-k8s-azure[fluentd-k8s-azure]
repository.