~shulhan/haminer

Library and program to parse and forward HAProxy HTTP logs

c48d861 all: do not log ServerName with invalid connection '<NOSRV>'

~shulhan pushed to ~shulhan/haminer git

5 months ago

57c0833 [wip] _wui: implement web user interface

~shulhan pushed to ~shulhan/haminer git

5 months ago
// SPDX-FileCopyrightText: 2018 M. Shulhan <ms@kilabit.info>
// SPDX-License-Identifier: GPL-3.0-or-later
= haminer
:toc:
:sectanchors:
:sectlinks:

Library and program to parse and forward HAProxy logs.

The HTTP logs is HTTP request that received by HAProxy frontend and forwarded
to backend.
In default format, it looks like these (split into multi lines, for
readability):

----
<158>Sep  4 17:08:47 haproxy[109530]: 185.83.144.103:46376
  [04/Sep/2022:17:08:47.264] www~ be_kilabit/kilabit-0.0/0/1/2/3 200 89 - -
  ---- 5/5/0/0/0 0/0 "GET / HTTP/1.1"
----

See
https://www.haproxy.com/documentation/hapee/1-8r1/onepage/#8.2.3[HTTP log format documentation]
for more information.

Currently, there are two supported database where haminer can forward the
parsed log: influxdb and questdb.
Haminer support Influxdb v1 and v2.

----
 +---------+  UDP  +---------+      +-----------+
 | HAProxy |------>| haminer |----->| influxdb  |
 +---------+       +---------+      | / questdb |
                                    +-----------+
----

In Influxdb, the log are stored as measurement called `haproxy`.
In Questdb, the log are stored as table called `haproxy`.

The following fields are stored as tags (in Influxdb) or symbol (in Questdb):
host, server, backend, frontend, http_method, http_url, http_query,
http_proto, http_status, term_state, client_ip, client_port.

And the following fields are stored as fields (in Influxdb) or values (in
Questdb): time_req, time_wait, time_connect, time_rsp, time_all,
conn_active, conn_frontend, conn_backend, conn_server, conn_retries,
queue_server, queue_backend, bytes_read.

Once the log has been accumulated, we can query the data.
For example, with Questdb we can count each visited URL using the following
query,

----
select backend, http_url, count(*) as visit from 'haproxy'
group by backend, http_url
order by visit desc;
----

==  Installation

===  Building from source

*Requirements*

* https://golang.org[Go^] for building from source code
* https://git-scm.com/[git^] for downloading source code

Get the source code using git,

----
$ git clone https://git.sr.ht/~shulhan/haminer
$ cd haminer
$ make
----

The binary name is `haminer` build in the current directory.


===  Pre-build package

The Arch Linux package is available at build.kilabit.info.
Add the following repository to your pacman.conf,

----
[build.kilabit.info]
Server = https://build.kilabit.info/aur
----

To install it,

	$ sudo pacman -Sy --noconfirm haminer-git


== Configuration

haminer by default will load it's config from `/etc/haminer.conf`, if not
specified when running the program.

See
https://git.sr.ht/~shulhan/haminer/tree/main/item/cmd/haminer/haminer.conf[haminer.conf^]
for an example of possible configuration and their explanation.


===  Forwarders

Currently, there are two supported database where haminer can forward the
parsed log: influxdb and questdb.
Haminer support Influxdb v1 and v2.

====  Influxdb v1

For v1, you need to create the user and database first,

----
$ influx
> CREATE USER "haminer" WITH PASSWORD 'haminer'
> CREATE DATABASE haminer
> GRANT ALL ON haminer TO haminer
----

Example of forwarder configuration,

----
[forwarder "influxd"]
version = v1
url = http://127.0.0.1:8086
bucket = haminer
user = haminer
password  = haminer
----

====  Influxdb v2

For v2,

----
$ sudo influx bucket create \
	--name haminer \
	--retention 30d
----

For v2, the example configuration is

----
[forwarder "influxd"]
version = v1
url = http://127.0.0.1:8086
org = $org
bucket = haminer
token = $token
----

====  Questdb

For questdb the configuration is quite simple,

----
[forwarder "questdb"]
url = udp://127.0.0.1:9009
----

We did not need to create the table, questdb will handled that automatically.


==  Deployment

. Copy configuration from `$SOURCE/cmd/haminer/haminer/conf` to
`/etc/haminer.conf`

. Update haminer configuration in `/etc/haminer.conf`
+
--
For example,
----
[haminer]
listen = 127.0.0.1:5140

...
----

Add one or more provider to the configuration as the example above.
--

. Update HAProxy config to forward log to UDP port other than rsyslog.
+
--
For example,
----
global
	...
	log 127.0.0.1:5140 local3
	...
----
Then reload or restart HAProxy.
--

. Run the haminer program,
+
--
----
$ haminer
----
or use a
https://git.sr.ht/~shulhan/haminer/tree/main/item/cmd/haminer/haminer.service[systemd
service^].

----
$ sudo systemctl enable haminer
$ sudo systemctl start  haminer
----
--


==  License

----
haminer - Library and program to parse and forward HAProxy logs.
Copyright (C) 2018-2022 M. Shulhan <ms@kilabit.info>

This program is free software: you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation, either version 3 of the License, or any later version.

This program is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more
details.

You should have received a copy of the GNU General Public License along with
this program.
If not, see <http://www.gnu.org/licenses/>.
----