Herald: A Lightweight Task Dispatch System
15 Nov 2019There are many works when maintaining the server, like certificates update, data backup, application deployment, software update, etc. We could use scripts to simplify these processes. But the management of scripts is also another challenge, especially when there are many servers to take care.
Many tools are born for these goals, like Puppet, Jenkins. But I do not want such a complex system. I would like only to accomplish some simple jobs, and database is totally not necessary for my situation.
Here are the objective for the new task dispatch system:
- It could accept various triggers.
- It is able to execute scripts from Git repository, which will simplify job deployment for multiple servers.
- There should be one center service managing the whole workflow of tasks, and other servers only receive and execute them.
- The security of task execution for remote servers must be carefully considered, while job results should also be returned safely.
- The full workflow could be configured by
yaml
file.
So the task dispatch system “Herald” is created, which is implemented in
Go. Go is quite fit for multiple tasks via goroutine, and its net/http
standard library is suitable for remote operation.
Herald contains the core library Herald, the Herald Daemon, and some plugins. All the projects are located at https://github.com/heraldgo.
Installation of Herald Daemon
We could run the service by installing Herald Daemon.
The binary files are located on github releases page. Download the file with right platform and extract it.
Then provide a YAML configuration file and run the Herald Daemon:
$ heraldd -config config.yml
Press Ctrl+C to exit.
Structure of configuration
The workflow is defined in a single YAML
file, which mainly
includes the following components:
trigger
: defines when to start the workflow, like fixed time, HTTP requests reception.selector
: defines whether to proceed for job execution, which depends ontrigger
andselect
parameters.executor
: defines how to execute the job, which acceptstrigger
andjob
parameters.router
: connects all the components above, and defines a complete task.
The configuration structure for trigger
, selector
and executor
are
quite similar. Take selector
as an example:
selector:
selector_name:
type: selector_type
param1: value1
param2: value2
The name for the same component must be unique. Each component should
have its type
, which could be omitted if it is the same as the
component name. All remaining parameters are passed to component.
The parameters vary among different component types.
All available component types and parameter definitions could be found in README. In case they could not meet your requirements, new components could be extended by plugin.
It is not necessary to write configuration for all components.
They could be specified by the type
name in router
directly,
which will use their default parameters.
Here is the structure of router
configuration. select_param
will be passed to selector
and job_param
to executor
.
router:
router_name:
trigger: trigger_name
selector: selector_name
task:
task_name: executor_name
select_param:
param1: value1
job_param:
param2: value2
param3: value3
Print periodically
This is a simple example which shows how to define workflow
in the configuration file.
Save the following content as config.yml
file, then run
heraldd
to start the Herald Daemon.
trigger:
every2s:
type: tick
interval: 2
router:
print_param_every2s:
trigger: every2s
selector: all
task:
print_param: print
In this example, the trigger every2s
will be active every 2 seconds.
The router print_param_every2s
will receive the trigger.
Then the selector all
will decide whether or not to execute task
print_param
.
If the selector allows to continue, then the job parameters will be
passed to executor print
and do the execution.
What we will see is printing some parameters on the screen every 2 seconds.
Run command on the local server
We could never be satisfied with printing some parameters on the screen.
More could be done with executor type local
, which could execute
external commands, even scripts located on Git repository.
trigger:
wednesday_morning:
type: cron
cron: '30 6 * * 3'
executor:
local_command:
type: local
work_dir: /var/lib/heraldd/work
router:
uptime_wednesday_morning:
trigger: wednesday_morning
selector: all
task:
run_local: local_command
job_param:
cmd: uptime
print_result:
trigger: exe_done
selector: match_map
task:
print_result: print
select_param:
match_key: router
match_value: uptime_wednesday_morning
job_param:
print_key: trigger_param/result
This example defines the executor local_command
with type local
.
It needs the work_dir
for holding Git repository (<work_dir>/gitrepo
).
The work_dir
is also used as current directory for commands (<work_dir>/run
),
where the commands could keep log and intermediate files.
The trigger wednesday_morning
use the crontab
syntax to define when to activate the task.
The router uptime_wednesday_morning
receive this trigger and execute job with executor
local_command
, where cmd
is passed as parameter.
The second router print_result
is used to print the execution result
for the last step. The trigger exe_done
is used here, which is
provided internally by herald. exe_done
is activated after any job
execution is finished, with the result of last execution as trigger
parameter. In this example, it prints the result of uptime
in last
step.
With
exe_done
it is able to build a task chain for more than one step. You could explore many interesting features, like job result monitoring. Great attention must be payed when choosing selector forexe_done
, or there might be dead loop. Because the job activated byexe_done
will also trigger newexe_done
.
Execute scripts from Git repository
It could be helpful if the jobs could run scripts from a Git repository. Then it is not necessary to deploy them on every server.
Here is an example of running scripts from Git repository with executor
type local
:
trigger:
wednesday_morning:
type: cron
cron: '30 6 * * 3'
executor:
local_command:
type: local
work_dir: /var/lib/heraldd/work
router:
run_git_script:
trigger: wednesday_morning
selector: all
task:
run_git: local_command
job_param:
git_repo: https://github.com/heraldgo/demo-script.git
cmd: run/backup.sh
print_result:
trigger: exe_done
selector: match_map
task:
print_result: print
select_param:
match_key: executor
match_value: local_command
job_param:
print_key: trigger_param/result
The local
executor will pull the git_repo
to the directory
<work_dir>/gitrepo
, and then run the scripts specified by cmd
.
Any executable file in the Git repository could be set as cmd
,
so there is no restriction on the script language.
All parameters of executor will be passed as the environment variable
HERALD_EXECUTE_PARAM
in json
format to the command.
The standard output of the command will be returned to Herald Daemon.
In case the output could be converted to json
, it will
be merged into the final result map, otherwise it will be directly
set as the output
item.
Since any scripts could be executed, the authority of the Git repository must be treated with great care. Only trusted user could have write permission. Never save sensitive information like user and password in the repository. They could be set as job parameter in the configuration file, and set proper access permission:
$ chmod go-rwx config.yml
Execute job on remote servers
The executor type http_remote
is provided to run remote jobs.
Installation of Herald Runner
http_remote
must cooperate with
Herald Runner,
which is essentially an HTTP service.
The server for running remote jobs must install Herald Runner firstly. The binary executable could be downloaded from the github releases page.
Herald Runner need a configuration file like this:
log_level: INFO
log_output: /var/log/herald-runner/herald-runner.log
host: 0.0.0.0
port: 8124
#unix_socket: /var/run/herald-runner/herald-runner.sock
secret: the_secret_should_be_strong_enough
work_dir: /var/lib/herald-runner/work
The secret
is used for the signature of
SHA256 HMAC,
which will guarantee that the request is not forged or hijacked.
Start Herald Runner with:
$ herald-runner -config config.yml
If HTTPS is needed, please consider using nginx as reverse proxy and set up certificates there.
Configuration for remote execution
trigger:
wednesday_morning:
type: cron
cron: '30 6 * * 3'
executor:
remote_command:
type: http_remote
host: https://herald-runner.example.com/
secret: the_secret_must_be_exactly_the_same_as_herald_exe_server
data_dir: /var/lib/heraldd/data
router:
run_git_script:
trigger: wednesday_morning
selector: all
task:
run_git: remote_command
job_param:
git_repo: https://github.com/heraldgo/demo-script.git
cmd: run/backup.sh
print_result:
trigger: exe_done
selector: except_map
task:
print_result: print
select_param:
except_key: router
except_value: print_result
job_param:
print_key: trigger_param/result
host
specifies the remote URL of Herald Runner.
secret
must be exactly the same as the Herald Runner service.
The result files from the remote execution will be saved
under data_dir
.
The job parameters in router is quite similar with the local
executor,
and remote jobs could also run scripts in Git repository.
Retrieve the result files from the remote execution
There could be result data files from the remote execution, like the backup data. If you would like to retrieve data files, just set the correct output format in the script.
The output must be the json
format and include the file
item,
where one or more files could be specified:
{
"file": {
"file1": "/full/path/of/file1.dat",
"file2": "/full/path/of/file2.dat"
},
"other_key1": "value1",
"other_key2": "value2",
...
}
All files under
file
will be automatically transferred to Herald Daemon in the form of multipart, and saved underdata_dir
ofhttp_remote
executor. The SHA-256 checksum is also applied to validate the data integrity.
This is the final result after local path conversion by
http_remote
:
{
"file": {
"file1": "/data_dir/job_id/file1/file1.dat",
"file2": "/data_dir/job_id/file2/file2.dat"
},
"other_key1": "value1",
"other_key2": "value2",
...
}
Execute tasks manually
Sometimes tasks need to be executed manually, like debugging scripts.
This could be accomplished by trigger http
.
trigger:
manual:
type: http
host: 127.0.0.1
port: 8123
router:
manual_command:
trigger: manual
selector: match_map
task:
run_command: local_command
select_param:
match_key: command
match_value: uptime
job_param:
cmd: uptime
manual_backup:
trigger: manual
selector: match_map
task:
backup_db: local_command
select_param:
match_key: backup
job_param:
git_repo: https://github.com/heraldgo/demo-script.git
cmd: run/backup.sh
print_result:
trigger: exe_done
selector: except_map
task:
print_result: print
select_param:
except_key: router
except_value: print_result
job_param:
print_key: trigger_param/result
The jobs could be triggered with HTTP POST request.
The Content-Type
must be application/json
.
The json
request body with be set as the trigger parameter,
which will be passed to selector and executor later.
The HTTP request could be sent with curl
command:
$ curl -i -H "Content-Type: application/json" -X POST -d '{"command":"uptime"}' http://localhost:8123
$ curl -i -H "Content-Type: application/json" -X POST -d '{"backup":"service1"}' http://localhost:8123
Complex logic could be achieved by dealing with trigger and job parameters in the script.
Please pay great attention that this trigger does not do any authority control, so do NOT open the port to the world, or else anyone could manipulate your server.
Flexibility and extendibility
Herald does not restrict the number of triggers, executors, selectors and routers, so there could be complicated combinations for your objective.
Since it is able to run any scripts, it could meet your demands most of the time. Even if you are not satisfied with the components provided by Herald Daemon, it is easy to extend the components with plugin. You could also develop your own program based on Herald core library.
Now I am using Herald to handle the management of the servers. However, because of the flexible design, there could be more interesting applications to explore.