When creating your .aurora configuration, try to keep all versions of
a particular job within the same .aurora file. For example, if you
have separate jobs for cluster1, cluster1 staging, cluster1
testing, andcluster2, keep them as close together as possible.
Constructs shared across multiple jobs owned by your team (e.g.
team-level defaults or structural templates) can be split into separate
.aurorafiles and included via the include directive.
If you see repetition or find yourself copy and pasting any parts of your configuration, it’s likely an opportunity for templating. Take the example below:
redundant.aurora contains:
download = Process(
name = 'download',
cmdline = 'wget http://www.python.org/ftp/python/2.7.3/Python-2.7.3.tar.bz2',
max_failures = 5,
min_duration = 1)
unpack = Process(
name = 'unpack',
cmdline = 'rm -rf Python-2.7.3 && tar xzf Python-2.7.3.tar.bz2',
max_failures = 5,
min_duration = 1)
build = Process(
name = 'build',
cmdline = 'pushd Python-2.7.3 && ./configure && make && popd',
max_failures = 1)
email = Process(
name = 'email',
cmdline = 'echo Success | mail feynman@tmc.com',
max_failures = 5,
min_duration = 1)
build_python = Task(
name = 'build_python',
processes = [download, unpack, build, email],
constraints = [Constraint(order = ['download', 'unpack', 'build', 'email'])])
As you’ll notice, there’s a lot of repetition in the Process
definitions. For example, almost every process sets a max_failures
limit to 5 and a min_duration to 1. This is an opportunity for factoring
into a common process template.
Furthermore, the Python version is repeated everywhere. This can be bound via structural templating as described in the Advanced Binding section.
less_redundant.aurora contains:
class Python(Struct):
version = Required(String)
base = Default(String, 'Python-{{version}}')
package = Default(String, '{{base}}.tar.bz2')
ReliableProcess = Process(
max_failures = 5,
min_duration = 1)
download = ReliableProcess(
name = 'download',
cmdline = 'wget http://www.python.org/ftp/python/{{python.version}}/{{python.package}}')
unpack = ReliableProcess(
name = 'unpack',
cmdline = 'rm -rf {{python.base}} && tar xzf {{python.package}}')
build = ReliableProcess(
name = 'build',
cmdline = 'pushd {{python.base}} && ./configure && make && popd',
max_failures = 1)
email = ReliableProcess(
name = 'email',
cmdline = 'echo Success | mail {{role}}@foocorp.com')
build_python = SequentialTask(
name = 'build_python',
processes = [download, unpack, build, email]).bind(python = Python(version = "2.7.3"))
Many tiny Processes makes for harder to manage configurations.
copy = Process(
name = 'copy',
cmdline = 'rcp user@my_machine:my_application .'
)
unpack = Process(
name = 'unpack',
cmdline = 'unzip app.zip'
)
remove = Process(
name = 'remove',
cmdline = 'rm -f app.zip'
)
run = Process(
name = 'app',
cmdline = 'java -jar app.jar'
)
run_task = Task(
processes = [copy, unpack, remove, run],
constraints = order(copy, unpack, remove, run)
)
Each cmdline runs in a bash subshell, so you have the full power of
bash. Chaining commands with && or || is almost always the right
thing to do.
Also for Tasks that are simply a list of processes that run one after
another, consider using the SequentialTask helper which applies a
linear ordering constraint for you.
stage = Process(
name = 'stage',
cmdline = 'rcp user@my_machine:my_application . && unzip app.zip && rm -f app.zip')
run = Process(name = 'app', cmdline = 'java -jar app.jar')
run_task = SequentialTask(processes = [stage, run])
90% of the time you define a function in a .aurora file, you’re
probably Doing It Wrong™.
def get_my_task(name, user, cpu, ram, disk):
return Task(
name = name,
user = user,
processes = [STAGE_PROCESS, RUN_PROCESS],
constraints = order(STAGE_PROCESS, RUN_PROCESS),
resources = Resources(cpu = cpu, ram = ram, disk = disk)
)
task_one = get_my_task('task_one', 'feynman', 1.0, 32*MB, 1*GB)
task_two = get_my_task('task_two', 'feynman', 2.0, 64*MB, 1*GB)
This one is more idiomatic. Forced keyword arguments prevents accidents, e.g. constructing a task with “32*MB” when you mean 32MB of ram and not disk. Less proliferation of task-construction techniques means easier-to-read, quicker-to-understand, and a more composable configuration.
TASK_TEMPLATE = SequentialTask(
user = 'wickman',
processes = [STAGE_PROCESS, RUN_PROCESS],
)
task_one = TASK_TEMPLATE(
name = 'task_one',
resources = Resources(cpu = 1.0, ram = 32*MB, disk = 1*GB) )
task_two = TASK_TEMPLATE(
name = 'task_two',
resources = Resources(cpu = 2.0, ram = 64*MB, disk = 1*GB)
)