Time series logging¶
How do I use this?¶
Consider this example of a ‘simulation’ logging various quantities:
from time import sleep
from random import uniform
from pytools.log import (LogManager, add_general_quantities,
add_simulation_quantities, add_run_info, IntervalTimer,
LogQuantity, set_dt)
from warnings import warn
class Fifteen(LogQuantity):
def __call__(self):
return 15
def main():
logmgr = LogManager("mylog.dat", "w") # , comm=...
# set a run property
logmgr.set_constant("myconst", uniform(0, 1))
add_run_info(logmgr)
add_general_quantities(logmgr)
add_simulation_quantities(logmgr)
vis_timer = IntervalTimer("t_vis", "Time spent visualizing")
logmgr.add_quantity(vis_timer)
logmgr.add_quantity(Fifteen("fifteen"))
logmgr.add_watches(["step.max", "t_sim.max", "t_step.max"])
for istep in range(200):
logmgr.tick_before()
dt = uniform(0.01, 0.1)
set_dt(logmgr, dt)
sleep(dt)
# Illustrate custom timers
if istep % 10 == 0:
with vis_timer.start_sub_timer():
sleep(0.05)
# Illustrate warnings capture
if uniform(0, 1) < 0.05:
warn("Oof. Something went awry.")
logmgr.tick_after()
logmgr.close()
if __name__ == "__main__":
main()
(You can find this example as
examples/log.py
in the pytools
source distribution.)
Running this script will give you a file mylog.dat
. Note how log quantities
given as “watches” are also periodically logged to the console.
To analyze this data, we need some tools that are distributed as part of the ‘datapyle’ package on Github, which we need to install first:
$ pip install git+https://github.com/inducer/datapyle.git
To prepare this data set for analysis, run runalyzer-gather
as a script:
$ runalyzer-gather summary.dat mylog.dat
Scanning... [########################################] ETA ?
Importing... [########################################] ETA ?
This step may look unnecessary here, but it serves two important functions:
If you would like to compare data from multiple runs, you can simply add data files from multiple runs and analyze them side-by-side.
For distributed-memory runs, you can likewise gve data sets from multiple ranks to be gathered in a single file for centralized analysis.
As well as, naturally, any combination of the above.
After this, you may use runalyzer
(also in datapyle
) to analyze
the summary file:
# -m enables '$quantity' in SQL
$ runalyzer -m summary.dat
Runalyzer running on Python 3.8.5 (default, Aug 2 2020, 15:09:07)
[GCC 10.2.0]
Copyright (c) Andreas Kloeckner 2008
Run .help to see help for 'magic' commands
>>>
This is a normal Python console, but it has a few extra tricks up its
sleeve. .help
gives an overview. Here, we’ll focus on demonstrating
two interesting features. First, lets get an overview of all the
logged quantities:
>>> .quantities
id | name | unit | description | rank_aggregator
---+---------+------+-----------------------------+-----------------
7 | dt | s | Simulation Timestep | None
9 | fifteen | None | None | None
5 | step | 1 | Timesteps | None
2 | t_2step | s | Step-to-step duration | None
3 | t_cpu | s | Wall time | None
4 | t_log | s | Time spent updating the log | None
6 | t_sim | s | Simulation Time | None
1 | t_step | s | Time step duration | None
8 | t_vis | s | Time spent visualizing | None
Next, we could plot, e.g. time step duration against simulation time:
>>> .plot select $t_sim, $t_2step
Next, a fact that the select
may have given away: The argument to
.plot
is a SQL query that is passed on to sqlite3
, with support
for some additional syntax. You can see what this query expanded to internally:
>>> mangle_sql("select $t_sim, $t_2step")
'select t_sim.value, t_2step.value from runs inner join t_sim on (t_sim.run_id = runs.id) inner join t_2step on (t_2step.run_id = runs.id and t_sim.step = t_2step.step and t_sim.rank=t_2step.rank) '
As you can see, by default these queries return data from all runs. You can filter which runs you are interested in through run properties:
>>> .runprops
cmdline
date
dirname
filename
id
is_parallel
machine
myconst
rank_count
schema_version
unique_run_id
unixtime
simply by adding a SQL where
clause referring to these properties.
As you can see from the demo, you can of course add your own, application-focused run properties.
This means that you have the full power of SQL available to analyze your time series data, for example:
Plot the solution 2-norms of all runs that were run on a Tuesday between time step 2000 and 3000
Make a scatterplot correlating memory transfer amounts and simulation time
Log Quantity Abstract Interfaces¶
-
class
pytools.log.
LogQuantity
(name, unit=None, description=None)[source]¶ A source of loggable scalars.
-
__init__
(name, unit=None, description=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
tick
()[source]¶ Perform updates required at every
LogManager
tick.
-
__call__
()[source]¶ Return the current value of the diagnostic represented by this
LogQuantity
or None if no value is available.This is only called if the invocation interval calls for it.
-
-
class
pytools.log.
PostLogQuantity
(name, unit=None, description=None)[source]¶ A source of loggable scalars.
-
__init__
(name, unit=None, description=None)¶ Initialize self. See help(type(self)) for accurate signature.
-
tick
()¶ Perform updates required at every
LogManager
tick.
-
Log Manager¶
-
class
pytools.log.
LogManager
(filename=None, mode='r', mpi_comm=None, capture_warnings=True, commit_interval=90)[source]¶ A distributed-memory-capable diagnostic time-series logging facility. It is meant to log data from a computation, with certain log quantities available before a cycle, and certain other ones afterwards. A timeline of invocations looks as follows:
tick_before() compute... tick() tick_after() tick_before() compute... tick_after() ...
In a time-dependent simulation, each group of
tick_before()
tick_after()
calls captures data for a single time state, namely that in which the data may have been before the “compute” step. However, some data (such as the length of the timestep taken in a time-adpative method) may only be available after the completion of the “compute…” stage, which is whytick_after()
exists.A
LogManager
logs any number of named time series of floats to a file. Non-time-series data, in the form of constants, is also supported and saved.If MPI parallelism is used, the “head rank” below always refers to rank 0.
Command line tools called runalyzer (in the datapyle package) are available for looking at the data in a saved log.
-
__init__
(filename=None, mode='r', mpi_comm=None, capture_warnings=True, commit_interval=90)[source]¶ Initialize this log manager instance.
- Parameters
filename – If given, the filename to which this log is bound. If this database exists, the current state is loaded from it.
mode – One of “w”, “r” for write, read. “w” assumes that the database is initially empty. May also be “wu” to indicate that a unique filename should be chosen automatically.
mpi_comm – A mpi4py.MPI.Comm. If given, logs are periodically synchronized to the head node, which then writes them out to disk.
capture_warnings – Tap the Python warnings facility and save warnings to the log file.
commit_interval – actually perform a commit only every N times a commit is requested.
Data retrieval
-
get_expr_dataset
(expression, description=None, unit=None)[source]¶ Prepare a time-series dataset for a given expression.
- Parameters
expression – A
pymbolic
expression that may involve the time-series variables and the constants in thisLogManager
. If there is data from multiple ranks for a quantity occuring in this expression, an aggregator may have to be specified.- Returns
(description, unit, table)
, where table is a list of tuples(tick_nbr, value)
.
Aggregators are specified as follows: -
qty.min
,qty.max
,qty.avg
,qty.sum
,qty.norm2
-qty[rank_nbr]
-qty.loc
-
get_joint_dataset
(expressions)[source]¶ Return a joint data set for a list of expressions.
- Parameters
expressions – a list of either strings representing expressions directly, or triples (descr, unit, expr). In the former case, the description and the unit are found automatically, if possible. In the latter case, they are used as specified.
- Returns
A triple
(descriptions, units, table)
, where table is a a list of[(tstep, (val_expr1, val_expr2,...)...]
.
Configuration
-
add_quantity
(quantity, interval=1)[source]¶ Add an object derived from
LogQuantity
to this manager.
Time Loop
-
tick_before
()[source]¶ Record data points from each added
LogQuantity
that is not an instance ofPostLogQuantity
. Also, invokePostLogQuantity.prepare_for_tick()
onPostLogQuantity
instances.
-
tick_after
()[source]¶ Record data points from each added
LogQuantity
that is an instance ofPostLogQuantity
.May also checkpoint data to disk.
-
Built-in Log General-Purpose Quantities¶
-
class
pytools.log.
IntervalTimer
(name, description=None)[source]¶ Records elapsed times supplied by the user either through sub-timers, or by explicitly calling
add_time()
.
-
class
pytools.log.
LogUpdateDuration
(mgr, name='t_log')[source]¶ Records how long the last log update in
LogManager
took.
-
class
pytools.log.
EventCounter
(name='interval', description=None)[source]¶ Counts events signaled by
add()
.
-
class
pytools.log.
TimestepCounter
(name='step')[source]¶ Counts the number of times
LogManager
ticks.
-
class
pytools.log.
StepToStepDuration
(name='t_2step')[source]¶ Records the CPU time between invocations of
LogManager.tick_before()
andLogManager.tick_after()
.
-
class
pytools.log.
TimestepDuration
(name='t_step')[source]¶ Records the CPU time between the starts of time steps.
LogManager.tick_before()
andLogManager.tick_after()
.
-
class
pytools.log.
ETA
(total_steps, name='t_eta')[source]¶ Records an estimate of how long the computation will still take.
-
pytools.log.
add_general_quantities
(mgr)[source]¶ Add generally applicable
LogQuantity
objects to mgr.