-
Notifications
You must be signed in to change notification settings - Fork 30
Description
I profiled ForwardChainer to find out how performance can be improved and found that it spends significant amount of time in the Rule::unify_source method. It depends on rules complexity. The more complex rules are used the bigger this amount of time is. The following benchmark demonstrates this: opencog/benchmark#27
The problem is that unify_source
does almost nothing. First part converts rule's BindLink
into another one with random variable naming (RewriteLink::alpha_convert
). And second one constructs new BindLink
with some variables grounded.
Proper fix probably should be raised to opencog/atomspace
, but I am raising the issue here as main concern is URE performance.
Flame graph of the code benchmarked (better opening in web-browser):
perf.svg.gz
Following steps require getting perf
and FlameGraph
tool (https://github.com/brendangregg/FlameGraph). Intel CPU with support of Intel Processor Trace is required as well.
Steps to reproduce:
Run benchmark in the background:
./micro/benchmark --benchmark_filter=ForwardChainer_Basic --benchmark_min_time=600 &
Collect profile using Intel Processor Trace. You need only 1 second of profile it will be too large otherwise:
perf record -e intel_pt//u -p `pidof benchmark` sleep 1
Build flame graph of the do_step
call (it can take a lot of time):
perf script --itrace=i100nsg | ./FlameGraph/stackcollapse-perf.pl | fgrep 'benchmark;BM_ForwardChainer_Basic;opencog::ForwardChainer::do_chain;opencog::ForwardChainer::do_step_rec;opencog::ForwardChainer::do_step;' | ./FlameGraph/flamegraph.pl > perf.svg