From: Justin Pettit Date: Fri, 11 Sep 2009 14:15:24 +0000 (-0700) Subject: logging: Lower logging level for coverage messages due to slow poll loops X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=6bc995e46d64b90ef0b3186abec92f63eab2c2ad;p=openvswitch logging: Lower logging level for coverage messages due to slow poll loops By default, many OVS processes keep track of their time through a poll loop. If it takes an unusually long time (measured as some distance from the mean), the processes will log stats it has been keeping about coverage. It was doing this at level WARN. On Xen systems, syslog messages written at level INFO and higher are written to /var/log/messages synchronously. This would mean that there would be dire messages that it took a few dozen milliseconds to go through the loop, meanwhile, it would take up to 6(!) seconds writing those. Meanwhile, the process would do no other processing, which could be quite serious in the case of a process such as ovs-vswitchd. This problem was somewhat masked because the time used by this logging was not used in the calculations for determining how long it was taking to get through the loop. This commit lowers the default log level for those coverage messages to INFO. On Xen systems, it raises the default level at which messages are written to syslog to WARN. Diagnosed and fixed with the help of Ian Campbell. --- diff --git a/lib/coverage.c b/lib/coverage.c index ba5d68d8..cdc796ec 100644 --- a/lib/coverage.c +++ b/lib/coverage.c @@ -122,7 +122,10 @@ coverage_log_counter(enum vlog_level level, const struct coverage_counter *c) } /* Logs the coverage counters at the given vlog 'level'. If - * 'suppress_dups' is true, then duplicate events are not displayed. */ + * 'suppress_dups' is true, then duplicate events are not displayed. + * Care should be taken in the value used for 'level'. Depending on the + * configuration, syslog can write changes synchronously, which can + * cause the coverage messages to take several seconds to write. */ void coverage_log(enum vlog_level level, bool suppress_dups) { diff --git a/lib/timeval.c b/lib/timeval.c index 4c34a173..3cca338f 100644 --- a/lib/timeval.c +++ b/lib/timeval.c @@ -294,7 +294,12 @@ log_poll_interval(long long int last_wakeup, const struct rusage *last_rusage) rusage.ru_nvcsw - last_rusage->ru_nvcsw, rusage.ru_nivcsw - last_rusage->ru_nivcsw); } - coverage_log(VLL_WARN, true); + + /* Care should be taken in the value chosen for logging. Depending + * on the configuration, syslog can write changes synchronously, + * which can cause the coverage messages to take longer to log + * than the processing delay that triggered it. */ + coverage_log(VLL_INFO, true); } /* Update exponentially weighted moving average. With these parameters, a diff --git a/xenserver/etc_init.d_vswitch b/xenserver/etc_init.d_vswitch index 92be421f..6ca0a1de 100755 --- a/xenserver/etc_init.d_vswitch +++ b/xenserver/etc_init.d_vswitch @@ -36,7 +36,7 @@ VSWITCHD_RUN_DIR="${VSWITCHD_RUN_DIR:-/var/xen/vswitch}" VSWITCHD_PRIORITY="${VSWITCHD_PRIORITY:--10}" VSWITCHD_LOGFILE="${VSWITCHD_LOGFILE:-/var/log/ovs-vswitchd.log}" VSWITCHD_FILE_LOGLEVEL="${VSWITCHD_FILE_LOGLEVEL:-}" -VSWITCHD_SYSLOG_LOGLEVEL="${VSWITCHD_SYSLOG_LOGLEVEL:-WARN}" +VSWITCHD_SYSLOG_LOGLEVEL="${VSWITCHD_SYSLOG_LOGLEVEL:-ERR}" VSWITCHD_MEMLEAK_LOGFILE="${VSWITCHD_MEMLEAK_LOGFILE:-}" VSWITCHD_STRACE_LOG="${VSWITCHD_STRACE_LOG:-}" VSWITCHD_STRACE_OPT="${VSWITCHD_STRACE_OPT:-}" @@ -49,7 +49,7 @@ BRCOMPATD_RUN_DIR="${BRCOMPATD_RUN_DIR:-/var/xen/vswitch}" BRCOMPATD_PRIORITY="${BRCOMPATD_PRIORITY:--10}" BRCOMPATD_LOGFILE="${BRCOMPATD_LOGFILE:-/var/log/ovs-brcompatd.log}" BRCOMPATD_FILE_LOGLEVEL="${BRCOMPATD_FILE_LOGLEVEL:-}" -BRCOMPATD_SYSLOG_LOGLEVEL="${BRCOMPATD_SYSLOG_LOGLEVEL:-INFO}" +BRCOMPATD_SYSLOG_LOGLEVEL="${BRCOMPATD_SYSLOG_LOGLEVEL:-ERR}" BRCOMPATD_MEMLEAK_LOGFILE="${BRCOMPATD_MEMLEAK_LOGFILE:-}" BRCOMPATD_STRACE_LOG="${BRCOMPATD_STRACE_LOG:-}" BRCOMPATD_STRACE_OPT="${BRCOMPATD_STRACE_OPT:-}"