License Public Domain
Lines 298
Keywords
comment (1) logging (1) performance (1) preprocess (1) Python (6) script (1)
Permissions
Viewable by Everyone
Editable by All Siafoo Users

Automatic replacement of logging statements with pass (and vice versa) Atom Feed 0

In Brief This script parses a Python file and comments out logging statements, replacing them with a pass statement (or vice versa). The purpose of commenting out these statements is to improve performance. Even if logging is disabled, arguments to logging method calls must still be evaluated, which can be expensive.... more
# 's
  1#!/usr/bin/env python
2
3"""\
4Logging Statement Modifier - replace logging calls with pass (or vice versa)
5Author: David Underhill <dgu@cs.stanford.edu>
6Version: 1.00 (06-Feb-2010)
7
8This script parses a Python file and comments out logging statements, replacing
9them with a pass statement (or vice versa). The purpose of commenting out these
10statements is to improve performance. Even if logging is disabled, arguments to
11logging method calls must still be evaluated, which can be expensive.
12
13This tool handles most common cases:
14 * Log statements may span multiple lines.
15 * Custom logging levels may be added (LEVELS, LEVEL_VALUES).
16 * Integral logging levels & named logging levels (DEBUG, etc.) are recognized.
17 * Logging statements log(), debug(), ..., critical() are all recognized.
18 * Statements with unrecognized logging levels will be left as-is.
19 * 'logging' is the assumed logging module name (LOGGING_MODULE_NAME).
20
21However, its ability to parse files is limited:
22 * It only operates on logging statements in the form logging.log(<level>, ...)
23 and logging.<level>(...).
24 * The <level> must either be an integral constant or contain one of the names
25 from the LEVELS constant below.
26 * If a logging statement is made, it is assumed that no other statement is
27 made on the same line as logging statement (except for statements made in
28 between the open and close parenthesis of the logging call). For example,
29 a semi-colon and then a second statement on the same line as a logging call
30 will not be handled properly.
31 * Logging methods must be called through SOME module, e.g., logging.log(), not
32 just log().
33 * For simplicity, undoing the commenting process relies on a comment left by
34 the program on the pass statements it adds when commenting out logging
35 statements. (So don't change the comment it outputs by the pass statement).
36
37To run this command on all of the Python files in a particular folder and its
38sub-folders at once, try this (replace '/path/to' as appropriate):
39 find . -name '*.py' | xargs -i{} /path/to/logging_statement_modifier.py {}
40"""
41
42import logging
43from optparse import OptionParser
44import re
45import sys
46
47# logging level names and values
48LEVELS = ['DEBUG', 'INFO', 'WARN', 'WARNING', 'ERROR', 'CRITICAL']
49LEVEL_VALUES = [logging.DEBUG, logging.INFO, logging.WARN, logging.WARNING, logging.ERROR, logging.CRITICAL]
50LEVELS_DICT = dict(zip(LEVELS, LEVEL_VALUES))
51
52# names of methods in the logging module which perform logging
53LOGGING_METHODS_OF_INTEREST = ['log', 'debug', 'info', 'warn', 'warning', 'error', 'critical']
54
55# name of the logging module
56LOGGING_MODULE_NAME = 'logging'
57
58# this matches logging.<method>([<first_arg>,]
59STR_RE_LOGGING_CALL = r'%s.(\w+)[(](([^,\r\n]+),)?' % LOGGING_MODULE_NAME
60
61# contents of a pass line (not including prefixed whitespace)
62PASS_LINE_CONTENTS = 'pass # replaces next logging statement\n'
63
64# Match a logging call (must only be prefixed with whitespace). Capture groups
65# include the whitespace, the logging method called, and the first argument if
66# possible
67RE_LOGGING_START = re.compile(r'^(\s+)' + STR_RE_LOGGING_CALL)
68RE_LOGGING_START_IN_COMMENT = re.compile(r'^(\s+)#' + STR_RE_LOGGING_CALL)
69
70def main(argv=sys.argv[1:]):
71 """Parses the command line comments."""
72 usage = 'usage: %prog [options] FILE\n\n' + __doc__
73 parser = OptionParser(usage)
74
75 # options
76 parser.add_option("-f", "--force",
77 action='store_true', default=False,
78 help="make changes even if they cannot undone before saving the new file")
79 parser.add_option("-m", "--min_level",
80 default='NONE',
81 help="minimum level of logging statements to modify [default: no minimum]")
82 parser.add_option("-M", "--max_level",
83 default='NONE',
84 help="maximum level of logging statements to modify [default: no maximum]")
85 parser.add_option("-o", "--output-file",
86 default=None,
87 help="where to output the result [default: overwrite the input file]")
88 parser.add_option("-r", "--restore",
89 action='store_true', default=False,
90 help="restore logging statements previously commented out and replaced with pass statements")
91 parser.add_option("-v", "--verbose",
92 action='store_true', default=False,
93 help="print informational messages about changes made")
94
95 (options, args) = parser.parse_args(argv)
96 if len(args) != 1:
97 parser.error("expected 1 argument but got %d arguments: %s" % (len(args), ' '.join(args)))
98 input_fn = args[0]
99 if not options.output_file:
100 options.output_file = input_fn
101
102 # validate min/max level
103 LEVEL_CHOICES = LEVELS + ['NONE']
104 min_level_value = 0 if options.min_level == 'NONE' else get_level_value(options.min_level)
105 if options.min_level is None:
106 parser.error("min level must be an integer or one of these values: %s" % ', '.join(LEVEL_CHOICES))
107 max_level_value = sys.maxint if options.max_level == 'NONE' else get_level_value(options.max_level)
108 if options.max_level is None:
109 parser.error("max level must be an integer or one of these values: %s" % ', '.join(LEVEL_CHOICES))
110
111 if options.verbose:
112 logging.getLogger().setLevel(logging.INFO)
113
114 try:
115 return modify_logging(input_fn, options.output_file,
116 min_level_value, max_level_value,
117 options.restore, options.force)
118 except IOError as e:
119 logging.error(str(e))
120 return -1
121
122# matches two main groups: 1) leading whitespace and 2) all following text
123RE_LINE_SPLITTER_COMMENT = re.compile(r'^(\s*)((.|\n)*)$')
124def comment_lines(lines):
125 """Comment out the given list of lines and return them. The hash mark will
126 be inserted before the first non-whitespace character on each line."""
127 ret = []
128 for line in lines:
129 ws_prefix, rest, ignore = RE_LINE_SPLITTER_COMMENT.match(line).groups()
130 ret.append(ws_prefix + '#' + rest)
131 return ''.join(ret)
132
133# matches two main groups: 1) leading whitespace and 2) all following text
134RE_LINE_SPLITTER_UNCOMMENT = re.compile(r'^(\s*)#((.|\n)*)$')
135def uncomment_lines(lines):
136 """Uncomment the given list of lines and return them. The first hash mark
137 following any amount of whitespace will be removed on each line."""
138 ret = []
139 for line in lines:
140 ws_prefix, rest, ignore = RE_LINE_SPLITTER_UNCOMMENT.match(line).groups()
141 ret.append(ws_prefix + rest)
142 return ''.join(ret)
143
144def first_arg_to_level_name(arg):
145 """Decide what level the argument specifies and return it. The argument
146 must contain (case-insensitive) one of the values in LEVELS or be an integer
147 constant. Otherwise None will be returned."""
148 try:
149 return int(arg)
150 except ValueError:
151 arg = arg.upper()
152 for level in LEVELS:
153 if level in arg:
154 return level
155 return None
156
157def get_level_value(level):
158 """Returns the logging value associated with a particular level name. The
159 argument must be present in LEVELS_DICT or be an integer constant.
160 Otherwise None will be returned."""
161 try:
162 # integral constants also work: they are the level value
163 return int(level)
164 except ValueError:
165 try:
166 return LEVELS_DICT[level.upper()]
167 except KeyError:
168 logging.warning("level '%s' cannot be translated to a level value (not present in LEVELS_DICT)" % level)
169 return None
170
171def get_logging_level(logging_stmt, commented_out=False):
172 """Determines the level of logging in a given logging statement. The string
173 representing this level is returned. False is returned if the method is
174 not a logging statement and thus has no level. None is returned if a level
175 should have been found but wasn't."""
176 regexp = RE_LOGGING_START_IN_COMMENT if commented_out else RE_LOGGING_START
177 ret = regexp.match(logging_stmt)
178 _, method_name, _, first_arg = ret.groups()
179 if method_name not in LOGGING_METHODS_OF_INTEREST:
180 logging.debug('skipping uninteresting logging call: %s' % method_name)
181 return False
182
183 if method_name != 'log':
184 return method_name
185
186 # if the method name did not specify the level, we must have a first_arg to extract the level from
187 if not first_arg:
188 logging.warning("logging.log statement found but we couldn't extract the first argument")
189 return None
190
191 # extract the level of logging from the first argument to the log() call
192 level = first_arg_to_level_name(first_arg)
193 if level is None:
194 logging.warning("arg does not contain any known level '%s'\n" % first_arg)
195 return None
196 return level
197
198def level_is_between(level, min_level_value, max_level_value):
199 """Returns True if level is between the specified min or max, inclusive."""
200 level_value = get_level_value(level)
201 if level_value is None:
202 # unknown level value
203 return False
204 return level_value >= min_level_value and level_value <= max_level_value
205
206def split_call(lines, open_paren_line=0):
207 """Returns a 2-tuple where the first element is the list of lines from the
208 first open paren in lines to the matching closed paren. The second element
209 is all remaining lines in a list."""
210 num_open = 0
211 num_closed = 0
212 for i, line in enumerate(lines):
213 c = line.count('(')
214 num_open += c
215 if not c and i==open_paren_line:
216 raise Exception('Exception open parenthesis in line %d but there is not one there: %s' % (i, str(lines)))
217 num_closed += line.count(')')
218
219 if num_open == num_closed:
220 return (lines[:i+1], lines[i+1:])
221
222 raise Exception('parenthesis are mismatched (%d open, %d closed found)' % (num_open, num_closed))
223
224def modify_logging(input_fn, output_fn, min_level_value, max_level_value, restore, force):
225 """Modifies logging statements in the specified file."""
226 # read in all the lines
227 logging.info('reading in %s' % input_fn)
228 fh = open(input_fn, 'r')
229 lines = fh.readlines()
230 fh.close()
231 original_contents = ''.join(lines)
232
233 if restore:
234 forwards = restore_logging
235 backwards = disable_logging
236 else:
237 forwards = disable_logging
238 backwards = restore_logging
239
240 # apply the requested action
241 new_contents = forwards(lines, min_level_value, max_level_value)
242
243 # quietly check to see if we can undo what we just did (if not, the text
244 # contains something we cannot translate [bug or limitation with this code])
245 logging.disable(logging.CRITICAL)
246 new_contents_undone = backwards(new_contents.splitlines(True), min_level_value, max_level_value)
247 logging.disable(logging.DEBUG)
248 if original_contents != new_contents_undone:
249 base_str = 'We are unable to revert this action as expected'
250 if force:
251 logging.warning(base_str + " but -f was specified so we'll do it anyway.")
252 else:
253 logging.error(base_str + ', so we will not do it in the first place. Pass -f to override this and make the change anyway.')
254 return -1
255
256 logging.info('writing the new contents to %s' % output_fn)
257 fh = open(output_fn, 'w')
258 fh.write(new_contents)
259 fh.close()
260 logging.info('done!')
261 return 0
262
263def check_level(logging_stmt, logging_stmt_is_commented_out, min_level_value, max_level_value):
264 """Extracts the level of the logging statement and returns True if the
265 level falls betwen min and max_level_value. If the level cannot be
266 extracted, then a warning is logged."""
267 level = get_logging_level(logging_stmt, logging_stmt_is_commented_out)
268 if level is None:
269 logging.warning('skipping logging statement because the level could not be extracted: %s' % logging_stmt.strip())
270 return False
271 elif level is False:
272 return False
273 elif level_is_between(level, min_level_value, max_level_value):
274 return True
275 else:
276 logging.debug('keep this one as is (not in the specified level range): %s' % logging_stmt.strip())
277 return False
278
279def disable_logging(lines, min_level_value, max_level_value):
280 """Disables logging statements in these lines whose logging level falls
281 between the specified minimum and maximum levels."""
282 output = ''
283 while lines:
284 line = lines[0]
285 ret = RE_LOGGING_START.match(line)
286 if not ret:
287 # no logging statement here, so just leave the line as-is and keep going
288 output += line
289 lines = lines[1:]
290 else:
291 # a logging call has started: find all the lines it includes and those it does not
292 logging_lines, remaining_lines = split_call(lines)
293 lines = remaining_lines
294 logging_stmt = ''.join(logging_lines)
295
296 # replace the logging statement if its level falls b/w min and max
297 if not check_level(logging_stmt, False, min_level_value, max_level_value):
298 output += logging_stmt
299 else:
300 # comment out this logging statement and replace it with pass
301 prefix_ws = ret.group(1)
302 pass_stmt = prefix_ws + PASS_LINE_CONTENTS
303 commented_out_logging_lines = comment_lines(logging_lines)
304 new_lines = pass_stmt + commented_out_logging_lines
305 logging.info('replacing:\n%s\nwith this:\n%s' % (logging_stmt.rstrip(), new_lines.rstrip()))
306 output += new_lines
307 return output
308
309def restore_logging(lines, min_level_value, max_level_value):
310 """Re-enables logging statements in these lines whose logging level falls
311 between the specified minimum and maximum levels and which were disabled
312 by disable_logging() before."""
313 output = ''
314 while lines:
315 line = lines[0]
316 if line.lstrip() != PASS_LINE_CONTENTS:
317 # not our pass statement here, so just leave the line as-is and keep going
318 output += line
319 lines = lines[1:]
320 else:
321 # a logging call will start on the next line: find all the lines it includes and those it does not
322 logging_lines, remaining_lines = split_call(lines[1:])
323 lines = remaining_lines
324 logging_stmt = ''.join(logging_lines)
325 original_lines = line + logging_stmt
326
327 # replace the logging statement if its level falls b/w min and max
328 if not check_level(logging_stmt, True, min_level_value, max_level_value):
329 output += logging_stmt
330 else:
331 # uncomment_lines of this logging statement and remove the pass line
332 uncommented_logging_lines = uncomment_lines(logging_lines)
333 logging.info('replacing:\n%s\nwith this:\n%s' % (original_lines.rstrip(), uncommented_logging_lines.rstrip()))
334 output += uncommented_logging_lines
335 return output
336
337if __name__ == "__main__":
338 logging.basicConfig(format='%(levelname)s: %(message)s', level=logging.WARN)
339 sys.exit(main())

This script parses a Python file and comments out logging statements, replacing them with a pass statement (or vice versa). The purpose of commenting out these statements is to improve performance. Even if logging is disabled, arguments to logging method calls must still be evaluated, which can be expensive.

This tool handles most common cases:
  • Log statements may span multiple lines.
  • Custom logging levels may be added (LEVELS, LEVEL_VALUES).
  • Integral logging levels & named logging levels (DEBUG, etc.) are recognized.
  • Logging statements log(), debug(), ..., critical() are all recognized.
  • Statements with unrecognized logging levels will be left as-is.
  • 'logging' is the assumed logging module name (LOGGING_MODULE_NAME).
However, its ability to parse files is limited:
  • It only operates on logging statements in the form logging.log(<level>, ...) and logging.<level>(...).
  • The <level> must either be an integral constant or contain one of the names from the LEVELS constant below.
  • If a logging statement is made, it is assumed that no other statement is made on the same line as logging statement (except for statements made in between the open and close parenthesis of the logging call). For example, a semi-colon and then a second statement on the same line as a logging call will not be handled properly.
  • Logging methods must be called through SOME module, e.g., logging.log(), not just log().
  • For simplicity, undoing the commenting process relies on a comment left by the program on the pass statements it adds when commenting out logging statements. (So don't change the comment it outputs by the pass statement).
To run this command on all of the Python files in a particular folder and its sub-folders at once, try this (replace '/path/to' as appropriate):
find . -name '*.py' | xargs -i{} /path/to/logging_statement_modifier.py {}

You can learn more about the motivation for the script at this link: http://dound.com/2010/02/python-logging-performance/