Skip to content
This repository was archived by the owner on Jan 14, 2020. It is now read-only.

Commit

Permalink
main/astobj2: Make REF_DEBUG a menuselect item; improve REF_DEBUG output
Browse files Browse the repository at this point in the history
This patch does the following:
(1) It makes REF_DEBUG a meneselect item. Enabling REF_DEBUG now enables
    REF_DEBUG globally throughout Asterisk.
(2) The ref debug log file is now created in the AST_LOG_DIR directory.
    Every run will now blow away the previous run (as large ref files
    sometimes caused issues). We now also no longer open/close the file
    on each write, instead relying on fflush to make sure data gets written
    to the file (in case the ao2 call being performed is about to cause a
    crash)
(3) It goes with a comma delineated format for the ref debug file. This
    makes parsing much easier. This also now includes the thread ID of the
    thread that caused ref change.
(4) A new python script instead for refcounting has been added in the
    contrib/scripts folder.
(5) The old refcounter implementation in utils/ has been removed.

Review: https://reviewboard.asterisk.org/r/3377/
........

Merged revisions 412114 from http://svn.asterisk.org/svn/asterisk/branches/1.8
........

Merged revisions 412115 from http://svn.asterisk.org/svn/asterisk/branches/11
........

Merged revisions 412153 from http://svn.asterisk.org/svn/asterisk/branches/12


git-svn-id: http://svn.asterisk.org/svn/asterisk/trunk@412154 f38db490-d61c-443f-a65b-d21fe96a405b
  • Loading branch information
matt-jordan committed Apr 11, 2014
1 parent 25be99d commit 95222bf
Show file tree
Hide file tree
Showing 10 changed files with 288 additions and 421 deletions.
4 changes: 4 additions & 0 deletions UPGRADE.txt
Original file line number Diff line number Diff line change
Expand Up @@ -174,5 +174,9 @@ Realtime Configuration:
- A new set of Alembic scripts has been added for CDR tables. This will create
a 'cdr' table with the default schema that Asterisk expects.

Utilities:
- The refcounter program has been removed in favor of the refcounter.py script
in contrib/scripts.

===========================================================
===========================================================
3 changes: 3 additions & 0 deletions build_tools/cflags.xml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@
<member name="DEBUG_THREADS" displayname="Enable Thread Debugging">
<support_level>core</support_level>
</member>
<member name="REF_DEBUG" displayname="Enable reference count debugging">
<support_level>extended</support_level>
</member>
<member name="STATIC_BUILD" displayname="Build static binaries">
<support_level>extended</support_level>
</member>
Expand Down
15 changes: 0 additions & 15 deletions channels/chan_sip.c
Original file line number Diff line number Diff line change
Expand Up @@ -230,21 +230,6 @@ ASTERISK_FILE_VERSION(__FILE__, "$Revision$")

#include "asterisk/network.h"
#include "asterisk/paths.h" /* need ast_config_AST_SYSTEM_NAME */
/*
Uncomment the define below, if you are having refcount related memory leaks.
With this uncommented, this module will generate a file, /tmp/refs, which contains
a history of the ao2_ref() calls. To be useful, all calls to ao2_* functions should
be modified to ao2_t_* calls, and include a tag describing what is happening with
enough detail, to make pairing up a reference count increment with its corresponding decrement.
The refcounter program in utils/ can be invaluable in highlighting objects that are not
balanced, along with the complete history for that object.
In normal operation, the macros defined will throw away the tags, so they do not
affect the speed of the program at all. They can be considered to be documentation.

Note: This must also be enabled in channels/sip/security_events.c
*/
/* #define REF_DEBUG 1 */

#include "asterisk/lock.h"
#include "asterisk/config.h"
#include "asterisk/module.h"
Expand Down
2 changes: 0 additions & 2 deletions channels/sip/security_events.c
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,6 @@

ASTERISK_FILE_VERSION(__FILE__, "$Revision$")

/* #define REF_DEBUG 1 */

#include "include/sip.h"
#include "include/security_events.h"

Expand Down
171 changes: 171 additions & 0 deletions contrib/scripts/refcounter.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
#!/usr/bin/env python
"""Process a ref debug log
This file will process a log file created by the REF_DEBUG
build option in Asterisk.
See http://www.asterisk.org for more information about
the Asterisk project. Please do not directly contact
any of the maintainers of this project for assistance;
the project provides a web site, mailing lists and IRC
channels for your use.
This program is free software, distributed under the terms of
the GNU General Public License Version 2. See the LICENSE file
at the top of the source tree.
Copyright (C) 2014, Digium, Inc.
Matt Jordan <[email protected]>
"""

import sys
import os

from optparse import OptionParser

def parse_line(line):
"""Parse out a line into its constituent parts.
Keyword Arguments:
line The line from a ref debug log to parse out
Returns:
A dictionary containing the options, or None
"""
tokens = line.strip().split(',', 7)
if len(tokens) < 8:
print "ERROR: ref debug line '%s' contains fewer tokens than " \
"expected: %d" % (line.strip(), len(tokens))
return None

processed_line = {'addr': tokens[0],
'delta': tokens[1],
'thread_id': tokens[2],
'file': tokens[3],
'line': tokens[4],
'function': tokens[5],
'state': tokens[6],
'tag': tokens[7],
}
return processed_line


def process_file(filename):
"""The routine that kicks off processing a ref file
Keyword Arguments:
filename The full path to the file to process
Returns:
A tuple containing:
- A list of objects whose lifetimes were completed
- A list of objects whose lifetimes were not completed
- A list of objects whose lifetimes are skewed
"""

finished_objects = []
leaked_objects = []
skewed_objects = []
current_objects = {}

with open(filename, 'r') as ref_file:
for line in ref_file:
parsed_line = parse_line(line)
if not parsed_line:
continue

obj = parsed_line['addr']

if obj not in current_objects:
current_objects[obj] = []
if 'constructor' not in parsed_line['state']:
skewed_objects.append(current_objects[obj])
current_objects[obj].append(parsed_line)

if 'destructor' in parsed_line['state']:
lifetime = current_objects.get(obj)
finished_objects.append(lifetime)
del current_objects[obj]

leaked_objects = current_objects.values()
return (finished_objects, leaked_objects, skewed_objects)


def print_objects(objects, prefix=""):
"""Prints out the objects that were processed
Keyword Arguments:
objects A list of objects to print
prefix A prefix to print that specifies something about
this object
"""

print "======== %s Objects ========" % prefix
print "\n"
for obj in objects:
print "==== %s Object %s history ====" % (prefix, obj[0]['addr'])
for entry in obj:
print "[%s] %s:%s %s: %s %s - [%s]" % (entry['thread_id'],
entry['file'], entry['line'],
entry['function'],
entry['delta'], entry['tag'],
entry['state'])
print "\n"


def main(argv=None):
"""Main entry point for the script"""

ret_code = 0

if argv is None:
argv = sys.argv

parser = OptionParser()

parser.add_option("-f", "--file", action="store", type="string",
dest="filepath", default="/var/log/asterisk/refs",
help="The full path to the refs file to process")
parser.add_option("-l", "--suppress-leaks", action="store_false",
dest="leaks", default=True,
help="If specified, don't output leaked objects")
parser.add_option("-n", "--suppress-normal", action="store_false",
dest="normal", default=True,
help="If specified, don't output objects with a " \
"complete lifetime")
parser.add_option("-s", "--suppress-skewed", action="store_false",
dest="skewed", default=True,
help="If specified, don't output objects with a " \
"skewed lifetime")

(options, args) = parser.parse_args(argv)

if not os.path.isfile(options.filepath):
print >>sys.stderr, "File not found: %s" % options.filepath
return -1

try:
(finished_objects,
leaked_objects,
skewed_objects) = process_file(options.filepath)

if options.leaks and len(leaked_objects):
print_objects(leaked_objects, "Leaked")
ret_code |= 1

if options.skewed and len(skewed_objects):
print_objects(skewed_objects, "Skewed")
ret_code |= 2

if options.normal:
print_objects(finished_objects, "Finalized")

except (KeyboardInterrupt, SystemExit, IOError):
print >>sys.stderr, "File processing cancelled"
return -1

return ret_code


if __name__ == "__main__":
sys.exit(main(sys.argv))
129 changes: 70 additions & 59 deletions include/asterisk/astobj2.h
Original file line number Diff line number Diff line change
Expand Up @@ -141,11 +141,10 @@ list. However there is no ordering among elements.
\note DEBUGGING REF COUNTS BIBLE:
An interface to help debug refcounting is provided
in this package. It is dependent on the REF_DEBUG macro being
defined in a source file, before the #include of astobj2.h,
and in using variants of the normal ao2_xxx functions
that are named ao2_t_xxx instead, with an extra argument, a string,
that will be printed out into /tmp/refs when the refcount for an
object is changed.
defined via menuselect and in using variants of the normal ao2_xxxx
function that are named ao2_t_xxxx instead, with an extra argument,
a string that will be printed out into the refs log file when the
refcount for an object is changed.
these ao2_t_xxx variants are provided:
Expand All @@ -164,45 +163,56 @@ each case is supposed to be a string pointer, a "tag", that should contain
enough of an explanation, that you can pair operations that increment the
ref count, with operations that are meant to decrement the refcount.
Each of these calls will generate at least one line of output in /tmp/refs.
These lines look like this:
Each of these calls will generate at least one line of output in in the refs
log files. These lines look like this:
...
0x8756f00 =1 chan_sip.c:22240:load_module (allocate users)
0x86e3408 =1 chan_sip.c:22241:load_module (allocate peers)
0x86dd380 =1 chan_sip.c:22242:load_module (allocate peers_by_ip)
0x822d020 =1 chan_sip.c:22243:load_module (allocate dialogs)
0x8930fd8 =1 chan_sip.c:20025:build_peer (allocate a peer struct)
0x8930fd8 +1 chan_sip.c:21467:reload_config (link peer into peer table) [@1]
0x8930fd8 -1 chan_sip.c:2370:unref_peer (unref_peer: from reload_config) [@2]
0x89318b0 =1 chan_sip.c:20025:build_peer (allocate a peer struct)
0x89318b0 +1 chan_sip.c:21467:reload_config (link peer into peer table) [@1]
0x89318b0 -1 chan_sip.c:2370:unref_peer (unref_peer: from reload_config) [@2]
0x8930218 =1 chan_sip.c:20025:build_peer (allocate a peer struct)
0x8930218 +1 chan_sip.c:21539:reload_config (link peer into peers table) [@1]
0x868c040 -1 chan_sip.c:2424:dialog_unlink_all (unset the relatedpeer->call field in tandem with relatedpeer field itself) [@2]
0x868c040 -1 chan_sip.c:2443:dialog_unlink_all (Let's unbump the count in the unlink so the poor pvt can disappear if it is time) [@1]
0x868c040 **call destructor** chan_sip.c:2443:dialog_unlink_all (Let's unbump the count in the unlink so the poor pvt can disappear if it is time)
0x8cc07e8 -1 chan_sip.c:2370:unref_peer (unsetting a dialog relatedpeer field in sip_destroy) [@3]
0x8cc07e8 +1 chan_sip.c:3876:find_peer (ao2_find in peers table) [@2]
0x8cc07e8 -1 chan_sip.c:2370:unref_peer (unref_peer, from sip_devicestate, release ref from find_peer) [@3]
0x8756f00,+1,1234,chan_sip.c,22240,load_module,**constructor**,allocate users
0x86e3408,+1,1234,chan_sip.c,22241,load_module,**constructor**,allocate peers
0x86dd380,+1,1234,chan_sip.c,22242,load_module,**constructor**,allocate peers_by_ip
0x822d020,+1,1234,chan_sip.c,22243,load_module,**constructor**,allocate dialogs
0x8930fd8,+1,1234,chan_sip.c,20025,build_peer,**constructor**,allocate a peer struct
0x8930fd8,+1,1234,chan_sip.c,21467,reload_config,1,link peer into peer table
0x8930fd8,-1,1234,chan_sip.c,2370,unref_peer,2,unref_peer: from reload_config
0x89318b0,1,5678,chan_sip.c,20025,build_peer,**constructor**,allocate a peer struct
0x89318b0,+1,5678,chan_sip.c,21467,reload_config,1,link peer into peer table
0x89318b0,-1,1234,chan_sip.c,2370,unref_peer,2,unref_peer: from reload_config
0x8930218,+1,1234,chan_sip.c,20025,build_peer,**constructor**,allocate a peer struct
0x8930218,+1,1234,chan_sip.c,21539,reload_config,1,link peer into peers table
0x868c040,-1,1234,chan_sip.c,2424,dialog_unlink_all,2,unset the relatedpeer->call field in tandem with relatedpeer field itself
0x868c040,-1,1234,chan_sip.c,2443,dialog_unlink_all,1,Let's unbump the count in the unlink so the poor pvt can disappear if it is time
0x868c040,-1,1234,chan_sip.c,2443,dialog_unlink_all,**destructor**,Let's unbump the count in the unlink so the poor pvt can disappear if it is time
0x8cc07e8,-1,1234,chan_sip.c,2370,unref_peer,3,unsetting a dialog relatedpeer field in sip_destroy
0x8cc07e8,+1,1234,chan_sip.c,3876,find_peer,2,ao2_find in peers table
0x8cc07e8,-1,1234,chan_sip.c,2370,unref_peer,3,unref_peer, from sip_devicestate, release ref from find_peer
...
The first column is the object address.
The second column reflects how the operation affected the ref count
for that object. Creation sets the ref count to 1 (=1).
increment or decrement and amount are specified (-1/+1).
The remainder of the line specifies where in the file the call was made,
and the function name, and the tag supplied in the function call.
The **call destructor** is specified when the destroy routine is
run for an object. It does not affect the ref count, but is important
in debugging, because it is possible to have the astobj2 system run it
multiple times on the same object, commonly fatal to asterisk.
This uses a comma delineated format. The columns in the format are as
follows:
- The first column is the object address.
- The second column reflects how the operation affected the ref count
for that object. A change in the ref count is reflected either as
an increment (+) or decrement (-), as well as the amount it changed
by.
- The third column is the ID of the thread that modified the reference
count.
- The fourth column is the source file that the change in reference was
issued from.
- The fifth column is the line number of the source file that the ref
change was issued from.
- The sixth column is the name of the function that the ref change was
issued from.
- The seventh column indicates either (a) construction of the object via
the special tag **constructor**; (b) destruction of the object via
the special tag **destructor**; (c) the previous reference count
prior to this reference change.
- The eighth column is a special tag added by the developer to provide
context for the ref change. Note that any subsequent columns are
considered to be part of this tag.
Sometimes you have some helper functions to do object ref/unref
operations. Using these normally hides the place where these
functions were called. To get the location where these functions
were called to appear in /tmp/refs, you can do this sort of thing:
were called to appear in /refs, you can do this sort of thing:
#ifdef REF_DEBUG
#define dialog_ref(arg1,arg2) dialog_ref_debug((arg1),(arg2), __FILE__, __LINE__, __PRETTY_FUNCTION__)
Expand Down Expand Up @@ -250,13 +260,14 @@ file, function, and line number info provided. You might find this
well worth the effort to help track these function calls in the code.
To find out why objects are not destroyed (a common bug), you can
edit the source file to use the ao2_t_* variants, add the #define REF_DEBUG 1
before the #include "asterisk/astobj2.h" line, and add a descriptive
tag to each call. Recompile, and run Asterisk, exit asterisk with
"stop gracefully", which should result in every object being destroyed.
Then, you can "sort -k 1 /tmp/refs > x1" to get a sorted list of
all the objects, or you can use "util/refcounter" to scan the file
for you and output any problems it finds.
edit the source file to use the ao2_t_* variants, enable REF_DEBUG
in menuselect, and add a descriptive tag to each call. Recompile,
and run Asterisk, exit asterisk with "core stop gracefully", which should
result in every object being destroyed.
Then, you can "sort -k 1 {AST_LOG_DIR}/refs > x1" to get a sorted list of
all the objects, or you can use "contrib/script/refcounter.py" to scan
the file for you and output any problems it finds.
The above may seem astronomically more work than it is worth to debug
reference counts, which may be true in "simple" situations, but for
Expand All @@ -272,29 +283,29 @@ For instance, here is an example of this for a dialog object in
chan_sip, that was not getting destroyed, after I moved the lines around
to pair operations:
0x83787a0 =1 chan_sip.c:5733:sip_alloc (allocate a dialog(pvt) struct)
0x83787a0 -1 chan_sip.c:19173:sip_poke_peer (unref dialog at end of sip_poke_peer, obtained from sip_alloc, just before it goes out of scope) [@4]
0x83787a0,+1,1234,chan_sip.c,5733,sip_alloc,**constructor**,(allocate a dialog(pvt) struct)
0x83787a0,-1,1234,chan_sip.c,19173,sip_poke_peer,4,(unref dialog at end of sip_poke_peer, obtained from sip_alloc, just before it goes out of scope)
0x83787a0 +1 chan_sip.c:5854:sip_alloc (link pvt into dialogs table) [@1]
0x83787a0 -1 chan_sip.c:19150:sip_poke_peer (About to change the callid -- remove the old name) [@3]
0x83787a0 +1 chan_sip.c:19152:sip_poke_peer (Linking in under new name) [@2]
0x83787a0 -1 chan_sip.c:2399:dialog_unlink_all (unlinking dialog via ao2_unlink) [@5]
0x83787a0,+1,1234,chan_sip.c,5854,sip_alloc,1,(link pvt into dialogs table)
0x83787a0,-1,1234,chan_sip.c,19150,sip_poke_peer,3,(About to change the callid -- remove the old name)
0x83787a0,+1,1234,chan_sip.c,19152,sip_poke_peer,2,(Linking in under new name)
0x83787a0,-1,1234,chan_sip.c,2399,dialog_unlink_all,5,(unlinking dialog via ao2_unlink)
0x83787a0 +1 chan_sip.c:19130:sip_poke_peer (copy sip alloc from p to peer->call) [@2]
0x83787a0,+1,1234,chan_sip.c,19130,sip_poke_peer,2,(copy sip alloc from p to peer->call)
0x83787a0 +1 chan_sip.c:2996:__sip_reliable_xmit (__sip_reliable_xmit: setting pkt->owner) [@3]
0x83787a0 -1 chan_sip.c:2425:dialog_unlink_all (remove all current packets in this dialog, and the pointer to the dialog too as part of __sip_destroy) [@4]
0x83787a0,+1,1234,chan_sip.c,2996,__sip_reliable_xmit,3,(__sip_reliable_xmit: setting pkt->owner)
0x83787a0,-1,1234,chan_sip.c,2425,dialog_unlink_all,4,(remove all current packets in this dialog, and the pointer to the dialog too as part of __sip_destroy)
0x83787a0 +1 chan_sip.c:22356:unload_module (iterate thru dialogs) [@4]
0x83787a0 -1 chan_sip.c:22359:unload_module (toss dialog ptr from iterator_next) [@5]
0x83787a0,+1,1234,chan_sip.c,22356,unload_module,4,(iterate thru dialogs)
0x83787a0,-1,1234,chan_sip.c,22359,unload_module,5,(toss dialog ptr from iterator_next)
0x83787a0 +1 chan_sip.c:22373:unload_module (iterate thru dialogs) [@3]
0x83787a0 -1 chan_sip.c:22375:unload_module (throw away iterator result) [@2]
0x83787a0,+1,1234,chan_sip.c,22373,unload_module,3,(iterate thru dialogs)
0x83787a0,-1,1234,chan_sip.c,22375,unload_module,2,(throw away iterator result)
0x83787a0 +1 chan_sip.c:2397:dialog_unlink_all (Let's bump the count in the unlink so it doesn't accidentally become dead before we are done) [@4]
0x83787a0 -1 chan_sip.c:2436:dialog_unlink_all (Let's unbump the count in the unlink so the poor pvt can disappear if it is time) [@3]
0x83787a0,+1,1234,chan_sip.c,2397,dialog_unlink_all,4,(Let's bump the count in the unlink so it doesn't accidentally become dead before we are done)
0x83787a0,-1,1234,chan_sip.c,2436,dialog_unlink_all,3,(Let's unbump the count in the unlink so the poor pvt can disappear if it is time)
As you can see, only one unbalanced operation is in the list, a ref count increment when
the peer->call was set, but no corresponding decrement was made...
Expand Down
Loading

0 comments on commit 95222bf

Please sign in to comment.