• Bug#267066: libsvn0: repository becomes wedged, subversion deadlocks

    From Marius Gedminas@1:229/2 to All on Fri Aug 20 16:40:23 2004
    From: [email protected]

    Package: libsvn0
    Version: 1.0.6-1.1
    Severity: important

    Recently (since last weekend) a recurring problem started plaguing the SchoolTool subversion repository at http://source.schooltool.org/. The symptoms are quite different from the ones described in bugs #266314 and #252974: all processes that try to access the repository (including svn, svnadmin, svnserve, python for viewcvs.cgi, and apache2 with mod_svn)
    just hang in an infinite loop. Running strace shows that they all
    execute the same loop, repeatedly calling

    select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)

    ltrace on those pids reports nothing.

    fuser shows that all of those processes are accessing db.lock from the repository. lsof shows that they all hold read locks on that file.

    Looking at the stack trace with gdb reveals that all of them have the
    same topmost 9 frames:

    #0 0x4036bdd2 in select () from /lib/libc.so.6
    #1 0x401bc448 in db_xa_switch_4002 () from /usr/lib/libdb-4.2.so
    #2 0x4019d5c9 in __memp_sync_int_4002 () from /usr/lib/libdb-4.2.so
    #3 0x4019cd24 in __memp_sync_4002 () from /usr/lib/libdb-4.2.so
    #4 0x401a46ee in __txn_checkpoint_4002 () from /usr/lib/libdb-4.2.so
    #5 0x401a43cd in __txn_checkpoint_pp_4002 () from /usr/lib/libdb-4.2.so
    #6 0x40048e60 in svn_fs_list_transactions () from /usr/lib/libsvn_fs-1.so.0
    #7 0x40048f1c in svn_fs_list_transactions () from /usr/lib/libsvn_fs-1.so.0
    #8 0x4004904f in svn_fs__retry_txn () from /usr/lib/libsvn_fs-1.so.0

    Killing all processes that are accessing the repository does not help --
    any newly started processes also hang. Stracing svnadmin verify shows
    that the last syscalls performed before the select loop are:

    ...
    stat64("/svn/schooltool/db/uuids", {st_mode=S_IFREG|0664, st_size=8192, ...}) = 0
    open("/svn/schooltool/db/uuids", O_RDWR|O_LARGEFILE) = 11
    fcntl64(11, F_SETFD, FD_CLOEXEC) = 0
    fstat64(11, {st_mode=S_IFREG|0664, st_size=8192, ...}) = 0
    time(NULL) = 1093011345
    time([1093011345]) = 1093011345
    stat64("/svn/schooltool/db/log.0000000275", {st_mode=S_IFREG|0664, st_size=590372, ...}) = 0
    open("/svn/schooltool/db/log.0000000275", O_RDWR|O_CREAT|O_LARGEFILE, 0666) = 12
    fcntl64(12, F_SETFD, FD_CLOEXEC) = 0
    read(12, "\202\372\17\0\34\0\0\0$(\21\324\210\t\4\0\10\0\0\0\0\0"..., 28) = 28
    _llseek(12, 590372, [590372], SEEK_SET) = 0
    write(12, "\322\1\t\0R\0\0\0\275\23\255\336\2\0\0\0\265\201\7\200"..., 1002) = 1002
    fsync(12) = 0
    pwrite(7, "\370\0\0\0!\24\10\0\0\0\0\0b1\5\0\t\0\0\0\0\20\0\0\0\t"..., 4096, 0) = 4096
    pwrite(6, "\20\1\0\0c\344\n\0\0\0\0\0b1\5\0\t\0\0\0\0\20\0\0\0\t\0"..., 4096, 0) = 4096
    pwrite(5, "\371\0\0\0\266c\3\0\0\0\0\0b1\5\0\t\0\0\0\0\20\0\0\0\t"..., 4096, 0) = 4096
    pwrite(8, "\23\1\0\0\23B\4\0\0\0\0\0b1\5\0\t\0\0\0\0\20\0\0\0\t\0"..., 4096, 0) = 4096
    pwrite(10, "\23\1\0\0\220)\5\0\0\0\0\0b1\5\0\t\0\0\0\0\20\0\0\0\t\0"..., 4096, 0) = 4096
    pwrite(9, "\22\1\0\0\362w\t\0\0\0\0\0b1\5\0\t\0\0\0\0\20\0\0\0\t\0"..., 4096, 0) = 4096
    pwrite(11, "\0\0\0\0\1\0\0\0\0\0\0\0b1\5\0\t\0\0\0\0\20\0\0\0\t\0\0"..., 4096, 0) = 4096
    pwrite(4, "\23\1\0\0\362Q\3\0\0\0\0\0b1\5\0\t\0\0\0\0\20\0\0\0\t\0"..., 4096, 0) = 4096
    select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
    select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
    select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
    select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)

    The file descriptors at that time are (from ls -l /proc/$pid/fd):

    0 -> /dev/pts/4
    1 -> /dev/pts/4
    2 -> /dev/pts/4
    3 -> /var/lib/svn/schooltool/locks/db.lock
    4 -> /var/lib/svn/schooltool/db/nodes
    5 -> /var/lib/svn/schooltool/db/revisions
    6 -> /var/lib/svn/schooltool/db/transactions
    7 -> /var/lib/svn/schooltool/db/copies
    8 -> /var/lib/svn/schooltool/db/changes
    9 -> /var/lib/svn/schooltool/db/representations
    10 -> /var/lib/svn/schooltool/db/strings
    11 -> /var/lib/svn/schooltool/db/uuids
    12 -> /var/lib/svn/schooltool/db/log.0000000275

    Killing all processes that are accessing the repository and then running svnadmin recover helps.

    It is possible, but not certain, that these wedges are triggered by
    msnbot indexing the schooltool repository, as indicated by apache's
    access.log. There are problems registered in error.log at the time of
    the wedge, although there are a number of error messages about problems
    closing the Berkeley DB execution environment from earlier days when I
    had resorted to killall and killall -9.


    -- System Information:
    Debian Release: 3.0
    APT prefers testing
    APT policy: (300, 'testing'), (200, 'unstable')
    Architecture: i386 (i686)
    Kernel: Linux 2.4.26-1-686
    Locale: LANG=C, LC_CTYPE=lt_LT.UTF-8

    Versions of packages libsvn0 depends on:
    ii libapr0 2.0.50-9 The Apache Portable Runtime
    ii libc6 2.3.2.ds1-13 GNU C Library: Shared libraries an ii libdb4.2 4.2.52-16 Berkeley v4.2 Database Libraries [ ii libexpat1 1.95.6-8 XML parsing C library - runtime li ii libldap2 2.1.30-2 OpenLDAP libraries
    ii libneon24 0.24.7.dfsg-0.1 An HTTP and WebDAV client library ii libperl5.8 5.8.4-2 Shared Perl library.
    ii libssl0.9.7 0.9.7d-4 SSL shared libraries
    ii libswig1.3.21 1.3.21-5 Runtime support libraries for swig ii libxml2 2.6.11-3 GNOME XML library
    ii python2.3 2.3.4-5 An interactive high-level object-o ii zlib1g 1:1.2.1.1-5 compression library - runtime

    -- no debconf information

    Marius Gedminas
    --
    We have an advanced scalable groupware communication environment (email)
    -- Alan Cox


    --
    To UNSUBSCRIBE, email to [email protected]
    with a subject of "unsubscribe". Trouble? Contact [email protected]

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Matt Kraai@1:229/2 to Marius Gedminas on Sat Aug 21 03:00:15 2004
    From: [email protected]

    On Fri, Aug 20, 2004 at 05:23:28PM +0300, Marius Gedminas wrote:
    Recently (since last weekend) a recurring problem started plaguing the SchoolTool subversion repository at http://source.schooltool.org/. The symptoms are quite different from the ones described in bugs #266314 and #252974: all processes that try to access the repository (including svn, svnadmin, svnserve, python for viewcvs.cgi, and apache2 with mod_svn)
    just hang in an infinite loop.

    Have you looked at

    http://subversion.tigris.org/project_faq.html#stuck-bdb-repos

    ?

    --
    Matt Kraai [email protected] http://ftbfs.org/

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.2.4 (GNU/Linux)

    iD8DBQFBJpjofNdgYxVXvBARAs7zAJ9Ptww0R2w6OHgtYrV4NDjEImAAjgCfYHtH incFHa0WlObQf14KLutgjOk=
    =APuR
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)