• Bug#266519: gawk: Odd regexp matching problem if LANG=ja_JP

    From Miles Bader@1:229/2 to All on Wed Aug 18 08:10:06 2004
    From: [email protected]

    Package: gawk
    Version: 1:3.1.4-1
    Severity: normal



    Executing the following line in a shell:

    echo -e '--- orig/lisp/ChangeLog\n+++ mod/lisp/ChangeLog' | LANG=ja_JP gawk '/[Cc]hangeLog/ { print }'

    yields not the expected two lines of output, but instead only the first one:

    --- orig/lisp/ChangeLog


    If the LANG-setting portion is changed to use C, then it works as
    expected (others such as "de" seem to work too):

    echo -e '--- orig/lisp/ChangeLog\n+++ mod/lisp/ChangeLog' | LANG=C gawk '/[Cc]hangeLog/ { print }'

    yields:

    --- orig/lisp/ChangeLog
    +++ mod/lisp/ChangeLog


    I'm not sure if the actual encoding has any impact -- ja_JP, ja_JP.utf8,
    and ja_JP.eucjp all exhibit the same problem.


    Thanks,

    -Miles


    -- System Information:
    Debian Release: 3.1
    APT prefers unstable
    APT policy: (500, 'unstable'), (101, 'experimental')
    Architecture: i386 (i686)
    Kernel: Linux 2.6.8.1
    Locale: LANG=ja_JP.UTF-8, LC_CTYPE=ja_JP.UTF-8

    Versions of packages gawk depends on:
    ii libc6 2.3.2.ds1-16 GNU C Library: Shared libraries an

    -- no debconf information


    --
    To UNSUBSCRIBE, email to [email protected]
    with a subject of "unsubscribe". Trouble? Contact [email protected]

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)