• Bug#1098919: Help to improve comparison in english/international/l10n/s

    From Adriano Rafael Gomes@21:1/5 to All on Thu Feb 27 15:40:02 2025
    XPost: linux.debian.bugs.dist

    Hi,

    I don't speak perl, but maybe something like changing: https://salsa.debian.org/webmaster-team/webwml/-/blob/master/english/international/l10n/scripts/transmonitor-check?ref_type=heads#L703

    from:
    open(PO, "< $PO_DIR/$filename");

    to:
    open(PO, "-| msgattrib --no-wrap $PO_DIR/$filename");

    could do the trick?

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEEm6XoUs5MUlVRZK2NUWZ2ThdSmqwFAmfAdBcACgkQUWZ2ThdS mqzByw//Su9a4LKu5Ryi2z6ZjQuqtpAWhIqfBEvDzKKTn1Lp+SzGPfedMl5/jf2m XYR6A1d03dEHrhgGYSTNCVkEEmYJ7LTi5G5MlbQvPqwYD+r9LOalYD887W8Y1HoZ rNnbrp/HfuNBYOH+QBH+jeNo2VVf6aeFHJgnlwbQhYHCoxifehoHluu+KXab7XZd ftrvOnCTjGRcIHXVLMjroYXAdCinYunbjcTOjsZ4nJaVOH9ZmNS360Iv/32k1yEP rDn1KWGN/gSSg6FiRKoaALc0/vbJrLH/rmLKsfxHMQqwcHS3dCGxe2Z69FROfNod rGu6lVgxmgXL5410DD70Bwxta+rNAqRlsq4L4/xz7CFX2gMLcqkIPUY4mtOHg2UB NY3iTkjBDzggGOm0y7xvGwsnx3YyR8qoehDydhm78wfeCj15q7PDQh+h6wOJQt3B nZWWYqGqpXp+VzWQdZNcvH4Q1EddvRt/GUW8Co3ZOYEx6UxueTKlAvc0lag0pg6T fj+3g+ffPJ8QG9D5t1ZxJ2z/Luutyd25ndHZi+h0gt/bMFoOEqwhVMGSYwdN8Dgb 2Nw6OsXLPLJpe7jLLQ034rUdPIu/aKdGCPqkH+QeHZf8uv0N29s/BMuUMwKF+6bP e0XUvrStT4+46FLtmq39rgN/o3w4UiFokME0y22SvRTwq1kwcec=
    =vNAl
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Paulo Henrique Santana@21:1/5 to All on Wed Feb 26 22:30:01 2025
    XPost: linux.debian.bugs.dist

    Hi,

    More details about the bug.

    If you look this page, you will see the first table listing some packages: https://www.debian.org/international/l10n/po/pt_BR

    Looking some of their po files, we can see the name and the address of the team on the same line:
    https://sources.debian.org/src/shadow/1%3A4.17.3-1/po/pt_BR.po/ https://sources.debian.org/src/xscreensaver/6.08%2Bdfsg1-1/po/pt_BR.po/ https://sources.debian.org/src/pppconfig/2.3.30/po/pt_BR.po/

    "Language-Team: Debian-BR Project <[email protected]>\n"

    There are other variations:
    "Language-Team: Debian-BR <[email protected]>\n" "Language-Team: l10n portuguese <[email protected]>\n"

    As I said before, to build the first table, the script compares the address of the team with this field Language-Team.

    When we look other packages on the second table, we can see the line has more than 80 collums and the address is broken:

    https://sources.debian.org/src/cwidget/0.5.18-6/po/pt_BR.po/
    "Language-Team: Brazilian Portuguese <[email protected]." "org>\n"

    https://sources.debian.org/src/debian-security-support/1%3A13%2B2025.01.30/po/pt_BR.po/
    "Language-Team: Brazilian Portuguese <debian-l10n-portuguese@lists." "debian.org>\n"

    On the second table, you can see the collum "Equipe" is empty for these cases. So, the package was translated [email protected] team but the script can't deal with broken lines.

    For the po debconf files, we have similar issue. There are around 80 packages are not listed on this page:
    https://www.debian.org/international/l10n/po-debconf/pt_BR

    Some of them:
    https://sources.debian.org/src/anna/1.96/debian/po/pt_BR.po/ https://sources.debian.org/src/apt-setup/1%3A0.192/debian/po/pt_BR.po/ https://sources.debian.org/src/base-installer/1.222/debian/po/pt_BR.po/

    "Language-Team: Brazilian Portuguese <[email protected]." "org>\n"

    Ideally, we would like the script gets the second line when the address is broken.
    Or maybe consider only the first part of the address until the @: debian-l10n-portuguese

    https://salsa.debian.org/webmaster-team/webwml/-/blob/master/english/international/l10n/scripts/gen-files.pl?ref_type=heads#L331

    Best regards,


    -----
    Paulo Henrique de Lima Santana (phls)
    Belo Horizonte - Brasil
    Debian Developer
    Associado do Instituto para Conservação de Tecnologias Livres
    Site: http://phls.com.br
    GPG ID: 0443C450

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Holger Wansing@21:1/5 to All on Thu Feb 27 00:10:01 2025
    XPost: linux.debian.bugs.dist

    This is a multi-part message in MIME format.

    A possible approach would be, to convert all po files before processing, so they don't have line breaks at ~79 characters, as it is common.
    Instead, they would have all msgid's/msgstr's on one line.
    That would solve your issue here, I think.

    I have attached two files, to show the results.
    [[ Attention: be aware, your mail client may break the view of the files! Prefer to view it in a plain text editor and on a device, that can show very long lines! ]]


    That can be done with msgattrib:

    To convert a po file with lines at max 79 chars into 'long lines' which
    have the msgid's/msgstr's completely on one line:
    msgattrib --no-wrap -o po-file_without-line-break.po po-file-with-line-break-at-79-chars.po

    To convert a file with 'long lines' into a file with line breaks at
    79 chars:
    msgattrib -o po-file-with-line-break-at-79-chars.po po-file_without-line-break.po



    However, since the po files in all the debian packages are in
    "line-break at 79 chars" mode, you would have to convert them all to the "without line-breaks" mode before this script does its work.
    And after the script run, you would have to convert the po files back to
    the "line-break at 79 chars" mode, since this is what translators are expecting.

    Thus, that means a significant change to this script, which is already heavily complicated as it is now...
    And it takes a long time to run, since it processes all po files in all
    debian packages!!!
    Think about the numbers here: bookworm had more than ~59000 packages in total!


    Holger


    --
    Holger Wansing <[email protected]>
    PGP-Fingerprint: 496A C6E8 1442 4B34 8508 3529 59F1 87CA 156E B076

    # Translation of Debian Installer templates to Brazilian Portuguese.
    # This file is distributed under the same license as debian-installer.
    #
    # Felipe Augusto van de Wiel (faw) <[email protected]>, 2008-2012.
    # Adriano Rafael Gomes <[email protected]>, 2010-2024.
    #
    msgid ""
    msgstr ""
    "Project-Id-Version: debian-installer\n"
    "Report-Msgid-Bugs-To: \n"
    "POT-Creation-Date: 2024-05-11 20:02+0000\n"
    "PO-Revision-Date: 2024-06-22 14:23-0300\n"
    "Last-Translator: Adriano Rafael Gomes <[email protected]>\n" "Language-Team: Brazilian Portuguese <[email protected]." "org>\n"
    "Language: pt_BR\n"
    "MIME-Version: 1.0\n"
    "Content-Type: text/plain; charset=UTF-8\n"
    "Content-Transfer-Encoding: 8bit\n"

    #. Type: boolean
    #. Description
    #. :sl5:
    #: ../apt-mirror-setup.templates:2001
    msgid "Use non-free firmware?"
    msgstr "Usar firmware não-livre?"

    #. Type: boolean
    #. Description
    #. :sl5:
    #: ../apt-mirror-setup.templates:2001
    msgid ""
    "Firmware is a kind of software providing low-level control of certain " "hardware components (such as Wi-Fi cards or audio chipsets), which may not " "function fully or at all without it."
    msgstr ""
    "Firmware é um tipo de software que fornece controle de baixo nível para " "certos componentes de hardware (tais como placas Wi-Fi ou \"chipsets\" de " "áudio), os quais podem não funcionar por completo ou de forma alguma sem ele."

    # Translation of Debian Installer templates to Brazilian Portuguese.
    # This file is distributed under the same license as debian-installer.
    #
    # Felipe Augusto van de Wiel (faw) <[email protected]>, 2008-2012.
    # Adriano Rafael Gomes <[email protected]>, 2010-2024.
    #
    msgid ""
    msgstr ""
    "Project-Id-Version: debian-installer\n"
    "Report-Msgid-Bugs-To: \n"
    "POT-Creation-Date: 2024-05-11 20:02+0000\n"
    "PO-Revision-Date: 2024-06-22 14:23-0300\n"
    "Last-Translator: Adriano Rafael Gomes <[email protected]>\n" "Language-Team: Brazilian Portuguese <[email protected]>\n"
    "Language: pt_BR\n"
    "MIME-Version: 1.0\n"
    "Content-Type: text/plain; charset=UTF-8\n"
    "Content-Transfer-Encoding: 8bit\n"

    #. Type: boolean
    #. Description
    #. :sl5:
    #: ../apt-mirror-setup.templates:2001
    msgid "Use non-free firmware?"
    msgstr "Usar firmware não-livre?"

    #. Type: boolean
    #. Description
    #. :sl5:
    #: ../apt-mirror-setup.templates:2001
    msgid "Firmware is a kind of software providing low-level control of certain hardware components (such as Wi-Fi cards or audio chipsets), which may not function fully or at all without it."
    msgstr "Firmware é um tipo de software que fornece controle de baixo nível para certos componentes de hardware (tais como placas Wi-Fi ou \"chipsets\" de áudio), os quais podem não funcionar por completo ou de forma alguma sem ele."

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)