• Email headers

    From Sirius@21:1/5 to All on Fri Jun 28 11:44:40 2024
    Hi there,

    I have recently gotten myself into running a usenet server and with that,
    I have been mirroring down Linux Kernel mailing lists to have as a rolling
    two year archive. As part of that, I have had to figure out what headers
    in list-mail is preventing pullnews from pulling the messages of nntp.lore.kernel.org (as they make the lists available via NNTP somehow).

    And with that, there is a reflection on the mail headers I am coming
    across that seems to duplicate standard headers. The headers I filter out
    can be found at gopher://photonic.trudheim.com/1/usenet/headers.txt though
    I will call out a few.

    X-Alt-Message-ID:
    X-CodeTwo-MessageID:
    X-Gmail-Original-Message-ID:
    X-Google-Original-Date:
    X-Google-Original-From:
    X-Google-Original-Message-ID:

    I could understand these headers if they used them for something
    internally and then stripped them before the mail left their servers, but
    that is not the case. They have primarily contained the standard header
    data. And some of the (other) headers contain large amounts of encoded
    data as well.

    FWIW, it would be nice to see email and nntp RFCs harmonise on headers (colon-space) and make some recommendations on mail systems dropping
    headers like spam-filters add when they hand off mail to other
    mailservers. Seeing mails that have been spam-scanned multiple times by different solutions en-route is kind of pointless when spam-scanning is something the destination likely will do themselves (as everyone have a differing view on what spam is).

    --
    Kind regards,

    /S

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Marco Moock@21:1/5 to All on Fri Jun 28 12:50:26 2024
    On 28.06.2024 um 11:44 Uhr Sirius wrote:

    I have recently gotten myself into running a usenet server and with
    that, I have been mirroring down Linux Kernel mailing lists to have
    as a rolling two year archive. As part of that, I have had to figure
    out what headers in list-mail is preventing pullnews from pulling the messages of nntp.lore.kernel.org (as they make the lists available
    via NNTP somehow).

    This is a pullnews question. Normally, unknown headers should be
    ignored.

    And with that, there is a reflection on the mail headers I am coming
    across that seems to duplicate standard headers. The headers I filter
    out can be found at
    gopher://photonic.trudheim.com/1/usenet/headers.txt though I will
    call out a few.

    X-Alt-Message-ID:
    X-CodeTwo-MessageID:
    X-Gmail-Original-Message-ID:
    X-Google-Original-Date:
    X-Google-Original-From:
    X-Google-Original-Message-ID:

    X- is site-local an must be ignored if not known to the application.

    I could understand these headers if they used them for something
    internally and then stripped them before the mail left their servers,
    but that is not the case.

    Those headers must not interrupt something. It is intended to provide
    X- for local usage, even when they are distributed to others.

    FWIW, it would be nice to see email and nntp RFCs harmonise on headers (colon-space) and make some recommendations on mail systems dropping
    headers like spam-filters add when they hand off mail to other
    mailservers.

    Why should they do?

    Seeing mails that have been spam-scanned multiple times
    by different solutions en-route is kind of pointless when
    spam-scanning is something the destination likely will do themselves
    (as everyone have a differing view on what spam is).

    Simply ignore them. Application do it the same way.

    --
    kind regards
    Marco

    Send spam to [email protected]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Sirius@21:1/5 to Marco Moock on Sat Jun 29 09:19:56 2024
    On fre, 2024/06/28 at 12:50:26 GMT, Marco Moock wrote:
    On 28.06.2024 um 11:44 Uhr Sirius wrote:

    I have recently gotten myself into running a usenet server and with
    that, I have been mirroring down Linux Kernel mailing lists to have
    as a rolling two year archive. As part of that, I have had to figure
    out what headers in list-mail is preventing pullnews from pulling the messages of nntp.lore.kernel.org (as they make the lists available
    via NNTP somehow).

    This is a pullnews question. Normally, unknown headers should be
    ignored.

    Normally, yes. Though as I am pulling it into Inn2, any headers that technically conform to email RFC and do not have a space after colon
    becomes a problem. That, and when there are duplicate headers of
    From, Subject and a couple others. Usenet systems, being stricter on
    headers, balk at those messages.

    And with that, there is a reflection on the mail headers I am coming
    across that seems to duplicate standard headers. The headers I filter
    out can be found at
    gopher://photonic.trudheim.com/1/usenet/headers.txt though I will
    call out a few.

    X-Alt-Message-ID:
    X-CodeTwo-MessageID:
    X-Gmail-Original-Message-ID:
    X-Google-Original-Date:
    X-Google-Original-From:
    X-Google-Original-Message-ID:

    X- is site-local an must be ignored if not known to the application.

    Agreed. And if they had space after colon, I would ignore them.

    I could understand these headers if they used them for something
    internally and then stripped them before the mail left their servers,
    but that is not the case.

    Those headers must not interrupt something. It is intended to provide
    X- for local usage, even when they are distributed to others.

    They don't interrupt *email*, though they take up more space than the
    message itself most of the time. As I pull them into Inn2, I now drop a majority of the headers as the messages are not intended to get outside my
    own usenet server.

    FWIW, it would be nice to see email and nntp RFCs harmonise on headers (colon-space) and make some recommendations on mail systems dropping headers like spam-filters add when they hand off mail to other
    mailservers.

    Why should they do?

    Practical reasons. Headers that are internal to a mail-system (even if
    that system is spread over multiple physical systems) could remain
    internal to that system. Aside from it being easier for the admin to not
    have to strip their internal headers when the mail leave their
    mail-system, I can not think of a good reason why they should leave them
    in the message. I just think it is a waste, that is all.

    Seeing mails that have been spam-scanned multiple times
    by different solutions en-route is kind of pointless when
    spam-scanning is something the destination likely will do themselves
    (as everyone have a differing view on what spam is).

    Simply ignore them. Application do it the same way.

    Sure, I do, by filtering them out. :)

    --
    kind regards
    Marco

    Send spam to [email protected]


    --
    Kind regards,

    /S

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)