• Bug#265870: RFP: dspam -- highly scalable, large scale, multi-statistic

    From Jari Aalto@1:229/2 to All on Sun Aug 15 15:30:09 2004
    From: [email protected]

    Package: wnpp
    Severity: wishlist

    * Package name : dspam
    Version : 3.1.0
    Upstream Author : Jonathan A. Zdziarski <[email protected]>
    * URL : http://www.nuclearelephant.com/projects/dspam/
    * License : GPL
    Description : highly scalable, large scale, multi-statistic spam analyzer and filter

    System-wide administratively-maintenance free filtering. The DSPAM
    agent masquerades as the email server's delivery agent (or proxy agent
    if necessary) providing filtering at the server level.

    A simple-to-use learning mechanism. DSPAM allows users to simply
    forward their spam to their "spam email address" for learning,
    eliminating any learning curve necessary to make it usable by your
    customers. The information used in every calculation is temporarily
    stored on the server, enabling DSPAM to relearn the original message
    by looking for a small signature in the forwarded spam. As a result,
    users don't have to be trained to 'bounce' messages around, and
    administrators don't have to worry about incompatible mail clients.

    Support for a variety of storage implementations. DSPAM's storage
    driver API allows the administrator to choose how they wish to store
    data. Currently supported drivers include SQLite, Berkeley DB3,
    Berkeley DB4, MySQL, PostgrSQL and Oracle.

    Multi-Algorithm Support. DSPAM presently supports the following
    combination algorithms: Graham-Bayesian, Burton-Bayesian, Robinson's
    Geometric Mean, and Fisher-Robinson's Chi-Square. The administrator
    may choose one or more of these algorithms to use when calculating
    against spam and even combine two or more for extended filtering
    reach.

    A strong focus on large-scale implementation support. The largest implementation of DSPAM heard involves 125,000 users, with the next
    largest being around 100,000, then 70,000. DSPAM has been designed to
    run with a very short execution time (between 0.01s - 0.03s real time
    for classification and between 0.03s - 0.10s real time for training,
    on average hardware), and has been equipped with a storage driver API
    allowing several different storage mechanisms to be used. Depending on
    disk space constraints, accuracy can be traded off for additional disk
    space or vice-versa

    -- System Information:
    Debian Release: 3.1
    Architecture: i386 (i686)
    Kernel: Linux 2.4.26.20040601
    Locale: LANG=C, LC_CTYPE=C (ignored: LC_ALL set to en_US)


    --
    To UNSUBSCRIBE, email to [email protected]
    with a subject of "unsubscribe". Trouble? Contact [email protected]

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)