From: Sven Hoexter Date: Mon, 17 Jan 2011 15:35:01 +0000 (+0100) Subject: Imported Upstream version 1.1.3 X-Git-Tag: upstream/1.1.3 X-Git-Url: https://git.sven.stormbind.net/?a=commitdiff_plain;h=210b712891e96a851b37e3bb7d6d36d246e7536d;p=sven%2Fpflogsumm.git Imported Upstream version 1.1.3 --- diff --git a/ChangeLog b/ChangeLog new file mode 100644 index 0000000..df3558d --- /dev/null +++ b/ChangeLog @@ -0,0 +1,771 @@ +ChangeLog for pflogsumm.pl + + + [Note: Let me know if you would like to be notified as new versions + are released. The latest released version can always be found at + http://jimsun.LinxNet.com/postfix_contrib.html.] + + +rel-1.1.3 20100320 + + Added long-awaited switches to optionally reduce detail reporting: + --bounce_detail=N, --deferral_detail=N, --reject_detail=N, + --smtp_detail=N, smtpd_warning_detail=N, and --detail=N. Setting + any of them to 0 suppresses that detail entirely. --detail=N sets + the default for all of them, as well as for -u=N and -h=N. + + With the above enhancements, the following switches are depreciated, + and will eventually be removed: --no_bounce_detail, + --no_deferral_detail, --no_reject_detail and --no_smtpd_warnings. + They are replaced by setting the desired --*_detail=0. They still + work, but using them generates a warning. + + Added support for parsing logs with RFC 3339 timestamps. Thanks + and a tip o' the hat to sftf-at-yandrex-dot-ru for the heads-up + and the code contribution. (N.B.: My code does not require a + command-line switch. The format is detected automatically.) + + Fixed some --ignore-case inconsistincies. Thanks and a tip o' + the hat to Richard Blanchet (richard-dot-blanchet-at-free-dot-fr) + for the heads-up and the diff. + + Fixed parsing bug that resulted in attempts to treat + kind-of-IPv4-looking strings as IPv4 addresses. (I really need to + improve reject/defer/etc. "reason" parsing to fix this properly.) + Thanks to Joseph Vit (jvit-at-certicon-dot-cz) for the bug + report. + +rel-1.1.2 20080629 + + Fixed bug with calculating yesterday's date in vicinity of DST + changes. (Thanks and a tip o' the hat to Wieland Chmielewski + for bringing the problem to my attention.) + + Added missing "underlining" to some (sub-)section titles for + consistency. + + +rel-1.1.1 20070406 + + Fixed to parse Postfix-2.3 (and beyond) logfiles. Thanks to + whomever contributed to + + http://bugs.gentoo.org/show_bug.cgi?id=144236 + + Removed support for vmailer. + + Removed "SMTPD_STATS_SUPPORT" "fences" in code in favour of code + to automatically detect the availability of Date::Calc. If + --smtpd_stats is specified and Date::Calc is not installed, now + bails-out with friendly message. (Adapted from suggestion and + examples provided by David Landgren . + Thanks!) + + Removed rem_smtpd_stats_supp.pl utility from distribution. (No + longer needed.) + + Memory footprint improvement: Pflogsumm no longer stores data for + reports that are supressed via --no_ switches. + + Removed extraneous arguments in two calls to print_nested_hash + that would result in the "quiet" flag being ignored. Thanks to + Pavel Urban (pupu-at-pupu-dot-cz) for bringing that to my + attention. + + Added notes to FAQ about translations and i18n, about mismatching + "received"/"delivered" counts, about bug in calculating "yesterday," + and about John Fawcett's "prepflog." + + +rel-1.1.0 20031212 + + Promoted 1.0.18 (Beta) to "production/stable" version release. + + +rel-1.0.18 20031204 + + Fixed reject parsing for "DATA" smtpd rejects. + + +rel-1.0.17 20031129 + + Fixed reject parsing to properly recognize bare "User unknown". + (Thanks to J.D. Bronson" for the + bug-report and sample logfile lines.) + + +rel-1.0.16 20031128 + + Re-worked "to" and "from" field parsing in reject report handling to + make it more robust in pathological cases. (Thanks to Paul Brooks + and Lars Hecking + for the bug-reports and sample logfile + lines.) + + Fixed warnings resulting from non-standard, extraneous syslog input. + (Thanks to Mathias Behrle for the report.) + + Fixed reject parsing to account for really atrocious garbage in + HELO strings, sender addresses and recipient addresses. (Thanks to + Lars Hecking for the bug-report and sample + logfile lines.) + + Fixed reject parsing to properly recognize "CONNECT" smtpd rejects. + (Thanks to Mike Vanecek for the + bug-report.) + + Fixed reject parsing to properly recognize "User unknown in relay + recipient table." (Thanks to Lars Hecking + for the bug-report and sample logfile lines.) + + Some code optimization resulting in 3-5% performance improvement. + + +rel-1.0.15 20030914 + + Pflogsumm *should* now properly parse and handle log entries with + IPv6 addresses in them. (Adapted from idea and code submitted by + Stefan `Sec` Zehl .) + + Fixed "User unknown in local recipient table" reject reports to + show target recipient address, rather then sending domain, to be + consistent with other "recipient" reports. (Thanks to WC Jones + for the suggestion.) + + Fixed parsing of "Recipient address rejected" for recipient + address verification (RAV) rejects. (Thanks to Len Conrad + for the suggestion.) + + FAQ additions regarding recommendations on how to format custom + reject reports for "best" results in Pflogsumm's output and note + regarding "non-standard" syslogd's. + + +rel-1.0.14 20030909 + + Fixed bug in parsing for "Host/Domain Summary: Messages Received" + report improvement (rel-1.0.13) that resulted from (unexpected, to + me) lines such as + + ... postfix/smtpd[31430]: E02DDB04E: client=blurfl[1.2.3.4], + sasl_method=LOGIN, sasl_username=phred + +rel-1.0.13 20030907 + + The "Host/Domain Summary: Messages Received" report would show simply + "from=<>", for the host/domain, for postmaster bounces. Pflogsumm now + substitutes the client hostname or IP address for these, unless it's + from the pickup daemon, in which case "from=<>" is retained. (Note + that "Senders by message count/size" reports are unaffected by this + change.) + + "Senders by message count" and "Recipients by message count" reports + are now secondarily sorted by domain, host and user parts. (As a + side-effect: So are "Senders by message size" and "Recipients by + message size" but, being as the odds are against numerous senders and + recipients having the same total message sizes, this change hasn't much + effect with those.) + +rel-1.0.12 20030902 + + Rejects, warns, etc. now print sub-category totals. E.g.: + + message reject detail + --------------------- + RCPT + Relay access denied (total: 6) + + (Adapted from idea and code submitted by blake7-at-blake7-dot-org.) + + Reject, warning, etc. reports are now sorted by 2nd column (e.g.: IP + address, domain, etc.) within count. (Adapted from idea and code + submitted by David Landgren .) + + Added --no_smtpd_warnings (report) option. + + Added --no_no_msg_size (report) option. + + A couple of minor improvements to reject parsing/reporting. + +rel-1.0.11 20030617 + + This is a bug-fix release. + + There was a problem in the way pflogsumm-1.0.8 through 1.0.10 + handled the --syslog_name option: When --syslog_name was + specified, some log entries with the default "postfix" name would + be missed. This revision may introduce incompatibilities if + you're logging two or more instances of Postfix to the same log. + See the docs included in the tarball for details. + +rel-1.0.10 20030219 + + Re-worked "% rejected" calculation to include messages discarded + and added "% discarded" calculation/display. + +rel-1.0.9 20030217 + + Bugfix: If Perl's -w is specified at run-time and there were no + messages delivered or rejected, uninitialized variable warnings + would be issued in the percent rejected calculation code. Thanks + for Larry Hansford (and many others since!) for the bug report. + +rel-1.0.8 20030216 + + Bugfix: Fixed problem with "orig_to=" being parsed as + "to=". This resulted in *very* wrong output. Thanks to + Bjorn Swift for the report. + + Added "% rejected" to Grand Totals "rejected" figure. This is + calculated as: rejected / (delivered + rejected). (I did this + purely because it amuses me.) + + Bugfix: Fix, in reject processing, for truncated overly-long "to" + fields. Thanks to Rick Troxel for reporting the problem. + + Added --syslog_name option. Thanks to Ben Rosengart for the + suggestion. + +rel-1.0.7 20021231 + + Corrected and improved message reject/reject warn/hold/discard + parsing. Again. (Thanks to Peter Santiago for reporting the + problem that initiated these improvements.) + +rel-1.0.6 20021230 + + Added support for reporting message reject warnings, holds and + discards. + + Note: Message rejects, reject warnings, holds and discards + are all reported under the "rejects" column for the Per-Hour + and Per-Day traffic summaries. + + More aggressive verp munging (again). (Prompted, in part, by a + suggestion from Casey Peel. Thanks!) + + Verp munging now applied to sender addresses in smtpd reject + reports. + + WARNING: Please note that verp munging is highly experimental! + + Pflogsumm distribution changed to gzip'd tarball format. + + Tightened-up parsing. Thanks for Ralf Hildebrandt for noting and + reporting the problem. + + Docs at the top of pflogsumm.pl changed to POD format for automated + manpage generation. + + README added. + + Automatically-generated manpage added. + + "To Do" moved out of ChangeLog into separate file. + + Package now includes convenience Perl script for removing smtpd + stats support for those who don't have Date::Calc, don't want to + install it and don't care about smtpd stats reporting. + + Belated thanks to Len Conrad in regards to the Sender Address + Verification work in 1.0.5. + +rel-1.0.5 20021222 + + Fixed to parse smtpd rejects for Postfix versions as of 20021026. + (Retained compatibility with older versions of Postfix.) + + Note: smtpd and header-/body-checks warn, hold and discard + messages are *not* currently parsed/reported. I'll need to + get some logfile entries. + + Fixed parsing to handle the new "sender address verification" + lines. + + Added "--zero_fill" option to put zeros in columns that might + otherwise be blank in some reports. (Suggestion by Matthias + Andree). + + Fixed "Message size exceeds fixed limit" parsing for reject + reporting. + +rel-1.0.4 20020224 + + Added "--no_*_detail" options. (Suppresses some of the "detail" + reports.) + + Added "--version" option. (Thanks to "Guillaume") + + Improved handling of "[ID nnnnnn some.thing]" stuff (Thanks to + Blake Dunmire) + + Repaired and optimized some of the "reject" parsing. + + Added processing and report of smtp delivery failures. + + Added --rej_add_from option: For those reject reports that list + IP addresses or host/domain names: append the email from address + to each listing. (Note: does not apply to "Improper use of SMTP + command pipelining" report.) + + +rel-1.0.3 20010520 + + Minor re-work of "reject: RCPT" parsing to account for Yet Another + Change in the logfile format. (Semi-colon changed to a comma in + "blocked using rbl.maps.vix.com,".) + + +rel-1.0.2 20010519 + + Took another whack at "verp" munging. *sigh* + + Added code to summarize "Improper use of SMTP command pipelining" + rejects by client. + + +rel-1.0.1 20010518 + + Modified to catch "reject: header" log entries changed as of + postfix release-20010228 (?). Prior versions of postfix had the + string "warning: " (where the qid normally is). Thanks to Glen + Eustace , Len Conrad + , Daniel Roesen + , Milivoj Ivkovic and + j_zuilkowski@hotmail.com (Jon Zuilkowski) for reports and/or + patches. + + Fixed a couple of "uninitialized variable" problems. + + Committed (actually starting with 20000925-01beta) to CVS. + + +20000925-01 + + Added a line to compensate for (new?) "[ID nnnnnn some.thing]" + sub-strings that appear in logfile entries under Sun Solaris 8. + (At least. Others?) + + Note: Upon being committed to CVS, this became rel-0.9.0. + + +20000916-01 + + Forgot to add "--problems_first" to the "usage" output and in the + synopsis at the top of the comments. + + +20000908-01beta + + Re-did what 20000907-02beta was *supposed* to be! To wit: + replaced missing "--ignore_case" bugfix, "panic" entry processing, + improvements to "fatal" and "warning" message reporting and + missing "--mailq" option. (Obviously: 20000907-02beta was + derived from the wrong code base.) + + +20000907-02beta + + Fixed bug in ISO date formatting that caused the month to be off + by one. Thanks to Kurt Andersen + for the report and the patch. + + Fixed overflow of connect time reporting into days. (Can happen + during weekly summaries for sites with large volumes of email.) + Thanks again to Kurt Andersen + for the report and the fix. + + Improved "rejects" reporting *again*. Thanks to Thomas Parmelan + for the patch. + + Added "--problems_first" option to print "problem" reports such as + bounces, defers, warnings, fatal errors, etc. before "normal" + reports. + + +19991219-02 + + Fixed bug in code that prevented "--ignore_case" from actually + doing anything. Thanks to Nadeem Hasan for + reporting this and supplying the fix. + + +19991219-01beta + + Added the following caveat to the "Notes" section of Pflogsumm: + + ------------------------------------------------------------- + IMPORTANT: Pflogsumm makes no attempt to catch/parse non- + postfix/vmailer daemon log entries. (I.e.: Unless + it has "postfix/" or "vmailer/" in the log entry, + it will be ignored.) + ------------------------------------------------------------- + + Added reporting of "panic" log messages. This was missed until + now! + + Increased reporting detail of "fatal" and "warning" entries. + (Actually, "warning" detail was increased in 19991120-01beta. + Neglected to note it then.) + + +19991123-01 (unreleased) + + Added "--mailq" option. (Convenience factor.) Runs Postfix's + "mailq" command at the end of the other reports. + + ------------------------------------------------------- + NOTE: If Postfix's "mailq" command isn't in your $PATH, + you'll have to edit the "$mailqCmd" variable located + near the top of pflogsumm to path it explicitly. + ------------------------------------------------------- + + +19991120-01 + + Tried once again to improve parsing of reject log entries. + Specifically: those associated with "RCPT" rejects. + + +19991016-01 (not generally released) + + Added --smtpd_stats. Generates smtpd connection statistics. + + --------------------------------------------------------------- + NOTE: Support for --smtpd_stats requires the Date::Calc module + (available from CPAN). If you don't want to go to the trouble + of fetching & installing that module, and you don't want smtpd + stats anyway, *carefully* identify all of the code sections + delimited by "# ---Begin: SMTPD_STATS_SUPPORT---" and + "# ---End: SMTPD_STATS_SUPPORT---" and remove them. + --------------------------------------------------------------- + + +19990909-01 (not generally released) + + Added -i and --ignore_case options. Causes entire email address + to be lower-cased instead of just the host/domain part. + + Added "use locale". (This means that the sorting order within + reports may be different from before--depending on how you have + your machine's locale set.) + + +19990904-03 + + Improved "reason" parsing and reporting for bounced and deferred + messages. + + Added parsing of "cleanup" reject lines to catch PCRE/regexp + rejects. + + Added "reject" stats to per-hour and (on multi-day reports) per- + day reports. + + Improved "warnings" report to show details. + + A single message deferred multiple times showed up as multiple + deferrals--implying that multiple messages were deferred. Now + shows "how many messages were deferred" and "how many deferrals" + as separate stats. + + Changed display of "Grand Totals" to make it a bit more readable + (IMO). + + Added "automatic perl finder" line for those systems that don't + support the "#!" notation. + + By popular demand: added note to comments as to where pflogsumm + home page could be found :-). + + +19990413-02 + + Fixed problem with last octet of IP address getting truncated in + reports when IP address used in place of unknown hosts. + + Changed the way a few internal variables were handled to be + compatible with Perl 5.003. Don't run it under Perl 5.003 with the + "-w" perl switch, tho! It will issue lots of warnings. All tests + I performed indicated that it produces the correct output, however. + + ------------------------------------------------------------ + NOTE: While this version was tested to work with Perl 5.003, + I recommend that you upgrade to 5.004 or later. I will not + guarantee that I'll remember to do the full regression- + testing that I usually do with 5.003 as well. + ------------------------------------------------------------ + + +19990411-01 + + NOTICE: As of this version of pflogsumm.pl, the "-c" switch is + GONE! (As per the previous notice.) + + Added "--help" option to emit short usage message and bail out. + + Added "--iso_date_time" switch to change displays of dates and times + to ISO 8601 standard formats (CCYY-MM-DD and HH:MM), rather than + "Month-name Day-number CCYY" and "HHMM" formats (the default). + + Added "--verbose_msg_detail" switch. This causes the full "reason" + to be displayed for the message deferral, bounce and reject summaries. + (Note: this can result in quite long lines in the report. Also note + that there have been a couple of subtle changes in the "reason" + parsing/reporting in the default mode (no "--verbose_msg_detail".) + + Added "--verp_mung" option. The problem this addresses is "VERP" + generated (??? so far as I can tell!) addresses (?) of the form: + + "list-return-NN-someuser=some.dom@host.sender.dom" + + These result in mail from the same "user" and site to look like it + originated from different users, when in fact it originates from the + same "user." There are presently two "levels" of address munging + available. With no numeric argument (or any value less than 2), the + above address will be converted to: + + "list-return-ID-someuser=some.dom@host.sender.dom" + + In other words: the numeric value will be replaced with "ID". + + By specifying "--verp_mung=2", the munging is more "aggressive", + converting the above address to something like: + + "list@host.sender.dom" + + Which looks more "normal." + + (Actually: specifying anything less than 2 does the "simple" munging + and anything greater than 1 results in the more "aggressive" hack + being applied.) + + Added "--uucp_mung" switch for consistence with "--verp_mung". + + +19990321-01 + + NOTICE: As of this version of pflogsumm.pl, versions of VMailer + prior to 19981023 are no longer supported. Sorry. + Pflogsumm-19990121-01.pl will be made permanently + available from now on for those with out-of-date versions + of VMailer prior to 19981023. + + NOTICE: As of this version of pflogsumm.pl, the "-c" switch is + DEPRECIATED. This version is transitional and retains it. + The next version will not have it. Subsequent versions + may re-use it for another purpose. Use the "-h" and "-u" + switches instead. + + Added "-h" and "-u" switches to provide finer-grained control over + report output. Depreciated "-c". + + Added "deferred" and "bounced" to "Grand Totals", "by-day" and "by- + hour" reports. + + Added "by-host/domain" reports. For sent (delivered) and received + messages: lists message count, total size of messages and + host/domain. For delivered messages: also lists number of deferred + messages and average and maximum delivery time. Both reports sorted + by message count in descending order. + + Grand totals also now list number of recipient and sender + hosts/domains. + + Re-wrote "by-user" data collection storage to reduce memory consumption + by in-memory hashes. + + Moved "credits" from pflogsumm.pl to this file. + + +19990121-01 + + Now accounts for forwarded messages. + + Side-effects of the above: + + . Total messages size now broken-out into total bytes received + and total bytes delivered. + . Count of forwarded messages now reported. + . Postfix-internally-generated messages (e.g.: Postmaster + notifications of bounces) are no longer counted as "received". + (They do, however, show up as "delivered".) + . Forwarded addresses no longer show up as "recipients" (just + as with aliases and mailing lists). + + Note that "delivered" will exceed "received" when messages + are forwarded because of additional header lines. + + +19990116-01 + + Added processing for "reject" log entries. + + Expanded detail of "deferred" and "bounced" log entries to include + "reason". + + +19990110-05 + + Added "messages received/delivered by hour" and "messages + received/delivered by day" reports. See the "Notes" section in the + documentation for details on how these behave. + + Broke-out total message count to "messages received" and "messages + delivered". + + (For the above two enhancements: "postfix/pickup" and "postfix/smtpd" + lines are now processed. They used to be discarded.) + + Renamed "summary" report to "Grand Totals". + + Added code to parse date & time stamps from log entries. This was + needed, in part, for the "messages per-hour/day" reports. It would + have been necessary for future enhancements in the way of date- & + time-based processing anyway. + + Added "Notes" section to docs at top of code. + + +19990109-01 + + Improved display of large integer values. + + +19990107-01 + + Bugfix only. Data for "extended detail" listing was being built + even if "-e" not specified. This resulted in unexpected excessive + memory consumption when crunching large amounts of data. + + Added warning about memory consumption when "-e" option specified. + + +19990103-01 + + Further improvement to "accuracy" of by-domain-then-logname sort. + (Presently used only by "extended detail" listing). For comparison + purposes: mungs "machine(s).host.dom" into "host.dom.machine(s)" so + sort is keyed on "base" domain name before machines within the + domain. Does *not* attempt to reverse the order of the "machine(s)" - + so within a particular "base" domain, may not come out in quite the + right order. ("foo.bar.some.dom" will come out before + "sales.acme.some.dom", for example.) + Also works for 2x2-style domain names. (I.e.: "some.do.co") + + +19990102-01 (never released) + + Added "mung UUCP-style bang-paths" switch (-m). + + Improved performance and "accuracy" of by-domain-then-logname sort + used by (only at present) "extended detail" listing. + + +19990101-02 + + Added "extended detail" option (-e). At present this includes only a + per-message detail listing, which lists per-message detail sorted by + sender domain, then sender username, then by queue i.d. + + Improved docs a bit. + + +19990101-01 + + Replaced warning message when message size unavailable in favor of + producing a report of these, sorted by queue i.d. Unlike the other + reports, this report header is not emitted at all if there are none of + these. (Always acts as if the -q switch had been specified). + + +19981231.01 + + Added experimental code to lower-case all domain names so that + "user@foo.dom" and "user@FOO.DOM" will come out the same. + + Added test for existence of message size value when "to=" records are + being processed. This was necessary for cases in which the logfile + entry containing the "status=sent" record is not processed at the same + time as the logfile containing the "size=nnnn" record. Note that this + will produce a summary that will show recipient counts without + matching recipient sizes. The only way to cure this would be to + create a separate disk file to "memorize" message sizes. (Which would + introduce a whole new raft of problems.) + + Added warning message (emitted to stderr) when the situation above is + detected. + + Fixed "usage" message to indicate you can specify files on command + line + + Wrapped a couple of long lines in the comments and code. + + Added (temporary) version numbering scheme. + + Started this log. + + Other changes/enhancements since previous un-version-numbered + versions: deals with log entries for VMailer as well as Postfix, more + robust parsing of "to=" and "from=" fields (now handles spaces in + these), eliminated double-counting of message sizes (happened when + delivery was deferred), re-structured parsing to be more robust (not- + to-mention correct!), added "grand summary" report at top (total + messages, total size, number of senders and recipients). + + +Credits + + [Note: The credits reflect suggestions and code contributions that + have actually been added. If your contribution doesn't appear + here, it may simply mean that it hasn't been added yet. (In which + case it should be on the list above.) On the other hand: if I + failed to credit you for something that *has* been added, please + let me know!] + + Paul D. Robertson + + For much testing and patience and many good suggestions on + how pflogsumm could be improved. + + Simon J Mudd + + For the following code contributions: + + Add "deferred" and "bounced" to "by hour" reports. + (I also added these to "by day" reports and "Grand + Totals".) + + "VERP" (?) address munger (less-agressive version) + + Suggestion for "by domain" delivery delay report. + + For the --smtpd_stats suggestion. + + Anders Arnholm + + For pointing out the problem with forwarded messages. + + Walcir Fontanini + + For pointers to changes to make for Perl 5.003 compatibility. + (Added to 19990413-02beta.) (Which I will *try* to keep in + mind!) + + Eric Cholet + + For the --ignore_case patch. + + Kurt Andersen + + For the ISO date formatting month-off-by-one patch and the + connect time overflow fix. + + Thomas Parmelan + + For improved "rejects" reporting patch. + + Glen Eustace + + Patch to fix "reject: header" matching after Wietse changed + the logfile format. diff --git a/README b/README new file mode 100644 index 0000000..e0573bd --- /dev/null +++ b/README @@ -0,0 +1,41 @@ + +Pflogsumm README + +There's not much to installing pflogsumm, so it's all manual. + + 1. Unpack the distribution (if you're reading this, you've already + gotten that far) + + 2. Copy or move pflogsumm.pl to some directory from which you'll + want to execute it. Maybe rename it to just "pflogsumm." + Watch the ownerships and permissions. Make sure it's executable. + + E.g.: + + cp pflogsumm.pl /usr/local/bin/pflogsumm + chown bin:bin /usr/local/bin/pflogsumm + chmod 755 /usr/local/bin/pflogsumm + + 3. If there's a manual page available (pflogsumm.1), copy that to + /usr/local/man/man1 or wherever you stash local manpages. Make + sure it's world-readable. + + E.g.: + + cp pflogsumm.1 /usr/local/man/man1/pflogsumm.1 + chown bin:bin /usr/local/man/man1/pflogsumm.1 + chmod 644 /usr/local/man/man1/pflogsumm.1 + + 4. Read the man page (or the top of pflogsumm itself) for usage. + + 5. Check the FAQ (pflogsumm-faq.txt) + + 6. Configure your cron jobs if you're going to run pflogsumm on an + automatic, scheduled basis. There are tips in the manpage and + the FAQ. + +That's about it. + +As the manpage and FAQ both note: pflogsumm requires the Date::Calc +Perl module if you want to use --smtpd_stats. + diff --git a/ToDo b/ToDo new file mode 100644 index 0000000..9ac8154 --- /dev/null +++ b/ToDo @@ -0,0 +1,74 @@ + +To Be Done (Maybe) + + date ranges, "lastweek", etc.? + + (options for?) break-down by local vs. non-local?, further + "drill-downs" to sender/recipient domains? + + Separate reports by-domain? (Would require that pflogsumm write files + instead of emitting to stdout? Or maybe that would be an option?) + Separate options for sender domain and receiver domain? Or perhaps + sender address and receiver address regex "filtering" options? + + lower-case domains (Done) + + don't k-ize msg counts? (Or do so at a higher boundary?) (Done) + + implement proper version numbering (interim version numbering in + place) (use SCCS or RCS?) (Done) + + add changelog (Done) + + Expand docs (in code?) + + Re-do docs to POD format? (Done) + + Improve UUCP-style bang-path handling + + Add option to use disk-based hash files instead of in-memory so + processing logfiles with lots of postfix log entries doesn't run + the machine out of memory? (This would really whack performance!) + (Maybe only needed with "-e" option.) + + Add another "step" to integer processing: "g"? (Display formatting + will presently mess up at values exceeding 999999m.) + + Internationalization? Necessary? There's English-language month + abbreviations hard-coded in pflogsumm. I'm admittedly real weak + in this area. (Covered by ISO 8601 option [below]?) + + Add option to lower-case entire addresses, rather than only the + domains. (Or make that the default, and add an option to make + it just domains?) (The RFCs say username parts shouldn't be + touched!) (Done) + + Add percentage-of-total to all (?) reports? + + Add ability to handle compressed files within pflogsumm? + (Unlikely) + + SMTP logging by host (msgs, connects) (add to new "by-domain" + reports?) (Done) + + UUCP logging of some sort? (At least what is sent to what + host/gateway. Since rmail receives incoming, don't know as I + can do anything there.) + + Option for ISO 8601 standard date & time formats. (Done) + + Option for specifying "host" (for multi-host logfiles)? + + Add POSIX regexp body_checks "bypass" docs to FAQ. (Liviu Daia, + Noel Jones) + + Add "helper" program to distribution based on Wolfgang Zeikat's + MIME::Lite suggestion for the body_checks issue? (Will want to + make it read from stdin, securely create tempfile and clean up + afterward.) + + Expand on SAV reporting? (Len Conrad) + + Add "reject details" on "too many errors after " log + lines? (Len Conrad) + diff --git a/pflogsumm-faq.txt b/pflogsumm-faq.txt new file mode 100644 index 0000000..b96085a --- /dev/null +++ b/pflogsumm-faq.txt @@ -0,0 +1,754 @@ + +FAQ for Pflogsumm.pl - A Log Summarizer/Analyzer for the Postfix MTA + +Introduction + + I wouldn't have believed it. What started out mostly as a light- + hearted exercise in improving my facility with Perl--with the hope + that something useful would come out of it as well--has turned out to + be a somewhat popular utility. And as more Admins find out about + postfix, and more end up trying pflogsumm.pl, many of the questions, + suggestions, and enhancement requests are becoming "frequently + asked". So odd as it seems (to me, at any rate), it looks like it's + time for a FAQ. + + +Index of pflogsumm.pl Frequently Asked Questions (in no particular order) + + 1. Project Status + 2. "Could You Make" or "Here's A Patch To Make" Pflogsumm Do ... + 3. Requires Date:Calc Module + 4. Built-In Support for Compressed Logs + 5. Processing Multiple Log Files + 6. Time-Based Reporting and Statistics + 7. By-domain Listings + 8. Reject, Deferred and Bounced Detail Info + 9. "Orphaned" (no size) Messages + 10. Pflogsumm misses/mis-diagnoses/mis-reports, etc. + 11. Pflogsumm is generating lots of "uninitialized value" warnings + 12. Pflogsumm just doesn't work or doesn't report anything + 13. Postfix Rejects Pflogsumm Reports Because Of Body Checks + 14. Pflogsumm Reports Double Traffic When Anti-Virus Scanner Used + 15. Pflogsumm's numbers don't add up + 16. Hourly stats for reports run without "-d" option are halved + 17. How Do I Get Pflogsumm To Email Reports To Me Daily/Weekly/etc.? + 18. How Can I View Pflogsumm's Reports In My Web Browser? + 19. New Red Hat install - Pflogsumm no longer works! + 20. How can I best format my custom reject messages for display in + Pflogsumm's output? + 21. Pflogsumm doesn't understand my log file format + 22. Why Isn't There Any Mention Of "Monkey Butler" In The FAQ? + 23. Translating Pflogsumm (Support for Internationalization) + 24. Pflogsumm may sometimes calculate "yesterday" incorrectly + 25. Sending Logfile Samples + 26. From Where Can I Obtain Pflogsumm? + + +1. Project Status + + New work on Pflogsumm is sporadic. It pretty much does everything I + need it to do and, so far as I can tell, pretty much what most other + people need it to do. And my time is limited. + + I'll still take bug reports. I'll still fix bugs. (But I promise no + time-line.) I'll still answer questions (as time allows). And I + *may* add the occasional enhancement or whatever--as the mood + strikes--but Pflogsumm is pretty much a "finished work" as far as I'm + concerned. + + +2. "Could You Make" or "Here's A Patch To Make" Pflogsumm Do ... + + Unless it's a *bug* fix, please see: "1. Project Status" + + To the argument "But it's a patch, all you have to do is...," the + answer is: "Not quite." Every time I make a change to Pflogsumm I + have to run it through a series of regression checks to make sure the + change didn't break something. Then there's the commit, + documentation, web page update, etc. cycle. + + I'm particularly unlikely to add code to Pflogsumm to account for + non-standard Postfix log entries. "Non-standard" being defined as + "other than what Wietse's code does." Or additional stats gathering + that nobody else has requested and strikes *me* as of limited interest + or use. In addition to the development cycle, there's the issue of + "code bloat." Pflogsumm already takes enough (too much?) time and + memory on busy machines with large logs. I'm not prone to make this + worse for the sake of these things. + + See Also: 21. Pflogsumm doesn't understand my log file format + + +3. Requires Date::Calc Module + + Pflogsumm requires the Date::Calc module. You can download and + install the Date::Calc module from CPAN. It can be found at: + + http://search.cpan.org/search?module=Date::Calc + + Or you can remove the code that's dependent on the Date::Calc module. + For the convenience of folks that would prefer to take this approach, + I've "fenced" all such code like this: + + # ---Begin: SMTPD_STATS_SUPPORT--- + . + . + . + + . + . + . + # ---End: SMTPD_STATS_SUPPORT--- + + However, if you do this you will lose support for --smtpd_stats. + + Later versions of the Pflogsumm distribution include a script to + semi-automate removing smtpd stats support, if you so-desire. + + As of Pflogsumm-1.1.1, the presence of Date::Calc is optional. If you + don't want to use the Pflogsumm options that depend upon it, you + neither need Date::Calc, nor is it necessary to manually remove the + code that depends upon it. + + +4. Built-In Support for Compressed Logs + + I took a look at this. There is a Perl module (which I downloaded, + built, and installed here) to interface to libz, but after considering + the changes that would be necessary--and the fact that those changes + would require that potential users have to download/build/install libz + (and of the correct version) and the additional Perl module, I decided + to forego this enhancement. + + I could just open a pipe within Pflogsumm and use zcat/gunzip/gzip. + That would depend upon a) them being there [probably a safe bet-- + considering the logs somehow got into that format :-), but...] and b) + one of these either being in the path or having an environment + variable or a script variable or... + + The thing is, in the latter case there's really no "savings" over + simply piping into Pflogsumm in the first place. Multiple processes + get spawned, pipes opened, etc. either way. It would add a little + convenience, is all. + + So I could do it. And there are a couple of ways I could do it. And + my mind is certainly still open on the issue. I'm just not convinced + there's a good reason to do it, is all. And I'd like to avoid + "creeping over-feature-itis" if I can. My position is *not* set in + stone on this issue. In the mean-time: + + zcat /var/log/maillog.0.gz |pflogsumm.pl + + or + + gunzip + + should do the trick quite nicely for you. + + If you've a complex situation, for example: your logs aren't rotated + exactly at midnight, you might try something like: + + (zcat /var/log/maillog.0.gz; cat /var/log/maillog) \ + |pflogsumm.pl -d yesterday + + See Also: 5. Processing Multiple Log Files + 17. How Do I Get Pflogsumm To Email Reports To Me + Daily/Weekly/etc.? + + +5. Processing Multiple Log Files + + When processing multiple log files (say: an entire weeks worth of logs + that are rotated daily), it is important that Pflogsumm be fed them in + chronological order. For performance and memory conservation reasons, + Pflogsumm relies on log messages "arriving" in the order in which they + were created. + + If you do something like this: + + pflogsumm /var/log/maillog* + + you might not get what you expect! Instead, try something like: + + pflogsumm `ls -rt /var/log/maillog*` + + A more complex example, where compressed logs are involved: + + (zcat `ls -rt /var/log/maillog.*.gz`; cat /var/log/maillog) \ + |pflogsumm.pl + + Obviously, this depends on the file modification times for your logs + being reflective of their chronological order. If that can't be + trusted, you're gonna have to get ugly. Like in enumerating each + file, or as in: + + (for each in 3 2 1 0; do + zcat "/var/log/maillog.$each.gz" + done + cat /var/log/maillog) |pflogsumm.pl + + or (somewhat more efficiently--by running zcat only once): + + (zcat `for ea in 3 2 1 0; do echo "/var/log/maillog.$ea.gz"; + done`; cat /var/log/maillog) |pflogsumm.pl + + [Note: I didn't actually run these. So you would be well-advised + to double-check them.] + + See Also: 4. Built-In Support for Compressed Logs + 17. How Do I Get Pflogsumm To Email Reports To Me + Daily/Weekly/etc.? + + +6. Time-Based Reporting and Statistics + + There has been a small assortment of requests for different time + statistics reporting. And adding this would be relatively straight- + forward. (Just have to reach a consensus on exactly *what* should be + reported, and how. This could easily get out of hand!) + + There's only one *small* problem. Ironically, it's time. + + I've experimented with Pflogsumm grokking the log timestamps. As a + matter-of-fact: the enhancement added in the 19990110-05 version + required that I do some of this. My first pass was to use the Perl + timelocal() function to convert those sub-strings to an integer for + subsequent comparison operations. Imagine my surprise when + performance of the resulting code was a factor of five (5) times + slower than that of its predecessor. After a "remove the statements + until it got fast again" exercise, I found that the culprit was + timelocal(). + + As of version 19990321-01, Pflogsumm does by-domain stats reporting of + average and maximum delivery time by host/domain. And an even earlier + version added by-hour and by-day message count reporting. Anything + much beyond these is going to get "expensive." + + If/when any additional time-based stats reporting is added: I think + they are definitely going to be optional. + + One way you can make up for Pflogsumm's deficiency in this respect is + to use good ol' Unix tools like "grep" to pre-process your log files + before feeding them to Pflogsumm. E.g.: + + grep "Feb 9" /var/log/maillog |pflogsumm what_ever_args + + Note that single-digit days-of-the-month have an additional leading + space in the logfiles, where the digit for two-digit dates would be. + + +7. By-domain Listings + + I figured on the desire for this one from the start. There are many + possibilities: + + 1) A single report, split by domain + 2) An option to limit reporting to a particular domain + + This issue is kind of tricky. The popularity of Unix amongst + SysAdmins is testimony to the beauty of being able to wire- together + small, simple tools so that one can generate output to ones taste. + Anything I do is likely to make some Admins happy and others wishing + I'd done it "the other way". + + One thought that occurred is to perhaps provide a couple of options + that would allow one to limit a particular report to + + sender=regular_expression and/or recipient=regular_expression + + The problem with this solution is that an Admin desiring to emit + custom reports for multiple domains would have to re-process the same + log multiple times--once for each desired domain. + + So I'm still thinking about this one. + + +8. Reject, Deferred and Bounced Detail Info + + I've actually only received one query about this so far, but there are + bound to be more. So... + + The "detailed" information in the "Reject", "Deferred" and "Bounced" + reports is a compromise. Just take a stroll through your postfix logs + some day and observe the variation in how the "reason" for a + particular reject, defer, or bounce is reported. Without putting a + lot of static comparisons for each-and-every case into the analyzer, I + have absolutely no hope is doing this very well. + + Emitting the entire "reason" is not good, either. The entire "reason" + string can be very long. Depending on what somebody is using to + display Pflogsumm's output, the lines may well wrap-- producing output + that is no more readable than just grepping the logs. + + And anything more I do to this end may soon be rendered moot. After + Wietse gets most of the more important functional stuff out of the + way, Postfix logging is going to be completely re-written. (Oh boy, + won't that be fun!) I'm hoping I'll be able to get some input into + the process so the formatting is more amenable to automated + processing. Wietse has indicated that such would be the case. + + Also, please note my primary objective behind Pflogsumm (besides the + entertainment value): "just enough detail to give the administrator a + ``heads up'' for potential trouble spots." It's not *supposed* to do + away with manual inspection entirely. + + For those that really want all that extra detail in the log summary + reports, specify the "--verbose_msg_detail" switch. + + See Also: 25. Sending Logfile Samples + + +9. "Orphaned" (no size) Messages + + The Problem: + + Message size is reported only by the queue manager. The message + may be delivered long-enough after the (last) qmgr log entry that + the information is not in the log(s) processed by a particular run + of pflogsumm.pl. + + The Result: + + "Orphaned" messages. These are reported by Pflogsumm as "Messages + with no size data." + + This, of course, throws off "Recipients by message size" and the + total for "bytes delivered." ("bytes in messages" in earlier + versions.) + + The Solution: + + "Memorize" message sizes by queue i.d. Easy in theory. Difficult + in practice. At least at the moment. + + You see, if Pflogsumm's going to "memorize" message sizes, it has + to have some definitive way to know when to delete a no- + longer-needed reference. Otherwise the memory file will just grow + forever. + + As with the "Reject, Deferred and Bounced Detail Info" issue above, + I'm hoping the get some input into future changes in logging issues. + In any event: maybe whatever comes out of the logging redesign will + provide a solution. + + As of Pflogsumm version 1.0.12, the "Messages with no size data" report + can be turned off. + + +10. Pflogsumm misses/mis-diagnoses/mis-reports, etc. + + Are you using a real old version of VMailer? As of pflogsumm.pl + version 19990220-06, versions of VMailer prior to 19981023 are no + longer supported. Sorry. Pflogsumm-19990121-01.pl will be made + permanently available from now on for those with out-of-date versions + of VMailer prior to 19981023. + + Are you processing your log files in chronological order? See item + "5: "Processing Multiple Log Files". + + Pflogsumm.pl is being developed by me on my rather small-scale server + at home. There are only two users on the system. And I do no + mail-forwarding. So the log samples I have to work with are + commensurately limited. + + If there's something that Pflogsumm is not doing, or not doing right, + let me know what it is, what you think it ought to do, and send me a + representative sample of *real* log entries with which to work. + + See Also: 5. Processing Multiple Log Files + 12. Pflogsumm just doesn't work or doesn't report anything + 15. Pflogsumm's numbers don't add up + 19. New Red Hat install - Pflogsumm no longer works! + 21. Pflogsumm doesn't understand my log file format + 25. Sending Logfile Samples + + +11. Pflogsumm is generating lots of "uninitialized value" warnings + + Are you using a version of Perl lower than 5.004_04? Perhaps with a + "beta" version of pflogsumm.pl? If so, try turning off the "-w" + switch. Pflogsumm as of 19990413-02beta appeared to work correctly + with Perl 5.003 in spite of the warnings. (Those warnings didn't + appear with Perl 5.004.) + + I don't guarantee that I'll remember to test future versions of + pflogsumm.pl against 5.003, but I'll try to :-). + + You really should consider upgrading your Perl to 5.004 or later. + + +12. Pflogsumm just doesn't work or doesn't report anything + + Did you *download* Pflogsumm as opposed to grabbing it by + "copy-and-paste" from a browser? Copy-and-paste can result in lines + being unintentionally wrapped and hard-tabs being converted to + spaces. This will break Pflogsumm. + + Also, I've received a couple of reports by people downloading + Pflogsumm with Lynx that the download has long lines wrapped. + Naturally, this breaks Pflogsumm. + + See Also: 10. Pflogsumm misses/mis-diagnoses/mis-reports, etc. + + 19. New Red Hat install - Pflogsumm no longer works! + 21. Pflogsumm doesn't understand my log file format + + +13. Postfix Rejects Pflogsumm Reports Because Of Body Checks + + You configure Postfix to do body checks, Postfix does its thing, + Pflogsumm reports it and Postfix catches the the same string in the + Pflogsumm report. There are several solutions to this. + + Wolfgang Zeikat contributed this: + + #!/usr/bin/perl + use MIME::Lite; + + ### Create a new message: + $msg = MIME::Lite->new( + From => 'your@send.er', + To => 'your@recipie.nt', + # Cc => 'some@other.com, some@more.com', + Subject => 'pflogsumm', + Date => `date`, + Type => 'text/plain', + Encoding => 'base64', + Path =>'/tmp/pflogg', + ); + + $msg->send; + + Where "/tmp/pflogg" is the output of Pflogsumm. This puts Pflogsumm's + output in a base64 MIME attachment. + + In a follow-up to a thread in the postfix-users mailing list, Ralf + Hildebrandt noted: + + "mpack does the same thing." + + The canonical FTP site for mpack is ftp.andrew.cmu.edu:pub/mpack/ + + The solution I came up with is to modify the body_checks statements to + ignore the strings when they're in a Pflogsumm report, as follows: + + Bounce anything with 6 or more "$"s in a row... + + /\${6,}/ REJECT + + Which, of course, catches the line in the Pflogsumm report too. + So... + + /^(?!\s+[0-9]+\s+).*?\${6,}/ REJECT + + which reads "anything with 6 or more '$'s in a row that is not a line + beginning with one or more whitespace characters, followed by one or + more digits, followed by one or more whitespace characters." + + (This is using PCRE's, btw.) + + Note that my solution will be more computationally expensive, by a + *long* way, than encoding Pflogsumm's output into a format that + body_checks won't catch. + + Robert L Mathews suggested the following solution + + /^ {6,11}[[:digit:]]{1,6}[ km] / OK + + Placed at the beginning of a body_checks file, this will "pre-approve" + lines in Pflogsumm's output that might otherwise get caught. That's + a POSIX regexp version. A PCRE version of the same thing would be: + + /^ {6,11}\d{1,6}[ km] / OK + + +14. Pflogsumm Reports Double Traffic When Anti-Virus Scanner Used + + Sadly, there's absolutely nothing I can do about this :-(. + + The problem arises because of the way in which anti-virus scanning is + handled by Postfix. Basically, Postfix "delivers" each email to the + anti-virus scanner and the anti-virus scanner re-sends it through + Postfix. So each email really is received twice and sent/delivered + twice. + + And yes, I tried. I really, really tried. If I recall correctly, I + spent come two days mucking-about with this problem. Actually thought + I had it once or twice. But the results inevitably failed regression + testing. At the end of this, and with some more careful thought, I + realized it just wasn't possible. If you think you can prove me + wrong, please do so. I'd be quite pleased to be proven wrong on this + one. + + johnfawcett at tiscali-dot-it believes he's done it. You may find + prefiltering your log with his "prepflog" does it for you. You can + find it at . + + +15. Pflogsumm's numbers don't add up + + Pflogsumm reports more "delivered" than "received" + + Naturally. A single email message can have multiple recipients. + + Pflogsumm reports more "rejected" than "received" + + Why doesn't delivered + deferred + bounce + rejected = received? + + Some rejects (header and body checks, for example) happen in + "cleanup," after alias lists are expanded. Thus a single received + message will be rejected multiple times: once for each recipient. + + The "size=" fields, multiplied by their "nrcpt=" fields, when added-up + yields a total higher than Pflogsumm's "bytes delivered" total. + + Pflogsumm doesn't count something delivered until it actually *is* + delivered. Nrcpt only suggests the number of intended recipients, + not how many are actually deliverable. Only if there were no + bounces, rejects, defers or other undeliverables for everything + that was received would a calculation such as that above yield the + proper value. + + Pflogsumm's "% rejected" doesn't add up + + The "percent rejected" and "percent discarded" figures are only + approximations. They are calculated as follows (example is for + "percent rejected"): + + percent rejected = + + (rejected / (delivered + rejected + discarded)) * 100 + + Given the issues discussed above, this is really the best that can + be hoped-for, IMO. + + I consistently see more "delivered" than "received." How is that + possible? + + Any message that's got multiple recipients in the "To:," + "Cc:," and "Bcc:" fields will result in a single "received" + with multiple "delivered"s, as well as, possibly, multiple + "rejects" (or reject warnings, discards or holds), depending + on where in Postfix' processing the rule was that resulted + in the reject, etc. + + See Also: 10. Pflogsumm misses/mis-diagnoses/mis-reports, etc. + + + +16. Hourly stats for reports run without "-d" option are halved + + Scenario: On day #1 of a fresh logfile, you run Pflogsumm with "-d + today" and the next day you run it with no "-d" option. The "Per-Hour + Traffic" statistics are approximately halved. How can this be? + + Note that when you run Pflogsumm on a logfile that contains multi-day + logfile entries, Pflogsumm automatically changes the per-hour stats to + daily traffic averages. If there's even *one* logfile entry from + another day, all of the per-hour stats will be divided by two. Unless + you rotate logfiles *precisely* at midnight--and it's unlikely you can + guarantee that happening--there's no way to prevent this. + + +17. How Do I Get Pflogsumm To Email Reports To Me Daily/Weekly/etc.? + + Excuse me? You're running a mailserver and you don't know how to use + cron to run things on a scheduled basis and pipe the output to + something that'll email it to you? + + Oh. My. Lord. + + *sigh* + + Here's my crontab entries: + + 10 0 * * * /usr/local/sbin/pflogsumm -d yesterday /var/log/syslog \ + 2>&1 |/usr/bin/mailx -s "`uname -n` daily mail stats" postmaster + + 10 4 * * 0 /usr/local/sbin/pflogsumm /var/log/syslog.0 \ + 2>&1 |/usr/bin/mailx -s "`uname -n` weekly mail stats" postmaster + + (Those are actually each a single line. I line-broke them [and + signified that with the "\"s] for readability.) + + The first generates stats for the previous day and is run *after* + midnight. The second is run against the previous week's entire log. + (I rotate my logs weekly.) + + If you rotate your logs on a different schedule, want monthly reports, + etc., I leave it as an exercise to you, the reader, to figure out how + to concatenate several logs to stdout and feed that to Pflogsumm. + + See Also: 4. Built-In Support for Compressed Logs + 5. Processing Multiple Log Files + The Unix manual pages for "cron," "crontab," "cat," + "zcat," "gzip," "gunzip," "mail," "mailx," etc. + + +18. How Can I View Pflogsumm's Reports In My Web Browser? + + Just direct Pflogsumm's output to a file, call it "something.txt" or + whatever, and look at it with your browser :). If you want to get + fancy, create a post-processing shell script that'll create a + date-tagged file, like "mailstats-20030216.txt". It's easy. + + See Also: Pflogsumm Through A Browser, on Pflogsumm's home page. + + +19. New Red Hat install - Pflogsumm no longer works! + + From some email exchanges with a couple of people that reported + this... + + "It appears the Pflogsumm is broken with RedHat9. I can take + the same log file and run it under Solaris9/RedHat 7.3 (perl 5.8 + on both) without a problem, but it breaks on RH9." + + "Oops. Sorry about the false alarm. This is an issue with + some of the other Perl scripts that are out there due to Red Hat + 8/9 using LANG=en_US.UTF-8 + + Changing the locale to "POSIX" fixes this... + LANG=C + + Note that Pflogsumm works fine when run through cron.daily, as + cron has different environment settings." + + "Ah, the good old RH8/9 UTF-8 strikes again. I should have + known. Setting LANG to either en_US or C fixes the problem." + + What the above means is that you have to change the "LANG" environment + variable from "en_US.UTF-8" to "en_US" or "C". E.g.: + + LANG="en_US" + export LANG + + in your shell. Or you could add these commands to your login + profile. (I.e.: $HOME/.bash_profile, if you're using bash.) Or set + the system-wide default in /etc/sysconfig/i18n. My RH boxes have LANG + set to "en_US" there, and everything seems to work fine. (If you set + it in your profile or the system-wide default, you'll need a fresh + login for it to take effect, obviously.) + + See Also: 10. Pflogsumm misses/mis-diagnoses/mis-reports, etc. + + 12. Pflogsumm just doesn't work or doesn't report anything + 21. Pflogsumm doesn't understand my log file format + + +20. How can I best format my custom reject messages for display in + Pflogsumm's output? + + Reject reason strings found in the mail log will be truncated at the + first comma (","), colon (":") or semi-colon (";"). If you want a + "clause" in your reject message to appear in Pflogsumm's output, + without having to specify --verbose_msg_detail, use a punctuation mark + other than one of those three, such as a dash ("-"). + + +21. Pflogsumm doesn't understand my log file format + + I've received several requests to modify Pflogsumm's log file format + regular expression matching to accommodate "non-standard" log file + formats. I'm not inclined to honour such requests. The regexp that + identifies Postfix' log file entries is nearly incomprehensible as it + is. If your log file format has extra fields (e.g.: FreeBSD syslogd + with "-v -v" specified), or, as in one case (metalog), is lacking + fields, and you insist on doing things that way, I recommend you + code-up a little pre-filter to mung the format into a standard one. + + See Also: 10. Pflogsumm misses/mis-diagnoses/mis-reports, etc. + + 12. Pflogsumm just doesn't work or doesn't report anything + 19. New Red Hat install - Pflogsumm no longer works! + + +22. Why Isn't There Any Mention Of "Monkey Butler" In The FAQ? + + A friend of mine asked me if I'd put the phrase "monkey butler" in + the FAQ. The answer is no. Pflogsumm is used by some rather large + corporations. There are credibility issues. Sorry. :) + + +23. Translating Pflogsumm (Support for Internationalization) + + Unfortunately, Pflogsumm doesn't currently have i18n support. + + It wasn't until at least Perl 5.6 that i18n was included as part of + the base distribution. Since, last time I looked, 5.005* was still + the most widely-used version of Perl (that's what I'm still running + everywhere, too), I can't put i18n in without chancing breaking + things right-and-left for the majority of my customers. + + Even with Perl 5.6 and above, it was mentioned in postfix-users, by + Liviu Daia, that + + "Perl 5.6+ has locales. Locales can give you localized + dates, charsets, standard error messages etc., but it + won't automatically switch languages of the strings + defined in your program. For that, you still need + gettext or something equivalent." + + So I'm not clear on the future of i18n support in Pflogsumm. But I'm + keeping an eye on things. Proper i18n support has long been one of + the top things on my own wish list! + + Prospective translators are urged to translate *only* the + stable/production versions. Beta and Alpha versions can sometimes + change rapidly. + + If you do translate Pflogsumm, let me know and I'll put a link to it + on Pflogsumm's main web page. + + +24. Pflogsumm may sometimes calculate "yesterday" incorrectly + + As Wieland Chmielewski aptly noted: + + Subroutine get_datestr incorrectly assumes that each day of + the year comprises 24 hours. In those countries which + participate in Daylight Saving Time policy there is one day + with 23 hours and one day with 25 hours. So, chances are (for + 1 hour within those days) that get_datestr actually returns + either "the day before yesterday" or "today" instead of + "yesterday" as requested. + + Right you are, Wieland, and thanks for the catch. + + Problem is, of course, there's really no clean, easy, certain fix. + The work-around is to stay well clear of DST never-never land with + your cron jobs. + + +25. Sending Logfile Samples + + Here's the deal with whatever you may send me in the way of log + samples: + + . Obfuscate them if you want. But take care not alter them + in such a manner that they're not accurate wrt the "realism" of + the data, make sure the field formatting is not altered, and + that the order of the log entries is not altered. + + . The world is an unsafe place for your data, no matter where + it might reside. But I'll do my level best to ensure that your + data does not fall into the hands of others. + + . If you want, I'll PGP-encrypt the data when it's not in + use. + + . You can PGP-encrypt it when you send it to me if you're + concerned. My PGP public key can be found on my Web site and at + the PGP public key servers. + + . If you want, I'll delete the sample data when the work is + done. But I would *like* to keep it around for future + regression-testing. It's your call. Let me know. + + +26. From Where Can I Obtain Pflogsumm? + + http://jimsun.LinxNet.com/postfix_contrib.html + + +Created: 15 Feb., 1999 / Last updated: 10 April, 2004 diff --git a/pflogsumm.1 b/pflogsumm.1 new file mode 100644 index 0000000..33dac0e --- /dev/null +++ b/pflogsumm.1 @@ -0,0 +1,532 @@ +.\" Automatically generated by Pod::Man 2.1801 (Pod::Simple 3.13) +.\" +.\" Standard preamble: +.\" ======================================================================== +.de Sp \" Vertical space (when we can't use .PP) +.if t .sp .5v +.if n .sp +.. +.de Vb \" Begin verbatim text +.ft CW +.nf +.ne \\$1 +.. +.de Ve \" End verbatim text +.ft R +.fi +.. +.\" Set up some character translations and predefined strings. \*(-- will +.\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left +.\" double quote, and \*(R" will give a right double quote. \*(C+ will +.\" give a nicer C++. Capital omega is used to do unbreakable dashes and +.\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff, +.\" nothing in troff, for use with C<>. +.tr \(*W- +.ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' +.ie n \{\ +. ds -- \(*W- +. ds PI pi +. if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch +. if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch +. ds L" "" +. ds R" "" +. ds C` "" +. ds C' "" +'br\} +.el\{\ +. ds -- \|\(em\| +. ds PI \(*p +. ds L" `` +. ds R" '' +'br\} +.\" +.\" Escape single quotes in literal strings from groff's Unicode transform. +.ie \n(.g .ds Aq \(aq +.el .ds Aq ' +.\" +.\" If the F register is turned on, we'll generate index entries on stderr for +.\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index +.\" entries marked with X<> in POD. Of course, you'll have to process the +.\" output yourself in some meaningful fashion. +.ie \nF \{\ +. de IX +. tm Index:\\$1\t\\n%\t"\\$2" +.. +. nr % 0 +. rr F +.\} +.el \{\ +. de IX +.. +.\} +.\" +.\" Accent mark definitions (@(#)ms.acc 1.5 88/02/08 SMI; from UCB 4.2). +.\" Fear. Run. Save yourself. No user-serviceable parts. +. \" fudge factors for nroff and troff +.if n \{\ +. ds #H 0 +. ds #V .8m +. ds #F .3m +. ds #[ \f1 +. ds #] \fP +.\} +.if t \{\ +. ds #H ((1u-(\\\\n(.fu%2u))*.13m) +. ds #V .6m +. ds #F 0 +. ds #[ \& +. ds #] \& +.\} +. \" simple accents for nroff and troff +.if n \{\ +. ds ' \& +. ds ` \& +. ds ^ \& +. ds , \& +. ds ~ ~ +. ds / +.\} +.if t \{\ +. ds ' \\k:\h'-(\\n(.wu*8/10-\*(#H)'\'\h"|\\n:u" +. ds ` \\k:\h'-(\\n(.wu*8/10-\*(#H)'\`\h'|\\n:u' +. ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'^\h'|\\n:u' +. ds , \\k:\h'-(\\n(.wu*8/10)',\h'|\\n:u' +. ds ~ \\k:\h'-(\\n(.wu-\*(#H-.1m)'~\h'|\\n:u' +. ds / \\k:\h'-(\\n(.wu*8/10-\*(#H)'\z\(sl\h'|\\n:u' +.\} +. \" troff and (daisy-wheel) nroff accents +.ds : \\k:\h'-(\\n(.wu*8/10-\*(#H+.1m+\*(#F)'\v'-\*(#V'\z.\h'.2m+\*(#F'.\h'|\\n:u'\v'\*(#V' +.ds 8 \h'\*(#H'\(*b\h'-\*(#H' +.ds o \\k:\h'-(\\n(.wu+\w'\(de'u-\*(#H)/2u'\v'-.3n'\*(#[\z\(de\v'.3n'\h'|\\n:u'\*(#] +.ds d- \h'\*(#H'\(pd\h'-\w'~'u'\v'-.25m'\f2\(hy\fP\v'.25m'\h'-\*(#H' +.ds D- D\\k:\h'-\w'D'u'\v'-.11m'\z\(hy\v'.11m'\h'|\\n:u' +.ds th \*(#[\v'.3m'\s+1I\s-1\v'-.3m'\h'-(\w'I'u*2/3)'\s-1o\s+1\*(#] +.ds Th \*(#[\s+2I\s-2\h'-\w'I'u*3/5'\v'-.3m'o\v'.3m'\*(#] +.ds ae a\h'-(\w'a'u*4/10)'e +.ds Ae A\h'-(\w'A'u*4/10)'E +. \" corrections for vroff +.if v .ds ~ \\k:\h'-(\\n(.wu*9/10-\*(#H)'\s-2\u~\d\s+2\h'|\\n:u' +.if v .ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'\v'-.4m'^\v'.4m'\h'|\\n:u' +. \" for low resolution devices (crt and lpr) +.if \n(.H>23 .if \n(.V>19 \ +\{\ +. ds : e +. ds 8 ss +. ds o a +. ds d- d\h'-1'\(ga +. ds D- D\h'-1'\(hy +. ds th \o'bp' +. ds Th \o'LP' +. ds ae ae +. ds Ae AE +.\} +.rm #[ #] #H #V #F C +.\" ======================================================================== +.\" +.IX Title "PFLOGSUMM 1" +.TH PFLOGSUMM 1 "2010-03-20" "1.1.3" "User Contributed Perl Documentation" +.\" For nroff, turn off justification. Always turn off hyphenation; it makes +.\" way too many mistakes in technical documents. +.if n .ad l +.nh +.SH "NAME" +pflogsumm.pl \- Produce Postfix MTA logfile summary +.PP +Copyright (C) 1998\-2010 by James S. Seymour, Release 1.1.3. +.SH "SYNOPSIS" +.IX Header "SYNOPSIS" +.Vb 10 +\& pflogsumm.pl \-[eq] [\-d ] [\-\-detail ] +\& [\-\-bounce_detail ] [\-\-deferral_detail ] +\& [\-h ] [\-i|\-\-ignore_case] [\-\-iso_date_time] [\-\-mailq] +\& [\-m|\-\-uucp_mung] [\-\-no_bounce_detail] [\-\-no_deferral_detail] +\& [\-\-no_no_msg_size] [\-\-no_reject_detail] [\-\-no_smtpd_warnings] +\& [\-\-problems_first] [\-\-rej_add_from] [\-\-reject_detail ] +\& [\-\-smtp_detail ] [\-\-smtpd_stats] +\& [\-\-smtpd_warning_detail ] [\-\-syslog_name=string] +\& [\-u ] [\-\-verbose_msg_detail] [\-\-verp_mung[=]] +\& [\-\-zero_fill] [file1 [filen]] +\& +\& pflogsumm.pl \-[help|version] +\& +\& If no file(s) specified, reads from stdin. Output is to stdout. +.Ve +.SH "DESCRIPTION" +.IX Header "DESCRIPTION" +.Vb 4 +\& Pflogsumm is a log analyzer/summarizer for the Postfix MTA. It is +\& designed to provide an over\-view of Postfix activity, with just enough +\& detail to give the administrator a "heads up" for potential trouble +\& spots. +\& +\& Pflogsumm generates summaries and, in some cases, detailed reports of +\& mail server traffic volumes, rejected and bounced email, and server +\& warnings, errors and panics. +.Ve +.SH "OPTIONS" +.IX Header "OPTIONS" +.Vb 1 +\& \-\-bounce_detail +\& +\& Limit detailed bounce reports to the top . 0 +\& to suppress entirely. +\& +\& \-d today generate report for just today +\& \-d yesterday generate report for just "yesterday" +\& +\& \-\-deferral_detail +\& +\& Limit detailed deferral reports to the top . 0 +\& to suppress entirely. +\& +\& \-\-detail +\& +\& Sets all \-\-*_detail, \-h and \-u to . Is +\& over\-ridden by individual settings. \-\-detail 0 +\& suppresses *all* detail. +\& +\& \-e extended (extreme? excessive?) detail +\& +\& Emit detailed reports. At present, this includes +\& only a per\-message report, sorted by sender domain, +\& then user\-in\-domain, then by queue i.d. +\& +\& WARNING: the data built to generate this report can +\& quickly consume very large amounts of memory if a +\& lot of log entries are processed! +\& +\& \-h top to display in host/domain reports. +\& +\& 0 = none. +\& +\& See also: "\-u" and "\-\-*_detail" options for further +\& report\-limiting options. +\& +\& \-\-help Emit short usage message and bail out. +\& +\& (By happy coincidence, "\-h" alone does much the same, +\& being as it requires a numeric argument :\-). Yeah, I +\& know: lame.) +\& +\& \-i +\& \-\-ignore_case Handle complete email address in a case\-insensitive +\& manner. +\& +\& Normally pflogsumm lower\-cases only the host and +\& domain parts, leaving the user part alone. This +\& option causes the entire email address to be lower\- +\& cased. +\& +\& \-\-iso_date_time +\& +\& For summaries that contain date or time information, +\& use ISO 8601 standard formats (CCYY\-MM\-DD and HH:MM), +\& rather than "Mon DD CCYY" and "HHMM". +\& +\& \-m modify (mung?) UUCP\-style bang\-paths +\& \-\-uucp_mung +\& +\& This is for use when you have a mix of Internet\-style +\& domain addresses and UUCP\-style bang\-paths in the log. +\& Upstream UUCP feeds sometimes mung Internet domain +\& style address into bang\-paths. This option can +\& sometimes undo the "damage". For example: +\& "somehost.dom!username@foo" (where "foo" is the next +\& host upstream and "somehost.dom" was whence the email +\& originated) will get converted to +\& "foo!username@somehost.dom". This also affects the +\& extended detail report (\-e), to help ensure that by\- +\& domain\-by\-name sorting is more accurate. +\& +\& \-\-mailq Run "mailq" command at end of report. +\& +\& Merely a convenience feature. (Assumes that "mailq" +\& is in $PATH. See "$mailqCmd" variable to path thisi +\& if desired.) +\& +\& \-\-no_bounce_detail +\& \-\-no_deferral_detail +\& \-\-no_reject_detail +\& +\& These switches are depreciated in favour of +\& \-\-bounce_detail, \-\-deferral_detail and +\& \-\-reject_detail, respectively. +\& +\& Suppresses the printing of the following detailed +\& reports, respectively: +\& +\& message bounce detail (by relay) +\& message deferral detail +\& message reject detail +\& +\& See also: "\-u" and "\-h" for further report\-limiting +\& options. +\& +\& \-\-no_no_msg_size +\& +\& Do not emit report on "Messages with no size data". +\& +\& Message size is reported only by the queue manager. +\& The message may be delivered long\-enough after the +\& (last) qmgr log entry that the information is not in +\& the log(s) processed by a particular run of +\& pflogsumm.pl. This throws off "Recipients by message +\& size" and the total for "bytes delivered." These are +\& normally reported by pflogsumm as "Messages with no +\& size data." +\& +\& \-\-no_smtpd_warnings +\& +\& This switch is depreciated in favour of +\& smtpd_warning_detail +\& +\& On a busy mail server, say at an ISP, SMTPD warnings +\& can result in a rather sizeable report. This option +\& turns reporting them off. +\& +\& \-\-problems_first +\& +\& Emit "problems" reports (bounces, defers, warnings, +\& etc.) before "normal" stats. +\& +\& \-\-rej_add_from +\& For those reject reports that list IP addresses or +\& host/domain names: append the email from address to +\& each listing. (Does not apply to "Improper use of +\& SMTP command pipelining" report.) +\& +\& \-q quiet \- don\*(Aqt print headings for empty reports +\& +\& note: headings for warning, fatal, and "master" +\& messages will always be printed. +\& +\& \-\-reject_detail +\& +\& Limit detailed smtpd reject, warn, hold and discard +\& reports to the top . 0 to suppress entirely. +\& +\& \-\-smtp_detail +\& +\& Limit detailed smtp delivery reports to the top . +\& 0 to suppress entirely. +\& +\& \-\-smtpd_stats +\& +\& Generate smtpd connection statistics. +\& +\& The "per\-day" report is not generated for single\-day +\& reports. For multiple\-day reports: "per\-hour" numbers +\& are daily averages (reflected in the report heading). +\& +\& \-\-smtpd_warning_detail +\& +\& Limit detailed smtpd warnings reports to the top . +\& 0 to suppress entirely. +\& +\& \-\-syslog_name=name +\& +\& Set syslog_name to look for for Postfix log entries. +\& +\& By default, pflogsumm looks for entries in logfiles +\& with a syslog name of "postfix," the default. +\& If you\*(Aqve set a non\-default "syslog_name" parameter +\& in your Postfix configuration, use this option to +\& tell pflogsumm what that is. +\& +\& See the discussion about the use of this option under +\& "NOTES," below. +\& +\& \-u top to display in user reports. 0 == none. +\& +\& See also: "\-h" and "\-\-*_detail" options for further +\& report\-limiting options. +\& +\& \-\-verbose_msg_detail +\& +\& For the message deferral, bounce and reject summaries: +\& display the full "reason", rather than a truncated one. +\& +\& Note: this can result in quite long lines in the report. +\& +\& \-\-verp_mung do "VERP" generated address (?) munging. Convert +\& \-\-verp_mung=2 sender addresses of the form +\& "list\-return\-NN\-someuser=some.dom@host.sender.dom" +\& to +\& "list\-return\-ID\-someuser=some.dom@host.sender.dom" +\& +\& In other words: replace the numeric value with "ID". +\& +\& By specifying the optional "=2" (second form), the +\& munging is more "aggressive", converting the address +\& to something like: +\& +\& "list\-return@host.sender.dom" +\& +\& Actually: specifying anything less than 2 does the +\& "simple" munging and anything greater than 1 results +\& in the more "aggressive" hack being applied. +\& +\& See "NOTES" regarding this option. +\& +\& \-\-version Print program name and version and bail out. +\& +\& \-\-zero_fill "Zero\-fill" certain arrays so reports come out with +\& data in columns that that might otherwise be blank. +.Ve +.SH "RETURN VALUE" +.IX Header "RETURN VALUE" +.Vb 1 +\& Pflogsumm doesn\*(Aqt return anything of interest to the shell. +.Ve +.SH "ERRORS" +.IX Header "ERRORS" +.Vb 1 +\& Error messages are emitted to stderr. +.Ve +.SH "EXAMPLES" +.IX Header "EXAMPLES" +.Vb 1 +\& Produce a report of previous day\*(Aqs activities: +\& +\& pflogsumm.pl \-d yesterday /var/log/maillog +\& +\& A report of prior week\*(Aqs activities (after logs rotated): +\& +\& pflogsumm.pl /var/log/maillog.0 +\& +\& What\*(Aqs happened so far today: +\& +\& pflogsumm.pl \-d today /var/log/maillog +\& +\& Crontab entry to generate a report of the previous day\*(Aqs activity +\& at 10 minutes after midnight. +\& +\& 10 0 * * * /usr/local/sbin/pflogsumm \-d yesterday /var/log/maillog +\& 2>&1 |/usr/bin/mailx \-s "\`uname \-n\` daily mail stats" postmaster +\& +\& Crontab entry to generate a report for the prior week\*(Aqs activity. +\& (This example assumes one rotates ones mail logs weekly, some time +\& before 4:10 a.m. on Sunday.) +\& +\& 10 4 * * 0 /usr/local/sbin/pflogsumm /var/log/maillog.0 +\& 2>&1 |/usr/bin/mailx \-s "\`uname \-n\` weekly mail stats" postmaster +\& +\& The two crontab examples, above, must actually be a single line +\& each. They\*(Aqre broken\-up into two\-or\-more lines due to page +\& formatting issues. +.Ve +.SH "SEE ALSO" +.IX Header "SEE ALSO" +.Vb 1 +\& The pflogsumm FAQ: pflogsumm\-faq.txt. +.Ve +.SH "NOTES" +.IX Header "NOTES" +.Vb 3 +\& Pflogsumm makes no attempt to catch/parse non\-Postfix log +\& entries. Unless it has "postfix/" in the log entry, it will be +\& ignored. +\& +\& It\*(Aqs important that the logs are presented to pflogsumm in +\& chronological order so that message sizes are available when +\& needed. +\& +\& For display purposes: integer values are munged into "kilo" and +\& "mega" notation as they exceed certain values. I chose the +\& admittedly arbitrary boundaries of 512k and 512m as the points at +\& which to do this\-\-my thinking being 512x was the largest number +\& (of digits) that most folks can comfortably grok at\-a\-glance. +\& These are "computer" "k" and "m", not 1000 and 1,000,000. You +\& can easily change all of this with some constants near the +\& beginning of the program. +\& +\& "Items\-per\-day" reports are not generated for single\-day +\& reports. For multiple\-day reports: "Items\-per\-hour" numbers are +\& daily averages (reflected in the report headings). +\& +\& Message rejects, reject warnings, holds and discards are all +\& reported under the "rejects" column for the Per\-Hour and Per\-Day +\& traffic summaries. +\& +\& Verp munging may not always result in correct address and +\& address\-count reduction. +\& +\& Verp munging is always in a state of experimentation. The use +\& of this option may result in inaccurate statistics with regards +\& to the "senders" count. +\& +\& UUCP\-style bang\-path handling needs more work. Particularly if +\& Postfix is not being run with "swap_bangpath = yes" and/or *is* being +\& run with "append_dot_mydomain = yes", the detailed by\-message report +\& may not be sorted correctly by\-domain\-by\-user. (Also depends on +\& upstream MTA, I suspect.) +\& +\& The "percent rejected" and "percent discarded" figures are only +\& approximations. They are calculated as follows (example is for +\& "percent rejected"): +\& +\& percent rejected = +\& +\& (rejected / (delivered + rejected + discarded)) * 100 +\& +\& There are some issues with the use of \-\-syslog_name. The problem is +\& that, even with $syslog_name set, Postfix will sometimes still log +\& things with "postfix" as the syslog_name. This is noted in +\& /etc/postfix/sample\-misc.cf: +\& +\& # Beware: a non\-default syslog_name setting takes effect only +\& # after process initialization. Some initialization errors will be +\& # logged with the default name, especially errors while parsing +\& # the command line and errors while accessing the Postfix main.cf +\& # configuration file. +\& +\& As a consequence, pflogsumm must always look for "postfix," in logs, +\& as well as whatever is supplied for syslog_name. +\& +\& Where this becomes an issue is where people are running two or more +\& instances of Postfix, logging to the same file. In such a case: +\& +\& . Neither instance may use the default "postfix" syslog name +\& and... +\& +\& . Log entries that fall victim to what\*(Aqs described in +\& sample\-misc.cf will be reported under "postfix", so that if +\& you\*(Aqre running pflogsumm twice, once for each syslog_name, such +\& log entries will show up in each report. +\& +\& The Pflogsumm Home Page is at: +\& +\& http://jimsun.LinxNet.com/postfix_contrib.html +.Ve +.SH "REQUIREMENTS" +.IX Header "REQUIREMENTS" +.Vb 3 +\& For certain options (e.g.: \-\-smtpd_stats), Pflogsumm requires the +\& Date::Calc module, which can be obtained from CPAN at +\& http://www.perl.com. +\& +\& Pflogsumm is currently written and tested under Perl 5.8.3. +\& As of version 19990413\-02, pflogsumm worked with Perl 5.003, but +\& future compatibility is not guaranteed. +.Ve +.SH "LICENSE" +.IX Header "LICENSE" +.Vb 4 +\& This program is free software; you can redistribute it and/or +\& modify it under the terms of the GNU General Public License +\& as published by the Free Software Foundation; either version 2 +\& of the License, or (at your option) any later version. +\& +\& This program is distributed in the hope that it will be useful, +\& but WITHOUT ANY WARRANTY; without even the implied warranty of +\& MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +\& GNU General Public License for more details. +\& +\& You may have received a copy of the GNU General Public License +\& along with this program; if not, write to the Free Software +\& Foundation, Inc., 59 Temple Place \- Suite 330, Boston, MA 02111\-1307, +\& USA. +\& +\& An on\-line copy of the GNU General Public License can be found +\& http://www.fsf.org/copyleft/gpl.html. +.Ve diff --git a/pflogsumm.pl b/pflogsumm.pl new file mode 100755 index 0000000..12b703d --- /dev/null +++ b/pflogsumm.pl @@ -0,0 +1,1744 @@ +#!/usr/bin/perl -w +eval 'exec perl -S $0 "$@"' + if 0; + +=head1 NAME + +pflogsumm.pl - Produce Postfix MTA logfile summary + +Copyright (C) 1998-2010 by James S. Seymour, Release 1.1.3. + +=head1 SYNOPSIS + + pflogsumm.pl -[eq] [-d ] [--detail ] + [--bounce_detail ] [--deferral_detail ] + [-h ] [-i|--ignore_case] [--iso_date_time] [--mailq] + [-m|--uucp_mung] [--no_bounce_detail] [--no_deferral_detail] + [--no_no_msg_size] [--no_reject_detail] [--no_smtpd_warnings] + [--problems_first] [--rej_add_from] [--reject_detail ] + [--smtp_detail ] [--smtpd_stats] + [--smtpd_warning_detail ] [--syslog_name=string] + [-u ] [--verbose_msg_detail] [--verp_mung[=]] + [--zero_fill] [file1 [filen]] + + pflogsumm.pl -[help|version] + + If no file(s) specified, reads from stdin. Output is to stdout. + +=head1 DESCRIPTION + + Pflogsumm is a log analyzer/summarizer for the Postfix MTA. It is + designed to provide an over-view of Postfix activity, with just enough + detail to give the administrator a "heads up" for potential trouble + spots. + + Pflogsumm generates summaries and, in some cases, detailed reports of + mail server traffic volumes, rejected and bounced email, and server + warnings, errors and panics. + +=head1 OPTIONS + + --bounce_detail + + Limit detailed bounce reports to the top . 0 + to suppress entirely. + + -d today generate report for just today + -d yesterday generate report for just "yesterday" + + --deferral_detail + + Limit detailed deferral reports to the top . 0 + to suppress entirely. + + --detail + + Sets all --*_detail, -h and -u to . Is + over-ridden by individual settings. --detail 0 + suppresses *all* detail. + + -e extended (extreme? excessive?) detail + + Emit detailed reports. At present, this includes + only a per-message report, sorted by sender domain, + then user-in-domain, then by queue i.d. + + WARNING: the data built to generate this report can + quickly consume very large amounts of memory if a + lot of log entries are processed! + + -h top to display in host/domain reports. + + 0 = none. + + See also: "-u" and "--*_detail" options for further + report-limiting options. + + --help Emit short usage message and bail out. + + (By happy coincidence, "-h" alone does much the same, + being as it requires a numeric argument :-). Yeah, I + know: lame.) + + -i + --ignore_case Handle complete email address in a case-insensitive + manner. + + Normally pflogsumm lower-cases only the host and + domain parts, leaving the user part alone. This + option causes the entire email address to be lower- + cased. + + --iso_date_time + + For summaries that contain date or time information, + use ISO 8601 standard formats (CCYY-MM-DD and HH:MM), + rather than "Mon DD CCYY" and "HHMM". + + -m modify (mung?) UUCP-style bang-paths + --uucp_mung + + This is for use when you have a mix of Internet-style + domain addresses and UUCP-style bang-paths in the log. + Upstream UUCP feeds sometimes mung Internet domain + style address into bang-paths. This option can + sometimes undo the "damage". For example: + "somehost.dom!username@foo" (where "foo" is the next + host upstream and "somehost.dom" was whence the email + originated) will get converted to + "foo!username@somehost.dom". This also affects the + extended detail report (-e), to help ensure that by- + domain-by-name sorting is more accurate. + + --mailq Run "mailq" command at end of report. + + Merely a convenience feature. (Assumes that "mailq" + is in $PATH. See "$mailqCmd" variable to path thisi + if desired.) + + --no_bounce_detail + --no_deferral_detail + --no_reject_detail + + These switches are depreciated in favour of + --bounce_detail, --deferral_detail and + --reject_detail, respectively. + + Suppresses the printing of the following detailed + reports, respectively: + + message bounce detail (by relay) + message deferral detail + message reject detail + + See also: "-u" and "-h" for further report-limiting + options. + + --no_no_msg_size + + Do not emit report on "Messages with no size data". + + Message size is reported only by the queue manager. + The message may be delivered long-enough after the + (last) qmgr log entry that the information is not in + the log(s) processed by a particular run of + pflogsumm.pl. This throws off "Recipients by message + size" and the total for "bytes delivered." These are + normally reported by pflogsumm as "Messages with no + size data." + + --no_smtpd_warnings + + This switch is depreciated in favour of + smtpd_warning_detail + + On a busy mail server, say at an ISP, SMTPD warnings + can result in a rather sizeable report. This option + turns reporting them off. + + --problems_first + + Emit "problems" reports (bounces, defers, warnings, + etc.) before "normal" stats. + + --rej_add_from + For those reject reports that list IP addresses or + host/domain names: append the email from address to + each listing. (Does not apply to "Improper use of + SMTP command pipelining" report.) + + -q quiet - don't print headings for empty reports + + note: headings for warning, fatal, and "master" + messages will always be printed. + + --reject_detail + + Limit detailed smtpd reject, warn, hold and discard + reports to the top . 0 to suppress entirely. + + --smtp_detail + + Limit detailed smtp delivery reports to the top . + 0 to suppress entirely. + + --smtpd_stats + + Generate smtpd connection statistics. + + The "per-day" report is not generated for single-day + reports. For multiple-day reports: "per-hour" numbers + are daily averages (reflected in the report heading). + + --smtpd_warning_detail + + Limit detailed smtpd warnings reports to the top . + 0 to suppress entirely. + + --syslog_name=name + + Set syslog_name to look for for Postfix log entries. + + By default, pflogsumm looks for entries in logfiles + with a syslog name of "postfix," the default. + If you've set a non-default "syslog_name" parameter + in your Postfix configuration, use this option to + tell pflogsumm what that is. + + See the discussion about the use of this option under + "NOTES," below. + + -u top to display in user reports. 0 == none. + + See also: "-h" and "--*_detail" options for further + report-limiting options. + + --verbose_msg_detail + + For the message deferral, bounce and reject summaries: + display the full "reason", rather than a truncated one. + + Note: this can result in quite long lines in the report. + + --verp_mung do "VERP" generated address (?) munging. Convert + --verp_mung=2 sender addresses of the form + "list-return-NN-someuser=some.dom@host.sender.dom" + to + "list-return-ID-someuser=some.dom@host.sender.dom" + + In other words: replace the numeric value with "ID". + + By specifying the optional "=2" (second form), the + munging is more "aggressive", converting the address + to something like: + + "list-return@host.sender.dom" + + Actually: specifying anything less than 2 does the + "simple" munging and anything greater than 1 results + in the more "aggressive" hack being applied. + + See "NOTES" regarding this option. + + --version Print program name and version and bail out. + + --zero_fill "Zero-fill" certain arrays so reports come out with + data in columns that that might otherwise be blank. + +=head1 RETURN VALUE + + Pflogsumm doesn't return anything of interest to the shell. + +=head1 ERRORS + + Error messages are emitted to stderr. + +=head1 EXAMPLES + + Produce a report of previous day's activities: + + pflogsumm.pl -d yesterday /var/log/maillog + + A report of prior week's activities (after logs rotated): + + pflogsumm.pl /var/log/maillog.0 + + What's happened so far today: + + pflogsumm.pl -d today /var/log/maillog + + Crontab entry to generate a report of the previous day's activity + at 10 minutes after midnight. + + 10 0 * * * /usr/local/sbin/pflogsumm -d yesterday /var/log/maillog + 2>&1 |/usr/bin/mailx -s "`uname -n` daily mail stats" postmaster + + Crontab entry to generate a report for the prior week's activity. + (This example assumes one rotates ones mail logs weekly, some time + before 4:10 a.m. on Sunday.) + + 10 4 * * 0 /usr/local/sbin/pflogsumm /var/log/maillog.0 + 2>&1 |/usr/bin/mailx -s "`uname -n` weekly mail stats" postmaster + + The two crontab examples, above, must actually be a single line + each. They're broken-up into two-or-more lines due to page + formatting issues. + +=head1 SEE ALSO + + The pflogsumm FAQ: pflogsumm-faq.txt. + +=head1 NOTES + + Pflogsumm makes no attempt to catch/parse non-Postfix log + entries. Unless it has "postfix/" in the log entry, it will be + ignored. + + It's important that the logs are presented to pflogsumm in + chronological order so that message sizes are available when + needed. + + For display purposes: integer values are munged into "kilo" and + "mega" notation as they exceed certain values. I chose the + admittedly arbitrary boundaries of 512k and 512m as the points at + which to do this--my thinking being 512x was the largest number + (of digits) that most folks can comfortably grok at-a-glance. + These are "computer" "k" and "m", not 1000 and 1,000,000. You + can easily change all of this with some constants near the + beginning of the program. + + "Items-per-day" reports are not generated for single-day + reports. For multiple-day reports: "Items-per-hour" numbers are + daily averages (reflected in the report headings). + + Message rejects, reject warnings, holds and discards are all + reported under the "rejects" column for the Per-Hour and Per-Day + traffic summaries. + + Verp munging may not always result in correct address and + address-count reduction. + + Verp munging is always in a state of experimentation. The use + of this option may result in inaccurate statistics with regards + to the "senders" count. + + UUCP-style bang-path handling needs more work. Particularly if + Postfix is not being run with "swap_bangpath = yes" and/or *is* being + run with "append_dot_mydomain = yes", the detailed by-message report + may not be sorted correctly by-domain-by-user. (Also depends on + upstream MTA, I suspect.) + + The "percent rejected" and "percent discarded" figures are only + approximations. They are calculated as follows (example is for + "percent rejected"): + + percent rejected = + + (rejected / (delivered + rejected + discarded)) * 100 + + There are some issues with the use of --syslog_name. The problem is + that, even with $syslog_name set, Postfix will sometimes still log + things with "postfix" as the syslog_name. This is noted in + /etc/postfix/sample-misc.cf: + + # Beware: a non-default syslog_name setting takes effect only + # after process initialization. Some initialization errors will be + # logged with the default name, especially errors while parsing + # the command line and errors while accessing the Postfix main.cf + # configuration file. + + As a consequence, pflogsumm must always look for "postfix," in logs, + as well as whatever is supplied for syslog_name. + + Where this becomes an issue is where people are running two or more + instances of Postfix, logging to the same file. In such a case: + + . Neither instance may use the default "postfix" syslog name + and... + + . Log entries that fall victim to what's described in + sample-misc.cf will be reported under "postfix", so that if + you're running pflogsumm twice, once for each syslog_name, such + log entries will show up in each report. + + The Pflogsumm Home Page is at: + + http://jimsun.LinxNet.com/postfix_contrib.html + +=head1 REQUIREMENTS + + For certain options (e.g.: --smtpd_stats), Pflogsumm requires the + Date::Calc module, which can be obtained from CPAN at + http://www.perl.com. + + Pflogsumm is currently written and tested under Perl 5.8.3. + As of version 19990413-02, pflogsumm worked with Perl 5.003, but + future compatibility is not guaranteed. + +=head1 LICENSE + + This program is free software; you can redistribute it and/or + modify it under the terms of the GNU General Public License + as published by the Free Software Foundation; either version 2 + of the License, or (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You may have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, + USA. + + An on-line copy of the GNU General Public License can be found + http://www.fsf.org/copyleft/gpl.html. + +=cut + +use strict; +use locale; +use Getopt::Long; +eval { require Date::Calc }; +my $hasDateCalc = $@ ? 0 : 1; + +my $mailqCmd = "mailq"; +my $release = "1.1.3"; + +# Variables and constants used throughout pflogsumm +use vars qw( + $progName + $usageMsg + %opts + $divByOneKAt $divByOneMegAt $oneK $oneMeg + @monthNames %monthNums $thisYr $thisMon + $msgCntI $msgSizeI $msgDfrsI $msgDlyAvgI $msgDlyMaxI + $isoDateTime +); + +# Some constants used by display routines. I arbitrarily chose to +# display in kilobytes and megabytes at the 512k and 512m boundaries, +# respectively. Season to taste. +$divByOneKAt = 524288; # 512k +$divByOneMegAt = 536870912; # 512m +$oneK = 1024; # 1k +$oneMeg = 1048576; # 1m + +# Constants used throughout pflogsumm +@monthNames = qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec); +%monthNums = qw( + Jan 0 Feb 1 Mar 2 Apr 3 May 4 Jun 5 + Jul 6 Aug 7 Sep 8 Oct 9 Nov 10 Dec 11); +($thisMon, $thisYr) = (localtime(time()))[4,5]; +$thisYr += 1900; + +# +# Variables used only in main loop +# +# Per-user data +my (%recipUser, $recipUserCnt); +my (%sendgUser, $sendgUserCnt); +# Per-domain data +my (%recipDom, $recipDomCnt); # recipient domain data +my (%sendgDom, $sendgDomCnt); # sending domain data +# Indexes for arrays in above +$msgCntI = 0; # message count +$msgSizeI = 1; # total messages size +$msgDfrsI = 2; # number of defers +$msgDlyAvgI = 3; # total of delays (used for averaging) +$msgDlyMaxI = 4; # max delay + +my ( + $cmd, $qid, $addr, $size, $relay, $status, $delay, + $dateStr, + %panics, %fatals, %warnings, %masterMsgs, + %msgSizes, + %deferred, %bounced, + %noMsgSize, %msgDetail, + $msgsRcvd, $msgsDlvrd, $sizeRcvd, $sizeDlvrd, + $msgMonStr, $msgMon, $msgDay, $msgTimeStr, $msgHr, $msgMin, $msgSec, + $msgYr, + $revMsgDateStr, $dayCnt, %msgsPerDay, + %rejects, $msgsRjctd, + %warns, $msgsWrnd, + %discards, $msgsDscrdd, + %holds, $msgsHld, + %rcvdMsg, $msgsFwdd, $msgsBncd, + $msgsDfrdCnt, $msgsDfrd, %msgDfrdFlgs, + %connTime, %smtpdPerDay, %smtpdPerDom, $smtpdConnCnt, $smtpdTotTime, + %smtpMsgs +); +$dayCnt = $smtpdConnCnt = $smtpdTotTime = 0; + +# Init total messages delivered, rejected, and discarded +$msgsDlvrd = $msgsRjctd = $msgsDscrdd = 0; + +# Init messages received and delivered per hour +my @rcvPerHr = (0) x 24; +my @dlvPerHr = @rcvPerHr; +my @dfrPerHr = @rcvPerHr; # defers per hour +my @bncPerHr = @rcvPerHr; # bounces per hour +my @rejPerHr = @rcvPerHr; # rejects per hour +my $lastMsgDay = 0; + +# Init "doubly-sub-scripted array": cnt, total and max time per-hour +my @smtpdPerHr; +for (0 .. 23) { + $smtpdPerHr[$_] = [0,0,0]; +} + +$progName = "pflogsumm.pl"; +$usageMsg = + "usage: $progName -[eq] [-d ] [--detail ] + [--bounce_detail ] [--deferral_detail ] + [-h ] [-i|--ignore_case] [--iso_date_time] [--mailq] + [-m|--uucp_mung] [--no_bounce_detail] [--no_deferral_detail] + [--no_no_msg_size] [--no_reject_detail] [--no_smtpd_warnings] + [--problems_first] [--rej_add_from] [--reject_detail ] + [--smtp_detail ] [--smtpd_stats] + [--smtpd_warning_detail ] [--syslog_name=string] + [-u ] [--verbose_msg_detail] [--verp_mung[=]] + [--zero_fill] [file1 [filen]] + + $progName --[version|help]"; + +# Some pre-inits for convenience +$isoDateTime = 0; # Don't use ISO date/time formats +GetOptions( + "bounce_detail=i" => \$opts{'bounceDetail'}, + "d=s" => \$opts{'d'}, + "deferral_detail=i" => \$opts{'deferralDetail'}, + "detail=i" => \$opts{'detail'}, + "e" => \$opts{'e'}, + "help" => \$opts{'help'}, + "h=i" => \$opts{'h'}, + "ignore_case" => \$opts{'i'}, + "i" => \$opts{'i'}, + "iso_date_time" => \$isoDateTime, + "mailq" => \$opts{'mailq'}, + "m" => \$opts{'m'}, + "no_bounce_detail" => \$opts{'noBounceDetail'}, + "no_deferral_detail" => \$opts{'noDeferralDetail'}, + "no_no_msg_size" => \$opts{'noNoMsgSize'}, + "no_reject_detail" => \$opts{'noRejectDetail'}, + "no_smtpd_warnings" => \$opts{'noSMTPDWarnings'}, + "problems_first" => \$opts{'pf'}, + "q" => \$opts{'q'}, + "rej_add_from" => \$opts{'rejAddFrom'}, + "reject_detail=i" => \$opts{'rejectDetail'}, + "smtp_detail=i" => \$opts{'smtpDetail'}, + "smtpd_stats" => \$opts{'smtpdStats'}, + "smtpd_warning_detail=i" => \$opts{'smtpdWarnDetail'}, + "syslog_name=s" => \$opts{'syslogName'}, + "u=i" => \$opts{'u'}, + "uucp_mung" => \$opts{'m'}, + "verbose_msg_detail" => \$opts{'verbMsgDetail'}, + "verp_mung:i" => \$opts{'verpMung'}, + "version" => \$opts{'version'}, + "zero_fill" => \$opts{'zeroFill'} +) || die "$usageMsg\n"; + +# internally: 0 == none, undefined == -1 == all +$opts{'h'} = -1 unless(defined($opts{'h'})); +$opts{'u'} = -1 unless(defined($opts{'u'})); +$opts{'bounceDetail'} = -1 unless(defined($opts{'bounceDetail'})); +$opts{'deferralDetail'} = -1 unless(defined($opts{'deferralDetail'})); +$opts{'smtpDetail'} = -1 unless(defined($opts{'smtpDetail'})); +$opts{'smtpdWarnDetail'} = -1 unless(defined($opts{'smtpdWarnDetail'})); +$opts{'rejectDetail'} = -1 unless(defined($opts{'rejectDetail'})); + +# These go away eventually +if(defined($opts{'noBounceDetail'})) { + $opts{'bounceDetail'} = 0; + warn "$progName: \"no_bounce_detail\" is depreciated, use \"bounce_detail=0\" instead\n" +} +if(defined($opts{'noDeferralDetail'})) { + $opts{'deferralDetail'} = 0; + warn "$progName: \"no_deferral_detail\" is depreciated, use \"deferral_detail=0\" instead\n" +} +if(defined($opts{'noRejectDetail'})) { + $opts{'rejectDetail'} = 0; + warn "$progName: \"no_reject_detail\" is depreciated, use \"reject_detail=0\" instead\n" +} +if(defined($opts{'noSMTPDWarnings'})) { + $opts{'smtpdWarnDetail'} = 0; + warn "$progName: \"no_smtpd_warnings\" is depreciated, use \"smtpd_warning_detail=0\" instead\n" +} + +# If --detail was specified, set anything that's not enumerated to it +if(defined($opts{'detail'})) { + foreach my $optName (qw (h u bounceDetail deferralDetail smtpDetail smtpdWarnDetail rejectDetail)) { + $opts{$optName} = $opts{'detail'} unless($opts{"$optName"} != -1); + } +} + +my $syslogName = $opts{'syslogName'}? $opts{'syslogName'} : "postfix"; + +if(defined($opts{'help'})) { + print "$usageMsg\n"; + exit 0; +} + +if(defined($opts{'version'})) { + print "$progName $release\n"; + exit 0; +} + +if($hasDateCalc) { + # manually import the Date::Calc routine we want + # + # This looks stupid, but it's the only way to shut Perl up about + # "Date::Calc::Delta_DHMS" used only once" if -w is on. (No, + # $^W = 0 doesn't work in this context.) + *Delta_DHMS = *Date::Calc::Delta_DHMS; + *Delta_DHMS = *Date::Calc::Delta_DHMS; + +} elsif(defined($opts{'smtpdStats'})) { + # If user specified --smtpd_stats but doesn't have Date::Calc + # installed, die with friendly help message. + die < unprocessed") || +# die "couldn't open \"unprocessed\": $!\n"; + +while(<>) { + next if(defined($dateStr) && ! /^$dateStr/o); + s/: \[ID \d+ [^\]]+\] /: /o; # lose "[ID nnnnnn some.thing]" stuff + my $logRmdr; + + # "Traditional" timestamp format? + if((($msgMonStr, $msgDay, $msgHr, $msgMin, $msgSec, $logRmdr) = + /^(...) {1,2}(\d{1,2}) (\d{2}):(\d{2}):(\d{2}) \S+ (.+)$/o) == 6) + { + # Convert string to numeric value for later "month rollover" check + $msgMon = $monthNums{$msgMonStr}; + } else { + # RFC 3339 timestamp format? + next unless((($msgYr, $msgMon, $msgDay, $msgHr, $msgMin, $msgSec, $logRmdr) = + /^(\d{4})-(\d{2})-(\d{2})T(\d{2}):(\d{2}):(\d{2})(?:[\+\-](?:\d{2}):(?:\d{2})|Z) \S+ (.+)$/o) == 10); + # RFC 3339 months start at "1", we index from 0 + --$msgMon; + } + + unless((($cmd, $qid) = $logRmdr =~ m#^(?:postfix|$syslogName)/([^\[:]*).*?: ([^:\s]+)#o) == 2 || + (($cmd, $qid) = $logRmdr =~ m#^((?:postfix)(?:-script)?)(?:\[\d+\])?: ([^:\s]+)#o) == 2) + { + #print UNPROCD "$_"; + next; + } + chomp; + + # If the log line's month is greater than our current month, + # we've probably had a year rollover + # FIXME: For processing old logfiles: This is a broken test! + $msgYr = ($msgMon > $thisMon? $thisYr - 1 : $thisYr); + + # the following test depends on one getting more than one message a + # month--or at least that successive messages don't arrive on the + # same month-day in successive months :-) + unless($msgDay == $lastMsgDay) { + $lastMsgDay = $msgDay; + $revMsgDateStr = sprintf "%d%02d%02d", $msgYr, $msgMon, $msgDay; + ++$dayCnt; + if(defined($opts{'zeroFill'})) { + ${$msgsPerDay{$revMsgDateStr}}[4] = 0; + } + } + + # regexp rejects happen in "cleanup" + if($cmd eq "cleanup" && (my($rejSubTyp, $rejReas, $rejRmdr) = $logRmdr =~ + /\/cleanup\[\d+\]: .*?\b(reject|warning|hold|discard): (header|body) (.*)$/o) == 3) + { + $rejRmdr =~ s/( from \S+?)?; from=<.*$//o unless($opts{'verbMsgDetail'}); + $rejRmdr = string_trimmer($rejRmdr, 64, $opts{'verbMsgDetail'}); + if($rejSubTyp eq "reject") { + ++$rejects{$cmd}{$rejReas}{$rejRmdr} unless($opts{'rejectDetail'} == 0); + ++$msgsRjctd; + } elsif($rejSubTyp eq "warning") { + ++$warns{$cmd}{$rejReas}{$rejRmdr} unless($opts{'rejectDetail'} == 0); + ++$msgsWrnd; + } elsif($rejSubTyp eq "hold") { + ++$holds{$cmd}{$rejReas}{$rejRmdr} unless($opts{'rejectDetail'} == 0); + ++$msgsHld; + } elsif($rejSubTyp eq "discard") { + ++$discards{$cmd}{$rejReas}{$rejRmdr} unless($opts{'rejectDetail'} == 0); + ++$msgsDscrdd; + } + ++$rejPerHr[$msgHr]; + ++${$msgsPerDay{$revMsgDateStr}}[4]; + } elsif($qid eq 'warning') { + (my $warnReas = $logRmdr) =~ s/^.*warning: //o; + $warnReas = string_trimmer($warnReas, 66, $opts{'verbMsgDetail'}); + unless($cmd eq "smtpd" && $opts{'noSMTPDWarnings'}) { + ++$warnings{$cmd}{$warnReas}; + } + } elsif($qid eq 'fatal') { + (my $fatalReas = $logRmdr) =~ s/^.*fatal: //o; + $fatalReas = string_trimmer($fatalReas, 66, $opts{'verbMsgDetail'}); + ++$fatals{$cmd}{$fatalReas}; + } elsif($qid eq 'panic') { + (my $panicReas = $logRmdr) =~ s/^.*panic: //o; + $panicReas = string_trimmer($panicReas, 66, $opts{'verbMsgDetail'}); + ++$panics{$cmd}{$panicReas}; + } elsif($qid eq 'reject') { + proc_smtpd_reject($logRmdr, \%rejects, \$msgsRjctd, \$rejPerHr[$msgHr], + \${$msgsPerDay{$revMsgDateStr}}[4]); + } elsif($qid eq 'reject_warning') { + proc_smtpd_reject($logRmdr, \%warns, \$msgsWrnd, \$rejPerHr[$msgHr], + \${$msgsPerDay{$revMsgDateStr}}[4]); + } elsif($qid eq 'hold') { + proc_smtpd_reject($logRmdr, \%holds, \$msgsHld, \$rejPerHr[$msgHr], + \${$msgsPerDay{$revMsgDateStr}}[4]); + } elsif($qid eq 'discard') { + proc_smtpd_reject($logRmdr, \%discards, \$msgsDscrdd, \$rejPerHr[$msgHr], + \${$msgsPerDay{$revMsgDateStr}}[4]); + } elsif($cmd eq 'master') { + ++$masterMsgs{(split(/^.*master.*: /, $logRmdr))[1]}; + } elsif($cmd eq 'smtpd') { + if($logRmdr =~ /\[\d+\]: \w+: client=(.+?)(,|$)/o) { + # + # Warning: this code in two places! + # + ++$rcvPerHr[$msgHr]; + ++${$msgsPerDay{$revMsgDateStr}}[0]; + ++$msgsRcvd; + $rcvdMsg{$qid} = gimme_domain($1); # Whence it came + } elsif(my($rejSubTyp) = $logRmdr =~ /\[\d+\]: \w+: (reject(?:_warning)?|hold|discard): /o) { + if($rejSubTyp eq 'reject') { + proc_smtpd_reject($logRmdr, \%rejects, \$msgsRjctd, + \$rejPerHr[$msgHr], + \${$msgsPerDay{$revMsgDateStr}}[4]); + } elsif($rejSubTyp eq 'reject_warning') { + proc_smtpd_reject($logRmdr, \%warns, \$msgsWrnd, + \$rejPerHr[$msgHr], + \${$msgsPerDay{$revMsgDateStr}}[4]); + } elsif($rejSubTyp eq 'hold') { + proc_smtpd_reject($logRmdr, \%holds, \$msgsHld, + \$rejPerHr[$msgHr], + \${$msgsPerDay{$revMsgDateStr}}[4]); + } elsif($rejSubTyp eq 'discard') { + proc_smtpd_reject($logRmdr, \%discards, \$msgsDscrdd, + \$rejPerHr[$msgHr], + \${$msgsPerDay{$revMsgDateStr}}[4]); + } + } + else { + next unless(defined($opts{'smtpdStats'})); + if($logRmdr =~ /: connect from /o) { + $logRmdr =~ /\/smtpd\[(\d+)\]: /o; + @{$connTime{$1}} = + ($msgYr, $msgMon + 1, $msgDay, $msgHr, $msgMin, $msgSec); + } elsif($logRmdr =~ /: disconnect from /o) { + my ($pid, $hostID) = $logRmdr =~ /\/smtpd\[(\d+)\]: disconnect from (.+)$/o; + if(exists($connTime{$pid})) { + $hostID = gimme_domain($hostID); + my($d, $h, $m, $s) = Delta_DHMS(@{$connTime{$pid}}, + $msgYr, $msgMon + 1, $msgDay, $msgHr, $msgMin, $msgSec); + delete($connTime{$pid}); # dispose of no-longer-needed item + my $tSecs = (86400 * $d) + (3600 * $h) + (60 * $m) + $s; + + ++$smtpdPerHr[$msgHr][0]; + $smtpdPerHr[$msgHr][1] += $tSecs; + $smtpdPerHr[$msgHr][2] = $tSecs if($tSecs > $smtpdPerHr[$msgHr][2]); + + unless(${$smtpdPerDay{$revMsgDateStr}}[0]++) { + ${$smtpdPerDay{$revMsgDateStr}}[1] = 0; + ${$smtpdPerDay{$revMsgDateStr}}[2] = 0; + } + ${$smtpdPerDay{$revMsgDateStr}}[1] += $tSecs; + ${$smtpdPerDay{$revMsgDateStr}}[2] = $tSecs + if($tSecs > ${$smtpdPerDay{$revMsgDateStr}}[2]); + + unless(${$smtpdPerDom{$hostID}}[0]++) { + ${$smtpdPerDom{$hostID}}[1] = 0; + ${$smtpdPerDom{$hostID}}[2] = 0; + } + ${$smtpdPerDom{$hostID}}[1] += $tSecs; + ${$smtpdPerDom{$hostID}}[2] = $tSecs + if($tSecs > ${$smtpdPerDom{$hostID}}[2]); + + ++$smtpdConnCnt; + $smtpdTotTime += $tSecs; + } + } + } + } else { + my $toRmdr; + if((($addr, $size) = $logRmdr =~ /from=<([^>]*)>, size=(\d+)/o) == 2) + { + next if($msgSizes{$qid}); # avoid double-counting! + if($addr) { + if($opts{'m'} && $addr =~ /^(.*!)*([^!]+)!([^!@]+)@([^\.]+)$/o) { + $addr = "$4!" . ($1? "$1" : "") . $3 . "\@$2"; + } + $addr =~ s/(@.+)/\L$1/o unless($opts{'i'}); + $addr = lc($addr) if($opts{'i'}); + $addr = verp_mung($addr); + } else { + $addr = "from=<>" + } + $msgSizes{$qid} = $size; + push(@{$msgDetail{$qid}}, $addr) if($opts{'e'}); + # Avoid counting forwards + if($rcvdMsg{$qid}) { + # Get the domain out of the sender's address. If there is + # none: Use the client hostname/IP-address + my $domAddr; + unless((($domAddr = $addr) =~ s/^[^@]+\@(.+)$/$1/o) == 1) { + $domAddr = $rcvdMsg{$qid} eq "pickup"? $addr : $rcvdMsg{$qid}; + } + ++$sendgDomCnt + unless(${$sendgDom{$domAddr}}[$msgCntI]); + ++${$sendgDom{$domAddr}}[$msgCntI]; + ${$sendgDom{$domAddr}}[$msgSizeI] += $size; + ++$sendgUserCnt unless(${$sendgUser{$addr}}[$msgCntI]); + ++${$sendgUser{$addr}}[$msgCntI]; + ${$sendgUser{$addr}}[$msgSizeI] += $size; + $sizeRcvd += $size; + delete($rcvdMsg{$qid}); # limit hash size + } + } + elsif((($addr, $relay, $delay, $status, $toRmdr) = $logRmdr =~ + /to=<([^>]*)>, (?:orig_to=<[^>]*>, )?relay=([^,]+), (?:conn_use=[^,]+, )?delay=([^,]+), (?:delays=[^,]+, )?(?:dsn=[^,]+, )?status=(\S+)(.*)$/o) >= 4) + { + + if($opts{'m'} && $addr =~ /^(.*!)*([^!]+)!([^!@]+)@([^\.]+)$/o) { + $addr = "$4!" . ($1? "$1" : "") . $3 . "\@$2"; + } + $addr =~ s/(@.+)/\L$1/o unless($opts{'i'}); + $addr = lc($addr) if($opts{'i'}); + $relay = lc($relay) if($opts{'i'}); + (my $domAddr = $addr) =~ s/^[^@]+\@//o; # get domain only + if($status eq 'sent') { + + # was it actually forwarded, rather than delivered? + if($toRmdr =~ /forwarded as /o) { + ++$msgsFwdd; + next; + } + ++$recipDomCnt unless(${$recipDom{$domAddr}}[$msgCntI]); + ++${$recipDom{$domAddr}}[$msgCntI]; + ${$recipDom{$domAddr}}[$msgDlyAvgI] += $delay; + if(! ${$recipDom{$domAddr}}[$msgDlyMaxI] || + $delay > ${$recipDom{$domAddr}}[$msgDlyMaxI]) + { + ${$recipDom{$domAddr}}[$msgDlyMaxI] = $delay + } + ++$recipUserCnt unless(${$recipUser{$addr}}[$msgCntI]); + ++${$recipUser{$addr}}[$msgCntI]; + ++$dlvPerHr[$msgHr]; + ++${$msgsPerDay{$revMsgDateStr}}[1]; + ++$msgsDlvrd; + if($msgSizes{$qid}) { + ${$recipDom{$domAddr}}[$msgSizeI] += $msgSizes{$qid}; + ${$recipUser{$addr}}[$msgSizeI] += $msgSizes{$qid}; + $sizeDlvrd += $msgSizes{$qid}; + } else { + ${$recipDom{$domAddr}}[$msgSizeI] += 0; + ${$recipUser{$addr}}[$msgSizeI] += 0; + $noMsgSize{$qid} = $addr unless($opts{'noNoMsgSize'}); + push(@{$msgDetail{$qid}}, "(sender not in log)") if($opts{'e'}); + # put this back later? mebbe with -v? + # msg_warn("no message size for qid: $qid"); + } + push(@{$msgDetail{$qid}}, $addr) if($opts{'e'}); + } elsif($status eq 'deferred') { + unless($opts{'deferralDetail'} == 0) { + my ($deferredReas) = $logRmdr =~ /, status=deferred \(([^\)]+)/o; + unless(defined($opts{'verbMsgDetail'})) { + $deferredReas = said_string_trimmer($deferredReas, 65); + $deferredReas =~ s/^\d{3} //o; + $deferredReas =~ s/^connect to //o; + } + ++$deferred{$cmd}{$deferredReas}; + } + ++$dfrPerHr[$msgHr]; + ++${$msgsPerDay{$revMsgDateStr}}[2]; + ++$msgsDfrdCnt; + ++$msgsDfrd unless($msgDfrdFlgs{$qid}++); + ++${$recipDom{$domAddr}}[$msgDfrsI]; + if(! ${$recipDom{$domAddr}}[$msgDlyMaxI] || + $delay > ${$recipDom{$domAddr}}[$msgDlyMaxI]) + { + ${$recipDom{$domAddr}}[$msgDlyMaxI] = $delay + } + } elsif($status eq 'bounced') { + unless($opts{'bounceDetail'} == 0) { + my ($bounceReas) = $logRmdr =~ /, status=bounced \((.+)\)/o; + unless(defined($opts{'verbMsgDetail'})) { + $bounceReas = said_string_trimmer($bounceReas, 66); + $bounceReas =~ s/^\d{3} //o; + } + ++$bounced{$relay}{$bounceReas}; + } + ++$bncPerHr[$msgHr]; + ++${$msgsPerDay{$revMsgDateStr}}[3]; + ++$msgsBncd; + } else { +# print UNPROCD "$_\n"; + } + } + elsif($cmd eq 'pickup' && $logRmdr =~ /: (sender|uid)=/o) { + # + # Warning: this code in two places! + # + ++$rcvPerHr[$msgHr]; + ++${$msgsPerDay{$revMsgDateStr}}[0]; + ++$msgsRcvd; + $rcvdMsg{$qid} = "pickup"; # Whence it came + } + elsif($cmd eq 'smtp' && $opts{'smtpDetail'} != 0) { + # Was an IPv6 problem here + if($logRmdr =~ /.* connect to (\S+?): ([^;]+); address \S+ port.*$/o) { + ++$smtpMsgs{lc($2)}{$1}; + } elsif($logRmdr =~ /.* connect to ([^[]+)\[\S+?\]: (.+?) \(port \d+\)$/o) { + ++$smtpMsgs{lc($2)}{$1}; + } else { +# print UNPROCD "$_\n"; + } + } + else + { +# print UNPROCD "$_\n"; + } + } +} + +# debugging +#close(UNPROCD) || +# die "problem closing \"unprocessed\": $!\n"; + +# Calculate percentage of messages rejected and discarded +my $msgsRjctdPct = 0; +my $msgsDscrddPct = 0; +if(my $msgsTotal = $msgsDlvrd + $msgsRjctd + $msgsDscrdd) { + $msgsRjctdPct = int(($msgsRjctd/$msgsTotal) * 100); + $msgsDscrddPct = int(($msgsDscrdd/$msgsTotal) * 100); +} + +if(defined($dateStr)) { + print "Postfix log summaries for $dateStr\n"; +} + +print_subsect_title("Grand Totals"); +print "messages\n\n"; +printf " %6d%s received\n", adj_int_units($msgsRcvd); +printf " %6d%s delivered\n", adj_int_units($msgsDlvrd); +printf " %6d%s forwarded\n", adj_int_units($msgsFwdd); +printf " %6d%s deferred", adj_int_units($msgsDfrd); +printf " (%d%s deferrals)", adj_int_units($msgsDfrdCnt) if($msgsDfrdCnt); +print "\n"; +printf " %6d%s bounced\n", adj_int_units($msgsBncd); +printf " %6d%s rejected (%d%%)\n", adj_int_units($msgsRjctd), $msgsRjctdPct; +printf " %6d%s reject warnings\n", adj_int_units($msgsWrnd); +printf " %6d%s held\n", adj_int_units($msgsHld); +printf " %6d%s discarded (%d%%)\n", adj_int_units($msgsDscrdd), $msgsDscrddPct; +print "\n"; +printf " %6d%s bytes received\n", adj_int_units($sizeRcvd); +printf " %6d%s bytes delivered\n", adj_int_units($sizeDlvrd); +printf " %6d%s senders\n", adj_int_units($sendgUserCnt); +printf " %6d%s sending hosts/domains\n", adj_int_units($sendgDomCnt); +printf " %6d%s recipients\n", adj_int_units($recipUserCnt); +printf " %6d%s recipient hosts/domains\n", adj_int_units($recipDomCnt); + +if(defined($opts{'smtpdStats'})) { + print "\nsmtpd\n\n"; + printf " %6d%s connections\n", adj_int_units($smtpdConnCnt); + printf " %6d%s hosts/domains\n", adj_int_units(int(keys %smtpdPerDom)); + printf " %6d avg. connect time (seconds)\n", + $smtpdConnCnt > 0? ($smtpdTotTime / $smtpdConnCnt) + .5 : 0; + { + my ($sec, $min, $hr) = get_smh($smtpdTotTime); + printf " %2d:%02d:%02d total connect time\n", + $hr, $min, $sec; + } +} + +print "\n"; + +print_problems_reports() if(defined($opts{'pf'})); + +print_per_day_summary(\%msgsPerDay) if($dayCnt > 1); +print_per_hour_summary(\@rcvPerHr, \@dlvPerHr, \@dfrPerHr, \@bncPerHr, + \@rejPerHr, $dayCnt); + +print_recip_domain_summary(\%recipDom, $opts{'h'}); +print_sending_domain_summary(\%sendgDom, $opts{'h'}); + +if(defined($opts{'smtpdStats'})) { + print_per_day_smtpd(\%smtpdPerDay, $dayCnt) if($dayCnt > 1); + print_per_hour_smtpd(\@smtpdPerHr, $dayCnt); + print_domain_smtpd_summary(\%smtpdPerDom, $opts{'h'}); +} + +print_user_data(\%sendgUser, "Senders by message count", $msgCntI, $opts{'u'}, $opts{'q'}); +print_user_data(\%recipUser, "Recipients by message count", $msgCntI, $opts{'u'}, $opts{'q'}); +print_user_data(\%sendgUser, "Senders by message size", $msgSizeI, $opts{'u'}, $opts{'q'}); +print_user_data(\%recipUser, "Recipients by message size", $msgSizeI, $opts{'u'}, $opts{'q'}); + +print_hash_by_key(\%noMsgSize, "Messages with no size data", 0, 1); + +print_problems_reports() unless(defined($opts{'pf'})); + +print_detailed_msg_data(\%msgDetail, "Message detail", $opts{'q'}) if($opts{'e'}); + +# Print "problems" reports +sub print_problems_reports { + unless($opts{'deferralDetail'} == 0) { + print_nested_hash(\%deferred, "message deferral detail", $opts{'deferralDetail'}, $opts{'q'}); + } + unless($opts{'bounceDetail'} == 0) { + print_nested_hash(\%bounced, "message bounce detail (by relay)", $opts{'bounceDetail'}, $opts{'q'}); + } + unless($opts{'rejectDetail'} == 0) { + print_nested_hash(\%rejects, "message reject detail", $opts{'rejectDetail'}, $opts{'q'}); + print_nested_hash(\%warns, "message reject warning detail", $opts{'rejectDetail'}, $opts{'q'}); + print_nested_hash(\%holds, "message hold detail", $opts{'rejectDetail'}, $opts{'q'}); + print_nested_hash(\%discards, "message discard detail", $opts{'rejectDetail'}, $opts{'q'}); + } + unless($opts{'smtpDetail'} == 0) { + print_nested_hash(\%smtpMsgs, "smtp delivery failures", $opts{'smtpDetail'}, $opts{'q'}); + } + unless($opts{'smtpdWarnDetail'} == 0) { + print_nested_hash(\%warnings, "Warnings", $opts{'smtpdWarnDetail'}, $opts{'q'}); + } + print_nested_hash(\%fatals, "Fatal Errors", 0, $opts{'q'}); + print_nested_hash(\%panics, "Panics", 0, $opts{'q'}); + print_hash_by_cnt_vals(\%masterMsgs,"Master daemon messages", 0, $opts{'q'}); +} + +if($opts{'mailq'}) { + # flush stdout first cuz of asynchronousity + $| = 1; + print_subsect_title("Current Mail Queue"); + system($mailqCmd); +} + +# print "per-day" traffic summary +# (done in a subroutine only to keep main-line code clean) +sub print_per_day_summary { + my($msgsPerDay) = @_; + my $value; + + print_subsect_title("Per-Day Traffic Summary"); + + print < $b } keys(%$msgsPerDay)) { + my ($msgYr, $msgMon, $msgDay) = unpack("A4 A2 A2", $_); + if($isoDateTime) { + printf " %04d-%02d-%02d ", $msgYr, $msgMon + 1, $msgDay + } else { + my $msgMonStr = $monthNames[$msgMon]; + printf " $msgMonStr %2d $msgYr", $msgDay; + } + foreach $value (@{$msgsPerDay->{$_}}) { + my $value2 = $value? $value : 0; + printf " %6d%s", adj_int_units($value2); + } + print "\n"; + } +} + +# print "per-hour" traffic summary +# (done in a subroutine only to keep main-line code clean) +sub print_per_hour_summary { + my ($rcvPerHr, $dlvPerHr, $dfrPerHr, $bncPerHr, $rejPerHr, $dayCnt) = @_; + my $reportType = $dayCnt > 1? 'Daily Average' : 'Summary'; + my ($hour, $value); + + print_subsect_title("Per-Hour Traffic $reportType"); + + print < 0? "(top $cnt)" : ""; + my $avgDly; + + print_subsect_title("Host/Domain Summary: Message Delivery $topCnt"); + + print <{$_}}[$msgCntI]) { + $avgDly = (${$hashRef->{$_}}[$msgDlyAvgI] / + ${$hashRef->{$_}}[$msgCntI]); + } else { + $avgDly = 0; + } + printf " %6d%s %6d%s %6d%s %5.1f %s %5.1f %s %s\n", + adj_int_units(${$hashRef->{$_}}[$msgCntI]), + adj_int_units(${$hashRef->{$_}}[$msgSizeI]), + adj_int_units(${$hashRef->{$_}}[$msgDfrsI]), + adj_time_units($avgDly), + adj_time_units(${$hashRef->{$_}}[$msgDlyMaxI]), + $_; + last if --$cnt == 0; + } +} + +# print "per-sender-domain" traffic summary +# (done in a subroutine only to keep main-line code clean) +sub print_sending_domain_summary { + use vars '$hashRef'; + local($hashRef) = $_[0]; + my($cnt) = $_[1]; + return if($cnt == 0); + my $topCnt = $cnt > 0? "(top $cnt)" : ""; + + print_subsect_title("Host/Domain Summary: Messages Received $topCnt"); + + print <{$_}}[$msgCntI]), + adj_int_units(${$hashRef->{$_}}[$msgSizeI]), + $_; + last if --$cnt == 0; + } +} + +# print "per-user" data sorted in descending order +# order (i.e.: highest first) +sub print_user_data { + my($hashRef, $title, $index, $cnt, $quiet) = @_; + my $dottedLine; + return if($cnt == 0); + $title = sprintf "%s%s", $cnt > 0? "top $cnt " : "", $title; + unless(%$hashRef) { + return if($quiet); + $dottedLine = ": none"; + } else { + $dottedLine = "\n" . "-" x length($title); + } + printf "\n$title$dottedLine\n"; + foreach (map { $_->[0] } + sort { $b->[1] <=> $a->[1] || $a->[2] cmp $b->[2] } + map { [ $_, $hashRef->{$_}[$index], normalize_host($_) ] } + (keys(%$hashRef))) + { + printf " %6d%s %s\n", adj_int_units(${$hashRef->{$_}}[$index]), $_; + last if --$cnt == 0; + } +} + + +# print "per-hour" smtpd connection summary +# (done in a subroutine only to keep main-line code clean) +sub print_per_hour_smtpd { + my ($smtpdPerHr, $dayCnt) = @_; + my ($hour, $value); + if($dayCnt > 1) { + print_subsect_title("Per-Hour SMTPD Connection Daily Average"); + + print <[0] || next; + my $avg = int($smtpdPerHr[$hour]->[0]? + ($smtpdPerHr[$hour]->[1]/$smtpdPerHr[$hour]->[0]) + .5 : 0); + if($dayCnt > 1) { + $smtpdPerHr[$hour]->[0] /= $dayCnt; + $smtpdPerHr[$hour]->[1] /= $dayCnt; + $smtpdPerHr[$hour]->[0] += .5; + $smtpdPerHr[$hour]->[1] += .5; + } + my($sec, $min, $hr) = get_smh($smtpdPerHr[$hour]->[1]); + + if($isoDateTime) { + printf " %02d:00-%02d:00", $hour, $hour + 1; + } else { + printf " %02d00-%02d00 ", $hour, $hour + 1; + } + printf " %6d%s %2d:%02d:%02d", + adj_int_units($smtpdPerHr[$hour]->[0]), + $hr, $min, $sec; + if($dayCnt < 2) { + printf " %6ds %6ds", + $avg, + $smtpdPerHr[$hour]->[2]; + } + print "\n"; + } +} + +# print "per-day" smtpd connection summary +# (done in a subroutine only to keep main-line code clean) +sub print_per_day_smtpd { + my ($smtpdPerDay, $dayCnt) = @_; + + print_subsect_title("Per-Day SMTPD Connection Summary"); + + print < $b } keys(%$smtpdPerDay)) { + my ($msgYr, $msgMon, $msgDay) = unpack("A4 A2 A2", $_); + if($isoDateTime) { + printf " %04d-%02d-%02d ", $msgYr, $msgMon + 1, $msgDay + } else { + my $msgMonStr = $monthNames[$msgMon]; + printf " $msgMonStr %2d $msgYr", $msgDay; + } + + my $avg = (${$smtpdPerDay{$_}}[1]/${$smtpdPerDay{$_}}[0]) + .5; + my($sec, $min, $hr) = get_smh(${$smtpdPerDay{$_}}[1]); + + printf " %6d%s %2d:%02d:%02d %6ds %6ds\n", + adj_int_units(${$smtpdPerDay{$_}}[0]), + $hr, $min, $sec, + $avg, + ${$smtpdPerDay{$_}}[2]; + } +} + +# print "per-domain-smtpd" connection summary +# (done in a subroutine only to keep main-line code clean) +sub print_domain_smtpd_summary { + use vars '$hashRef'; + local($hashRef) = $_[0]; + my($cnt) = $_[1]; + return if($cnt == 0); + my $topCnt = $cnt > 0? "(top $cnt)" : ""; + my $avgDly; + + print_subsect_title("Host/Domain Summary: SMTPD Connections $topCnt"); + + print <{$_}}[1]/${$hashRef->{$_}}[0]) + .5; + my ($sec, $min, $hr) = get_smh(${$hashRef->{$_}}[1]); + + printf " %6d%s %2d:%02d:%02d %6ds %6ds %s\n", + adj_int_units(${$hashRef->{$_}}[0]), + $hr, $min, $sec, + $avg, + ${$hashRef->{$_}}[2], + $_; + last if --$cnt == 0; + } +} + +# print hash contents sorted by numeric values in descending +# order (i.e.: highest first) +sub print_hash_by_cnt_vals { + my($hashRef, $title, $cnt, $quiet) = @_; + my $dottedLine; + $title = sprintf "%s%s", $cnt? "top $cnt " : "", $title; + unless(%$hashRef) { + return if($quiet); + $dottedLine = ": none"; + } else { + $dottedLine = "\n" . "-" x length($title); + } + printf "\n$title$dottedLine\n"; + really_print_hash_by_cnt_vals($hashRef, $cnt, ' '); +} + +# print hash contents sorted by key in ascending order +sub print_hash_by_key { + my($hashRef, $title, $cnt, $quiet) = @_; + my $dottedLine; + $title = sprintf "%s%s", $cnt? "first $cnt " : "", $title; + unless(%$hashRef) { + return if($quiet); + $dottedLine = ": none"; + } else { + $dottedLine = "\n" . "-" x length($title); + } + printf "\n$title$dottedLine\n"; + foreach (sort keys(%$hashRef)) + { + printf " %s %s\n", $_, $hashRef->{$_}; + last if --$cnt == 0; + } +} + +# print "nested" hashes +sub print_nested_hash { + my($hashRef, $title, $cnt, $quiet) = @_; + my $dottedLine; + unless(%$hashRef) { + return if($quiet); + $dottedLine = ": none"; + } else { + $dottedLine = "\n" . "-" x length($title); + } + printf "\n$title$dottedLine\n"; + walk_nested_hash($hashRef, $cnt, 0); +} + +# "walk" a "nested" hash +sub walk_nested_hash { + my ($hashRef, $cnt, $level) = @_; + $level += 2; + my $indents = ' ' x $level; + my ($keyName, $hashVal) = each(%$hashRef); + + if(ref($hashVal) eq 'HASH') { + foreach (sort keys %$hashRef) { + print "$indents$_"; + # If the next hash is finally the data, total the + # counts for the report and print + my $hashVal2 = (each(%{$hashRef->{$_}}))[1]; + keys(%{$hashRef->{$_}}); # "reset" hash iterator + unless(ref($hashVal2) eq 'HASH') { + print " (top $cnt)" if($cnt > 0); + my $rptCnt = 0; + $rptCnt += $_ foreach (values %{$hashRef->{$_}}); + print " (total: $rptCnt)"; + } + print "\n"; + walk_nested_hash($hashRef->{$_}, $cnt, $level); + } + } else { + really_print_hash_by_cnt_vals($hashRef, $cnt, $indents); + } +} + + +# print per-message info in excruciating detail :-) +sub print_detailed_msg_data { + use vars '$hashRef'; + local($hashRef) = $_[0]; + my($title, $quiet) = @_[1,2]; + my $dottedLine; + unless(%$hashRef) { + return if($quiet); + $dottedLine = ": none"; + } else { + $dottedLine = "\n" . "-" x length($title); + } + printf "\n$title$dottedLine\n"; + foreach (sort by_domain_then_user keys(%$hashRef)) + { + printf " %s %s\n", $_, shift(@{$hashRef->{$_}}); + foreach (@{$hashRef->{$_}}) { + print " $_\n"; + } + print "\n"; + } +} + +# *really* print hash contents sorted by numeric values in descending +# order (i.e.: highest first), then by IP/addr, in ascending order. +sub really_print_hash_by_cnt_vals { + my($hashRef, $cnt, $indents) = @_; + + foreach (map { $_->[0] } + sort { $b->[1] <=> $a->[1] || $a->[2] cmp $b->[2] } + map { [ $_, $hashRef->{$_}, normalize_host($_) ] } + (keys(%$hashRef))) + { + printf "$indents%6d%s %s\n", adj_int_units($hashRef->{$_}), $_; + last if --$cnt == 0; + } +} + +# Print a sub-section title with properly-sized underline +sub print_subsect_title { + my $title = $_[0]; + print "\n$title\n" . "-" x length($title) . "\n"; +} + +# Normalize IP addr or hostname +# (Note: Makes no effort to normalize IPv6 addrs. Just returns them +# as they're passed-in.) +sub normalize_host { + # For IP addrs and hostnames: lop off possible " (user@dom.ain)" bit + my $norm1 = (split(/\s/, $_[0]))[0]; + + if((my @octets = ($norm1 =~ /^(\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3})$/o)) == 4) { + # Dotted-quad IP address + return(pack('C4', @octets)); + } else { + # Possibly hostname or user@dom.ain + return(join( '', map { lc $_ } reverse split /[.@]/, $norm1 )); + } +} + +# subroutine to sort by domain, then user in domain, then by queue i.d. +# Note: mixing Internet-style domain names and UUCP-style bang-paths +# may confuse this thing. An attempt is made to use the first host +# preceding the username in the bang-path as the "domain" if none is +# found otherwise. +sub by_domain_then_user { + # first see if we can get "user@somedomain" + my($userNameA, $domainA) = split(/\@/, ${$hashRef->{$a}}[0]); + my($userNameB, $domainB) = split(/\@/, ${$hashRef->{$b}}[0]); + + # try "somedomain!user"? + ($userNameA, $domainA) = (split(/!/, ${$hashRef->{$a}}[0]))[-1,-2] + unless($domainA); + ($userNameB, $domainB) = (split(/!/, ${$hashRef->{$b}}[0]))[-1,-2] + unless($domainB); + + # now re-order "mach.host.dom"/"mach.host.do.co" to + # "host.dom.mach"/"host.do.co.mach" + $domainA =~ s/^(.*)\.([^\.]+)\.([^\.]{3}|[^\.]{2,3}\.[^\.]{2})$/$2.$3.$1/o + if($domainA); + $domainB =~ s/^(.*)\.([^\.]+)\.([^\.]{3}|[^\.]{2,3}\.[^\.]{2})$/$2.$3.$1/o + if($domainB); + + # oddly enough, doing this here is marginally faster than doing + # an "if-else", above. go figure. + $domainA = "" unless($domainA); + $domainB = "" unless($domainB); + + if($domainA lt $domainB) { + return -1; + } elsif($domainA gt $domainB) { + return 1; + } else { + # disregard leading bang-path + $userNameA =~ s/^.*!//o; + $userNameB =~ s/^.*!//o; + if($userNameA lt $userNameB) { + return -1; + } elsif($userNameA gt $userNameB) { + return 1; + } else { + if($a lt $b) { + return -1; + } elsif($a gt $b) { + return 1; + } + } + } + return 0; +} + +# Subroutine used by host/domain reports to sort by count, then size. +# We "fix" un-initialized values here as well. Very ugly and un- +# structured to do this here - but it's either that or the callers +# must run through the hashes twice :-(. +sub by_count_then_size { + ${$hashRef->{$a}}[$msgCntI] = 0 unless(${$hashRef->{$a}}[$msgCntI]); + ${$hashRef->{$b}}[$msgCntI] = 0 unless(${$hashRef->{$b}}[$msgCntI]); + if(${$hashRef->{$a}}[$msgCntI] == ${$hashRef->{$b}}[$msgCntI]) { + ${$hashRef->{$a}}[$msgSizeI] = 0 unless(${$hashRef->{$a}}[$msgSizeI]); + ${$hashRef->{$b}}[$msgSizeI] = 0 unless(${$hashRef->{$b}}[$msgSizeI]); + return(${$hashRef->{$a}}[$msgSizeI] <=> + ${$hashRef->{$b}}[$msgSizeI]); + } else { + return(${$hashRef->{$a}}[$msgCntI] <=> + ${$hashRef->{$b}}[$msgCntI]); + } +} + +# return a date string to match in log +sub get_datestr { + my $dateOpt = $_[0]; + + my $time = time(); + + if($dateOpt eq "yesterday") { + # Back up to yesterday + $time -= ((localtime($time))[2] + 2) * 3600; + } elsif($dateOpt ne "today") { + die "$usageMsg\n"; + } + my ($t_mday, $t_mon) = (localtime($time))[3,4]; + + return sprintf("%s %2d", $monthNames[$t_mon], $t_mday); +} + +# if there's a real domain: uses that. Otherwise uses the IP addr. +# Lower-cases returned domain name. +# +# Optional bit of code elides the last octet of an IPv4 address. +# (In case one wants to assume an IPv4 addr. is a dialup or other +# dynamic IP address in a /24.) +# Does nothing interesting with IPv6 addresses. +# FIXME: I think the IPv6 address parsing may be weak +sub gimme_domain { + $_ = $_[0]; + my($domain, $ipAddr); + + # split domain/ipaddr into separates + # newer versions of Postfix have them "dom.ain[i.p.add.ress]" + # older versions of Postfix have them "dom.ain/i.p.add.ress" + unless((($domain, $ipAddr) = /^([^\[]+)\[((?:\d{1,3}\.){3}\d{1,3})\]/o) == 2 || + (($domain, $ipAddr) = /^([^\/]+)\/([0-9a-f.:]+)/oi) == 2) { + # more exhaustive method + ($domain, $ipAddr) = /^([^\[\(\/]+)[\[\(\/]([^\]\)]+)[\]\)]?:?\s*$/o; + } + + # "mach.host.dom"/"mach.host.do.co" to "host.dom"/"host.do.co" + if($domain eq 'unknown') { + $domain = $ipAddr; + # For identifying the host part on a Class C network (commonly + # seen with dial-ups) the following is handy. + # $domain =~ s/\.\d+$//o; + } else { + $domain =~ + s/^(.*)\.([^\.]+)\.([^\.]{3}|[^\.]{2,3}\.[^\.]{2})$/\L$2.$3/o; + } + + return $domain; +} + +# Return (value, units) for integer +sub adj_int_units { + my $value = $_[0]; + my $units = ' '; + $value = 0 unless($value); + if($value > $divByOneMegAt) { + $value /= $oneMeg; + $units = 'm' + } elsif($value > $divByOneKAt) { + $value /= $oneK; + $units = 'k' + } + return($value, $units); +} + +# Return (value, units) for time +sub adj_time_units { + my $value = $_[0]; + my $units = 's'; + $value = 0 unless($value); + if($value > 3600) { + $value /= 3600; + $units = 'h' + } elsif($value > 60) { + $value /= 60; + $units = 'm' + } + return($value, $units); +} + +# Trim a "said:" string, if necessary. Add elipses to show it. +# FIXME: This sometimes elides The Wrong Bits, yielding +# summaries that are less useful than they could be. +sub said_string_trimmer { + my($trimmedString, $maxLen) = @_; + + while(length($trimmedString) > $maxLen) { + if($trimmedString =~ /^.* said: /o) { + $trimmedString =~ s/^.* said: //o; + } elsif($trimmedString =~ /^.*: */o) { + $trimmedString =~ s/^.*?: *//o; + } else { + $trimmedString = substr($trimmedString, 0, $maxLen - 3) . "..."; + last; + } + } + + return $trimmedString; +} + +# Trim a string, if necessary. Add elipses to show it. +sub string_trimmer { + my($trimmedString, $maxLen, $doNotTrim) = @_; + + $trimmedString = substr($trimmedString, 0, $maxLen - 3) . "..." + if(! $doNotTrim && (length($trimmedString) > $maxLen)); + return $trimmedString; +} + +# Get seconds, minutes and hours from seconds +sub get_smh { + my $sec = shift @_; + my $hr = int($sec / 3600); + $sec -= $hr * 3600; + my $min = int($sec / 60); + $sec -= $min * 60; + return($sec, $min, $hr); +} + +# Process smtpd rejects +sub proc_smtpd_reject { + my ($logLine, $rejects, $msgsRjctd, $rejPerHr, $msgsPerDay) = @_; + my ($rejTyp, $rejFrom, $rejRmdr, $rejReas); + my ($from, $to); + my $rejAddFrom = 0; + + ++$$msgsRjctd; + ++$$rejPerHr; + ++$$msgsPerDay; + + # Hate the sub-calling overhead if we're not doing reject details + # anyway, but this is the only place we can do this. + return if($opts{'rejectDetail'} == 0); + + # This could get real ugly! + + # First: get everything following the "reject: ", etc. token + # Was an IPv6 problem here + ($rejTyp, $rejFrom, $rejRmdr) = + ($logLine =~ /^.* \b(?:reject(?:_warning)?|hold|discard): (\S+) from (\S+?): (.*)$/o); + + # Next: get the reject "reason" + $rejReas = $rejRmdr; + unless(defined($opts{'verbMsgDetail'})) { + if($rejTyp eq "RCPT" || $rejTyp eq "DATA" || $rejTyp eq "CONNECT") { # special treatment :-( + # If there are "<>"s immediately following the reject code, that's + # an email address or HELO string. There can be *anything* in + # those--incl. stuff that'll screw up subsequent parsing. So just + # get rid of it right off. + $rejReas =~ s/^(\d{3} <).*?(>:)/$1$2/o; + $rejReas =~ s/^(?:.*?[:;] )(?:\[[^\]]+\] )?([^;,]+)[;,].*$/$1/o; + $rejReas =~ s/^((?:Sender|Recipient) address rejected: [^:]+):.*$/$1/o; + $rejReas =~ s/(Client host|Sender address) .+? blocked/blocked/o; + } elsif($rejTyp eq "MAIL") { # *more* special treatment :-( grrrr... + $rejReas =~ s/^\d{3} (?:<.+>: )?([^;:]+)[;:]?.*$/$1/o; + } else { + $rejReas =~ s/^(?:.*[:;] )?([^,]+).*$/$1/o; + } + } + + # Snag recipient address + # Second expression is for unknown recipient--where there is no + # "to=" field, third for pathological case where recipient + # field is unterminated, forth when all else fails. + (($to) = $rejRmdr =~ /to=<([^>]+)>/o) || + (($to) = $rejRmdr =~ /\d{3} <([^>]+)>: User unknown /o) || + (($to) = $rejRmdr =~ /to=<(.*?)(?:[, ]|$)/o) || + ($to = "<>"); + $to = lc($to) if($opts{'i'}); + + # Snag sender address + (($from) = $rejRmdr =~ /from=<([^>]+)>/o) || ($from = "<>"); + + if(defined($from)) { + $rejAddFrom = $opts{'rejAddFrom'}; + $from = verp_mung($from); + $from = lc($from) if($opts{'i'}); + } + + # stash in "triple-subscripted-array" + if($rejReas =~ m/^Sender address rejected:/o) { + # Sender address rejected: Domain not found + # Sender address rejected: need fully-qualified address + ++$rejects->{$rejTyp}{$rejReas}{$from}; + } elsif($rejReas =~ m/^(Recipient address rejected:|User unknown( |$))/o) { + # Recipient address rejected: Domain not found + # Recipient address rejected: need fully-qualified address + # User unknown (in local/relay recipient table) + #++$rejects->{$rejTyp}{$rejReas}{$to}; + my $rejData = $to; + if($rejAddFrom) { + $rejData .= " (" . ($from? $from : gimme_domain($rejFrom)) . ")"; + } + ++$rejects->{$rejTyp}{$rejReas}{$rejData}; + } elsif($rejReas =~ s/^.*?\d{3} (Improper use of SMTP command pipelining);.*$/$1/o) { + # Was an IPv6 problem here + my ($src) = $logLine =~ /^.+? from (\S+?):.*$/o; + ++$rejects->{$rejTyp}{$rejReas}{$src}; + } elsif($rejReas =~ s/^.*?\d{3} (Message size exceeds fixed limit);.*$/$1/o) { + my $rejData = gimme_domain($rejFrom); + $rejData .= " ($from)" if($rejAddFrom); + ++$rejects->{$rejTyp}{$rejReas}{$rejData}; + } elsif($rejReas =~ s/^.*?\d{3} (Server configuration (?:error|problem));.*$/(Local) $1/o) { + my $rejData = gimme_domain($rejFrom); + $rejData .= " ($from)" if($rejAddFrom); + ++$rejects->{$rejTyp}{$rejReas}{$rejData}; + } else { +# print STDERR "dbg: unknown reject reason $rejReas !\n\n"; + my $rejData = gimme_domain($rejFrom); + $rejData .= " ($from)" if($rejAddFrom); + ++$rejects->{$rejTyp}{$rejReas}{$rejData}; + } +} + +# Hack for VERP (?) - convert address from somthing like +# "list-return-36-someuser=someplace.com@lists.domain.com" +# to "list-return-ID-someuser=someplace.com@lists.domain.com" +# to prevent per-user listing "pollution." More aggressive +# munging converts to something like +# "list-return@lists.domain.com" (Instead of "return," there +# may be numeric list name/id, "warn", "error", etc.?) +sub verp_mung { + my $addr = $_[0]; + + if(defined($opts{'verpMung'})) { + $addr =~ s/((?:bounce[ds]?|no(?:list|reply|response)|return|sentto|\d+).*?)(?:[\+_\.\*-]\d+\b)+/$1-ID/oi; + if($opts{'verpMung'} > 1) { + $addr =~ s/[\*-](\d+[\*-])?[^=\*-]+[=\*][^\@]+\@/\@/o; + } + } + + return $addr; +} + +### +### Warning and Error Routines +### + +# Emit warning message to stderr +sub msg_warn { + warn "warning: $progName: $_[0]\n"; +} +