[logs] Analyzing tons of logs
Anton Chuvakin
anton at chuvakin.org
Wed Mar 28 20:20:22 PDT 2007
Chetan and all,
> How do we go about log analysis if we have tons (maybe in trillions) of logs
> from lets say tcpdump (raw logs) or some firewall (like netscreen or pix)?
> What would be the best way to normalize and analyze these logs in the
> shortest possible time?
Let's see here: assuming 1 trillions records of 200 bytes (typical
PIX, way too small for a packet), we are looking at roughly 180TB of
data. To analyze... not just to store.
So, I have a sneaking suspicion that ALL the mentioned solutions will
fail miserably, albeit without embarrassing their creators (cause
that's a looooooooooot of data!). I have to admit that Jose is
probably right: you might need to write some purpose-specific code
here. Look up some old posts by Marcus Ranum (here
http://www.andrews.hu/guru/msg583.html and around) for useful tips on
super-fast but purpose-specific log processing.
Best,
--
Anton Chuvakin, Ph.D., GCIA, GCIH, GCFA
http://www.chuvakin.org
http://chuvakin.blogspot.com
http://www.info-secure.org
More information about the LogAnalysis
mailing list