[logs] Analyzing tons of logs
Jose Nazario
jose at monkey.org
Wed Mar 28 10:23:03 PDT 2007
On Wed, 28 Mar 2007, Chetan Gupta wrote:
> How do we go about log analysis if we have tons (maybe in trillions) of
> logs from lets say tcpdump (raw logs) or some firewall (like netscreen
> or pix)? What would be the best way to normalize and analyze these logs
> in the shortest possible time? Import them into a database? Use a
> commercial application like arcsight? loglogic? simple text editor like
> editplus?
don't kick me for saying this, but you haven't posed any questions about
what you're trying to address with this log analysis. traffic over time?
failed logins? attacks? application usage? server usage and utilization?
what you want to do will dictate what tools you'll use, and hence what
normalization you'll do.
first things first, make sure all logs have the same timestamp references
(ie UTC). if not, normalize that.
this next step will explode the data storage requirements, but gives you a
bunch more indices to query on.
secondly, for network traces, a few breakdowns can be useful:
- split all traffic into sessions and save those out in individual files
- run an IDS over it and look for known attacks and alerts
- run AV over the session payloads to look for known bad stuff
- identify what applications are in use and tag the traces that way
- organize the traces by source, dest, service (proto/port), payloads, and
alerts
- you can use a database with foreign keys or even just a filesystem
with links
for syslog data, a number of high performance engiines exist. you can dump
your data into those systems, run some analysis on the content and then
you have a nice searchable database.
find the lowest common denominator that preserves the info for text based
logs (ie those PIX logs, Windows server logs, etc) and use that, ie
syslog.
hope that helps.
________
jose nazario, ph.d. jose at monkey.org
http://monkey.org/~jose/ http://monkey.org/~jose/secnews.html
http://www.wormblog.com/
More information about the LogAnalysis
mailing list