[logs] regexless parsing, again?
E G
bronc94583 at yahoo.com
Thu Sep 13 17:12:03 PDT 2007
Back when I worked at "another company" a few years
back I did a lot of research into this area.
We looked a an approach that grouped logs together
based upon what we already knew about that type of log
source and how they are similar, rather then
"guessing" what each line was as it came in.
This came about from doing quite a bit of statistical
analysis on raw log data, I noted quite a bit of
correlation from source to source (which in itself
isn't news), but because of this, would allow us to
classify unknown data in some semi-intelligent method
and dump known entities in known "buckets".
Working with some people who were much smarter then I,
I was able to create a reverse Patricia Trie tree like
structure. Think of it like when you're on your
blackberry and you're typing. It attempts to predict
the next letter and tries to complete the word you're
typing for you. The same logic can basically work in
reverse where you use this Trie structure to dissemble
a word, or string in our case. Once you reach an end
point on the Trie, it leaves you with what the data
is, however you have decided to classify it.
I hope that's understandable; I didn't want to write
out a book.
Anyhow, my ideas didn't end up going anywhere. They
choose to stay with the RegEx "guessing" method - as
is the standard. I had a lot of the code I developed
after I left up on SourceForce for a while, but real
life took me away from it. I might be able to dig it
up if anyone is interested.
- Erik
--- "Marcus J. Ranum" <mjr at ranum.com> wrote:
> Anton Chuvakin wrote:
> >Anybody care to restart the discussion and see what
> the collective
> >wisdom of loganalysis can produce?
>
> I am coding on something regarding regexless parsing
> as we
> speak. ETA is unknown but certainly before Xmas. It
> will be
> open source but not GPL.
>
> mjr.
> _______________________________________________
> LogAnalysis mailing list
> LogAnalysis at loganalysis.org
>
http://www.loganalysis.org/mailman/listinfo/loganalysis
>
____________________________________________________________________________________
Be a better Globetrotter. Get better travel answers from someone who knows. Yahoo! Answers - Check it out.
http://answers.yahoo.com/dir/?link=list&sid=396545469
More information about the LogAnalysis
mailing list