* Storage space wasted because of empty fields in database.
Store raw logs in signed flat files:
* Logs are available in original format for most flexibility in subsequent .
* Integrity of each log and integrity of log sequence can be proven.
* Efficient storage with possibility to compress flat files.
The right approach: No matter what you use logs for, you need to insure that you are working off of legitimate logs, that the logs you have stored have not changed since they were received, and that no log has been added or deleted. So you need to store them with some sort of proof of integrity.
If you store raw logs vs. normalized logs, make sure you understand what you want to do with your logs. If you want maximum flexibility, then work off of raw logs and apply a treatment later. If you want logs that are immediately usable for reporting or correlation, normalized storage is fine. But once normalized, it could be difficult or even impossible to reconstruct the original message and prove its integrity.
Performing log forensics
Now you are in an ideal situation to perform forensics: you are working from a clear stream of data, with fresh, unaltered and pure information at your fingertips, and in case of misbehavior you have all the elements of information that will lead to the criminal.
If you have ever been under attack you can certainly understand the pressure to react fast. And the first step is to understand what happened, who did it, how, what systems were affected and what needs to be done to stop the damage and prevent this from happening again.
Logs represent a gold mine for this task if you know how to leverage them, and if you have the proper tools to do so. You know that the proof of the misbehavior is somewhere in there, somewhere mixed with billions of other logs, buried in terabytes or even petabytes of data.
The process of doing forensics on a log management solution is similar to using an Internet search engine. Sometimes you know exactly what you're looking for; other times, it's a trial-and-error process. Start with keywords and refine/modify these so you zoom in on the log or logs that explain you what happened.
Once you have zoomed in on the specific log or logs, you can now follow the trail of the crime and understand how the breach spread from system to system, how and why the attack was successful, and which systems were affected. Each log becomes a piece of the puzzle as you answer questions such as: was it successful because there were missing security patches, or because passwords are in clear and a system was in promiscuous mode, or because the firewall was misconfigured, etc.