Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: extract rule group from msg when parsing #41

Closed
sevdog opened this issue Mar 20, 2017 · 4 comments
Closed

Feature request: extract rule group from msg when parsing #41

sevdog opened this issue Mar 20, 2017 · 4 comments

Comments

@sevdog
Copy link

sevdog commented Mar 20, 2017

When parsing a merged rule file, using in example rule.parse_file, is not possible to assign the group parameter and the rule group is not parsed into the object.

However the group is somehow standardized in the rule msg using this convention:

msg second word rule file
CNC botcc.rules
CINS ciarmy.rules
COMPROMISED compromised.rules
DROP drop.rules (also dshield.rules )
ACTIVEX emerging-activex.rules
ATTACK_RESPONSE emerging-attack_response.rules
CHAT emerging-chat.rules
CURRENT_EVENTS emerging-current_events.rules
DELETED emerging-deleted.rules
DNS emerging-dns.rules
DOS emerging-dos.rules
EXPLOIT emerging-exploit.rules
FTP emerging-ftp.rules
GAMES emerging-games.rules
ICMP_INFO emerging-icmp_info.rules
ICMP emerging-icmp.rules
IMAP emerging-imap.rules
INAPPROPRIATE emerging-inappropriate.rules
INFO emerging-info.rules
MALWARE emerging-malware.rules
MISC emerging-misc.rules
MOBILE_MALWARE emerging-mobile_malware.rules
NETBIOS emerging-netbios.rules
P2P emerging-p2p.rules
POLICY emerging-policy.rules
POP3 emerging-pop3.rules
RPC emerging-rpc.rules
SCADA emerging-scada.rules
SCAN emerging-scan.rules
SHELLCODE emerging-shellcode.rules
SMTP emerging-smtp.rules
SNMP emerging-snmp.rules
SQL emerging-sql.rules
TELNET emerging-telnet.rules
TFTP emerging-tftp.rules
TROJAN emerging-trojan.rules
USER_AGENTS emerging-user_agents.rules
VOIP emerging-voip.rules
WEB_CLIENT emerging-web_client.rules
WEB_SERVER emerging-web_server.rules
WEB_SPECIFIC_APPS emerging-web_specific_apps.rules
WORM emerging-worm.rules
TOR tor.rules

So, if rules are well writtent and no group is provided the parser should extract it from the msg second word.

If you find this interesting I can make a PR for this feature.

@jasonish
Copy link
Owner

The usage of the group field is actually to contain the filename the rule was extracted from (see https://github.com/jasonish/py-idstools/blob/master/idstools/scripts/rulecat.py#L749). Rulecat then has a keyword, "group:" to enable/disable rules based on their group.

I do see use for what you are suggesting, but perhaps under a different name than group? Or perhaps the filename was misnamed, and should be something else.

From this blog post http://blog.snort.org/2012/03/rule-category-reorganization.html, it looks like Talos refers to the filename and the leading all caps in the message as a category. Perhaps the rule parser should parse out this all caps data as "category" and not group - keeping group as the filename.

@sevdog
Copy link
Author

sevdog commented Mar 21, 2017

Referring this attribute as category will be ok.

Perhaps the rule parser should parse out this all caps data as "category" and not group - keeping group as the filename.

I think that category is just the second uppercase word of the msg, considering also the first word or the third would cause some issue. IMO the category should enrich the information given by the classtype attribute, adding context.

These are some example of categorization:

  • ET TROJAN ABUSE.CH SSL Blacklist Malicious SSL certificate detected (Gozi MITM) should be categorized just as TROJAN because this is the main information about the rule
  • ET TROJAN WORM_VOBFUS Requesting exe will also be categorized as TROJAN, but should not be categorized as WORM_VOBFUS because it does not make sense to have a category just for one malware
  • GPL TROJAN BackOrifice access will also be categorized as TROJAN, not GPL TROJAN
  • GPL SMTP OUTBOUND bad file attachment will be categorized as SMTP, the OUTBOUND is not a categorization information
  • GPL SMTP SMTP relaying denied will be categorized as SMTP, this is an example also shows the pattern used.

Rules coming from ET follow this convention on the msg field:

<SET> <CATEGORY> <description>

ET TROJAN ABUSE.CH SSL Blacklist Malicious SSL certificate detected (Gozi MITM) should be parsed as:
set -> ET
category -> TROJAN
description -> ABUSE.CH SSL Blacklist Malicious SSL certificate detected (Gozi MITM)

However, looking into Talos' rules for SNORT, I noticed they use a different convention on `msg:

<CATEGORY>-<SUBCATEGORY> <description>

MALWARE-BACKDOOR NetBus Pro 2.0 connection established should be parsed as:
category -> MALWARE
subcategory -> BACKDOOR
description -> NetBus Pro 2.0 connection established

With this in mind I think that a better solution whould be to add some code to the parser that can recognize if the msg field is following one of the above mentioned conventions and then parsing the field according to this (so also the issue #42 will be included in the feature).

It whould be great if we can also make it work with other naming convention, but this will require more reasoning.

@sevdog
Copy link
Author

sevdog commented Mar 22, 2017

Going deeper in SNORT rules i foud that they use a more particular pattern to include metadata in the msg field:

  • the first uppercase word is the category
  • the first uppercase word may be a dash-separated word, in this case the first part is the main category while the second seems to be a subcategory
  • if the category is DELETED the second word is the previous category of the rule

With in mind I think that a parser to extract metadata from msg field cannot work if we did not tell it what convention is in use. So i think that this feature, if approved, should be used by giving an explicit parameter and should not be enabled by default (to avoid unnecessary and error prone parsing).

Something like

def parse(buf, group=None, msg_metadata=None):
    ....
            elif name == "msg":
            ....
            rule[name] = val
            if msg_metadata:
                rule.parse_msg_metadat(msg_metadata)

@sevdog
Copy link
Author

sevdog commented Jul 20, 2018

Ok, this feature is out of scope.

If needed I will use it out of this library.

@sevdog sevdog closed this as completed Jul 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants