We applied scans.pl to 5550 "Potential Scan" email alerts gathered by Antonio Cesaracciu between Saturday 7 October, 2006 and Wednesday 15 August 2007 (roughly 300 days). The file consisted of 1,356,214 lines including mail headers as well as the scan data. The summary results were as follows:
~~~~~~~~~~~~~~~~scans.pl overall summary~~~~~~~~~~~~~~~~~~~~~~~ BROADCAST>30 occurs 2 times in 5550 mails BitTorrent>=20 occurs 324 times in 5550 mails Conditional Run Probability>50% & >100 unique ports occurs 125 times in 5550 mails GRE occurs 29 times in 5550 mails Gnutella>=20 occurs 302 times in 5550 mails Grokster>=20 occurs 7 times in 5550 mails KNOWN occurs 1029 times in 5550 mails KaZaA>=20 occurs 7 times in 5550 mails Likely AFS occurs 23 times in 5550 mails PING>=60 occurs 910 times in 5550 mails RPC>=20 occurs 11 times in 5550 mails SKYPE>=60 occurs 3395 times in 5550 mails SMTP>=50 occurs 8 times in 5550 mails WinMX>=20 occurs 1 times in 5550 mails eDonkey>=20 occurs 176 times in 5550 mails No signature found in 446 out of 5550 mails Frequency of emails for Mon=898/5550 Frequency of emails for Tue=929/5550 Frequency of emails for Wed=969/5550 Frequency of emails for Thu=907/5550 Frequency of emails for Fri=825/5550 Frequency of emails for Sat=499/5550 Frequency of emails for Sun=523/5550 Src hosts in visitor subnet=2830, VPN=583, with roaming DHCP=265, with CANDO data=1826 Mon Aug 27 17:27:50 2007 scans.pl finished took 620 secs., read 1356214 lines with 5550 emails from /scratch/temp/allscanning
Test Cases
There are test cases for scan.pl in ~cottrell/scan/. There are test cases for scans.pl in ~cottrell/scans/. There is also an extensive test set on net-desk2.slac.stanford.edu:/scratch/temp/allscanning.
Thresholds
The number following the >= is the threshold of number of occurences in the 200 flows that are required to cause this identification.
Port Identification
Ports are identified as follows:
######################id_port#################################### # Given the protocol and port as a string prot_port, provides # the information on the port's purpose #See http://www.iana.org/assignments/port-numbers #and http://www.seifried.org/security/ports/" #Example id_port("TCP_80"); sub id_port { #Identify port applications using: # http://www.seifried.org/security/ports/ #among other sources my ($prot, $port)=split(/_/,$_[0]); if(!defined($port) || $port eq "") { print "Can't find port in prot_port=$_[0]\n"; return "-1"; } #See http://compnetworking.about.com/od/p2ppeertopeer/p/shareaza.htm if(($port>=6345) && ($port<=6348)) {return "P2P:Gnutella:$_[0]";} elsif($_[0] eq "ICMP_0") {return "EchoReply";} elsif($port==20) {return "FTPDATA";} elsif($port==22) {return "SSH";} elsif($port==23) {return "TELNET";} elsif($port==25) {return "SMTP";} elsif($port==37) {return "Time";} elsif($port==53) {return "DNS";} elsif($port==66) {return "ORACLE";} elsif($port==69) {return "TFTP";} elsif(($port==67)||($port==68)) {return "BootP";} elsif($port==80) {return "HTTP";} elsif($_[0] eq "TCP_81") {return "HOSTS2";} elsif($port==123) {return "NTP";} #See http://www.seifried.org/security/ports/0/137.html elsif($_[0] eq "TCP_135") {return "MS:RPC";} #See http://www.seifried.org/security/ports/0/135.html elsif($_[0] eq "TCP_137") {return "MS:NETBIOS_NS";} elsif($port==110) {return "POP3";} elsif($port==143) {return "IMAP";} elsif(($_[0] eq "UDP_161") || ($_[0] eq "UDP_162")) {return "SNMP";} elsif($port==443) {return "HTTPS";} #See http://www.seifried.org/security/ports/0/445.html elsif($port==445) {return "MS:DS";} #See http://www.auditmypc.com/port/tcp-port-631.asp elsif($port==631) {return "IPPRINT";} #Port" = 771 = 3*256+3, so it's ICMP type 3/subtype 3 == port-unreachable (JXH) elsif($_[0] eq "ICMP_771") {return "ICMP_PortUnreach";} #See http://homepage.ntlworld.com/robin.d.h.walker/cmtips/p2p.html elsif($port==1214) {return "P2P:KaZaA,Grokster:$_[0]";} elsif($_[0] eq "ICMP_2048") {return "EchoRequest";} #http://isc.sans.org/port.html?port=2222 elsif($_[0] eq "UDP_2222") {return "Office_X";} #See http://isc.sans.org/port.html?port=3724 elsif($port==3724) {return "P2P:WoW:$_[0]";} #See http://www.cisco.com/en/US/products/hw/vpndevc/ps2030/products_tech_note09186a00801e419a.shtml #See http://www.pam2004.org/papers/159.pdf #See http://www.amule.org/wiki/index.php/Firewall elsif(($port==4662) || ($port==4665) || ($port==4672) || ($port==4661)) {return "P2P:eDonkey_Mule:$_[0]";} elsif($_[0] eq "TCP_5060") {return "SIP";} #TCP 5168 for Windows running Trend Micro Inc's ServerProtect antivirus Software elsif($port==5168) {return "TrendMicro_ServerProtect";} elsif(($_[0] eq "TCP_5222") || ($_[0] eq "TCP_5269")) {return "Jabber";} elsif(($port>=6000) && ($port<=6063)) {return "X_Windows";} #See http://compnetworking.about.com/od/p2ppeertopeer/p/winmxclient.htm elsif(($port==6699) || ($port==6257)) {return "P2P:WinMX:$_[0]";} #See http://compnetworking.about.com/od/p2ppeertopeer/p/shareaza.htm for BitTorrent #and http://www.cert.org/netsa/publications/ESORICS2006-mcollins-finding-peer-to-peer-07132006.pdf for profile elsif($port==6881) { return "P2P:BitTorrent:$_[0]"; } elsif(($port>=7000) && ($port<=7009)) {return "AFS";} #See http://www.cisco.com/en/US/docs/security/nac/appliance/configuration_guide/412/cam/m_agntd.html elsif($port==8906) {return "CiscoNAC:Swiss";} #See http://forum.utorrent.com/viewtopic.php?pid=274679 elsif($port==60065) {return "P2P:uTorrent";} elsif($port>0) {return "unk";} else {return "";} }
Summary
The scripts appear reasonably effective in triage. To a non-expert but interested eye the identification of signatures appears effective when one compares the automated result with manual scanning of the emails plus email discussions with the SLAC security experts. More work is needed on the causes of unknown signatures, and false positives, this will probably need help from the security experts as will ranking the signatures into levels of severity. The scripts are able to categorize over 90% of the potential scans with < 10% having no recognizable signature. We have not made detailed experiments with changing or optimizing the various thresholds. Over 60% of the scans appear to be caused by Skype (for more on Skype see here). On average over this period there were about 18 Potential scans per day, with weekdays having about twice as many as weekends. Let us say it takes about a minute for an expert to bring up a Potential scan email and quickly review to identify a signature, then file or delete the email, assuming most of the emails are not interesting. In this case there is a potential savings of about 20 minutes/day most of which is not very useful. Since there are multiple experts who get the email alerts it is possible for duplicated efforts. Actually the time used is probably less than 20 minutes/day since the emails are so frequent they are only viewed when the expert(s) are not busy. However, this in itself begs the question as to whether the emails should be scanned at all or whether we are missing some important Potential scans. About 18% of the alerts come from well known hosts which maybe should be removed from generating alerts or identified in the alerts.
Finally the scripts are fairly lightweight taking < 0.1secs to analyze an email, they are able to run without AFS access and do not require other scripts to run. Thus they could be used as a filter to add extra information to each alert email, or to eliminate probably uninteresting emails.