This week's feature is the effective use of Transformation functions.
This excerpt is taken from the updated Reference Manual section of Ivan Ristic's book ModSecurity Handbook.
Transformation functions are used to alter input data before it is used in matching (i.e., operator execution). The input data is never modified, actually—whenever you request a trans- formation function to be used, ModSecurity will create a copy of the data, transform it, and then run the operator against the result.
Note
There are no default transformation functions, as there were in the first generation of ModSecurity (1.x).
In the following example, the request parameter values are converted to lowercase before matching:
SecRule ARGS "xp_cmdshell""t:lowercase"
Multiple transformation actions can be used in the same rule, forming a transformation pipeline. The transformations will be performed in the order in which they appear in the rule.
In most cases, the order in which transformations are performed is very important. In the following example, a series of transformation functions is performed to counter evasion. Per- forming the transformations in any other order would allow a skillful attacker to evade detection:
SecRule ARGS "(asfunction|javascript|vbscript|data|mocha|livescript):" \
"t:none,t:htmlEntityDecode,t:lowercase,t:removeNulls,t:removeWhitespace"
Warning
It is currently possible to use SecDefaultAction to specify a default list of transfor- mation functions, which will be applied to all rules that follow the SecDefaultAction directive. However, this practice is not recommended, because it means that mistakes are very easy to make. It is recommended that you always specify the transformation functions that are needed by a particular rule, starting the list with t:none (which clears the possibly inherited transformation functions).
The OWASP ModSecurity CRS makes extensive use of transformation functions. Each rule applies a specific transformation pipeline that was developed from user feedback and testing and aims to avoid false negative issues. The following example rule is taken from the modsecurity_crs_41_sql_injection.conf file:
SecRule REQUEST_FILENAME|ARGS_NAMES|ARGS|XML:/* "\bunion\b.{1,100}?\bselect\b" \ "phase:2,rev:'2.0.8',capture,t:none,t:urlDecodeUni,t:htmlEntityDecode,t:lowercase,t:replaceComments,t:compressWhiteSpace,ctl:auditLogParts=+E,pass,nolog,auditlog,msg:'SQL Injection Attack',id:'959047',tag:'WEB_ATTACK/SQL_INJECTION',tag:'WASCTC/WASC-19',tag:'OWASP_TOP_10/A1',tag:'OWASP_AppSensor/CIE1',tag:'PCI/6.5.2',logdata:'%{TX.0}',severity:'2',setvar:'tx.msg=%{rule.msg}',setvar:tx.sql_injection_score=+%{tx.critical_anomaly_score},setvar:tx.anomaly_score=+%{tx.critical_anomaly_score},setvar:tx.%{rule.id}-WEB_ATTACK/SQL_INJECTION-%{matched_var_name}=%{tx.0}"
The goal of this transformation pipeline is to try and counter-act typical evasion attempts that are used by SQL Injection attacks (which we describe more in depth in the following section).
Why are transformation functions so important? One word - EVASIONS. We have blogged about the term Impedance Mismatchmany times in the past. The issue is that there are many ways in which an attacker may be able to alter the format of an inbound payload, so that it may not match an input validation filtering scheme, however it is still functionally equivalent code and will be executed by the back-end application.
Example SQL Injection Evasion Techniques:
delete from -- lowercase
DELETE FROM -- upper-case
deLeTe fRoM -- mixed-case
Delete From -- more than 1 space between keywords
DELETE\tFROM -- \t represents a TAB character
DELETE/* random data */ FROM -- use of SQL Comments
Let's take a look at an example SQL Injection attack payload:
http://localhost/vulnerable_app.php?foo=1'%20UniOn%09(/*blah%20blah%20blah*/SeLeCt%20%20%20%20%20%20%20'1','2',PASSword%20from%20USERS)%20--%20-a
This payload has a number of the same evasion techniques described in the previous section. Let's take a look at the modsecurity debug log (set to level 9) and see how the transformation pipeline used in CRS rule id 959047 handles the payload:
[31/Aug/2010:11:34:17 --0400] [localhost/sid#10080d708][rid#1025d2aa0][/vulnerable_app.php][9] T (0) urlDecodeUni: "1' UniOn\t(/*blah blah blah*/SeLeCt '1','2',PASSword from USERS) -- -a" [31/Aug/2010:11:34:17 --0400] [localhost/sid#10080d708][rid#1025d2aa0][/vulnerable_app.php][9] T (0) htmlEntityDecode: "1' UniOn\t(/*blah blah blah*/SeLeCt '1','2',PASSword from USERS) -- -a" [31/Aug/2010:11:34:17 --0400] [localhost/sid#10080d708][rid#1025d2aa0][/vulnerable_app.php][9] T (0) lowercase: "1' union\t(/*blah blah blah*/select '1','2',password from users) -- -a" [31/Aug/2010:11:34:17 --0400] [localhost/sid#10080d708][rid#1025d2aa0][/vulnerable_app.php][9] T (0) replaceComments: "1' union\t( select '1','2',password from users) -- -a" [31/Aug/2010:11:34:17 --0400] [localhost/sid#10080d708][rid#1025d2aa0][/vulnerable_app.php][9] T (0) compressWhitespace: "1' union ( select '1','2',password from users) -- -a" [31/Aug/2010:11:34:17 --0400] [localhost/sid#10080d708][rid#1025d2aa0][/vulnerable_app.php][4] Transformation completed in 36 usec. [31/Aug/2010:11:34:17 --0400] [localhost/sid#10080d708][rid#1025d2aa0][/vulnerable_app.php][4] Executing operator "rx" with param "\\bunion\\b.{1,100}?\\bselect\\b" against ARGS:foo. [31/Aug/2010:11:34:17 --0400] [localhost/sid#10080d708][rid#1025d2aa0][/vulnerable_app.php][9] Target value: "1' union ( select '1','2',password from users) -- -a"
As you can see from the debug_log, the ARGS:foo parameter payload data was normalized to remove the evasion techniques before the payload was inspected by the @rx operator.
While transformation pipelines work fairly well at identifying most evasion attacks, they are by no means perfect. In fact, attackers may abuse the fact that ModSecurity only applies the specified operator only once, at the end of the transformation pipeline. Here is an example attack payload that indeed evaded previous versions of the CRS which were only inspecting more generic payloads such as REQUEST_URI:
http://localhost/vulnerable_app.php?foo=%2f*&bar=%E2%80%98+UNION+SELECT+*+FROM+user+%26%23x2f*
The evasion trick this payload is attempting is to spread the SQL C-style comment across multiple parameter payloads. Let's look at how the previous CRS rule would have processed this REQUEST_URI payload and applied the transformation pipeline:
[31/Aug/2010:11:56:15 --0400] [localhost/sid#10080d708][rid#1025d2aa0][/vulnerable_app.php][9] T (0) urlDecodeUni: "/vulnerable_app.php?foo=/*&bar=\xe2\x80\x98 UNION SELECT * FROM user /*" [31/Aug/2010:11:56:15 --0400] [localhost/sid#10080d708][rid#1025d2aa0][/vulnerable_app.php][9] T (0) htmlEntityDecode: "/vulnerable_app.php?foo=/*&bar=\xe2\x80\x98 UNION SELECT * FROM user /*" [31/Aug/2010:11:56:15 --0400] [localhost/sid#10080d708][rid#1025d2aa0][/vulnerable_app.php][9] T (0) lowercase: "/vulnerable_app.php?foo=/*&bar=\xe2\x80\x98 union select * from user /*" [31/Aug/2010:11:56:15 --0400] [localhost/sid#10080d708][rid#1025d2aa0][/vulnerable_app.php][9] T (0) replaceComments: "/vulnerable_app.php?foo= " [31/Aug/2010:11:56:15 --0400] [localhost/sid#10080d708][rid#1025d2aa0][/vulnerable_app.php][9] T (0) compressWhitespace: "/vulnerable_app.php?foo= "
Opps... As you can see, once the replaceComments transformation function is applied, it effectively removes the SQLi payload before the operator is applied. This is a perfect example of Impedance Mismatch, were the WAF is normalizing data in a different way as the target application.
So, how do we combat this evasion issue?
By default, operators are only applied once after the entire transformation pipeline is completed. This is a sound approach for normal everyday use as it strikes a balance between detecting attacks and not adversely affecting performance. Keep in mind that there is a latency cost each time that ModSecurity has to apply an operator to data.
There is a seldom used action called multiMatch and its purpose is to change when operators are applied to data.
With multiMatch, the operator is actually applied to data each time that the individual transformation function alters the data. With this approach, it is now possible to detect the previous SQL Injection bypass attempt. Here is how the multiMatch operator execution looks now:
[31/Aug/2010:16:22:45 --0400] [localhost/sid#10080d708][rid#1040020a0][/vulnerable_app.php][4] Executing operator "rx" with param "\\bunion\\b.{1,100}?\\bselect\\b" against REQUEST_URI_RAW. [31/Aug/2010:16:22:45 --0400] [localhost/sid#10080d708][rid#1040020a0][/vulnerable_app.php][9] Target value: "/vulnerable_app.php?foo=%2f*&bar=%E2%80%98+UNION+SELECT+*+FROM+user+%26%23x2f*" [31/Aug/2010:16:22:45 --0400] [localhost/sid#10080d708][rid#1040020a0][/vulnerable_app.php][4] Operator completed in 1 usec. [31/Aug/2010:16:22:45 --0400] [localhost/sid#10080d708][rid#1040020a0][/vulnerable_app.php][9] T (0) urlDecodeUni: "/vulnerable_app.php?foo=/*&bar=\xe2\x80\x98 UNION SELECT * FROM user /*" [31/Aug/2010:16:22:45 --0400] [localhost/sid#10080d708][rid#1040020a0][/vulnerable_app.php][4] Transformation completed in 42 usec. [31/Aug/2010:16:22:45 --0400] [localhost/sid#10080d708][rid#1040020a0][/vulnerable_app.php][4] Executing operator "rx" with param "\\bunion\\b.{1,100}?\\bselect\\b" against REQUEST_URI_RAW. [31/Aug/2010:16:22:45 --0400] [localhost/sid#10080d708][rid#1040020a0][/vulnerable_app.php][9] Target value: "/vulnerable_app.php?foo=/*&bar=\xe2\x80\x98 UNION SELECT * FROM user /*" [31/Aug/2010:16:22:45 --0400] [localhost/sid#10080d708][rid#1040020a0][/vulnerable_app.php][4] Operator completed in 1 usec. [31/Aug/2010:16:22:45 --0400] [localhost/sid#10080d708][rid#1040020a0][/vulnerable_app.php][9] T (0) htmlEntityDecode: "/vulnerable_app.php?foo=/*&bar=\xe2\x80\x98 UNION SELECT * FROM user /*" [31/Aug/2010:16:22:45 --0400] [localhost/sid#10080d708][rid#1040020a0][/vulnerable_app.php][4] Transformation completed in 78 usec. [31/Aug/2010:16:22:45 --0400] [localhost/sid#10080d708][rid#1040020a0][/vulnerable_app.php][4] Executing operator "rx" with param "\\bunion\\b.{1,100}?\\bselect\\b" against REQUEST_URI_RAW. [31/Aug/2010:16:22:45 --0400] [localhost/sid#10080d708][rid#1040020a0][/vulnerable_app.php][9] Target value: "/vulnerable_app.php?foo=/*&bar=\xe2\x80\x98 UNION SELECT * FROM user /*"[31/Aug/2010:16:22:45 --0400] [localhost/sid#10080d708][rid#1040020a0][/vulnerable_app.php][4] Operator completed in 0 us ec. [31/Aug/2010:16:22:45 --0400] [localhost/sid#10080d708][rid#1040020a0][/vulnerable_app.php][9] T (0) lowercase: "/vulnerable_app.php?foo=/*&bar=\xe2\x80\x98 union select * from user /*" [31/Aug/2010:16:22:45 --0400] [localhost/sid#10080d708][rid#1040020a0][/vulnerable_app.php][4] Transformation completed in 109 usec. [31/Aug/2010:16:22:45 --0400] [localhost/sid#10080d708][rid#1040020a0][/vulnerable_app.php][4] Executing operator "rx" with param "\\bunion\\b.{1,100}?\\bselect\\b" against REQUEST_URI_RAW. [31/Aug/2010:16:22:45 --0400] [localhost/sid#10080d708][rid#1040020a0][/vulnerable_app.php][9] Target value: "/vulnerable_app.php?foo=/*&bar=\xe2\x80\x98 union select * from user /*"
In order to balance out the latency hit and its chance for higher false positives, we have implemented use of the multiMatch action into the OWASP ModSecurity Core Rule Set by only conditionally using it if the Admin configures the PARANOID_MODE variable in the modsecurity_crs_10_config.conf file. When this is set, many of the rules will then inspect more generic variables and use the multiMatch action. As indicated by the name, PARANOID_MODE is mainly meant for people who want to get more aggressive with detection and have a higher tolerance for false positives.
Here are a few recommended tips for using transformation functions.
Due to the fact that transformation functions pipelines are cumulative, it is possible that you could unintentionally inherit transformation functions from a previous SecDefaultAction. It is therefore good practice to start each transformation function pipeline with "t:none" as this will clear out any existing ones.
There is no "one size fits all" magic transformation function pipeline. You need to analyze what type of attack you are targeting and then identify the methods in which attackers may try evade detection. For instance, the transformation pipelines for detecting Cross-site Scripting (XSS) is much different then what you would use for SQL Injection.
The larger the payload is, the longer it will take to complete the transformation pipeline. This is not that big of a concern for inbound request data as they are generally small in size. Where you can run into higher latency hits is when you are attempting to inspect outbound data (RESPONSE_BODY variable). It is for this reason that you should try to limit the use of transformations when inspecting RESPONSE_BODY and instead specify mixed-case within your regular expression.