Have you ever seen a rule for ModSecurity? They may look similar to the following:
SecRule REQUEST_URI"@endswith example.com/index.html""id:1,log,deny,redirect:http://modsecurity.org"
This rule may look complicated, but it is extremely basic. It says, if you find a URL ending with example.com/index.html – issue an HTTP redirect to http://www.modsecurity.org. All rules within SecRules follow this format of a variable, operator, and action. While the structure of a basic rule is rather simple, underneath there is an extremely flexible language capable of both handling variables and persistence between requests. This may seem strange as it's not like many of the other programming languages you might have encountered. In fact, what is now the SecRules language evolved out of Apache's configuration parser over many years.
As part of the ongoing ModSecurity version 3.0 rewrite, SecRules' dependency on Apache and its configuration parsing capabilities were removed. The refactoring of this component was done for several reasons, the primary being that, while ModSecurity v3 (libmodsecurity) will continue to support Apache, its revamped architecture is designed to encourage flexibility in data sources, effectively decoupling ModSecurity need for a specific web server. As a result, any reliance on a web server as part of the core processing functionality would be superfluous.
The first aspect of this work was to identify how we were going to replace the traditional Apache SecRules configuration parser. The most obvious choice was to construct our own parser/lexer. This choice allowed us to formalize the SecRules language in such a way where we could easily solve the decision problems of validating existing SecRules samples while maintaining extensibility. To accomplish this requirement we leveraged modern versions of Bison and Flex, which are required for building libmodsecurity. While there are the practical reasons for formalizing our language, it leads to other, less immediately practical questions. While Bison allows us to represent our language as a Context Free Grammar (CFG), we have less information about our languages capability of expression once specified. To formalize this question we ask: "Is SecRules Turing Complete?"
Turing Completeness
Turing Completeness is often used to discuss the capabilities of a programming language. Generally, when we something is say Turing Complete, we are loosely describing a system that can simulate any single taped Turing Machine. In the case of most modern computers we are evaluating this completeness factor while ignoring the requirement for unlimited memory. While most imperative languages do fit this definition of Turing completeness, SecRules arguably doesn't fit the imperative mold well. In fact if you were to compare SecRules to another language you might very well select (heh) SQL. Both languages essentially instruct the system to fetch information and perform some action with it. For instance:
SecRules ARGS "@contains test" "id:1,deny,status:404"
This tells the system to fetch all arguments that contain the word test. If any results are returned the action specified, in this case "return a 404", is undertaken. Although similar, It should be noted that traditional SQL92 is not Turing complete (https://www.quora.com/Is-SQL-a-Turing-complete-language). Adding to the argument that SecRules may not be Turing Complete the language incorporates, often heavily, Regular Expression. Regular Expressions, as a language, are also known not to be Turing complete. So is SecRules Turing Complete, and if so what sets SecRules apart from languages like Regular Expressions?
Evaluating SecRules
When evaluating SecRules for completeness we may first observe that it shares a lot in common with imperative languages, particularly with its actions. However, we also get to rely on HTTP in the backend which can take the load off slightly, as we will see. But to decide if this is enough we must first look at how one evaluates Turing Completeness.
There are several ways to prove Turing completeness, from showing the computability of μ-recursive functions to perhaps the most famous example of a Turing Complete system, Lambda Calculus. However, any system that is able to implement a single taped Turing Machine is capable of representing each of these systems, so there are many other ways of formally proving Completeness. With this in mind implementing games like Conway's Game of Life can be used to show that a system is Turing Complete
While Conway's Game of Life is more commonly known, we used a slightly different game called Rule 110. Rule 110 is a cellular automaton that was introduced by Stephen Wolfram. It uses an extremely simple set of rules in order to generate the next row of the game based on the previous row.
Row Pattern |
111 |
110 |
101 |
100 |
011 |
010 |
001 |
000 |
Resulting Row Value |
0 |
1 |
1 |
0 |
1 |
1 |
1 |
0 |
The advantage for us over Conway's Game of Life is that it only required comparing one dimensional neighbors instead of having to generate the board data structure that is required for Conway's Game of Life, which must evaluate neighbors in two dimensions. Of course the other advantage is that Rule 110 has been proven to be Turing complete (http://www.complex-systems.com/abstracts/v15_i01_a01.html)
The ModSecurity implementation needs to take advantage of two key aspects of the SecRules language. The first aspect is support for persistence, which is implemented via lmdb (using initcol and setvar). This allows us to store information about the state of the game between requests. The second aspect we take advantage of is the capability to redirect to another site. This effectively makes our loop portion of the game. While modern browsers limit the amount of redirects, this isn't a fundamental limit of HTTP itself but rather the browser.
The flow of the game is slightly complex because of the lack of support for native looping within ModSecurity. We can see the pseudo code below. In this case variables with a prefix of 'p' are persistent variables and exist between requests. Variables with a prefix of 'r' come from the request and are passed via GET request parameters. Variables with a prefix of 'l' are limited variables and exist only for the length of time the specific request is being processed.
If the persistence is not already setup:
Set up pFinal_state that contains the rCurrent_row parameter
Setup an empty pNew_row.
If the rStart_new_row parameter is set, save pNew_row to pFinal_state
If the rStart_new_row parameter is set, clear pNew_row
Get the first three elements of the rCurrent_row and assign to a variable lTest_string
Run lTest_string through a lookup table and append result to our pNew_row
Remove one element from the rCurrent_row and save it to a variable lRemaining_row
If the lRemaining_row has more than two elements left to process
Redirect back to the original site passing lRemaining_row as the value of the rCurrent_row parameter
Else
If pNew_row has more than two elements in it
Redirect to the original site with a pNew_row passed as the rCurrent_row parameter and the rStart_new_row parameter set
Else:
# append pNew_row to pfinalState
We are done, append pFinal_state to the page
So let's take an example, if we used the following starting point "101010", we'd first process '101', then move one bit forward to '010', then '101', then '010' at which point we'd be done with the current row. This would result in the next row of '1111'. The process would continue until it completed, yielding:
101010
1111
00
Using the aforementioned pseudo code we are able to get the expected outcome when implementing within ModSecurity (Appendix 1).
This indicates that we are able to successfully implement Rule 110 in ModSecurity's SecRules language. As a result of this we can therefore draw the conclusion that SecRules is in fact Turing Complete, and capable of representing anything most modern languages are able to represent (although not necessarily as efficiently).
The following is an implementation of the pseudo code in the SecRules language:
SecContentInjection On
# Check if our persistance exists
SecRule &IP:counter "!@eq 1" "id:1,pass,nolog,msg:'Starting Persistence',initcol:ip=%{REMOTE_ADDR}"
# Functionality to clear (db persists between reboots)
SecRule ARGS:clear "@streq true" "id:100,pass,nolog,setvar:ip.pFinal_state="
SecRule ARGS:clear "@streq true" "id:101,pass,nolog,deny,status:402,setvar:ip.pNew_row="
# If the persistence is not already setup:
SecRule &ARGS:row "@eq 1" "id:2,chain,pass,nolog,msg:'initate pNew_row'"
# Set up pFinal_state that contains the rCurrent_row parameter
# Setup an empty pNew_row.
SecRule IP:pFinal_state "@lt 1" "t:length,setvar:ip.pNew_row=,setvar:ip.pFinal_state=%{ARGS.row}"
# If the rStart_new_row parameter is set, save pNew_row to pFinal_state
SecRule ARGS:start_new_row "@streq true" "id:4,pass,nolog,setvar:ip.pFinal_state=%{ip.pFinal_state}
%{ip.pNew_row}"
# If the rStart_new_row parameter is set, clear pNew_row
SecRule ARGS:start_new_row "@streq true" "id:5,pass,nolog,setvar:ip.pNew_row="
# Get the first three elements of the rCurrent_row and assign to a variable lTest_string
SecRule ARGS:row "@rx (\d{3})\d*" "id:6,pass,nolog,msg:'got %{tx.1}',capture,setvar:tx.lTest_string=%{tx.1}"
# Run lTest_string through a lookup table and append result to our pNew_row
SecRule TX:lTest_string "@streq 111" "id:7,pass,nolog,setvar:ip.pNew_row=%{ip.pNew_row}0"
SecRule TX:lTest_string "@streq 101" "id:8,pass,nolog,setvar:ip.pNew_row=%{ip.pNew_row}1"
SecRule TX:lTest_string "@streq 110" "id:9,pass,nolog,setvar:ip.pNew_row=%{ip.pNew_row}1"
SecRule TX:lTest_string "@streq 100" "id:10,pass,nolog,setvar:ip.pNew_row=%{ip.pNew_row}0"
SecRule TX:lTest_string "@streq 011" "id:11,pass,nolog,setvar:ip.pNew_row=%{ip.pNew_row}1"
SecRule TX:lTest_string "@streq 010" "id:12,pass,nolog,setvar:ip.pNew_row=%{ip.pNew_row}1"
SecRule TX:lTest_string "@streq 001" "id:13,pass,nolog,setvar:ip.pNew_row=%{ip.pNew_row}1"
SecRule TX:lTest_string "@streq 000" "id:14,pass,nolog,setvar:ip.pNew_row=%{ip.pNew_row}0"
# Remove one element from the rCurrent_row and save it to a variable lRemaining_row
SecRule ARGS:row "@rx \d{1}(\d*)" "id:15,pass,nolog,msg:'remain %{tx.1}',capture,setvar:tx.lRemaining_row=%{tx.1}"
# If the lRemaining_row has more than two elements left to process
# Redirect back to the original site passing lRemaining_row as the value of the rCurrent_row parameter
SecRule tx:lRemaining_row "@gt 2" "id:16,t:length,deny,redirect:http://localhost?row=%{tx.lRemaining_row}"
# Else
SecRule tx:lRemaining_row "@le 2" "id:17,t:length,msg:'new row added %{ip.pNew_row}',chain,deny,redirect:http://localhost?row=%{ip.pNew_row}&start_new_row=true"
# Append the amount of zeros (optional)
# If pNew_row has more than two elements in it
# Redirect to the original site with a pNew_row passed as the rCurrent_row parameter and the rStart_new_row parameter set
SecRule IP:pNew_row "@gt 2" "t:length"
# Else
SecRule tx:lRemaining_row "@le 2" "chain,phase:3,id:18,t:length"
# append pNew_row to pfinalState
# We are done, append pFinal_state to the page
SecRule IP:pNew_row "@le 2" "chain,t:length,setvar:ip.pFinal_state=%{ip.pFinal_state}
%{ip.pNew_row},append:'%{ip.pFinal_state}'"