Trustwave and Cybereason Merge to Form Global MDR Powerhouse for Unparalleled Cybersecurity Value. Learn More
Get access to immediate incident response assistance.
Get access to immediate incident response assistance.
Trustwave and Cybereason Merge to Form Global MDR Powerhouse for Unparalleled Cybersecurity Value. Learn More
If you save wide Unicode brackets (i.e. <>) into a char or varchar field, MSSQL Server will convert them into HTML brackets (i.e. <>). So, <img src=x onerror=alert('pxss')> will be converted to <img src=x onerror=alert('pxss')> compliments of the backend DB. This will likely help you sneak past server-side filters, WAFs, etc. and execute a persistent Cross-Site Scripting (PXSS) attack. As a bonus, .NET request validation will not detect it.
I still find myself shaking my head each time I come across this one. This is a known issue – I didn’t discover this. It’s not even a new issue, so it seems weird that I would waste my time to document the issue. Again. Truth be told, I’ve written about this internally for SpiderLabs. We all know about it and employ this payload in common attack scenarios and I still cannot think of a single valid reason for MSSQL to both allow and cause it. It can be an insidious little issue too because you thought you were safe…playing by the good guy rules. Here’s my attempt at explaining it. Again.
The venerable char/varchar datatypes did a fine job of allowing an application to store variable-length data and minimize its storage footprint by only using up space as the text data grew. 20+ years ago, disk space limitations were a common issue, so prudent table design which minimized waste was common sense. The char/varchar datatypes had a limited character set that would handle only ASCII characters which required only a single byte of storage for each character. Nchar/nvarchar, on the other hand, required 2 bytes in order to store Unicode characters. With these factors in mind, developers and DBAs who had no intention of storing non-ASCII characters would make the obvious choice at the time and use varchar since it would consume as little space as possible and expand to its maximum size as needed. While the issue we’re looking at affects both char and varchar, we’ll stick to talking about varchar from here on.
Fast forward a bunch of years from the 90’s, and now we have bad guys attacking applications, injecting any and every payload they can into HTTP requests. In our case, we’re going to see what happens when Unicode values (2 bytes) are saved in a database column that can only save ASCII values (1 byte). My initial thought would be to expect some sort of database error, but I would be wrong. Let’s take a look at a very unexpected behavior of SQL Server.
The particular oddity that we’ll be looking at is what SQL Server does when you insert Unicode characters into a varchar field. SQL Server will transparently attempt to map & convert Unicode characters into their visually related ASCII characters. This is where things get interesting. The wide “less-than” Unicode character looks similar to the ASCII “less-than” character, however the former is not blocked by request validation while the latter is.
For PoC purposes we can see that the wide less than (%ef%bc%9c) stored as a .NET string is 0xff1c in hex.
using System;
using System.Web;
using System.Text;
public class Program
{
public static void Main(){
string fromInput = System.Web.HttpUtility.UrlDecode("%ef%bc%9c");
int intVal = (int)fromInput[0];
Console.WriteLine("Wide angle bracket from input: {0}", fromInput);
Console.WriteLine(String.Format("Associated int-to-hex value: {0:X}", intVal));
}
};
Wide angle bracket from input: <
Associated int-to-hex value: FF1C
If this hex value is inserted into a SQL table’s varchar column, we’ll see SQL Server convert the wide less-than to a regular less-than due to a homoglyphic transformation – a given character looks like another character and is subsequently converted.
CREATE TABLE UnicodeConvert
(
col1 varchar(10) /* cannot handle unicode values */
,col2 nvarchar(10) /*can handle unicode values */
)
INSERT INTO UnicodeConvert VALUES (NCHAR(0xff1c),NCHAR(0xff1c));
select col1 as varcharData, unicode(col1) as varcharCharacterCode, col2 as storedAsUnicode, unicode(col2) as storeAsUnicodeCharacterChode from UnicodeConvert;
varcharData: < /* b00$h! that's a valid html bracket*/
varcharCharacterCode: 60
storedAsUnicode: < /* <-- that bracket will not be useful as HTML */
storeAsUnicodeCharacterChode: 65308
We see %EF%BC%9C will be the open bracket and %EF%BC%9E will be the closing bracket that is not caught by .NET request validation and (if stored in a varchar field) could easily be returned as a proper persistent Cross-Site Scripting payload.
%EF%BC%9Cscript%EF%BC%9Ealert%28%27PXSS%27%29%EF%BC%9C%2Fscript%EF%BC%9E i.e. <script>alert('PXSS')</script>
We have just saved a completely benign set of strings and let the database convert it into a malicious payload! So why is MSSQL converting Unicode to ASCII? We reached out to Microsoft to give them the opportunity to comment, but have not received a response. I’m guessing that in a kinder, gentler time, this was a helpful feature that they added. The bracket trick is just one of several transformations that could be useful for attackers. I’ve read other sources that outline using this same technique for bypassing filters and successfully executing SQL injection attacks when the conditions were right.
So, what should you do? Can filters be applied to look for known malicious Unicode patterns? Probably, but it's likely that there are also other clever bypasses out there that require some fringy, esoteric DBMS knowledge to execute. If it were up to me, I’d dump the DDL for my SQL tables and look for char/varchar and change those to nchar/nvarchar respectively. Knowing that DBAs can be resistant to change for fear of breaking legacy apps, I’d show them this post and let them know that the situation is serious, and it’s been serious for a long time. So, keep chanting “defense-in-depth”, run your WAF, apply your filters and change those data types. If Microsoft provides feedback on this issue we will update the post accordingly.
Trustwave is a globally recognized cybersecurity leader that reduces cyber risk and fortifies organizations against disruptive and damaging cyber threats. Our comprehensive offensive and defensive cybersecurity portfolio detects what others cannot, responds with greater speed and effectiveness, optimizes client investment, and improves security resilience. Learn more about us.
Copyright © 2025 Trustwave Holdings, Inc. All rights reserved.