Skip to content

RLO spoofing is another Internet security reverse

December 4, 2015
rlo-spoofing

You’re right to think that security on the Internet is going in the wrong direction.

Nine years ago, at least, computer industry insiders knew how the invisible bi-directional control characters in the Unicode character set could be exploited to disguise phishing websites and malware.

It’s nearly a decade later and surprisingly little has been done to mitigate the threat to ordinary computer users.

These same Unicode control characters, which are now included with all computer typefaces, also form part of the basic toolkit of work-a-day Internet hackers-slash-thieves.

Unicode bi-directional control characters are being regularly exploited to spoof filenames, Windows registry entries and perhaps even website URLs—all in order to disguise the malicious digital content used to take control of computers and steal from computer users.

The problem isn’t actually the Unicode control characters themselves but rather the characters running the Unicode Consortium, along with the various top hardware and software companies. These are the characters who have collaborated over the last 20 years to create the poorly-secured, easily compromised/easily weaponized, standards—like NTP servers, TLS/SSL and Unicode—that have helped make the Internet the dangerous environment that it has become.

And Microsoft—the maker of the operating system most-targeted by every kind of malware attack—should be singled out for deliberately removing a simple security feature from the Home version of Windows that could easily be used to guard against bi-direction character spoofing.

One character set to rule them all

You may know the Unicode text character encoding standard from such historical works as the Unicode of Hammurabi, or you may be more familiar with recent works, such as the DaVinci Unicode—or you may not.

Unicode is one of those foundational things that underlies computers and the Internet—most of us should never need to know about it. It’s a 20-year-old standardized set of over 120,000 characters allowing most of the world’s writing systems—current and historical—to be rendered digitally.

Each typeface that a person uses on their computer contains only a subset of this huge Unicode “alphabet” .

For example, the version of the typeface Arial which has shipped with Microsoft Windows since 1999 is properly called Arial Unicode MS and includes 27,720 Unicode characters.

Unicode is fundamentally a good and necessary thing, and so are the invisible bi-directional control characters.

The so-called Unicode “bidi” control characters are not really characters at all but rather invisible signal switches, or style operators, to control the apparent flow of words. They exist to facilitate the display—separately and together—of scripts that run left-to-right, such as English, and right-to-left, such as Arabic and Hebrew.

Character Meaning Unicode Usage
RLM Right-to-Left Mark 200f Acts as an Arabic character.
LRM Left-to-Right Mark 200e Acts as a Latin character.
RLE Right-to-Left Embedding 202b Treats following text as embedded right-to-left.
LRE Left-to-right embedding 202a Treats following text as embedded left-to-right.
RLO Right-to-left override 202e Forces following characters right-to-left.
LRO Left-to-right override 202d Forces following characters left-to-right.
PDF Pop Directional Format 202c Restores direction previous to LRE, RLE, RLO, LRO.
ZWJ Zero width joiner 200d Joins leading and trailing characters.
ZWNJ Zero width non joiner 200c Breaks leading and trailing characters.

It’s not clear if the bidi control characters were misused prior to 2006 but by that year at least, Unicode insiders and computer security types, were well aware of the possible malicious uses to which the bidi operators—particularly the right-to-left override (RLO) character—could be put to.

Malicious hacking goes into reverse

The earliest reported use that I can find of the RLO control character being exploited to distribute malware is five years ago.

In 2010, email spam of the “I Love You” variety disguised the SASFIS malware application as a text file attachment, by placing the invisible RLO character before the “T” in the filename—turning “I-LOVE-YOU-XOXTXT.EXE” into “I-LOVE-YOU-XOXEXE.TXT”.

There were reports, beginning in April of 2012, of UK-based activists and journalists in Bahrain being targeted by a cyber-campaign, apparently mounted by the Government of Bahrain, involving emails containing executable file attachments infected with a spyware called FinSpy. The attached file was spoofed with an RLO control character, so “gpj.1bajaR.exe” became “exe.Rajab1.jpg”.

Also in 2012, there were reports of a “large scale infiltration of computer systems in the Middle East area” dubbed “Mahdi” which employed the RLO spoofing.

In July, 2013,security researches spotted a rarity—a Mac malware app called Janicab.A—disguised, using the RLO trick, as a PDF file. Because of the RLO character in the filename, a warning displayed by the Mac OS X operating system was confusingly displayed right-to-left.

In 2013, according to Raymond Roberts of  Microsoft, the Sirefef family of malware was found to be using RLO spoofing to hide evidence of its presence in the Windows registry (a database of user settings) by mimicking a setting initiated by a Google Chrome installation. (Roberts wrote a good overview of RLO filename spoofing in 2012).

This last summer a spam campaign was observed distributing a Windows remote access trojan (RAT) called DarkKomet.  The attached file was named: “Airway_Bill 23-06-2015 Ind□cod.exe” — with an failed attempt to use a RLO character (indicated by the box) to disguise the malware as a Microsoft Word ‘.doc” file.

Seeing is believing even if you shouldn’t believe your eyes

rlo-spoofed-file

The DarkKomet malware RLO-fail is indicative of the moderate difficulty involved in applying the RLO character to filenames and making it stick.

The Windows operating system doesn’t normally allow Unicode control characters to be pasted into filenames and possible variations in font encoding from one computer and one operating system to another can play havoc with even the best laid plans of malicious hackers.

Consider the file that I created just for this post, called:

famous-croats-arzano‮gpj.vbs

To craft the file name, I had to first perform a change to the Windows registry to allow the pasting of Unicode number codes into filenames. And, in order to protect the final product in transit via email over the Internet, I had to put it in a compressed archive.

You can download the file as a zip archive if you’d like to examine it personally. It should be safe; Windows Defender gave it a clean bill of health. And why not, it looks just like a plain ol’ harmless JPG picture file—probably of some cats, right?.

While I’ll warrant that it’s harmless, it isn’t a JPG image file. If you double-click on it, instead of a picture, you’ll be presented with a series of clickable dialog boxes. It’s actually a Visual Basic Script—an executable program disguised as an image file using an RLO character operator, like so:

  • Before right-left-operator: famous-croats-arzanogpj.vbs
  • After right-left-operator: famous-croats-arzanounicode-bidi-202e‮gpj.vsb

To anyone familiar with such things, the fact that my “JPG” still displays a default Visual Basic file icon is a bit of a giveaway but, with more work, I could’ve swapped it for default JPG file icon.

For that matter, I could’ve disguised the filename and icon to perfectly mimic a plain text file, a MP3 audio file, a PDF file, a Microsoft Excel spreadsheet or whatever looked safe enough to convince a user to double-click it.

And instead of hiding a harmless Visual Basic dialog box, it could’ve been covering for a truly harmful piece of malware.

This shouldn’t be possible.

For starters, the three-character file extension tells Windows what to do with a file. Windows “knows” EXEs are application programs and that VBSs are Visual Basic Scripts. If you just rename a “.VBS” or “.EXE” executable program file to a document file type such as  “.JPG” then Windows will blindly try to open the disguised program in the default program for viewing JPGs — and fail. The user will see an alert declaring: “Unknown file format, empty file or file not found!”

My renamed VBS script opens properly because it isn’t renamed at all.

The invisible Unicode RLO control character pasted between the letters “o” and “g” is telling Windows to reverse the appearance of the order of the last seven characters in the file name—but only the appearance.

The filename only looks to people like “famous-croats-arzanosbv.jpg”. Windows reads the actual filename, which is “famous-croats-arzanogpj.vbs”.

This shouldn’t be possible either!

This is a nine year old-plus exploitable flaw in Unicode that is well known to everyone except ordinary computer users—who have been left in the dark and essentially defenseless.

An upper and lower case of mistaken identity

rlo-spoofed-url

Not only do Unicode bi-directional control characters still allow spoofing of malware file names, but (perhaps more seriously) they may even allow website URL spoofing, like so:

A malicious coder sets up a web domain pointing to a server, say: “http://www.evilmalware-site/” and then, to hold their malware-filled website, they set up a file directory with a nonsensical name like: “strela\moc.koobecaf.www\\:ptth”.

This gives them the following website URL:

http://www.evilmalware-site/strela\moc.koobecaf.www\\:ptth

After a right-left-override control character is pasted at the beginning of the URL, it is unchanged so far as a computer is concerned but looks completely different to computer users:

http://www.evilmalware-site/strela\moc.koobecaf.www\\:ptth

It’s unclear if the RLO control character has ever actually been exploited to spoof URLs but the possibility has been talked about for years as a kind of “homograph attack”.

Homograph attacks exploit the resemblance of some non-Latin Unicode characters to characters in the Latin alphabet, in order to create domain names and website URLs that convincing resemble well-known English-language websites.

paypal-english

The word “paypal” in a Latin Unicode font.

paypal-cyrillic

The word “raural” in a Cyrillic Unicode font.

My tests RLO-ing my own blog URL ( ‮https://sqwabb.wordpress.com/‬) were inconclusive — possibly due the alias nature of the WordPress.com URLs.

If I selected the reversed URL (from right-to-left) in the brackets, up to the end of “m” and performed a right-click “open link in new tab” in Firefox, it worked. But pasting it into the address bar in both Pale Moon/Firefox and Chrome resulted in a  Google search. Admittedly, the search returned nothing but my posts, with my main blog URL as the top search result.

Apparently all we can do is RLO with it

anexe

The same file displayed in XP, Windows 7, Linux and Mac OSX.

In 2013, A Malwarebytes researcher crafted a fake Windows file, named it “anncod.exe”, added an RLO character to make it read “annexe.doc” and took for a walk through the Internet and a few different operating systems.

Insecure Windows XP wasn’t fooled because it’s too old to support Unicode control characters in file names. Windows 7, on the other hand, displayed the spoofed filename perfectly…wrong.

On a Linux system, the Windows file was effectively harmless but still deceptively named. In Mac OS X, the unusable file name displayed correctly because the Mac didn’t recognize the Windows encoding of the RLO character (but that’s easily fixed).

The takeaway is that all three desktop platforms—Windows, Linux/Unix and Mac OS X—are equally vulnerable to Unicode control character spoofing.

All the major free web email services allow document files, such as text, music and images to be attached to messages. But Google and Hotmail, in particular, do not allow applications (“executables”) to be directly attached to messages, because of the threat from viruses. And neither service is fooled by the thin disguise of an RLO character.

The only way to send such a spoofed application file as either a Gmail or Hotmail attachment is inside of a zipped archive.

So you should be wary enough of email attachments from strangers as it is but if the attachment is a zip file supposedly containing nothing more than a text or image file then that is very suspicious in itself.

Unfortunately, both AOL mail and Yahoo Mail, in keeping with their historic disregard for user security, do allow executable (Windows “EXE”) files to be directly attached to email—all the more reason why I would recommend that people be downright phobic about receiving anything from Yahoo or AOL.

A fix for Windows Pro users but the rest of us are just in a fix

For something like the last 14 years, one way that Microsoft has set the Home version of Windows apart from the Professional and Enterprise versions has been by not including something called the “Local Security Settings console“. This is an easy-to-use graphical front-end for the fairly baffling Windows registry—the storehouse of all user settings.

With the Local Security Settings console, it’s easy to create a security rule that watches for Unicode bi-directional control characters and warns the user if they click on one. Otherwise, it’s all but impossible.

So, if you have any version of Windows besides a Home version, then you can probably take advantage of the simple countermeasure against the RLO control character recommended by a 2011 advisory from the the Information Technology Promotion Agency (IPA) of Japan.

The IPA’s simple six-step instructions shows Windows users who have the “secpol.msc” how to launch it and create a new software restriction policy rule for the RLO control character that will generate a warning message if they click on a filename using RLO spoofing.

Otherwise I have nothing much to offer people in the way of a fix. I can’t see how to create an applicable exception rule in the Windows Firewall or Windows Defender and the expedient of ripping the Local Security Settings console out of a Pro Windows to stick it in Home Windows version has been bandied about but sounds far too dodgy to even try, let alone recommend.

I see no protection against RLO spoofing offered by any major antivirus software package and I haven’t a clue what to tell Linux and Mac users.

Apparently two security wrongs almost make a right

I disable Microsoft’s awful “hide file extensions” feature (which is on-by-default as of Windows 8) so I should say that and if you have it enabled, it will slightly adversely affect the look of RLO spoofed filenames by striping the reversed file extension and the period but the fake extension will still be there.

This reminds me of the original security problem that Microsoft caused by introducing this dumb feature. Back in the early 2000s, hackers quickly realized that they could name malware aplications thusly: “the-sound-of-music.mp3.exe” and Microsoft’s “wonderful” new feature would trim the file to read: “the-sound-of-music.mp3”.

And if malware authors continue to count on Windows users hiding file extensions, it can only be “default” of Microsoft, which made extension-hiding the default beginning with Windows 8 in 2012.

Microsoft — acting in securely, I mean active in security, going way back. Click the images to enlarge them.

3 Comments
  1. This is fascinating stuff. And I don’t get to be a smug Linux user either. I’m going to have to play with this in Debian. It’s a pity I don’t have a Windows box available as well. Heh. Maybe it’s time to dig out a WinXP installer disc and have a play in the sandbox ☺

    ~xtian

    Like

    • I’m glad someone finally read it. 🙂 For follow up, I’ve been trying to find antivirus or anti-executable programs that allow users to filter files according to defined strings in filenames but no luck yet. I’d love to figure out how to do the same in the registry in Home versions of Windows and, of course, get an idea of what Linux and Mac user can do.

      Like

      • Heh. And *I’m* glad this comment showed up and didn’t get eaten by WordPress’s spam filter. Things haven’t gotten any more useable around here since my agnosis1975 blog was active, although my idea of useable maybe deviates from the norm.

        And something like what you describe would be very useful in Linux too. Despite Linus’s intentions, desktop Linux is an edge case. Security in server land or IoT country is a serious problem, and this looks like it could be a huge issue for a Linux based web service. I’m wondering about BSD now. I still haven’t got around to experimenting with it on my old Mac.

        Like

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.