FAQ: Why is Microsoft Word breaking my Boolean Strings?

This is a question that I received twice in the last week – well, a puzzled sourcer asking why something wasn’t working, and one keen-eyed sourcer who was asking if this is a known issue. Why does Microsoft Word break my Boolean strings? Here’s why:

Microsoft Word is the world’s most used word processing programme that formats characters into your desired fonts, layout etc. In order to understand why when you write Boolean strings perfectly and correctly, only for it not to work, you have to put your nerd cap on! Each character on your keyboard has a particular numbered code. Eg, A = 65, a = 97. For a table on all ASCII codes for the standard 256 characters on your keyboard, click here, and for a table on the more complex Unicode characters (of which there are over 110,000 characters covering 100 types of language scripts, like Mandarin, Arabic etc), click here.

Consider what MS Word does to particular characters without you asking – like elongating your hyphens (called a Dash) when between two words as a long pause… That elongated hyphen’s code is 2013 but the normal hyphen (or minus sign) is 45.Dash-and-Hyphen

Inverted commas are the same – MS Word will transform them into the lovely 66’s and 99’s you’re used to seeing (their code is 2BA), rather than the standard, unformatted ” (whose code is 34).Quotes-vs-QuotesAll of these seemingly insignificant changes that MS Word will do on your behalf have a distinct impact on your search engine results, whatever search engine that might be. Just think about it as the search engine is reading the numbered code corresponding to each character you’ve searched for, and not the word itself – it can’t read letters, only numbers (if you’re young enough to remember your c.1980’s computer days, it’s all about 0’s and 1’s!).

Anyway, apart from that computer science lesson, YES – MS WORD WILL DESTROY YOUR STRINGS. This is exactly why we recommend (read: insist) that you use Notepad (or equivalent on a Mac, TextEdit – ensure it’s in Plain Text though, not Rich Text), which is an unformatted word processing programme, to write your strings. Notepad/TextEdit will write the basic unformatted text that your search engine can read, and present your results as expected.

Yeah, but why do I still get some results?

Consider this string:

(sales OR “business development” OR “key account”)

If your quotations are the wrong ones, ie the 66’s 99’s, the search engine will disregard them. Therefore what your search is looking for is sales OR business AND development OR key AND account.

So it’ll present you with results that have either the word sales or business, and the word development or key, and the word account. It won’t present results which looks for the word sales or the term “business development” or the term “key account”. There may very well be pages on the web with those words all on them, but not nearly as many as just one of the synonyms of the same thing you’re looking for.

Conclusion: Always write your strings in Notepad!

