Apache and charset

Today, while viewing statistics for my blog, I noticed one interesting search query: \”Apache russian charset\”. Although it seems that I have never covered this issue in my blog, I had experienced this problem some time ago.

The idea is that some sites do not have a meta tag telling the browser which charset to use. This is getting really painful with russian since there are few (and I would say many) charsets for russian. As far as I know, when the browser do not see the meta tag for charset it takes a header given by apache to guess the encoding, but this is not correct in some cases as well. For instance if my server defaults to utf-8 and one of the virtual hosts (which I gave to my friend uses cp-1251) then there is a trouble (unless the browser is very inteligent to guess from the actuall page content). In this case, you can override the default charset for each virtual host by giving the AddDefaultCharset in the VirtualHost directive or by writing the same directive in the .htaccess file of the virtual host docuement root (for this the AllowOverride directive should be set correctly).

I had this problem when I gave a virtual host to my friend, but he was composing his HTML files on his home workstation (which had cp-1251 as a default charset for any saved files) and then was using FTP to upload the stuff.