Friday, January 30, 2009

Unicode Character Set

Unicode Character Set

Before talking about this particular character set we need to talk about what a character set is? Character Set is a set of symbols that represent a particular language or set of languages or can be anything one can rely on for his language representation.

Now knowing what a character set is we will talk about unicode character set. After the introduction of ASCII encoding for the common english character set which only required first 7bits of a byte, people started using the code points from 128 to 255 without any common standard. In fact few languages can't be even represented completely with one byte which ASCII was doing at that time. By the time internet got a boom and people started accessing one other's information the encoding became pretty important as the same code point represented differently in different computers as they had their own character encodings i.e. the mapping of characters to the code points. Hence to solve this, Unicode came up with a universal character set that contained almost all the writable characters in the world. And the best part is this was not restricted to only one byte or in fact any number of bytes to represent a character.

Tuesday, January 27, 2009

Internalization and Localization

Internalization & Localization are two buzz words when we come to web applications. People usually get confused thinking both are one and the same.

But the fact is Localization is a prerequisite to the internalization in some way.

Internalization

From a web application perspective internalization means input, output can be expected to be of any language. In more detail, say we have a comments section, where we can comment our thoughts in any language. And in see all comments we can see our inputted comment shown in the same way what we inputted.

For the above thing to happen, we need to cater all languages while taking the input and be able to process it and then store it. So every input has to be tagged with a proper character set. That means for hindi characters we need to have a character set (mapping of characters to numbers) which covers hindi characters whose code points(mapped numbers) need to be stored and while displaying use the store code point and the character set.

Saying that the character set which can take care of all the language writable symbols need to be used or we need to provide mechanism for supporting many character sets, tagging each string with appropriate character set. But to our luck we have Unicode character set which can cater all language symbols.

Localization

As said earlier localization is a kind of prerequisite for internalization. Localization is nothing but making the data represented with respect to a locale from a web application point of view. Locale is a set containing preferences like language, time format, currency and can be anything which personalizes an individual person.

For example we might want to cater to the people who doesn't know english. To do that we need to represent our data in the language they know. Say for example we want to say Hello to a french guy. Then we would like show him Bonjour. To make this happen we need to have proper database which can have translations of the things which we show in our web page.

Unicode and various encodings of it might be the next step of internalization from a developer point of view.



Saturday, January 10, 2009

Enabling Rewrite Rules in apache or xampp

I installed xampp apache server recently. After using this for a while i wanted to make a rewrite rule which couldn't get set due to the default behavior of xampp which is rewrite_module not loaded.



So to enable it, goto your httpd conf file, add the following line
LoadModule rewrite_module modules/mod_rewrite.so
grep for the AllowOverride None which should be changed to AllOverride all.



Now before your rewrite rule have the folling lines
RewriteEngine on
Options +FollowSymLinks




Also sometimes you might end up getting 403 forbidden error. To solve this check for
line which is followed by
Order deny,allow
allow from None

which should be changed to the following
Order allow,deny
allow from all


Happy xampping