Tuesday, January 27, 2009

Internalization and Localization

Internalization & Localization are two buzz words when we come to web applications. People usually get confused thinking both are one and the same.

But the fact is Localization is a prerequisite to the internalization in some way.

Internalization

From a web application perspective internalization means input, output can be expected to be of any language. In more detail, say we have a comments section, where we can comment our thoughts in any language. And in see all comments we can see our inputted comment shown in the same way what we inputted.

For the above thing to happen, we need to cater all languages while taking the input and be able to process it and then store it. So every input has to be tagged with a proper character set. That means for hindi characters we need to have a character set (mapping of characters to numbers) which covers hindi characters whose code points(mapped numbers) need to be stored and while displaying use the store code point and the character set.

Saying that the character set which can take care of all the language writable symbols need to be used or we need to provide mechanism for supporting many character sets, tagging each string with appropriate character set. But to our luck we have Unicode character set which can cater all language symbols.

Localization

As said earlier localization is a kind of prerequisite for internalization. Localization is nothing but making the data represented with respect to a locale from a web application point of view. Locale is a set containing preferences like language, time format, currency and can be anything which personalizes an individual person.

For example we might want to cater to the people who doesn't know english. To do that we need to represent our data in the language they know. Say for example we want to say Hello to a french guy. Then we would like show him Bonjour. To make this happen we need to have proper database which can have translations of the things which we show in our web page.

Unicode and various encodings of it might be the next step of internalization from a developer point of view.



No comments: