Creating/Extending a Culture

Haven’t you ever used a culture and want to extend it by changing e.g. the currency symbol? Haven’t you ever want to create a custom culture for formatting purposes? Answers will vary depending on the developer and the users targeted by his application.

Overtime, more and more cultures are started to be supported by the .NET Framework. However, not all cultures of the world are available in .NET. If you want to use a culture that is not available or you want to support a minority with a region you will need to create your custom culture.

Custom cultures and regions can be created with the class System.Globalization.CultureAndRegionInfoBuilder that resides in the assembly sysglobl.dll (you will need to reference it of course.)

Now, we are going to create a custom culture that extends the U.S. English culture. This culture is for New York. We’ll step further and change some of the characteristics of the base culture (en-US) to accommodate our needs for a good example.

We’ll start by referencing the assembly sysglobl.dll and add a using statement for the namespace System.Globalization.

sysglobl.dll is very tiny assembly. It contains only one class CultureAndRegionInfoBuilder and one enumeration CultureAndRegionModifiers that we will talk about it soon.

Next, we will instantiate a new instance of CultureAndRegionInfo class. That class is the key class for creating/extending a culture. Plus, it allows you to register/unregister your “baby.” It is required to register your custom culture on the system before you start using it.

CultureAndRegionInfoBuilder builder =
new CultureAndRegionInfoBuilder("en-US-NY",
CultureAndRegionModifiers.None);

The constructor of CultureAndRegionInfoBuilder class requires two parameters. The first is the name of the new culture. The second can one of the three values defined by the CultureAndRegionModifiers enumeration. And that parameter is used to identify the new culture.

The values of CultureAndRegionModifiers are:

  • None:
    For new cultures and the cultures that extends existing ones.
  • Neutral:
    For neutral cultures such as English, French, etc.
  • Replacement:
    For the cultures that you intend to be a replacement of existing culture in .NET Framework or even Windows. A replacement culture is like en-US that would replace the existing culture English (United States.)

In cultures world, there’s a unique name for every culture. And that name follows the naming convention xx-XX. The first lowercase xx is for language like en (English,) fr (French,) es (Spanish,) and so on. The second uppercase XX is for the country/region like US, FR, and ES. So an example like de-DE is the unique name German in Germany.

There’re exceptions to these rules. First, there’re languages like Dhivehi and Konkani that are abbreviated to three letters like div and kok. In addition, there’re two neutral cultures zh-CHS and zh-CHT for Simplified and Traditional Chinese; those have three letters for the country. In addition, there’re cultures that have the suffix -Cyrl and -Latn for defining the script Cyrillic or Latin. See RFC 1766 Standards.

After instantiating a new object of type CultureAndRegionInfoBuilder you need to initialize its properties with valid values. For the example in our hands, we will load the values from the parent culture en-US because we intend to extend it. Actually, there’s no rules to admit in loading the values for the new culture. You can load it from existing culture or you can write it yourself. In addition, we will set the Parent property to the parent culture that we wish to extend.

CultureInfo parent = new CultureInfo("en-US");
builder.LoadDataFromCultureInfo(parent);
builder.LoadDataFromRegionInfo(new RegionInfo("US"));
builder.Parent = parent;

You might review the properties of both CultureInfo and RegionInfo and compare it to the properties of CultureAndRegionInfoBuilder to know exactly why we are loading both objects.

Now comes the hardest part, changing the properties to accommodate our needs and setting the name of the new culture.

builder.RegionEnglishName = "New York";
builder.RegionNativeName = "New York";
builder.CultureEnglishName = "New York (United States)";
builder.CultureNativeName = "New York (United States)";
builder.NumberFormat.CurrencySymbol =  "*";

In the last few lines we changed the region “English” and its native name “English” to “New York”.

In addition, we changed the currency symbol from the dollar sign $ to an asterisk *.

Notice the difference between the English name and native name. The native name is the name of the culture in the native language that the culture is supposed to support. For example, “French” is the English name is and native name is “français”.

Here comes the hottest part, registering your new culture:

builder.Register();

You might get an exception of type InvalidOperationException if you tried to re-register it or if you chose a name that’s existed and you are not creating a replacement culture.

Now, Test your new culture:

CultureInfo newYork = new CultureInfo("en-US-NY");
double money = 100.99;
Console.WriteLine(money.ToString("C", newYork));
// "C" means currency formatting

Congratulations! You created a new culture and you can see it in action.

There’re some things that you need to take into account:

  1. You can unregister your created culture by using the static method Unregister of CultureAndRegionInfoBuilder class.
    CultureAndRegionInfoBuilder.Unregister("en-US-NY");
  2. After creating the desired culture and you don’t want to register it immediately (for example) you can save it as XML and load it later.
    // Saving the culture
    builder.Save("D:New York Culture.xml");
    // Loading the saved culture
    CultureAndRegionInfoBuilder.CreateFromLdml
    ("D:\New York Culture.xml");
  3. Windows saves the custom cultures you create in the folder %windir%Globalization.
  4. You are not ended using the custom culture in your .NET code only. You can go to Regional and Language Settings in Control Panel and change the culture to your created one and that affects the entire system.

Need more?

Links are subject to change if you found a bad link in our site please report it to us as soon as possible.

Download the example

Working with Strings with Combining Characters

هذه المقالة متوفرة أيضا باللغة العربية، اقرأها هنا.

Contents

Contents of this article:

  • Contents
  • Introduction
  • Writing Arabic Diacritics
  • Using the Character Map Application
  • Enumerating a String with Only Base Characters
  • Enumerating a String with Combining Characters
  • Comparing Strings
  • Try it out!

Introduction

In some languages, like Arabic and Hebrew, you combine some characters with combining characters based on the pronunciation of the word.

Combining characters are characters (like diacritics, etc.) that are combined with base characters to change the pronunciation of the word (sometimes called vocalization.)

Some examples of combining characters are diacritics:

Base Character Combining Character(s) Result
1
Combining a single character

Arabic Letter Teh
Arabic Letter Teh
0x062A

Arabic Damma
Arabic Damma
0x064F
Arabic Letter Teh + Damma.gif
Letter Teh + Damma
2
Combining two characters
Arabic Letter Teh
Arabic Letter Teh
0x062A

Arabic Shadda
Arabic Shadda
0x0651

Arabic Fathatan
Arabic Fathatan
0x064B

Arabic Letter Teh + Shadda + Fathatan
Letter Teh + Shadda + Fathatan

When you combine a character with another one then you end up with two characters. When you combine two characters with a base one you end up with 3 characters combined in one, and so on.

Writing Arabic diacritics

The following table summarizes up the Arabic diacritics and the keyboard shortcut for each character:

Unicode Representation Character Name Shortcut
0x064B Arabic Fathatan Fathatan Shift + W
0x064C Arabic Dammatan Dammatan Shift + R
0x064D Arabic Kasratan Kasratan Shift + S
0x064E Arabic Fatha Fatha Shift + Q
0x064F Arabic Damma Damma Shift + E
0x0650 Arabic Kasra Kasra Shift + A
0x0651 Arabic Shadda Shadda Shift + ~
0x0652 Arabic Sukun Sukun Shift + X

Using the Character Map Application

Microsoft Windows comes with an application that help you browsing the characters that a font supports. This application is called, Character Map.

You can access this application by typing charmap.exe into Run, or pressing Start->Programs->Accessories->System Tools->Character Map.

Character Map application

Enumerating a String with Base Characters

Now we are going to try an example. This example uses a simple word,Word Muhammad (Mohammad; the name of the Islam prophet.)

Word Muhammad Details

This word (with the diacritics) is consisted of 9 characters, sequentially as following:

  1. Meem
  2. Damma (a combining character combined with the previous Meem)
  3. Kashida
  4. Hah
  5. Meem
  6. Shadda (a combining character)
  7. Fatha (a combining character both Shadda and Fatha are combined with the Meem)
  8. Kashida
  9. Dal

After characters combined with their bases we end up with 6 characters, sequentially as following:

  1. Meem (have a Damma above)
  2. Kashida
  3. Hah
  4. Meem (have a Shadda and a Fatha above)
  5. Kashida
  6. Dal

The following code simply enumerates the string and displays a message box with each character along with its index:

// C#

string name = "مُـحمَّـد"
string result = String.Empty;

for (int i = 0; i < name.Length; i++)
result += String.Format("{0}t{1}b", i, name(i));

MessageBox.Show(result);

' VB.NET

Dim name As String = "مُـحمَّـد"
Dim result As String = String.Empty

For i As Integer = 0 To name.Length – 1
result &= String.Format("{0}{1}{2}{3}", i, vbTab, name(i), vbNewLine)
Next

MessageBox.Show(result)

What we get? When enumerating the string, we enumerate its base characters only.

Enumerating a String with Combining Characters

.NET Framework provides a way for enumerating strings with combining characters, it is via the TextElementEnumerator and StringInfo types (both reside in namespace System.Globalization.)

The following code demonstrates how you can enumerate a string along with its combining characters:

// C#

string name = "مُـحمَّـد";
string result = String.Empty;

TextElementEnumerator enumerator =
StringInfo.GetTextElementEnumerator(name);

while (enumerator.MoveNext())
result += String.Format("{0}t{1}b",
enumerator.ElementIndex, enumerator.Current);

MessageBox.Show(result);
' VB.NET

Dim name As String = "مُـحمَّـد"
Dim result As String = String.Empty

Dim enumerator As TextElementEnumerator = _
StringInfo.GetTextElementEnumerator(name)

While enumerator.MoveNext()
result &= String.Format("{0}{1}{2}{3}", _
enumerator.ElementIndex, vbTab, _
enumerator.Current, vbNewLine)
End While

MessageBox.Show(result)

Comparing Strings

Sometimes, you will be faced with a situation where you need to compare two identical strings differ only by their diacritics (combining characters) for instance. If you were to compare them using the common way (using String.Compare for instance) they would be different because of the combining characters.

To overcome this you will need to use a special overload of String.Compare method:

The Kashida, isn’t of the Arabic alphabets. It’s most likely be a space! So the option CompareOptions.IgnoreSymbols ignores it from comparison.

// C#

string name1 = "محمد";
string name2 = "مُـحمَّـد";

// 1st check
if (name1 == name2)
MessageBox.Show("Strings are identical");
else
MessageBox.Show("Strings are different!");

// 2nd check
if (String.Compare(name1, name2) == 0)
MessageBox.Show("Strings are identical");
else
MessageBox.Show("Strings are different!");

// 3rd
if (String.Compare(name1, name2,
System.Threading.Thread.CurrentThread.CurrentCulture,
CompareOptions.IgnoreSymbols) == 0)
MessageBox.Show("Strings are identical");
else
MessageBox.Show("Strings are different!");
' VB.NET

Dim name1 As String = "محمد"
Dim name2 As String = "مُـحمَّـد"

' 1st check
If (name1 = name2) Then
MessageBox.Show("Strings are identical")
Else
MessageBox.Show("Strings are different!")
End If

' 2nd check
If (String.Compare(name1, name2) = 0) Then
MessageBox.Show("Strings are identical")
Else
MessageBox.Show("Strings are different!")
End If

' 3rd check
If (String.Compare(name1, name2, _
System.Threading.Thread.CurrentThread.CurrentCulture, _
CompareOptions.IgnoreSymbols) = 0) Then
MessageBox.Show("Strings are identical")
Else
MessageBox.Show("Strings are different!")
End If