Fandom

Scratchpad

Databases Internationalization

216,047pages on
this wiki
Add New Page
Discuss this page0 Share

Ad blocker interference detected!


Wikia is a free-to-use site that makes money from advertising. We have a modified experience for viewers using ad blockers

Wikia is not accessible if you’ve made further modifications. Remove the custom ad blocker rule(s) and the page will load as expected.

Databases Internationalization

DB2

Character Set/Data Type

Supported Character Sets

DB2 UDB supports UTF-8 for char and varchar and UTF-16 for graphic and vargraphic. see Supported territory codes and code pages for the list of supported character sets.

Setup

When database is created, character set can be specified. And database configuration can be checked by GET DATABASE CONFIGURATION command. see GET DATABASE CONFIGURATION Command for details.

Collation/Comparison

Supported Collations

DB2 supports the following collation sequences.
COMPATIBILITY
The DB2 Version 2 collating sequence. Some collation tables have been enhanced. This option specifies that the previous version of these tables is to be used.
IDENTITY
Identity collating sequence, in which strings are compared byte for byte.
IDENTITY_16BIT
CESU-8 (Compatibility Encoding Scheme for UTF-16: 8-Bit) collation sequence as specified by the Unicode Technical Report #26, which is available at the Unicode Corsortium Web site (www.unicode.org). 1 This option can only be specified when creating a Unicode database.
UCA400_NO
The UCA (Unicode Collation Algorithm) collation sequence that is based on the Unicode Standard version 4.00 with normalization implicitly set to on. Details of the UCA can be found in the Unicode Technical Standard #10, which is available at the Unicode Consortium Web site (www.unicode.org). This option can only be used when creating a Unicode database.
UCA400_LTH
The UCA (Unicode Collation Algorithm) collation sequence that is based on the Unicode 9 Standard version 4.00 but will sort all Thai characters according to the Royal Thai Dictionary order. Details of the UCA can be found in the Unicode Technical Standard #10 available at the Unicode Consortium Web site (www.unicode.org). This option can only be used when creating a Unicode database. 9 Note that this collator might order Thai data differently from the 9 NLSCHAR collator option.
NLSCHAR
System-defined collating sequence using the unique collation rules for the specific code set/territory.
Note: This option can only be used with the Thai code page (CP874). If this option is specified in non-Thai environments, the command will fail and return the error SQL1083N with Reason Code.
SYSTEM
Collating sequence that is based on the database territory. This option cannot be specified when creating a Unicode database. This is the default.

Setup

Collation can be specified when the database is created and it cannot be changed. see CREATE DATABASE Command

Indexing

Indexing will be done based on database collation setting.

Calendar

Supported Calendars

DB2 seems to support only Gregorian support.

Setup

n.a.

Timezone

Supported Timezones

Timezone should be handled in the different layer.

Setup

n.a.

Session Locale/Formatting

Session Locale

Territory has to be set when the database is created as well as code page. And session language will be determined based on the client locale. e.g. if clp is launched under locale=en_US, the session language will be en_US.
TODO: It is not mentioned how locale can be set over the session when JDBC connection is built.

Date/Time

Date/time formatting is determined by territory code. The default format for locale can be found at Date and time formats by territory code. Also, it is possible to specify the format with VARCHAR_FORMAT, TIMESTAMP_FORMAT. Also DB2 provides TIMESTAMP_ISO to handle ISO8601 format.

Number

Cannot find much information about number formatting. Need to check the behavior with DB2?

SQL Server

Character Set/Data Type

Supported Character Sets

SQL Server supports Unicode with nchar, nvarchar, and ntext only. Its encoding is UTF-16. char, varchar and text supports only code page of OS where SQL Server runs.

Setup

When data types are nchar, nvarchar and ntext, it is always UTF-16 and there is no additional setup required for database character set. In case it is necessary to check character set of database, you can check it by using sp_helpdb <dbname> stored procedure.

Collation/Comparison

Supported Collations

SQL Server 2005 provides two groups of collations: Windows collations and SQL collations. We will use Windows collations when linguistic sorting is required. For other cases, we should use binary order as it is today.
Windows Collations
Windows collations are collations defined for SQL Server to support Windows locales. For a list of these collations, see Collation Settings in Setup. By specifying a Windows collation for SQL Server, the instance of SQL Server uses the same code pages and sorting and comparison rules as an application that is running on a computer for which you have specified the associated Windows locale. For example, the French Windows collation for SQL Server matches the collation attributes of the French locale for Windows.
See also Windows Collation Sorting Styles
SQL Collations
SQL collations are a compatibility option to match the attributes of common combinations of code-page number and sort orders that have been specified in earlier versions of SQL Server. Many of these collations support suffixes for case, accent, kana, and width sensitivity, but not always. For more information, see Using SQL Collations.

Setup

Collation settings, which include character set, sort order, and other locale-specific settings, are fundamental to the structure and function of Microsoft SQL Server databases. Within your organization, you should develop a standard for collation settings, and apply these settings at the time that you install SQL Server. Many server-to-server activities can fail or yield inconsistent results if collation settings are not consistent across servers. Select a Microsoft Windows locale to match the collation settings in other instances of SQL Server 2005; or select SQL Collations to match settings with the sort orders in earlier versions of SQL Server.
SQL Server 2005 supports setting collations at the following levels of a SQL Server 2005 instance:
  • Server level
  • Database level
  • Column level
  • Expression level
For more information on rebuilding system databases to specify a new system collation, see How to: Install SQL Server 2005 from the Command Prompt.
In case collation is not set at any level, the default sorting will be determined by OS Locale. see Collation Settings in Setup for details.

Indexing

When creating database schema, collation can be specified for the columns as below:

    USE tempdb
    GO
    CREATE TABLE TestTab (
      id int, 
      GreekCol nvarchar(10) collate greek_ci_as, 
      LatinCol nvarchar(10) collate latin1_general_cs_as
    )

Calendar

Supported Calendars

SQL Server supports Gregorian/Hijri Calendar with CAST/Convert functions. Other calendars such as Japanese Imperial, ROC Official and Thai Buddhist are not supported. Calendar handling need to be done in the different layer. see also CAST and CONVERT

Setup

Calendar is specified only at expression level.

Timezone

Supported Timezones

Timezone seems to be handled at the different layer.

Setup

n.a.

Session Language/Formatting

Session Locale

SQL server's locale is the same as Regional setting of OS. Also session language can be set by "SET LANGUAGE". see Session Language and sys.syslanguages (the list of supported languages) for details.

Date/Time

There are two date types: datetime, smalldatetime. Values with the datetime data type are stored internally by the Microsoft SQL Server 2005 Database Engine as two 4-byte integers. The first 4 bytes store the number of days before or after the base date: January 1, 1900. The base date is the system reference date. The other 4 bytes store the time of day represented as the number of milliseconds after midnight.
The smalldatetime data type stores dates and times of day with less precision than datetime. The Database Engine stores smalldatetime values as two 2-byte integers. The first 2 bytes store the number of days after January 1, 1900. The other 2 bytes store the number of minutes since midnight. datetime values are rounded to increments of .000, .003, or .007 seconds, as shown in the following table.
Format can be set by "SET DATEFORMAT" statement at session level. If it is not set, it is determined by session language. Session language can be set by "SET LANGUAGE".

Number

Cannot find much information about number formatting. Need to check the behavior with SQL Server?

Also on Fandom

Random wikia