当前位置: > 数据库 > Oracle >

Oracle中针对中文进行排序

时间:2014-12-11 23:17来源:linux.it.net.cn 作者:IT

在oracle 9i之前,对中文的排序,是默认按2进制编码来进行排序的. 9i时增加了几种新的选择:

  1. 按中文拼音进行排序:SCHINESE_PINYIN_M
  2. 按中文部首进行排序:SCHINESE_RADICAL_M
  3. 按中文笔画进行排序:SCHINESE_STROKE_M

而oracle 9i是对中文的排序是默认按拼音排序(并不是指NLS_SORT = SCHINESE_PINYIN_M,而是说SQL中不指定NLS_SORT时对中文列排序时默认按拼音)的,跟之前的2进制编码排序有所不同.具体用法如下:

  1. 直接写在sql中,例如:
    1. SELECT * FROM TEAM ORDER BY NLSSORT(排序字段名,'NLS_SORT = SCHINESE_PINYIN_M');
    2. SELECT * FROM TEAM ORDER BY NLSSORT(排序字段名,'NLS_SORT = SCHINESE_STROKE_M');
    3. SELECT * FROM TEAM ORDER BY NLSSORT(排序字段名,'NLS_SORT = SCHINESE_RADICAL_M');
  2. 配置在初始化参数NLS_SORT中,这可以在数据库创建时指定,也可以通过alter session来修改.如果是前者,则在所有session中生效.例如:
    1. 使用select * from NLS_SESSION_PARAMETERS;语句可以看到NLS_SORT的值.
    2. 更改配置文件:alter system set nls_sort='SCHINESE_PINYIN_M' scope=spfile;
    3. 更改session:alter SESSION set NLS_SORT = SCHINESE_PINYIN_M;

这里要额外注意一下性能问题,按oracle官方文档的解释,oracle在对中文列建立索引时,是按照2进制编码进行排序的,所以如果NLS_SORT被设置为BINARY时,排序则可以利用索引.如果不是2进制排序,而是使用上面介绍的3种针对中文的特殊排序,则oracle无法使用索引,会进行全表扫描.这点一定要注意,多用plsql工具比较一下执行效率.解决方法是,在此列上建立linguistic index.例如:CREATE INDEX nls_index ON my_table (NLSSORT(name, 'NLS_SORT = SCHINESE_PINYIN_M'));

以下是oracle文档中的原文:

Note:
Setting NLS_SORT to anything other than BINARY causes a sort to use a full table scan, regardless of the path chosen by the optimizer. BINARY is the exception because indexes are built according to a binary order of keys. Thus the optimizer can use an index to satisfy the ORDER BY clause when NLS_SORT is set to BINARY. If NLS_SORT is set to any linguistic sort, the optimizer must include a full table scan and a full sort in the execution plan.



(责任编辑:IT)
------分隔线----------------------------