SCWS ¼òÒ׷ִʺ¯Êý
$Id: php_scws_manual.txt,v 1.6 2010/01/29 06:48:55 hightman Exp $
±¾ËµÃ÷ÓÉ hightman ±àдÓÚ 2007.06.07
ÍøÒ³µØÖ·£ºhttp://www.ftphp.com/scws
==¼ò½é==
SCWS ÊÇÒ»¸ö¼òÒ׵ķִÊÒýÇ棬Ëü¿ÉÒÔ½«ÊäÈëµÄÎı¾×Ö·û´®¸ù¾ÝÉ趨ºÃµÄÑ¡ÏîÇиîºóÒÔÊý×éÐÎʽ·µ»Øÿһ¸ö´Ê»ã¡£ËüΪÖÐÎĶø±àд£¬Ö§³Ö gbk ºÍ utf-8 ×Ö·û¼¯£¬Êʵ±µÄÐ޸ĴʵäºóÒ²¿ÉÒÔÖ§³Ö·ÇÖÐÎĵĶà×Ö½ÚÓïÑÔÇдʣ¨ÈçÈÕÎÄ¡¢º«Îĵȣ©¡£³ý·Ö´ÊÍ⣬»¹Ìṩһ¸ö¼òµ¥µÄ¹Ø¼ü´Ê»ãͳ¼Æ¹¦ÄÜ£¬ËüÄÚÖÃÁËÒ»¸ö¼òµ¥µÄËã·¨À´ÅÅÐò¡£±¾À©Õ¹ÄÚ²¿Ö±½Ó°ó°ó¶¨ÁË libscws Ïà¹Ø´úÂë¡£
×¢£º¸ü¶àÏûÏ¢Çë·ÃÎÊ http://www.ftphp.com/scws
==ÐèÇó==
±¾À©Õ¹ÐèÒª scws-1.x.x µÄÖ§³Ö¡£
==°²×°==
ÕâÊÇÒ»¸ö php À©Õ¹£¬ÐèÒªÁíÐÐÏÂÔز¢±àÒ롣ĿǰֻÓÐÔ´Âë´úÂ룬¶øÇÒÖ»ÔÚ php4 »·¾³¼° Unix ×åƽ̨²âÊÔ±àÒë¡£ÏÂÔغóÖ±½Ó ./configure --enable-scws ±àÒë¼´¿É¡£°²×°ºóÐèÒªÔÚ php.ini ÖмÓÈëÏàÓ¦µÄÐУ¬ÆäÖкìÉ«²¿·ÖΪ±ØÐ룬»ÒÉ«²¿·Ö¿ÉÓпÉÎÞ£º
[scws]
extension = scws.so
scws.default.charset = gbk
scws.default.fpath = /usr/local/etc/scws
==ÔËÐÐʱÅäÖÃ==
scws.default.charset
scws.default.fpath (default = NULL) , Changeable = PHP_INI_ALL
ÓÐ¹Ø PHP_INI_* ³£Á¿½øÒ»²½µÄϸ½ÚÓ붨Òå²Î¼ûPHPÊֲᡣ
==×ÊÔ´ÀàÐÍ==
±¾À©Õ¹¶¨ÒåÁËÒ»ÖÖ×ÊÔ´ÀàÐÍ£ºÒ»¸ö scws Ö¸Õ룬ָÏòÕýÔÚ±»²Ù×÷µÄ scws ¶ÔÏó¡£
==Ô¤¶¨Òå³£Á¿==
±¾À©Õ¹Ä£¿é䶨ÒåÈκγ£Á¿¡£
==Ô¤¶¨ÒåÀà==
ÕâÊÇÒ»¸öÀàËÆ Directory µÄÄÚÖÃʽαÀà²Ù×÷£¬Àà·½·¨½¨Á¢ÇëʹÓà scws_new() º¯Êý£¬¶ø²»ÒªÖ±½ÓÓà new SimpledCWS¡£
·ñÔò²»»á°üº¬ÓÐ handle Ö¸Õ룬½«ÎÞ·¨ÕýÈ·²Ù×÷¡£°üº¬µÄ·½·¨ÓУº
class SimpledCWS
};
Àý×Ó1. ʹÓÃÀà·½·¨·Ö´Ê
<?php
$so = scws_new();
$so->set_charset('gbk');
// ÕâÀïûÓе÷Óà set_dict ºÍ set_rule ϵͳ»á×Ô¶¯ÊÔµ÷Óà ini ÖÐÖ¸¶¨Â·¾¶ÏµĴʵäºÍ¹æÔòÎļþ
$so->send_text("ÎÒÊÇÒ»¸öÖйúÈË,ÎÒ»áC++ÓïÑÔ,ÎÒÒ²ÓкܶàTÐôÒ·þ");
while ($tmp = $so->get_result())
{
}
$so->close();
?>
Àý×Ó2. ʹÓú¯ÊýÌáÈ¡¸ßƵ´Ê
<?php
$sh = scws_open();
scws_set_charset($sh, 'gbk');
scws_set_dict($sh, '/path/to/dict.xdb');
scws_set_rule($sh, '/path/to/rules.ini');
$text = "ÎÒÊÇÒ»¸öÖйúÈË£¬ÎÒ»áC++ÓïÑÔ£¬ÎÒÒ²ÓкܶàTÐôÒ·þ";
scws_send_text($sh, $text);
$top = scws_get_tops($sh, 5);
print_r($top);
?>
×¢Ò⣺
Ϊ·½±ãʹÓ㬵± send_text ·½·¨»ò scws_send_text º¯Êý±»µ÷ÓÃÇ°£¬Ã»ÓмÓÔشʵäºÍ¹æÔò¼¯Ê±£¬ÏµÍ³»á×Ô¶¯ÔÚscws.default.fpath(iniÅäÖÃ)ÖвéÕÒÏàÓ¦µÄ×Ö·û¼¯´Êµä¡£´ÊµäºÍ¹æÔòÎļþµÄÃüÃû·½Ê½Îª dict[.×Ö·û¼¯].xdb ºÍ rules[.×Ö·û¼¯].ini £¬µ±×Ö·û¼¯ÊÇ gbk ʱÖÐÀ¨ºÅÀïÃæµÄ²¿·ÖÔò²»ÐèÒª£¬Ö±½ÓʹÓà dict.xdb ºÍ rules.ini ¶ø²»ÊÇ dict.gbk.xdb ¡£
´ËÍ⣬ÊäÈëµÄÎÄ×Ö£¬´Êµä£¬¹æÔòÎļþÕâÈýÕßµÄ×Ö·û¼¯±ØÐëͳһ£¬Èç¹û²»ÊÇĬÈ쵀 gbk ×Ö·û¼¯Çëµ÷Óà set_charset »ò scws_set_charsetÀ´É趨£¬·ñÔò¿ÉÄܳöÏÖÒâÍâ´íÎó¡£
==º¯ÊýÁÐ±í£º==
mixed scws_new(void)
˵Ã÷£º´´½¨²¢·µ»ØÒ»¸ö SimpledCWS Àà²Ù×÷¶ÔÏó¡£
²ÎÊý£ºÎÞ
·µ»ØÖµ£º³É¹¦·µ»ØÀà²Ù×÷¾ä±ú£¬Ê§°Ü·µ»Ø false
mixed scws_open(void)
˵Ã÷£º´´½¨²¢·µ»ØÒ»¸ö·Ö´Ê²Ù×÷¾ä±ú
²ÎÊý£ºÎÞ
·µ»ØÖµ£º³É¹¦·µ»Ø scws ²Ù×÷¾ä±ú£¬Ê§°Ü·µ»Ø false
bool scws_close(resource scws_handle)
˵Ã÷£º¹Ø±ÕÒ»¸öÒÑ´ò¿ªµÄ scws ·Ö´Ê²Ù×÷¾ä±ú
·µ»Ø£ºÊ¼ÖÕΪ true
²ÎÊý£ºscws_handle ¼´Ö®Ç°ÓÉ scws_open ´ò¿ª·µ»ØµÄ¡£
bool scws_set_charset(resource scws_handle, string charset)
˵Ã÷£ºÉ趨·Ö´Ê´Êµä¡¢¹æÔò¼¯¡¢Óû·ÖÎı¾×Ö·û´®µÄ×Ö·û¼¯£¬ÏµÍ³È±Ê¡ÊÇ gbk ×Ö¼¯¡£
·µ»Ø£ºÊ¼ÖÕΪ true
²ÎÊý£ºscws_handle ¼´Ö®Ç°ÓÉ scws_open ´ò¿ª·µ»ØµÄ£»
bool scws_add_dict(resource scws_handle, string dict_path [, int mode])
˵Ã÷£ºÌí¼Ó·Ö´ÊËùÓõĴʵ䣬мÓÈëµÄÓÅÏȲéÕÒ¡£
·µ»Ø£º³É¹¦·µ»Ø true ʧ°Ü·µ»Ø false
²ÎÊý£ºscws_handle ¼´Ö®Ç°ÓÉ scws_open ´ò¿ª·µ»ØµÄ£»
bool scws_set_dict(resource scws_handle, string dict_path [, int mode])
˵Ã÷£ºÉ趨·Ö´ÊËùÓõĴʵ䲢Çå³ýÒÑ´æÔڵĴʵäÁÐ±í¡£
·µ»Ø£º³É¹¦·µ»Ø true ʧ°Ü·µ»Ø false
²ÎÊý£ºscws_handle ¼´Ö®Ç°ÓÉ scws_open ´ò¿ª·µ»ØµÄ£»
bool scws_set_rule(resource scws_handle, string rule_path)
˵Ã÷£ºÉ趨·Ö´ÊËùÓõÄдÊʶ±ð¹æÔò¼¯£¨ÓÃÓÚÈËÃû¡¢µØÃû¡¢Êý×Öʱ¼äÄê´úµÈʶ±ð£©¡£
·µ»Ø£º³É¹¦·µ»Ø true ʧ°Ü·µ»Ø false
²ÎÊý£ºscws_handle ¼´Ö®Ç°ÓÉ scws_open ´ò¿ª·µ»ØµÄ£»
bool scws_set_ignore(resource scws_handle, bool yes)
˵Ã÷£ºÉ趨·Ö´Ê·µ»Ø½á¹ûʱÊÇ·ñÈ¥³ýһЩÌØÊâµÄ±êµã·ûºÅÖ®Àà¡£
·µ»Ø£ºÊ¼ÖÕΪ true
²ÎÊý£ºscws_handle ¼´Ö®Ç°ÓÉ scws_open ´ò¿ª·µ»ØµÄ£»
bool scws_set_multi(resource scws_handle, int mode)
˵Ã÷£ºÉ趨·Ö´Ê·µ»Ø½á¹ûʱÊÇ·ñ¸´Ê½·Ö¸î£¬Èç¡°ÖйúÈË¡±·µ»Ø¡°Öйú£«ÈË£«ÖйúÈË¡±Èý¸ö´Ê¡£
·µ»Ø£ºÊ¼ÖÕΪ true
²ÎÊý£ºscws_handle ¼´Ö®Ç°ÓÉ scws_open ´ò¿ª·µ»ØµÄ£»
bool scws_set_duality(resource scws_handle, bool yes)
˵Ã÷£ºÉ趨ÊÇ·ñ½«ÏÐÉ¢ÎÄ×Ö×Ô¶¯ÒÔ¶þ×Ö·Ö´Ê·¨¾ÛºÏ
·µ»Ø£ºÊ¼ÖÕΪ true
²ÎÊý£ºscws_handle ¼´Ö®Ç°ÓÉ scws_open ´ò¿ª·µ»ØµÄ£»
bool scws_send_text(resource scws_handle, string text)
˵Ã÷£º·¢ËÍÉ趨·Ö´ÊËùÒªÇиîµÄÎı¾
·µ»Ø£º³É¹¦·µ»Ø true ʧ°Ü·µ»Ø false
²ÎÊý£ºscws_handle ¼´Ö®Ç°ÓÉ scws_open ´ò¿ª·µ»ØµÄ£»
×¢1£ºÏµÍ³µ×²ã´¦Àí·½Ê½Îª¶Ô¸ÃÎı¾Ôö¼ÓÒ»¸öÒýÓ㬹ʲ»Â۶೤µÄÎı¾²¢²»»áÔì³ÉÄÚ´æÀË·Ñ£»
×¢2£ºÖ´Ðб¾º¯Êýʱ£¬Èôδ¼ÓÔØÈκδʵäºÍ¹æÔò¼¯£¬Ôò»á×Ô¶¯ÊÔͼÔÚiniÖ¸¶¨µÄȱʡĿ¼Ï²éÕҴʵäºÍ¹æÔò¼¯¡£
mixed scws_get_result(resource scws_handle)
˵Ã÷£º¸ù¾Ý send_text É趨µÄÎı¾ÄÚÈÝ£¬·µ»ØһϵÁÐÇкõĴʻ㡣
·µ»Ø£º³É¹¦·µ»ØÇкõĴʻã×é³ÉµÄÊý×飬 ÈôÎÞ¸ü¶à´Ê»ã£¬·µ»Ø false¡£
²ÎÊý£ºscws_handle ¼´Ö®Ç°ÓÉ scws_open ´ò¿ª·µ»ØµÄ¡£
×¢1£ºÃ¿´ÎÇиîºó±¾º¯ÊýÓ¦¸ÃÑ»·µ÷Óã¬Ö±µ½·µ»Ø false Ϊֹ£¬ÒòΪ³ÌÐòÿ´Î·µ»ØµÄ´ÊÊýÊDz»È·¶¨µÄ¡£
×¢2£º·µ»ØµÄ´Ê»ã°üº¬µÄ¼üÖµÓУºword (string, ´Ê±¾Éí) idf (folat, ÄæÎı¾´ÊƵ) off (long, ÔÚÎı¾ÖеÄλÖÃ) attr(string, ´ÊÐÔ±íʾ)
mixed scws_get_tops(resource scws_handle [, int limit [, string attr]] )
˵Ã÷£º¸ù¾Ý send_text É趨µÄÎı¾ÄÚÈÝ£¬·µ»Øϵͳ¼ÆËã³öÀ´µÄ×î¹Ø¼ü´Ê»ãÁÐ±í¡£
·µ»Ø£º³É¹¦·µ»Øͳ¼ÆºÃµÄµÄ´Ê»ã×é³ÉµÄÊý×飬·µ»Ø false¡£ÔªËسý°üÀ¨ºÍ get_result Ò»ÑùµÄÊý¾ÝÍ⻹¶àÒ»¸ö times
²ÎÊý£ºscws_handle ¼´Ö®Ç°ÓÉ scws_open ´ò¿ª·µ»ØµÄ£»
mixed scws_get_words(resource scws_handle, string attr )
˵Ã÷£º¸ù¾Ý send_text É趨µÄÎı¾ÄÚÈÝ£¬·µ»ØϵͳÖдÊÐÔ·ûºÏÒªÇóµÄ¹Ø¼ü´Ê»ã¡£
·µ»Ø£º³É¹¦·µ»Ø·ûºÏÒªÇó´Ê»ã×é³ÉµÄÊý×飬·µ»Ø false¡£
²ÎÊý£ºscws_handle ¼´Ö®Ç°ÓÉ scws_open ´ò¿ª·µ»ØµÄ£»
bool scws_has_words(resource scws_handle, string attr )
˵Ã÷£º¸ù¾Ý send_text É趨µÄÎı¾ÄÚÈÝ£¬·µ»ØϵͳÖÐÊÇ·ñ°üÀ¨·ûºÏ´ÊÐÔÒªÇóµÄ¹Ø¼ü´Ê¡£
·µ»Ø£ºÈç¹ûÓÐÔò·µ»Ø true£¬Ã»Óоͷµ»Ø false¡£
²ÎÊý£ºscws_handle ¼´Ö®Ç°ÓÉ scws_open ´ò¿ª·µ»ØµÄ£»
mixed scws_version(void)
˵Ã÷£º·µ»Ø scws °æ±¾ºÅÃû³ÆÐÅÏ¢¡£
·µ»Ø£º×Ö·û´®
²ÎÊý£ºÎÞ
Àà¶ÔÏóÓ÷¨²ÎÕÕº¯ÊýÓ÷¨£¬Çø±ð¾ÍÊDz»ÐèÒª´«ÈëµÚÒ»²ÎÊý£¨»á×Ô¶¯´ÓhandleµÄÊôÐÔÖÐÈ¡Öµ£©¡£