︹ ᠸᠠᠩ ᠰᠡᠷᠭᠦᠯᠡᠩ᠂ ᠨᠠᠰᠤᠨ᠋ᠤ᠋ᠷᠲᠤ᠂ ᠰᠡᠴᠡᠨᠲᠦ ︺ - ᠦᠭᠦᠯᠡᠯ - (ᠬᠢᠲᠠᠳ) - 汉蒙统计机器翻译中的调序方法研究
 »25 null»4 ù 2011 M7 ÏÓÐ JOURNAL OF CHINESE INFORMATION PROCESSING Vol. 25, No. 4 Jul., 2011 ÓcI|: 1003-0077 ︵2011︶ 04-0088-05 q Îd9 JrϥؽZEùî ¦ ® ° Î1, ® Vm2, * ¨é °m3 ︵1. = Î =SvÐ9 ØÐýñÐý, = Ϋ}+010022︔ 2. = Î =SvЩ Ï, = Ϋ}+010022︔ 3. = ÎvÐ ÎÐÐý, = Ϋ}+010021︶ Knull1: 在基于短语的汉蒙统计机器翻译系统的研究中,我们发现存在着严重的语序错误b该文在对汉语和蒙古 语句子语序进行研究的基础上,提出了基于蒙古语语序的汉语句子调序方法︔ 同时介绍了调序规则和调序算法的 设计︔ 最后给出了具体实验b实验证明这种方法明显提高了现有汉蒙机器翻译系统的性能b 1oM: 汉蒙统计机器翻译︔ 调序︔ 规则 Ïms Ë|: TP391null null null nullÓDS M : A A Reordering Method of Chinese-Mongolian Statistical MachineTranslation Wangsiriguleng1, Siqintu2, Nasan-urt3 ︵1. Computer and Information Engineering College,Inner Mongolian Normal University, Huhhot, Inner Mongolia 010022,China︔ 2. Network Center, Inner Mongolian Normal University, Huhhot, Inner Mongolia010022, China︔ 3. The Instituteof Mongolian Studies, Inner Mongolia University, Huhhot, Inner Mongolia 010021, China︶ Abstract: In the study of phrase-based Chinese-Mongolian statistical machine translation, there exist substantial word order errors in the Chinese-Mongolian translation reuslts. This paper compared Chinese and Mongolian sen- tencenull s word order and proposed Chinese sentence reordering method based on the Mongolian word order. Then, it introduced thedesign of reordering rules and reordering algorithm. Finally, the experimental results proved that the performance of the Chinese-Mongolian machine translation system can be improved by this method. Key words: Chinese-Mongolian statistical machine translation system︔ reordering︔ rule là ° ù: 2011-04-27 nullçà ° ù: 2011-05-27 Á[ “:SE1  SÐÁ'ù[ “ ︵61063014︶ Teº:¦ ® ° Î ︵1970null︶ , o,p V,¬ q,ö1ùîZ_¹1 Ôý︶ Ø︔ ® Vm ︵1972null︶ , 3, « V,ö1ùî Z_¹ ÎÓ︶ Ø9 Ø© ︔ * ¨é °m ︵1959null︶ , 3,p V 3 =, q,ö1ùîZ_¹ ÎÓ︶ Øa9 ØÔ ýÐb 1nullý 1¿q Î Jr,á Ì︔ ÜSV¿︖ 5¥ ùî[1] ,¿ L è¥ùî[2]¿ Ô¥d9ZE ùî[3]bqÔ ÎÔ¥Ô½µ v¥µY, ¿ Ô¥q Îd9 Jr“dÏ,á Ì︖ Ci v ¥Ô½pb¿ Ô¥d9 Jr“d ¥ 8Jr²TÏ,¿ ÕÔýÏÔ½M]¥  0,JrrT zb è Â: ︵1︶ { Æ:fø {, ' Ú | {︖ {: ENE MONGGO G0BIGVLHV HAGV- DASV, HAMIG _ A J0G0S GARGAJV ABHV BVI︖ ︵︖︶ ︵2︶ { Æ:Â︖ ︖ 8 ¹“︖ {: ONODOR-UN CAG AGVR-VN BAY- IDAL YAMAR BVI︖ ︵︖︶ Vn,[  ñ 0ÏqÔ ÎÔ 0Ô½ ' B“,JrrTÎpb ^,¿ *tÔ½ 4 ù¦ ® ° Ω:q Îd9 JrϥؽZEùî ]¥ 0,Jr²Tü À * ¹ ØX ,öCv  ¥Ô½pb è Â: ︵1︶ { Æ:áXFBñ j³vb {:BI 0R0LCAY_A GEJU B0D0JV NIGEN JIGVLCILAL-VN BOLHOM . ︵︶ ︵2︶ { Æ:á Àµüb {:UGEI BI NIGE HALBAG_A . ︵︶ ¿ ñrÓ,á Ì ^ ÀE¤ s¥b ö1 ðy ^rÓÔ½ ÎÔ 0¥Ô½b ^® ¿q Î ÕÔý¥Ô½µs v,¿ Ô¥Jr º︖ ³ %Bt Ô½¥ØbyN,á Ì I n P¨ E︖ 5,q Îd9 Jr“dÏÉ Ø½b'Ó4 ¿ ÎÔÔ½¥qÔ 0ؽ ZE,NؽZE¿ Ô¥q Îd9 Jr “d¥NISTBLUE´ûµ üA¥4Úb 2null¿ ÎÔÔ½¥qÔ 0ؽZE ¿ Ô¥d9 Jrº³ %   Ö¥  ؽ,¹ ³ %É  ÖØ½,ùî ¦ô 7 S P¨ ¿ E¥Ø½ZEb Û“d9 Jr/ ¥︖ Z,¿ E¥d9 JrXÜî¹Í+ M¥ù î £Äb¿ E¥d9ZE| E ÆJr Ï,[ E²T¹Jrí,y ë²W¥ºr 1“b¿ E¥ Ï,²W¥ºr1“ ^YVJr︖ 5 ó“  ¥,t︖ 5 V[V Ü Ô  oÏ1î |,9 V[YV ¦ýÉB ,9²b ÓD[4][5]û÷Ôýç︶ ب P¨ Ô ýÐ EsbÓD[ 4]Ï¥︖ 5 ^ ÔÔ ¥ Ъ¤¥,ÔEÔ¥ Jr“d L ÏBLEU´4Ú 10%bÓD[5]Ï¿£Ô Ô¥Jr,ô ÔýЩ M9² BÕ£Ô0  ԥؽ︖ 5bÓD[ 6]4 BÕ| Es ¿ Ô¥d9 Jr¥ªÄ²  ÉØ ½¥À qZE,¿óç¥ 0#  E ,YV   T 3îØ½ª¥n_bestT¹¿ Ôd9³  ¥ { Æ,NZEqÔÔ¥Jr LÏ BLEU´4Ú 1.56%b á Ì¿ Ô¥q Îd9 Jr“dÏ4 ¿ ÎÔÔ½¥qÔ 0ؽ¦ {bö1ZE ^¿ { Æ¥qÔ 05ÉsMMSÿ, ª E És,ô ؽ︖ 5³1ؽ¥ sÉØ½, P¤qÔ 0¥Ô½Ð ¤Í Î Ô 0¥ ¨½,Kª|ؽª¥qÔ 0 Âd9 ³  ÉØ³ b' @ñ Âm1 î Ub m1null µ ÆØ½ ¥“d @ñ q Î Jr“dÏ,qÔ¥s ^ -4b ¹ ÉØ½, n51qÔ 0ÉsM E s,á ÌÊ4 ÏS SÐý9 Ø/ ùî  '÷ICTCLASICTPROPb ICTCLAS ︵Institute of Computing Technolo- gy, Chinese Lexical Analysis System︶ ^ÏS SÐ ý9 Ø/ ùî îùÅ¥qÔMEs“d,T ¨BÕ¿ªß  ¥qÔMEsZ E[7] , “ -XÜ 6︶  ICTCLAS3. 0,á Ì P¨  çn¥ICTCLAS1. 0,qÔ 0ÉsMM SÿbICTPROP ^ÏS SÐý9 Ø/ ùî î 7︖ ¥BñÀ q¥1 Ôý Es [8]b ICTPROP¥s ØE ^8 ժĦ {¥¿ Écharts ØE,s²T ^À qKv¥B P s bá Ì P¨ICTPROPqÔ 0É E sb 89 ÏÓÐ2011 M null null 8 LVñÏ,qÔ Es ¿ É ¥qÔ 0sîÿ q ®b¹ 4Ú Es  ¥îÿ q,á Ì n5|qÔ 0s³î0 , ª Äñ0 Ésؽ,Kª|ؽª¥òñ0 ¿v ¨½iîBñ 0bBî,qÔ 0Jr î ÎÔª,òñ0 ¥ ¨½ ^ö¿M¥, î[ ÕSEöYñ 0¥Jr²Tbá ÌYVq Ô 0ÏC¥ù| ︵,︶ as| ︵︔︶ a µ| ︵:︶ aË| ︵︕︶ ,Ù| ︵︖︶  | ︵b︶ ©|qÔ 0 Ms¹0 b 0 ؽ¥G ^q Î 0Ð︖ 5,yNá Ì n 5q Î 0Ð︖ 5É ùî,VÏ9²B ,   µÔ½MÄ¥︖ 5, ªI 0 ؽ  vb/ ëW%º ؽ︖ 5ؽ ØEb 3nullؽ︖ 5 á ÌÜqÔ E︖ 5s¹ s,Bs ^Ð ÎÔÔ½M]¥, 6Bs ^Ð ÎÔÔ½] ¥bá ̺ I³ *tÔ½]¥︖ 5,óM¥ MÐ Tb ︖ 5ì Tçl¹: = [ $ Z] Ï: X¹÷ÔýqÔ¥ Ô² E︖ 5, qÔ¥ Ô½,á Ì| ë¹s︖ 5,'qÔ Es︖ 5︔ Y ^qÔ︖ 5Ï¥ ÄñÔEîs  ÎÔ H¥ ¨½îs,'ؽs,á Ìë  ¹Ø½︖ 5bZ ^KÅHq, · ¨N︖ 5ؽ H,  ñÔEîs V︕ ÂMKÅHqbHqs ^ VÊ¥,' V[µ, V[ Àµb s︖ 5X¥ T¹:null- null Ïnull¹︖ 5¥P, ^BñqÔ ÔS: ︵n V1︶ bnull ^︖ 5¥ · ,®BññS:Fî,S : V[ ^ ÔS:,9 V[ ^MS: ︵nV2︶ ,ò ñS:W[Bñ bìséb V1 nullÀ q Es Ï¥ 0a ÔS:# ︖ 5  S:cl︖ 5  “S:cl︖ 5  “ s 05 np M Ô53 ap ¸M Ô41 ppºM Ô19 dj 82 sp︶ îM Ô12 dp¬M Ô5 tp HWM Ô35 fj¯ 11 vpîM Ô90 mcp M Ô7 zj 8 mp  M Ô8 376 null nullؽ︖ 5Y¥ T¹: A ︵null1null2 nullnull︶ , ÏA V U︖ 5ÏPqÔ ÔS:¥ ÎÔ ÔS :, ÄñnulliV UBñ9Ð[Bñ¬ Æîsb9Ð [¥ T¹B/C,cl ^s︖ 5Ï¥qÔÔEî sC ÎÔÏÔEîs¹B, B ^ ÎÔ¥ M Ë ÔS: ︵nV3︶ , C¹qÔ¥M Ë Ô S:b¬ Æîs T¹D ,D¹ ÎÔ¥M Ë  ÔS:,SV U ÎÔ¥BñM¸Fîs, cl ^¨ñ︖ 5ؽ H,NÊÂ1¬ ÆSb KÅHqZ ^MKÅHq,ì T: C= S C︕ = S,¨  ·  ñs︖ 5Ï¥ÔEîs ¥MÐSÏó¥BññMM©©b V2nullqÔMS:“ a: ¸Mh: -¤îsc: õMd:¬M b: uYMk:ª¤îsi:îÔp:ºM o: E 2Mj:eë {Ôq: Mr:}M y:Ô Mw:SÄ|v:îMe:M l:¨Ôt: HWMn: Mu:ùM s:︶ îMx:dÔ Í3m: Mg:Ô Í z: ÿMf:ZÊM V3 null ÎÔM ËS:“ S: ëS: ëS: ëS: ë N MR}MIËMX¨Ô VîMD¬MF¸FîsY¯M A ¸MH f ÿME3 Z õ¤í{ M MU  EMJ%çMP çM Q MGªÂMKîÔ T HWMSÔ ML ê {Ô O︶ îMC õ¤MWSÄ null null è Â: vp- vp np= VP ︵NP/np VP/vp︶ $ vp︕ = ^ s︖ 5X¹: vp- vp np ,V UqÔ¥vp ®BñvpªõBñnpFîb ؽ︖ 5Y¹:VP ︵NP/ np VP/vp︶ ,V UqÔ ¥îM Ô ÎÔ H9 ^îM ÔVP,N VP®NPªõVPî, NPs︖ 5¥np, VPs︖ 5¥vp, VnÔ½ ^]¥b KÅHqZ:vp︕ = ^,V U P¨N︖ 5ؽ H, s︖ 5 · H¥vp¥M©¿null ^nullb qÔ ÔS: P¨ qÔÀ q Es Ï¥ 90 4 ù¦ ® ° Ω:q Îd9 JrϥؽZEùî  ÔS:“,nV1bqÔMÔ¨ICTCLASÉM Sÿ HÊ4 vS:“,'nullC}qÔÔE MÅW³null[9]Ï︖ ç¥qÔMS:“,nV2,  26ñS:b ÎÔ¥M ËS: P¨ 2008 M11YV ¥null︶ ب ÎÓMÔS:“nullSES[10]b ÏB︶ S:“︖ ç 25ñS:,nV3b ¿ ÎÔ Ô,null ÎÔÔEMÅ O︕ 9null[11]ÏÜ ÎÔ Ôs¹ M Ô ︵NP︶ a ¸ M Ô ︵AP︶ aîM Ô ︵VP︶ a}M Ô ︵RP︶ a M  Ô ︵MCP︶ a  M Ô ︵MP︶ aZÊM Ô ︵OP︶ a HWM Ô ︵TP︶ a¬M Ô ︵DP︶ aªÂM Ô ︵GP︶ ©10Õ,] Hô L=³1,á Ì I ÓD[12] [13],|öÌ Ô ︵DJ︶ aÔ M Ô ︵SP︶  f ÿ M Ô ︵XP︶ ©ûb ÎÔ¥ ÔSÿ“Ï, µ 13ñ ÎÔ ÔS:b ç︖ 5ì TS:“ª,á Ì/ ¿ Î ÔÔ½¥qÔ 0ؽ︖ 5bá ÌqÔsMa MSÿ Es¥$ ,  |BñqÔ 0¥ Ô² ÐîM¥ ÎÔÔE É ùî,ÉBVÏB ,12H µÔ½MÄ¥ Ð︖ 5,Üs︖ 5á Ìë¹q ÎØ½︖ 5, 7HîM ÔØ½︖ 5,3HºM ÔØ½︖ 5 2HöÌ Ô¥Ø½︖ 5,ÓD[ 14]Ïó  W% ª üb 4nullؽ ØE ô ؽ︖ 5,á Ì︕ 9 ؽ ØE,I Ø ½ñ½bñ½ö1Ü[Óq T¥︖ 5ÐîL V¥ T,] H¹ L¿Ø½|À q Es ¥ s²T Ðî= ¥ T,KªÉ 8Ø ½bؽ ØE¥' @ñ¹: ︵1︶ ô ︖ 5Óq,y ëM︖ 5  V︔ ︵2︶ ô ︖ 5{MS:¥ 0ؽ︔ ︵3︶ db # h¥'÷︔ Ï ︵1︶  ︵2︶  SÄ  vؽ  vb  SÄ  vö1ÜÓ'Óq T¥Ø½︖ 5 ÐîLV  ²,[Ls︶ Øb SÄ  v ØE @ñ Â/: null nullu 7︖ 5Óq ︖ 5½|I= 0︔ WHILE ︵︖ 5Óq² ︶ { null ÆBH︖ 5︔ ð S31︖ 5 Vª ë¬ Æ︖ 5︔ Vð S31︖ 5s ÖPH︖ 5︔ |nullPH︖ 5,I|Inull1“ MAP︔ s Ö · H︖ 5︔ s︖ 5,y ëQPH︖ 5MÐî · H︖ 5¥ conv_node F︔ y ëØ½︖ 5LVconv_list︔ I+ +︔ } ؽ  v n5|qÔÀ q Esñ½¥²T ¨Y0¡½V UEÐîB P= , ªYV = ÉԽؽbؽ H P¨ ª½R 奾 B ØEbª½R 奾B ØE Â/: null null Conv_tree ︵struct_mynode * root︶ { Conv_tree ︵root. child︶ null Conv_tree ︵root.brother︶ null if ︵root- child︕ = NULL︶ null null { | -︖ 5P︔ null null null | -︖ 5 · ︔ null null nullVrulenum°s -︖ 5,¤I|I︔ null null nullô I|Vؽ︖ 5VCONV_LISTϰs ؽ︖ 5︔ null null ÂTs,5ô ؽ︖ 5ÉØ½︔ } } á Ì|ؽñ½ 3 Æd9 Jr“dÏ, [ 71 TÊ4 ^ÉØ½bؽ  vÐd9³  Mºÿ ë,ºYb 5null L# s LÔ á Ì P¨ CWMT2009q Î ß©Ô , LÔ ︖  ÂV4 î U, 7︖ “© k“ ¥ L²T ÂV5 î Ub Ï: baseline“d[3] ^ ¿ Ô¥q Îd9 Jr“d︔ QFV UÞ Ô É  ÿ Ms︔ JF:V UÉ  Eؽb V4 null LÔ  ª ü Þ Ô  q Î ÜÔ  ÎÔýÔÔ  7︖ Ô © kÔ  6. 7£ ︔ 22£MH¥ ÔMÅ 6.2£  ︵1ä£M︶ + 6. 7£  ︵ÔÔ  ÎÓs︶ 400 , Ä 4ñ Î ÓrÓ 400 , Ä 4ñ Î ÓrÓ null nullVV5Ïá Ì V[ A, ÿ Ms¿“d︖ ¥4Ú ^dÈ üA¥b] H,ؽ︖ 5¿ 7 ︖ “© k“ ¥︖ ûµBç¥4Úb 7︖ “ LÏ, BLEU´ ÿ Msª4Ú 1.06%,Ø ½ª4Ú 2.07%b© k“ LÏ, Msª BLEU´4Ú 1.84%, Ms¥$ ÉØ½ 91 ÏÓÐ2011 M ªBLEU´»4Ú 0.37%b V5 null 7︖ “© k ¥ L²T “d 7︖ “400,BLEU/%© k“400,BLEU/% baseline 22. 55 24. 89 Baseline+ QF 23. 61 26. 73 baseline+ QF + JF 25. 68 27. 10 null null¹ © k Eؽ  v¥T¨,á ÌYV ¦ýs,V 7︖ “© k“Ï|ME Es p¥ 0 Õ,V400ñ 0¥ 7︖ “ÏG  261ñqÔs ¥,V400ñ 0¥© k“Ï G 244ñqÔs ¥ 0É  L, L ²T ÂV6 î UbV6ó¥ L²TA U,Ô ý  ÔVM]¥ f /,á Ì A¿ Ô¥q Îd9 JrÏ, P¨ME E©Ôý Щ M︖ 4Ú“d︖ b 7︖ “ LÏ,BLEU ´ Msª4Ú 0.27%,ؽª4Ú 3.59%b © k“ LÏ, MsªBLEU´4Ú 2. 15%, Ms¥$ ÉØ½ªBLEU´4Ú 0.65%b V6 nullV rª 7︖ “© k ¥ L²T “d 7︖ “261,BLEU/%© k“244,BLEU/% baseline 23. 50 27. 06 Baseline+ QF 23. 77 29. 21 baseline+ QF + JF 27. 36 29. 86 null null/ ë ^F ÆØ½  vª¥“dVñ#Jr ²T L èb { Æ:áXFBñ j³vb sM -:áXFBñ j³vb sMª:á/rX/vF/ vBñ/m j³v/ nb/w ؽª:á/rBñ/ m j³v/ nF/ vX/ vb/w P¨¥︖ 5: vp- vp np= VP ︵NP/np VP/ vp︶ $ vp︕ = ^ P¨¥︖ 5: vp- vp vp= VP ︵VP// vpVP/ vp︶ ÕS:ª:áBñ j³vFXb rÓ2: BI NIGE JIGVLCILAL-VN BOLHOM- DU 0R0LCAHV GEJU B0D0JV BAYIN_A . ︵︶ L²TA U¿Ô½µs v¥qÔ Î Ô,¿ ÎÔÔ½¥qÔ 0ؽZE¿“d ︖ ¥4ÚMµrb/B,á Ì|¹ ÿ ؽ︖ 5, vÔ  o︖ ,ÉB4Ú¿ Ô¥ q Îd9 Jr“d¥︖ b  IÓD [1] null * ¨é °m, ,Ør b£ ®:.1¿q Î £ ùJr“d[J].: üÐ,2001: 91-95. [2] null¥¡¼, , * ¨é °m.¿ L è¥q Î Jr [ J].ÏÓÐ,2007,21 ︵4︶ :65-72. [3] null¦ ® ° Î, ® Vm, * ¨é °m.¿ Ô¥q Îd 9 Jrùî[ J] ,9 ØýñШ, 2010, ︵5︶ : 138-142. [4] null Fei Xia, and Michael McCord 2004. Improving a Sta- tistical MT System with Automatically Learned Re- write Patterns[ C] // Proceedings for COLING 2004. [5] null Michael Collins, Philipp Koehn, and Ivona Kucerova. 2005. Clause Restructuring for Statistical Machine- Translation[ C]// Proceedings for ACL 2005. [6] null Ch-i Ho Li, Dongdong Zhang, Mu Li, Ming Zhou, Minghui Li, Yi Guan, A Probabilistic Approach to Syntax-based Reordering for Statistical Machine Translation [ C] // Proceedings of the 45th Annual Meeting of the Association of Computational Linguis- tics, Prague, Czech Republic, June, 2007: 720-727. [7] null ,f¿ Ü,Æ ,©,¿ªß  ¥qÔM Es[ J].9 ØùîÐ︖ Z, 2004, 41 ︵8︶ : 1421-1429. [8] nullf}, ,â «.² /ÓM1¥À q Es [ C]/ /»B½Ð 39 ØÔýÐù︶ ö ︵SWCL2002︶ 2002. [9] nullÆ V«,©.C}qÔÔEMÅW³[M].Ø: b ¿vÐñ ,1998 M. [10] null * ¨é °m,©.null/ ︶ ب ÎÓMÔ S:“nullSES[S]. 2008 M11. [11] null * ¨é °m. ÎÔÔEMÅ O︕ 9[ D] . = ÎvÐ,2000 Mp VÐÊ Ó. [12] nullØr È£ ®:. ë_ Jr¥q Î ÔÐ︖ 5 ùî[ M] .«}+: = Îâñ , 2006 M 3. [13] nullr±âY . ÎÔ'îM Ô1î MYùî [D] .«}+: = ÎvÐ, 2005 Mp VÐÊ Ó. [14] null Wang. Siriguleng, Siqintu and Nasun-urtu. The re- search on reordering ruleof Chinese-Mongolian statis- tical machine translation[J]. Advanced Materials Re- search Vols,268-270 ︵2011︶ : 2185-2190. 92