# This is a test file grabbed from # # http://crl.nmsu.edu/~mleisher/ucdata.html # # A test with capital letters treated as RTL # # I tested yudit-2.7.beta13 with a modified bidiclass file, so that # capital letters are treated as AL, or R. # # I typed in what I saw on screen, with no document embedding set. # # The first like is the original # The second line is when capital letters are treated as R # The third line is when capital lettesr are treated as AL # (The third line may be omitted when it is the same as second.) # The difference between the reference code and this: # # Problem #1: # Original : TEST ~~~23%%% ONCE abc # Yudit R: abc ECNO 23%%%~~~ TSET # Yudit AL: abc ECNO %%%23~~~ TSET # Reference: abc ECNO 23%%%~~~ TSET # This can be explained by Reference Code having Hebrew letters. # # Problem #2: # Original : SOLVE 1*5 1-5 1/5 1+5 # Yudit R : 1+5 1/5 1-5 5*1 EVLOS # Yudit AL : 5+1 5/1 5-1 5*1 EVLOS # Reference: 5+1 5/1 5-1 5*1 EVLOS # This can be explained if Reference Code treats all letters # as Arabic letters. This contradicts problem #1. # # So what is going on? Lets track it down: # ----- from http://crl.nmsu.edu/~mleisher/ucdata.html ----- # Input :TEST ~~~23%%% ONCE abc # Output: abc ECNO 23%%%~~~ TSET # # Input : SOLVE 1*5 1-5 1/5 1+5 # Output: 5+1 5/1 5-1 5*1 EVLOS # ---------------------------------------------------------- # I have downloaded the java reference code from Unicode Consortium: # # http://www.unicode.org/unicode/reports/tr9/BidiReferenceJava/ # # And I ran it. To produce a result identical to what test web site # claims as “reference code” I had to switch the RL directionality # to Ararbic for the second test case. Here is what I got: # -------------------- BidiReferenceJava -------------------- # -hebrew: # TEST ~~~23%%% ONCE abc # abc ECNO 23%%%~~~ TSET # # SOLVE 1*5 1-5 1/5 1+5 # 1+5 1/5 1-5 5*1 EVLOS # # -arabic: # TEST ~~~23%%% ONCE abc # abc ECNO %%%23~~~ TSET # # SOLVE 1*5 1-5 1/5 1+5 # 5+1 5/1 5-1 5*1 EVLOS # ------------------------------------------------------------ # So the problem is solved: # # I thought that he always runs the reference code either with # Arabic or with Hebrew, consistently. Well, he never said that. # So http://crl.nmsu.edu/~mleisher/ucdata.html Reference # column sometimes contain Ararbic and somtimes Hebrew # context, mixed, with no clear indication which is which. # # Gaspar Sinai , Tokyo 2002-11-15 # # So here are my test results, typed in from Yudit screen: # -------------------------------------------------------- car is THE CAR in arabic car is RAC EHT in arabic CAR IS the car IN ENGLISH HSILGNE NI the car SI RAC he said "IT IS 123, 456, OK" he said "KO ,456 ,123 SI TI" he said "IT IS (123, 456), OK" he said "KO ,(456 ,123) SI TI" he said "IT IS 123,456, OK" he said "KO ,123,456 SI TI" he said "IT IS (123,456), OK" he said "KO ,(123,456) SI TI" HE SAID "it is 123, 456, ok" "it is 123, 456, ok" DIAS EH shalom <123H/>shalom<123H> SAALAM MALAAS HE SAID "it is a car!" AND RAN NAR DNA "!it is a car" DIAS EH HE SAID "it is a car!x" AND RAN NAR DNA "it is a car!x" DIAS EH -2 CELSIUS IS COLD DLOC SI SUISLEC -2 -10% CHANGE EGNAHC -10% SOLVE 1*5 1-5 1/5 1+5 1+5 1/5 1-5 5*1 EVLOS 5+1 5/1 5-1 5*1 EVLOS THE RANGE IS 2.5..5 5..2.5 SI EGNAR EHT # # Adapted from one of the FriBidi test files. # he said "IT IS A CAR!" he said "RAC A SI TI!" he said "IT IS A CAR!X" he said "X!RAC A SI TI" (TEST) abc abc (TSET) abc (TEST) abc (TSET) #@$ TEST TSET $@# TEST 23 ONCE abc abc ECNO 23 TSET TEST ~~~23%%% ONCE abc abc ECNO 23%%%~~~ TSET abc ECNO %%%23~~~ TSET TEST abc ~~~23%%% ONCE abc abc ECNO abc ~~~23%%% TSET TEST abc@23@cde ONCE ECNO abc@23@cde TSET TEST abc 23 cde ONCE ECNO abc 23 cde TSET TEST abc 23 ONCE cde cde ECNO abc 23 TSET Xa 2 Z Z a 2X