m***@gmail.com
2006-05-04 14:32:50 UTC
Hi,
I am using Sadhiro Tomoyuki's Lingua::JA::Sort::JIS module to sort
Japanese names of stores. I have come close to achieving the order my
client has asked for but am having a little difficulty matching their
request exactly. The problem seems to be collating kana glyphs with
manyogana glyphs. (Please excuse me if I am misusing any terms - this
is my first introduction to Japanese.)
Here is an example of 13 store names ordered with
Lingua::JA::Sort::JIS::msort:
1. $B0K@*C0(B JR$B5~ETE9(B
2. $B%"%Z%C%/%9(B $BJ!;3(B
3. $B%"%_%e%W%i%6(B $B</;yEg(B
4. $B%*%/%N(B $***@n(B
5. $B$5$/$iLnI42_E9(B $***@gBf(B
6. $B$5$D$^20(B $B</;yEg(B
7. $B%9%?%s%9(B $BJF;R(B
8. $B$=$4$&(B $B?@8ME9(B
9. $B$=$4$&(B $***@iMUE9(B
10. $B$=$4$&(B $BBg5\E9(B
11. $B$=$4$&(B $B2#IME9(B
12. $B%@%$%"%b%s%I%7%F%#%"%k%k(B $B3`86(B
13. $B%K%e!<%:(B $B7'K\(B
My client tells me that entry 1 should actually come after the 3rd
entry and before the fourth. From this description on manyogana, I'm
thinking they're saying that collation of the glyph $B0K(B should be based
on its katakana adaptation $B%$(B which makes sense:
http://en.wikipedia.org/wiki/Manyogana
Note I'm basing many of my statements on staring at and comparing these
glyphs online and so I might be far off.
So my questions are:
1. Is my client correct in their ordering?
2. I believe I've tried all the combinations of collation levels and
kanji classes in the Lingua::JA::Sort::JIS jcmp function but have not
achieved the desired ordering. Have I perhaps missed the correct
combination?
3. Is the solution to first convert the manyogana characters to
katakana and then do the msort? If so does anyone know of a Perl module
to do this or a nice reference that I could use more programmatically
than the image on the link above?
4. Can anyone think of any other glyphs or classes of Japanese glyphs
similar to manyogana that I should be worried about?
Thanks for any help you can give me!
Best,
Mike
I am using Sadhiro Tomoyuki's Lingua::JA::Sort::JIS module to sort
Japanese names of stores. I have come close to achieving the order my
client has asked for but am having a little difficulty matching their
request exactly. The problem seems to be collating kana glyphs with
manyogana glyphs. (Please excuse me if I am misusing any terms - this
is my first introduction to Japanese.)
Here is an example of 13 store names ordered with
Lingua::JA::Sort::JIS::msort:
1. $B0K@*C0(B JR$B5~ETE9(B
2. $B%"%Z%C%/%9(B $BJ!;3(B
3. $B%"%_%e%W%i%6(B $B</;yEg(B
4. $B%*%/%N(B $***@n(B
5. $B$5$/$iLnI42_E9(B $***@gBf(B
6. $B$5$D$^20(B $B</;yEg(B
7. $B%9%?%s%9(B $BJF;R(B
8. $B$=$4$&(B $B?@8ME9(B
9. $B$=$4$&(B $***@iMUE9(B
10. $B$=$4$&(B $BBg5\E9(B
11. $B$=$4$&(B $B2#IME9(B
12. $B%@%$%"%b%s%I%7%F%#%"%k%k(B $B3`86(B
13. $B%K%e!<%:(B $B7'K\(B
My client tells me that entry 1 should actually come after the 3rd
entry and before the fourth. From this description on manyogana, I'm
thinking they're saying that collation of the glyph $B0K(B should be based
on its katakana adaptation $B%$(B which makes sense:
http://en.wikipedia.org/wiki/Manyogana
Note I'm basing many of my statements on staring at and comparing these
glyphs online and so I might be far off.
So my questions are:
1. Is my client correct in their ordering?
2. I believe I've tried all the combinations of collation levels and
kanji classes in the Lingua::JA::Sort::JIS jcmp function but have not
achieved the desired ordering. Have I perhaps missed the correct
combination?
3. Is the solution to first convert the manyogana characters to
katakana and then do the msort? If so does anyone know of a Perl module
to do this or a nice reference that I could use more programmatically
than the image on the link above?
4. Can anyone think of any other glyphs or classes of Japanese glyphs
similar to manyogana that I should be worried about?
Thanks for any help you can give me!
Best,
Mike