T.R | Title | User | Personal Name | Date | Lines |
---|
1041.1 | | VMSMKT::KENAH | There are no mistakes in Love... | Mon Apr 19 1993 12:40 | 6 |
1041.2 | "1812" par l'duc de la terHorst... | TLE::JBISHOP | | Mon Apr 19 1993 12:52 | 14 |
| It's worse than you think.
I saw a library's ordering rules (in a C.S. article).
There were rules about sorting books with titles containing
numbers (spell it out), and books in foreign languages containing
numbers (spell it out in the foreign language), and titles
consisting only of numbers, and books in non-Latin scripts,
and names with prefixes separated by a space (de, von), and
names with prefixes not separated by a space (e.b. terHorst)
and on and on.
If I can remember the article, I'll post a reference.
-John Bishop
|
1041.3 | Was there any life before computers? | KETJE::HAENTJENS | Beware of Counterfeit | Tue Apr 20 1993 08:58 | 18 |
1041.4 | I'm a rational sort | RAGMOP::T_PARMENTER | Human. All too human. | Tue Apr 20 1993 10:32 | 21 |
1041.5 | | SMURF::BINDER | Deus tuus tibi sed deus meus mihi | Tue Apr 20 1993 13:26 | 29 |
1041.6 | | VMSMKT::KENAH | There are no mistakes in Love... | Tue Apr 20 1993 14:20 | 10 |
1041.7 | | CALS::DESELMS | | Tue Apr 20 1993 15:32 | 5 |
| RE: -1
Great example...
- Jim
|
1041.8 | Me too | AUSSIE::WHORLOW | Bushies do it for FREE! | Tue Apr 20 1993 19:32 | 10 |
| G'daym,
Minor rathole ..
There is 'Sans Souci' in Australia... It's a suburb of Sydney...
derek
PS where would that fit in the Sanssouci / SANSSOUCI /Sans-Souci...
scheme?
|
1041.9 | | JIT081::DIAMOND | Pardon me? Or must I be a criminal? | Tue Apr 20 1993 21:56 | 9 |
| Re .5
>>A character range can include a multicharacter collating element
>>enclosed within bracket-period delimiters ([. and .]).
[...]
>>When using Spanish collation rules, [[.ch.]] is treated as an RE
>>matching the sequence ch, while [ch] is treated as an RE matching
>>c or h. In addition, [a-[.ch.]] matches a, b, c, and ch.
How do they do it in a character set that doesn't have [ and ] ?
|
1041.10 | Difference between tilde and squiggly thing | KETJE::HAENTJENS | Beware of Counterfeit | Wed Apr 21 1993 08:10 | 21 |
1041.11 | My rathole or yours? | FORTY2::KNOWLES | DECspell snot awl ewe kneed | Wed Apr 21 1993 09:50 | 20 |
1041.12 | | KETJE::HAENTJENS | Beware of Counterfeit | Wed Apr 21 1993 11:52 | 10 |
1041.13 | | VMSMKT::KENAH | blah blah blah GINGER | Wed Apr 21 1993 14:43 | 15 |
| Is this an accurate synopsis?
1. Different European languages have developed ordering rules that are
internally consistent.
2. You are trying to develop more general ordering rules, rules that
incorporate different language's rules while maintaining internal
consistency as well as consistency with each individual language.
In addition, it sounds like you're trying to make sense between
similar but distinct words and word groupings.
3. Finally, the ordering scheme you develop must be implemented on a
computer, since computers are valuable tools for tasks like ordering.
Do any of the existing standards (ISO, XPG) deal with this topic?
|
1041.14 | | VMSMKT::KENAH | blah blah blah GINGER | Wed Apr 21 1993 17:11 | 4 |
| I re-read .0 and see that it states POSIX compiliant systems support
Multilevel ordering -- which POSIX standard is it a part of?
andrew
|
1041.15 | 9945-2.2 | KETJE::HAENTJENS | Beware of Counterfeit | Thu Apr 22 1993 06:34 | 9 |
1041.16 | | VMSMKT::KENAH | blah blah blah GINGER | Thu Apr 22 1993 10:21 | 4 |
| Thanks for the POSIX and XPG references -- I'll think I'll check 'em
out (I believe one of my colleagues has a copy of XPG4).
andrew
|
1041.17 | | NOVA::FISHER | DEC Rdb/Dinosaur | Thu Apr 22 1993 10:31 | 33 |
1041.18 | Ordering with Sanscrit | KETJE::HAENTJENS | Beware of Counterfeit | Thu Apr 22 1993 12:11 | 17 |
1041.19 | %^} | VMSMKT::KENAH | blah blah blah GINGER | Thu Apr 22 1993 17:47 | 4 |
| SANSCRIT would probably wind up somewhere else in American English -
that's because the usual transliteration is SANSKRIT.
andrew
|
1041.20 | let those R's rip | RAGMOP::T_PARMENTER | Human. All too human. | Tue Apr 27 1993 10:22 | 14 |
1041.21 | ARR, Matey! | CALS::DESELMS | | Tue Apr 27 1993 10:56 | 6 |
| A "flipped R", is just like a trilled R, except that instead of the tongue
tapping the roof of your mouth a bunch of times, it only taps the roof of
the mouth once. It is indeed exactly the same as "dd" in "ladder".
Pronounce Spanish with an American ARR and they'll laugh in your face.
- Jim
|
1041.22 | | NOVA::FISHER | DEC Rdb/Dinosaur | Thu Apr 29 1993 11:11 | 6 |
| But rr in Spanish also has no special collation rule [that I have
seen].
Is rr collated after rz?
ed
|
1041.23 | | NOTIME::SACKS | Gerald Sacks ZKO2-3/N30 DTN:381-2085 | Thu Apr 29 1993 17:46 | 3 |
| re .20:
It's an alveolar flap.
|
1041.24 | Knuth, of course | TLE::JBISHOP | | Fri Aug 06 1993 15:58 | 7 |
| re .2
See Knuth's _Sorting_and_Searching_ (his volume 3), pp 7..9
for some library sorting rules, e.g. "Ignore initial articles,
unless not in nominative case...".
-John Bishop
|
1041.25 | | VMSMKT::KENAH | | Fri Aug 06 1993 16:29 | 5 |
| A question came up in another conference -- does Digital support
Cyrillic alphabets?
I'm embarrassed to ask this, because I don't know whether ISO Latin-1
includes Cyrillic alphabets. (We *do* support ISO Latin-1, don't we?)
|
1041.26 | Nope. | SMURF::BINDER | Sapientia Nulla Sine Pecunia | Fri Aug 06 1993 16:41 | 17 |
| Re .25
> I'm embarrassed to ask this, because I don't know whether ISO Latin-1
> includes Cyrillic alphabets.
It doesn't.
Producing International Products -- Software handbook
(Identification Number A-MN-ELEN467-00-0 Rev B)
...says this:
The ISO Latin Alphabet No. 1 has been developed by the International
Organization for Standards (ISO) as the standard character set for the
Western European languages. It will eventually supersede the DEC
Multinational Character Set. Further ISO character sets are being
developed to cover European languages not based on the Latin Alphabet.
|
1041.27 | | VMSMKT::KENAH | | Fri Aug 06 1993 17:10 | 7 |
| Thanks.
So: does Digital support Cyrillic alphabets?
Also: Does Digital support ISO Latin-1?
andrew
|
1041.28 | | REGENT::BROOMHEAD | Don't panic -- yet. | Fri Aug 06 1993 17:21 | 8 |
| ISO Latin-1 is Digital's default character set -- so, yes, we support
it.
ISO Latin-Cyrillic (ISO 8859-5 (which is not ISO Latin-5)) is provided
on a few of our printers (dot matrix ones) and can be added via a
cartridge on our ANSI laser printers. So, yes, we support it.
Ann B.
|
1041.29 | | VMSMKT::KENAH | | Fri Aug 06 1993 18:27 | 9 |
| Thank you, Ann. I didn't realize ISO Latin-1 was our default,
although (based on Dick's description) it's obvious.
How about Cyrillic support at the user-interface level?
andrew
P.S. I'm tracking this question through another path within Digital;
should I get an expanded answer, I'll post it here.
|
1041.30 | | NRSTA2::KALIKOW | Supplely Chained | Fri Aug 06 1993 18:40 | 5 |
| Hey andrew -- Keep us posted on whether you get the answer thru
"official" or "other" channels faster than this employee-interest
notesfile... It'd be great if we could get you out of the BOX
faster... :-)
|
1041.31 | | ISTWI1::KINACI | Walk thru this world | Mon Aug 09 1993 09:47 | 16 |
| I think Cyrillic is ISO-Latin 2 is it not?
I know there is some Cyrillic support out there and there is more to
come once the Fonts acquired from Monotype go into distribution.
I've been informed that we will have a wide scale test for the various
fonts. I will be working on testing ISO-Latin 5 for Turkey, for example.
I know that there is a Cyrillic version of DECterm. Hold on, I am not
sure if we are talking full UI localization or if there is just character
set support. But the latter definitely exists. I know there was work
being done to get EPROMs which support Cyrillic for VT420 type terminals.
I believe this has been completed. I also know that the Cyrillic version
of ALL-IN-1 V3.0 should be shipping soon.
Suz
|
1041.32 | | VMSMKT::KENAH | | Mon Aug 09 1993 10:03 | 6 |
| So far, the clear winner is through Employee-Interest conferences;
Of course the informal channels have given me pointers to more
formal channels, so the lines are getting blurred.
Of course without the informal channels, I never would have found
the formal channels...
|
1041.33 | Who can answer Andrew's question? | REGENT::BROOMHEAD | Don't panic -- yet. | Mon Aug 09 1993 13:49 | 14 |
| Suz,
Nope, it's ISO Latin-Cyrillic, with no number in sight.
Andrew,
"How about Cyrillic support at the user-interface level?"
I can't answer that. All I can tell you is I have the Cyrillic fonts
from Monotype that Suzan mentioned, but I don't know who is to pay
to make them into cartridges or soft fonts, or even which fonts (type-
faces) I should concentrate on.
Ann B.
|
1041.34 | | ISTWI1::KINACI | Walk thru this world | Mon Aug 09 1993 15:24 | 20 |
| Hi Ann!
Nice to run into you here.
RE the fonts. You probably know that Israel is going to be running a
Fonts Q.A. Project in early September, where we will all get to test
our own fonts. I suspect that will be when we will get a broader picture
of what is out there.
As for who pays... well.. I am told by very reliable sources that
corporate will pay for the internationalization of products deemed
necessary by the involved subsidiaries, starting in FY '94. We've
submitted a prioritized list of what we need, and as far as I know
the funding discussions should be well under way at this time. Past
experience indicates that it will be the beginning of Calendar year
1994 before we see much of anything.
I hear all this will change come FY'95.. Keep your fingers crossed!
Suz
|
1041.35 | | 4GL::LASHER | Working... | Tue Aug 10 1993 09:50 | 4 |
| While y'all are looking into this, could you also check to see whether
DECwindows supports Orthodox icons?
Lew Lasher
|
1041.36 | Spanish Alphabetical Order Simplified | REGENT::BROOMHEAD | Don't panic -- yet. | Mon May 02 1994 14:26 | 45 |
| <<< NOTED::DISK$NOTES7:[NOTES$LIBRARY_7OF4]WORLDWIDE.NOTE;2 >>>
-< Worldwide -- International Product Issues >-
================================================================================
Note 525.0 Change in Spanish collating rules No replies
R2ME2::HINXMAN "It's waiting for it that's so tryin" 39 lines 2-MAY-1994 07:58
--------------------------------------------------------------------------------
Days in dictionary numbered for two in Spanish alphabet
=======================================================
Associated Press (Boston Globe 1994-05-01)
MADRID - The world's more than 300 million Spanish speakers now have
two fewer letters in their alphabet to worry about, a mostly bookkeeping move
that won almost unanimous support but disturbed some traditionalists.
The Association of Spanish Language Academies, meeting in Madrid for
its 10th annual congress, voted last week to eliminate the "Ch" an "Ll" from
the Spanish alphabet.
The two letters, which historically have had their own separate
headings in dictionaries, now will be listed under other letters. Words
beginning with "Ch", like "chico", will fall under the letter "C", and words
beginning with "Ll", like "llama", will fall under the letter "L".
The move does not change pronunciation, usage or spelling. It was
made mainly to simplify dictionaries and make Spanish more computer-
compatible with English.
Pushing for the change was Spain, a member of the 12-nation European
Union. The EU has urged its members to implement measures that aid
translation and computer standardization.
Cuban delegate Luisa Campuzano said he favored the change "because it
means that dictionaries will be easier to use. But arguments related to the
European Union shouldn't be brought up. Our talks are along scientific lines
and nothing more."
The vote Wednesday was 17 in favor, one opposed and three abstaining.
Ecuador voted "no" and Panama, Nicaragua and Ecuador abstained.
"It's not that the letters are disappearing, they're just being put
in a different place in the dicitionary," said a Madrid artist, Maria Gato.
"I don't think most people are upset."
Guatemala supported the change, but one Guatemalan delegate, Mario
Alberto Carrera, referred to the simplification as "killing" part of the
language.
"The two letters have succumbed to the dictates of the market and the
Anglo-Saxon world," Carrera said.
Some dictionaries, including the highly respected Maria Moliner, had
already made the change.
The Spanish alphabet now has 27 letters - the 26 contained in the
alphabet plus a stylized "n".
|
1041.37 | | NOVA::FISHER | Tay-unned, rey-usted, rey-ady | Thu May 05 1994 10:46 | 9 |
| aye, the contrariness of it all....
One of th efun parts of "internationalizing Rdb" was to assure that
"c*" did not MATCH "chxyz" when SPanish was the collating sequence
in use.
Drat!
ed
|
1041.38 | | JIT081::DIAMOND | $ SET MIDNIGHT | Mon May 16 1994 05:47 | 10 |
| Re .36
> "The two letters have succumbed to the dictates of the market and the
>Anglo-Saxon world," Carrera said.
Cute opinion. Has the Library of Congress changed their lexicography
to consider Mc as Mc instead of as Mac? If they did or will, they're
succumbing to the dictates of the market and the Spanish world.
-- Norman Diamond
|