Any foreign character mapping charts available?

Any foreign character mapping charts available?

Post by jmev » Thu, 05 Feb 2004 00:02:35


I'm in the US, and have to constantly take data input from other
countries. Some of this data has characters which I can't understand,
since it's input from other language keyboards. This prevents me from
reading the name and passing it to a master database for proper
storage and reporting. Can anyone tell me if there is a chart mapping
non-english characters to English equivalents? We can use any of the
extended characters, such as ?(as in MCHEN, Germany).

The problem is that with some words, multiple characters make up one,
such as "BAARES", which is a Spanish entry that translates to
BARES; "DSSELDORF" is the German entry I often find for
DSELDORF. Many such entries are providing me with a new hobby, the
likes of which I'd like to give up.

So, any assistance on this will merit great admiration and gratitude,
and maybe even a *** pop (limit one per solution, please--allow 4-6
weeks for delivery).

Thanks in advance
 
 
 

Any foreign character mapping charts available?

Post by Colleyvill » Thu, 05 Feb 2004 07:38:21


Not available.






Please ignore the above posting folks.

Tony
Tony Toews, Microsoft Access MVP
Please respond only in the newsgroups so that others can
read the entire thread of messages.
Microsoft Access Links, Hints, Tips & Accounting Systems at
http://www.yqcomputer.com/

 
 
 

Any foreign character mapping charts available?

Post by mich » Thu, 05 Feb 2004 08:35:52

See

http://www.yqcomputer.com/

for the Windows "default ANSI" [sic] code pages, and

http://www.yqcomputer.com/

for the Windows "deault OEM" code pages.


--
MichKa [MS]
NLS Collation/Locale/Keyboard Development
Globalization Infrastructure and Font Technologies

This posting is provided "AS IS" with
no warranties, and confers no rights.
 
 
 

Any foreign character mapping charts available?

Post by jmev » Fri, 06 Feb 2004 06:53:00

MichKa:

Thanks for the great reference. Unfortunately, it isn't quite a
bulls-eye. I can reproduce any of the characters brought into my
Access 97 database, but knowing what they translate to is the
question. If there is a single int'l character, like ?(character
0201), I can leave it as is. However, usually, there are two
characters that represent a third. For example, in the French word
DFENSE as well as the Spanish word MXICO, I can assume the
translates to ?(character 0201), but I can't assume in all cases. For
one, I don't recognize names in most languages, and second, the same
character can be used in different ways in different countries. For
example, in ZRICH translates to ?(char. 0220), as in ZICH, but
that's for Germany. For Switzerland, I think it's different, as in
HANS-JRGEN translating to HANS-JUERGEN, and MNCHENSTEIN to
MUENCHENSTEIN. By the way, those translations are from searches I've
done on Google, so I may be off on my research.

From the above examples, can you suggest a more accurate approach to
translating the data in question? Of course, if you're going to
suggest becoming proficient in 27 different languages, please be
informed that my brain is begging for a vacation as it is (and I just
started this project :-).

Thanks again, in advance.

J
 
 
 

Any foreign character mapping charts available?

Post by mich » Fri, 06 Feb 2004 10:01:17

You need to talk to the people who are doing the conversion to say what code
page is being used -- it is unreasonable to try to guess (though I would
guess UTF-8 if I had to).


--
MichKa [MS]
NLS Collation/Locale/Keyboard Development
Globalization Infrastructure and Font Technologies

This posting is provided "AS IS" with
no warranties, and confers no rights.
 
 
 

Any foreign character mapping charts available?

Post by jmev » Fri, 06 Feb 2004 23:13:14

was afraid you'd say that, but I guess you're right. I just thought
this was a common enough issue for others to perhaps have created some
kind of table they could share (or sell?).

Thanks again for your references.

JV


"Michael \(michka\) Kaplan [MS]" < XXXX@XXXXX.COM > wrote in message news:<402195dd$ XXXX@XXXXX.COM >...
 
 
 

Any foreign character mapping charts available?

Post by mich » Sat, 07 Feb 2004 04:12:13

ell, you have not really said enough about what the format is that someone
would have a notion of what table you need.


--
MichKa [MS]
NLS Collation/Locale/Keyboard Development
Globalization Infrastructure and Font Technologies

This posting is provided "AS IS" with
no warranties, and confers no rights.



"jmev7" < XXXX@XXXXX.COM > wrote in message
news: XXXX@XXXXX.COM ...
message news:<402195dd$ XXXX@XXXXX.COM >...
code
in
understand,
from


 
 
 

Any foreign character mapping charts available?

Post by jmev » Wed, 11 Feb 2004 02:06:51

apologize. I'm not trying to be ambiguous, but I'm not sure what you
mean by format. The format of file, as in Access 97? The format of
language? In that case, it's multiple, but I can get a list of what
I've experienced thus far and list them, if that will be of help. I
thought of putting together a simple find and replace table, where
every occurrence of certain characters would then translate to another
character "acceptable" to the target database application. From what I
understand, the reason this is not practical is that two people from
the same country can use different Encoding settings in their
browsers, resulting in the same keystroke translated to a slightly
different character.

I also learned recently that some of the Germanic countries are
replacing some characters, such as the ?character being replaced by
UE. This, as was explained to me by a more educated individual, was
due to the difficulties of reproducing such characters on the web by
users from other countries. How this will affect my efforts is another
question to consider.

I am all open to suggestions and references to existing resources
(free or for purchase).

Thanks again.

JV


"Michael \(michka\) Kaplan [MS]" < XXXX@XXXXX.COM > wrote in message news:<40229582$ XXXX@XXXXX.COM >...
 
 
 

Any foreign character mapping charts available?

Post by mich » Wed, 11 Feb 2004 02:14:46

"jmev7" < XXXX@XXXXX.COM > wrote...


I mean where is the data coming from?


Hmmm.... so, how is the data being entered? What is the default system
locale of the machine on which it is being entered, and what is the
database's collation? And what is your default system locale now, when you
are looking at the data.


This would not be correct -- encoding translations are not done at the
keystroke level. If you look at the links I gave, all of those code pages
overlap each other such that the same code points means different things,
depending on which one you are looking at.


Well, this is actually okay -- if they use the other form then that is what
is stored. And users know what it means.


Truly?

If you are doing multilingual work then you MUST consider ugrading to any
version of Access that uses a Unicode version of Jet (2000, 2002, or 2003).
This is the only way that you will be able to get good results in this area
without doing a LOT of work. Better for it to all work without adding
tremendous hacks....


--
MichKa [MS]
NLS Collation/Locale/Keyboard Development
Globalization Infrastructure and Font Technologies

This posting is provided "AS IS" with
no warranties, and confers no rights.
 
 
 

Any foreign character mapping charts available?

Post by jmev » Wed, 11 Feb 2004 22:54:00


Well, The data is entered on websites throughout the world as the
company collects data from visitors. There is no default system
locale, as the data is entered from each visitor's system (home or
office) worldwide. My default system has the encoding set to Western
European (ISO). I believe the system collecting the data is an Oracle
system, which is situated in the US. The fact is, I'm weak on these
issues, so I'm going to have to delve deeper, and you're providing me
with some good questions to ask.


<snipped my stuff>

I think it's the overlap that is giving me the difficulty. If striking
a "U" always produced a "U", that would be great. If it produces a "?
from one system and a "UE" from another, and they are both correct,
that's fine. When it produces a "" combination, I then have to trap
it.


Actually, Access 2003 is my goal, as I figured that would help solve
this issue at least to some extent. Due to client restrictions and
inhibitions, we must continue to use the original version for now. I'm
sure you've dealt with clients that say no to progress, yet demand it
in other ways.
 
 
 

Any foreign character mapping charts available?

Post by mich » Wed, 11 Feb 2004 23:15:03

"jmev7" < XXXX@XXXXX.COM > wrote...

you

There IS a default system locale for the machine that hosts the database,
and there is a database collation. It is the direct reason why you are
having problems here.

pages
things,

Well, the problem is that you are trying to store multilingual data in a
non-multilingual database. This will essentially corrupt the data.

any
2003).
area

You may have to really INSIST here, as there is no good way to support that
which cannot be supported. I used consult for companies in your client's
situation and make a lot of money providing solutions in such cases, but it
is really only possible if you find someone with the expertise to come up
with an extreme solution. In the meantime, it is crucial that they
understand that ANY upgrade from the version they have is designed to work
here (and that Jet 4.0 itself is free!).


--
MichKa [MS]
NLS Collation/Locale/Keyboard Development
Globalization Infrastructure and Font Technologies

This posting is provided "AS IS" with
no warranties, and confers no rights.