properly utf-8 encoding of LT language codes

DiscussionsBug Collectors

Rejoignez LibraryThing pour poster.

properly utf-8 encoding of LT language codes

Ce sujet est actuellement indiqué comme "en sommeil"—le dernier message date de plus de 90 jours. Vous pouvez le réveiller en postant une réponse.

1gangleri
Oct 8, 2010, 7:50 am

Hi! This is basicaly a repost of
http://www.librarything.com/topic/82757
anguage code « vol » - « Volapük » is using Unicode Character 'REPLACEMENT CHARACTER' in catalog, etc.

http://www.librarything.com/language.php?l=pro&alllanguages=1
Language: Proven�al (to 1500)

http://www.librarything.com/language.php?l=vol&alllanguages=1
Language: Volap�k

Regards Reinhardt

2PaulFoley
Oct 8, 2010, 10:51 pm

It actually has ü (and ç) in there, it's just incorrectly encoded as Latin-1 instead of UTF-8. The table mapping ISO-639 codes to names is bad.

3timspalding
Oct 24, 2010, 10:10 pm

Assigned to Casey, because this is kicking my ass.

4timspalding
Jan 12, 2011, 8:42 pm

Update on why this isn't fixed yet (see http://www.librarything.com/topic/107331).

Assigned to another employee.

5caseydurfee
Fév 25, 2011, 1:41 am

Fixed, at long last.

6gangleri
Mar 20, 2011, 4:23 pm

Thanks a lot!

8gangleri
Modifié : Mar 19, 2012, 12:44 pm

adding some search keaywords here:
Unicode Character 'REPLACEMENT CHARACTER' (U+FFFD) �
http://www.fileformat.info/info/unicode/char/fffd/index.htm

9gangleri
Modifié : Mar 30, 2012, 8:07 pm

follow up message:
/topic/134442 - "properly utf-8 encoding of LT messages (in all languages) - � detected again"

10gangleri
Mar 30, 2012, 3:07 pm

changed the bug category
reopening and closing by member afterwards

11gangleri
Mar 30, 2012, 3:08 pm

closing by member

12brightcopy
Mar 30, 2012, 3:16 pm

Err... WHY? If they mark it as fixed, there's really no reason to reopen it unless it wasn't fixed. Tim himself frequently marks bugs as fixed even after they've been closed by members. I think he likes being able to see how many bugs he's fixed. Did you just reopen it by mistake and then had to close it?

13gangleri
Modifié : Mar 30, 2012, 8:14 pm

>12 brightcopy: Did you see >8 gangleri: ?
follow up message:
/topic/134442 - "properly utf-8 encoding of LT messages (in all languages) - � detected again"
----
"Volapük" was fixed in the past. Then I could see this again. I have no clue who can hack LT English master messages and inserting old bags again. Now it is ridiculous. The only one concerned about replacement characters seems to be me, I see no much serious contributions relating UTF-8 and now such questions. All required information is in the followup topic.

14brightcopy
Mar 30, 2012, 9:39 pm

The question was why you reopened and closed it.

15gangleri
Modifié : Mar 30, 2012, 11:58 pm

>14 brightcopy: It was assigned to category "Site-wide problems (6)". I learned these days that translations are using category: "Non-English LibraryThing". This is also why it was difficult to find this thread.

>5 caseydurfee: was stating that it (the actual topic) is fixed (according to the posted url). I confirmed >7 gangleri: which is still fine today: http://www.librarything.com/language.php?l=mos&alllanguages=1

Sorry but the sytem is not transparent. It is not like the MediaWiki messages where you can trace back exactly what happened, who made when a change (and if you are more skilled you may look in svn to see which messages have been added).
If you tell a child "Do not cross this red light!" this is not as telling "Never ever cross a red light!" It is difficult to know how many instances of "Volapül" and "Mooré" (the red lights and more general the � characters) are in the DB of all LT master English message. It is not transparent who is allowed to add messages to the database. (I noticed today via the "Role" field in edit book one can added (any ha$ke*r can) add new messages. You may tell this is as it is intended to be but other sites see this as a se§cu*ri$ty ho$le.
This topic reffered at >9 gangleri: is generalized telling exactly that all instances of the replacement character need to be fixed in master messages. I see no way to explain it more clear. Please let the people fixing the bug post their questions first.

16brightcopy
Mar 30, 2012, 11:19 pm

Oh man, you're making my brain hurt here.

Is the confusion that you don't realize that "Fixed" is a final, closed status, just like "Closed by member"?