From 397b61df257f72a8ce90792985f76497ba735da4 Mon Sep 17 00:00:00 2001
From: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Date: Tue, 20 Feb 2007 23:02:35 +0000
Subject: Use ASCII KCODE to prevent problems like missing characters or
 matching failures when clients send messages in something else than UTF-8

---
 ChangeLog | 10 ++++++++++
 1 file changed, 10 insertions(+)

(limited to 'ChangeLog')

diff --git a/ChangeLog b/ChangeLog
index 358aab5f..403e8c41 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -6,6 +6,16 @@
 	<yaohan.chen@gmail.com>. People take turns to continue a chain of
 	words by saying words that begin with the final letter(s) of the
 	previous word.
+	* IRC messages are not UTF-8: Most of the string processing across
+	rbot is done against IRC messages, which do not have a well-defined
+	encoding. Although many clients are now using UTF-8, there is no
+	guarantee that an arbitrary string received from IRC will be UTF-8
+	encoded. We have to force ASCII (byte-wise/charset agnostic) matching
+	because otherwise some strings can give problems: in particular, for
+	example, the bytesequence "\340\350\354\362\371" (that is the aeiou
+	vowels, each with a grave accent) will cause the string to be
+	considered up to the "\354" (i with grave accent) only: so either the
+	rest of the message is ignored, or the matching fails.
 
 2007-02-18  Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
 
-- 
cgit v1.2.3