<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: UTF-8 encoding/decoding in C</title>
	<atom:link href="http://www.poppa.se/blog/utf-8-encodingdecoding-in-c/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.poppa.se/blog/utf-8-encodingdecoding-in-c/</link>
	<description>My blog about web development and such</description>
	<lastBuildDate>Sat, 14 Jan 2012 14:56:56 +0100</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Ernst</title>
		<link>http://www.poppa.se/blog/utf-8-encodingdecoding-in-c/comment-page-1/#comment-271</link>
		<dc:creator>Ernst</dc:creator>
		<pubDate>Mon, 14 Nov 2011 20:36:17 +0000</pubDate>
		<guid isPermaLink="false">http://www.poppa.se/blog/?p=10#comment-271</guid>
		<description>Hello Pontus,

Thanx a lot. I only need the decode Part of it . We fight with ä, ü and ö (German). I will test some use cases, which we need. I also so some error checking for our purpose. You&#039;ve saved my day ;-)

Cu
      Ernst</description>
		<content:encoded><![CDATA[<p>Hello Pontus,</p>
<p>Thanx a lot. I only need the decode Part of it . We fight with ä, ü and ö (German). I will test some use cases, which we need. I also so some error checking for our purpose. You&#8217;ve saved my day <img src='http://www.poppa.se/blog/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>
<p>Cu<br />
      Ernst</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Pontus</title>
		<link>http://www.poppa.se/blog/utf-8-encodingdecoding-in-c/comment-page-1/#comment-270</link>
		<dc:creator>Pontus</dc:creator>
		<pubDate>Mon, 14 Nov 2011 09:48:03 +0000</pubDate>
		<guid isPermaLink="false">http://www.poppa.se/blog/?p=10#comment-270</guid>
		<description>I had forgotten about this! Now I have changed the sources according to Rodrigo&#039;s suggestions. But I can&#039;t guarantee it&#039;s bug free since I haven&#039;t tested it thoroughly.

https://github.com/poppa/PlayStation/tree/master/c/utf8</description>
		<content:encoded><![CDATA[<p>I had forgotten about this! Now I have changed the sources according to Rodrigo&#8217;s suggestions. But I can&#8217;t guarantee it&#8217;s bug free since I haven&#8217;t tested it thoroughly.</p>
<p><a href="https://github.com/poppa/PlayStation/tree/master/c/utf8" rel="nofollow">https://github.com/poppa/PlayStation/tree/master/c/utf8</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ernst Scheller</title>
		<link>http://www.poppa.se/blog/utf-8-encodingdecoding-in-c/comment-page-1/#comment-269</link>
		<dc:creator>Ernst Scheller</dc:creator>
		<pubDate>Sun, 13 Nov 2011 15:24:25 +0000</pubDate>
		<guid isPermaLink="false">http://www.poppa.se/blog/?p=10#comment-269</guid>
		<description>Hello Pontus,

i don&#039;t quite understand the changes of Rodrigo and as far as i see , the modifications are not in your code. I have no example to proof the changes of Rodrigo. Do you modify your downloadable code  ?

Kind regards
                 Ernst</description>
		<content:encoded><![CDATA[<p>Hello Pontus,</p>
<p>i don&#8217;t quite understand the changes of Rodrigo and as far as i see , the modifications are not in your code. I have no example to proof the changes of Rodrigo. Do you modify your downloadable code  ?</p>
<p>Kind regards<br />
                 Ernst</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Pontus</title>
		<link>http://www.poppa.se/blog/utf-8-encodingdecoding-in-c/comment-page-1/#comment-259</link>
		<dc:creator>Pontus</dc:creator>
		<pubDate>Tue, 26 Jul 2011 22:38:41 +0000</pubDate>
		<guid isPermaLink="false">http://www.poppa.se/blog/?p=10#comment-259</guid>
		<description>Thanks a bunch for your contribution Rodrigo. I&#039;ll add it to the downloadable code.</description>
		<content:encoded><![CDATA[<p>Thanks a bunch for your contribution Rodrigo. I&#8217;ll add it to the downloadable code.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Rodrigo P. A.</title>
		<link>http://www.poppa.se/blog/utf-8-encodingdecoding-in-c/comment-page-1/#comment-258</link>
		<dc:creator>Rodrigo P. A.</dc:creator>
		<pubDate>Sun, 24 Jul 2011 16:56:01 +0000</pubDate>
		<guid isPermaLink="false">http://www.poppa.se/blog/?p=10#comment-258</guid>
		<description>Hi again, i found one problem when i use with PtBr, not encode correct this sample:

&quot;Olá Mundo&quot;

i found bug in this function: xml_utf8_encode

i change to:


static char *xml_utf8_encode(const char *s, int len, int *newlen,
		      const XML_Char *encoding)
{
  int pos = len;
  int size;
  char *newbuf;
  unsigned int c;
  unsigned short (*encoder)(unsigned char) = NULL;
  xml_encoding *enc = xml_get_encoding(encoding);

  *newlen = 0;
  if (enc)
    encoder = enc-&gt;encoding_function;
  else
    /* If the target encoding was unknown, fail */
    return NULL;

  if (encoder == NULL) {
    /* If no encoder function was specified, return the data as-is.
     */
    newbuf = (char*)emalloc(len + 1);
    memcpy(newbuf, s, len);
    *newlen = len;
    newbuf[*newlen] = &#039;&#039;;
    return newbuf;
  }

  /* This is the theoretical max (will never get beyond len * 2 as long
   * as we are converting from single-byte characters, though) */
  size=len;
  newbuf = emalloc(size);
  while (pos &gt; 0) {
    c = encoder ? encoder((unsigned char)(*s)) : (unsigned short)(*s);
	// alteredo, se o tamanho do novo buffer size ) {
		size+=16; // add 16 bytes in new buffer
		newbuf = (char*)erealloc(newbuf, size);
	}
    if (c &lt; 0x80)
      newbuf[(*newlen)++] = (char) c;
    else if (c &gt; 6));
      newbuf[(*newlen)++] = (0x80 &#124; (c &amp; 0x3f));
    }
    else if (c &gt; 12));
      newbuf[(*newlen)++] = (0xc0 &#124; ((c &gt;&gt; 6) &amp; 0x3f));
      newbuf[(*newlen)++] = (0x80 &#124; (c &amp; 0x3f));
    }
    else if (c &gt; 18));
      newbuf[(*newlen)++] = (0xe0 &#124; ((c &gt;&gt; 12) &amp; 0x3f));
      newbuf[(*newlen)++] = (0xc0 &#124; ((c &gt;&gt; 6) &amp; 0x3f));
      newbuf[(*newlen)++] = (0x80 &#124; (c &amp; 0x3f));
    }
    pos--;
    s++;
  }

  newbuf[*newlen] = 0;
  //newbuf = erealloc(newbuf, (*newlen)+1);
  return newbuf;
}


and now work fine!

Thank you</description>
		<content:encoded><![CDATA[<p>Hi again, i found one problem when i use with PtBr, not encode correct this sample:</p>
<p>&#8220;Olá Mundo&#8221;</p>
<p>i found bug in this function: xml_utf8_encode</p>
<p>i change to:</p>
<p>static char *xml_utf8_encode(const char *s, int len, int *newlen,<br />
		      const XML_Char *encoding)<br />
{<br />
  int pos = len;<br />
  int size;<br />
  char *newbuf;<br />
  unsigned int c;<br />
  unsigned short (*encoder)(unsigned char) = NULL;<br />
  xml_encoding *enc = xml_get_encoding(encoding);</p>
<p>  *newlen = 0;<br />
  if (enc)<br />
    encoder = enc-&gt;encoding_function;<br />
  else<br />
    /* If the target encoding was unknown, fail */<br />
    return NULL;</p>
<p>  if (encoder == NULL) {<br />
    /* If no encoder function was specified, return the data as-is.<br />
     */<br />
    newbuf = (char*)emalloc(len + 1);<br />
    memcpy(newbuf, s, len);<br />
    *newlen = len;<br />
    newbuf[*newlen] = &#8221;;<br />
    return newbuf;<br />
  }</p>
<p>  /* This is the theoretical max (will never get beyond len * 2 as long<br />
   * as we are converting from single-byte characters, though) */<br />
  size=len;<br />
  newbuf = emalloc(size);<br />
  while (pos &gt; 0) {<br />
    c = encoder ? encoder((unsigned char)(*s)) : (unsigned short)(*s);<br />
	// alteredo, se o tamanho do novo buffer size ) {<br />
		size+=16; // add 16 bytes in new buffer<br />
		newbuf = (char*)erealloc(newbuf, size);<br />
	}<br />
    if (c &lt; 0&#215;80)<br />
      newbuf[(*newlen)++] = (char) c;<br />
    else if (c &gt; 6));<br />
      newbuf[(*newlen)++] = (0&#215;80 | (c &amp; 0&#215;3f));<br />
    }<br />
    else if (c &gt; 12));<br />
      newbuf[(*newlen)++] = (0xc0 | ((c &gt;&gt; 6) &amp; 0&#215;3f));<br />
      newbuf[(*newlen)++] = (0&#215;80 | (c &amp; 0&#215;3f));<br />
    }<br />
    else if (c &gt; 18));<br />
      newbuf[(*newlen)++] = (0xe0 | ((c &gt;&gt; 12) &amp; 0&#215;3f));<br />
      newbuf[(*newlen)++] = (0xc0 | ((c &gt;&gt; 6) &amp; 0&#215;3f));<br />
      newbuf[(*newlen)++] = (0&#215;80 | (c &amp; 0&#215;3f));<br />
    }<br />
    pos&#8211;;<br />
    s++;<br />
  }</p>
<p>  newbuf[*newlen] = 0;<br />
  //newbuf = erealloc(newbuf, (*newlen)+1);<br />
  return newbuf;<br />
}</p>
<p>and now work fine!</p>
<p>Thank you</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Rodrigo P. A.</title>
		<link>http://www.poppa.se/blog/utf-8-encodingdecoding-in-c/comment-page-1/#comment-257</link>
		<dc:creator>Rodrigo P. A.</dc:creator>
		<pubDate>Sun, 24 Jul 2011 14:39:19 +0000</pubDate>
		<guid isPermaLink="false">http://www.poppa.se/blog/?p=10#comment-257</guid>
		<description>Hi, need free pointer after use, example:

create this function:

void utf8_clean(void *ptr)
{
	if ( ptr ) free(ptr);
}

using:

#include &quot;utf8.h&quot;

int main(int argc, char **argv)
{
  char *iso_str = &quot;Pontus Östlund&quot;;
  char *utf8_str;

  utf8_str = utf8_encode(iso_str);
  iso_str  = utf8_decode(utf8_str);
utf8_clean( utf8_str );
utf8_clean ( iso_str );

  return 0;
}</description>
		<content:encoded><![CDATA[<p>Hi, need free pointer after use, example:</p>
<p>create this function:</p>
<p>void utf8_clean(void *ptr)<br />
{<br />
	if ( ptr ) free(ptr);<br />
}</p>
<p>using:</p>
<p>#include &#8220;utf8.h&#8221;</p>
<p>int main(int argc, char **argv)<br />
{<br />
  char *iso_str = &#8220;Pontus Östlund&#8221;;<br />
  char *utf8_str;</p>
<p>  utf8_str = utf8_encode(iso_str);<br />
  iso_str  = utf8_decode(utf8_str);<br />
utf8_clean( utf8_str );<br />
utf8_clean ( iso_str );</p>
<p>  return 0;<br />
}</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Pontus</title>
		<link>http://www.poppa.se/blog/utf-8-encodingdecoding-in-c/comment-page-1/#comment-159</link>
		<dc:creator>Pontus</dc:creator>
		<pubDate>Wed, 06 Jan 2010 10:38:04 +0000</pubDate>
		<guid isPermaLink="false">http://www.poppa.se/blog/?p=10#comment-159</guid>
		<description>Now, I haven&#039;t really tested these functions thoroughly so be careful, there might be some bugs in there!</description>
		<content:encoded><![CDATA[<p>Now, I haven&#8217;t really tested these functions thoroughly so be careful, there might be some bugs in there!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: yctai</title>
		<link>http://www.poppa.se/blog/utf-8-encodingdecoding-in-c/comment-page-1/#comment-158</link>
		<dc:creator>yctai</dc:creator>
		<pubDate>Wed, 06 Jan 2010 08:01:51 +0000</pubDate>
		<guid isPermaLink="false">http://www.poppa.se/blog/?p=10#comment-158</guid>
		<description>This is REALLY what I need! Thanks a lot:)</description>
		<content:encoded><![CDATA[<p>This is REALLY what I need! Thanks a lot:)</p>
]]></content:encoded>
	</item>
</channel>
</rss>

