<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Pontus Östlund</title>
	<atom:link href="http://www.poppa.se/blog/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.poppa.se/blog</link>
	<description>My blog about web development and such</description>
	<lastBuildDate>Thu, 19 Aug 2010 22:07:40 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Pike project &#8211; module stub creator</title>
		<link>http://www.poppa.se/blog/pike-project-module-stub-creator/</link>
		<comments>http://www.poppa.se/blog/pike-project-module-stub-creator/#comments</comments>
		<pubDate>Thu, 19 Aug 2010 22:06:31 +0000</pubDate>
		<dc:creator>Pontus</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[GTK]]></category>
		<category><![CDATA[Pike]]></category>

		<guid isPermaLink="false">http://www.poppa.se/blog/?p=455</guid>
		<description><![CDATA[
I recently began learning how to create Pike modules in C. The Pike module C API seems great and once you&#8217;ve sorted things out the modules are easy to build and install. Non the less, when creating a C module from scratch there&#8217;s a couple of files you need and some configurations of those before [...]]]></description>
			<content:encoded><![CDATA[<p><a href="/blog/data/images/pike-project.png" class="no-file"><img src="/blog/data/images/pike-project.png/530" alt="Screenshot of Pike project" /></a></p>
<p>I recently began learning how to create <a href='http://pike.ida.liu.se/'>Pike</a> modules in C. The Pike module C API seems great and once you&#8217;ve sorted things out the modules are easy to build and install. Non the less, when creating a C module from scratch there&#8217;s a couple of files you need and some configurations of those before everything is set for go. And here comes &#8220;pike-project&#8221; into play.</p>
<p>Pike-project is a simple <a href='http://www.gtk.org/'>GTK</a> program (works as command line tool also) written in Pike it self. The program will create the basics for a running Pike C module, or a plain installable Pike module. Then it&#8217;s just starting programming.</p>
<p>The program is available at my <a href='http://github.com/poppa/Pike-Modules/blob/master/tools/pike-project'>Github repository</a>.</p>
<p>BTW! At the moment it only works on Linux I suppose.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.poppa.se/blog/pike-project-module-stub-creator/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Pingly &#8211; a Roxen module for automatic Twingly pinging</title>
		<link>http://www.poppa.se/blog/pingly-a-roxen-module-for-automatic-twingly-pinging/</link>
		<comments>http://www.poppa.se/blog/pingly-a-roxen-module-for-automatic-twingly-pinging/#comments</comments>
		<pubDate>Fri, 09 Jul 2010 09:49:55 +0000</pubDate>
		<dc:creator>Pontus</dc:creator>
				<category><![CDATA[Roxen]]></category>
		<category><![CDATA[Pike]]></category>
		<category><![CDATA[Share]]></category>
		<category><![CDATA[Twingly]]></category>

		<guid isPermaLink="false">http://www.poppa.se/blog/?p=451</guid>
		<description><![CDATA[Twingly is a blog search engine that focus on indexing the &#8220;blogosphere&#8221; rather than being a generic search engine. Twingly has a ping service that let you ping Twingly when you have new content on your blog so that Twingly can head over there and index the new content asap.
Since Roxen CMS have event hooks [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.twingly.com">Twingly</a> is a blog search engine that focus on indexing the &#8220;blogosphere&#8221; rather than being a generic search engine. Twingly has a ping service that let you ping Twingly when you have new content on your blog so that Twingly can head over there and index the new content asap.</p>
<p>Since <a href="http://www.roxen.com">Roxen</a> CMS have event hooks this module listens for newly published files and when found automatically notifies Twingly about it.</p>
<p>The only thing needed is to set up a config file in the SiteBuilder&#8217;s workarea so that this module knows under which paths newly published content should notify Twingly and with which arguments. But all this is documented in the module.</p>
<p><strong>One note!</strong> If you run a replicated environment install this module on <strong>one</strong> of the frontend servers, not the backend. If installed on a backend Twingly might head over to your site before the new content has been replicated.</p>
<p><a href="/blog/data/scripts/pingly.pike">The Pingly Roxen module</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.poppa.se/blog/pingly-a-roxen-module-for-automatic-twingly-pinging/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>JavaScript minifier filter module for Roxen</title>
		<link>http://www.poppa.se/blog/javascript-minifier-filter-module-for-roxen/</link>
		<comments>http://www.poppa.se/blog/javascript-minifier-filter-module-for-roxen/#comments</comments>
		<pubDate>Mon, 28 Jun 2010 12:17:46 +0000</pubDate>
		<dc:creator>Pontus</dc:creator>
				<category><![CDATA[Misc]]></category>
		<category><![CDATA[Javascript]]></category>
		<category><![CDATA[Pike]]></category>
		<category><![CDATA[Roxen]]></category>

		<guid isPermaLink="false">http://www.poppa.se/blog/?p=441</guid>
		<description><![CDATA[Nowadays web sites and web applications tend to be more and more JavaScript driven which results in humongous JavaScript files. It&#8217;s not uncommon to have several 100 of bytes of JavaScript on a site. Of course web browsers cache stuff like JavaScript so that it only is requested from the server once. But judging from [...]]]></description>
			<content:encoded><![CDATA[<p>Nowadays web sites and web applications tend to be more and more JavaScript driven which results in humongous JavaScript files. It&#8217;s not uncommon to have several 100 of bytes of JavaScript on a site. Of course web browsers cache stuff like JavaScript so that it only is requested from the server once. But judging from the visitor logs at work most people only visit our site once a month or so which means that cache will expire and all those scripts has to be requested upon the first visit.</p>
<p>Now, there are several ways to compact JavaScripts: <a href="http://dean.edwards.name/packer/">Packer</a>, <a href="http://yuilibrary.com/downloads/#yuicompressor">YUI Compressor</a>, <a href="http://shrinksafe.dojotoolkit.org/">Shrink Safe</a>, <a href="http://www.crockford.com/javascript/jsmin.html">jsmin</a> and many more. Some of these just remove redundant white space and comments, some obfuscates the code and shortens variable and function names and what not. Many of these scripts and programs are very fine but they require you to manually minify your scripts, and that&#8217;s just a hassle! </p>
<p>But since we use <a href="http://www.roxen.com">Roxen</a> CMS at work things get much easier if you write your own <em>Roxen filter module</em> which automatically minifies JavaScripts on the fly, given they meet certain criteria. And so I did!</p>
<p>I ported the original <a href="http://www.crockford.com/javascript/jsmin.html">jsmin</a> code written in C to <a href="http://pike.ida.liu.se">Pike</a>. Then it was just a matter of creating a simple filter module for Roxen. And then it was all done.</p>
<p>You can use two criteria to determine if a script should be minified or not:</p>
<ol>
<li><strong>Path glob:</strong> In the module settings you can specify any number of directory globs or full paths. If a requested JavaScript either is in a directory matching a glob or is a direct match it will be minified.</li>
<li><strong>Query string variable:</strong> In the module settings you can define a variable name that if exists as a query string variable in the request the JavaScript will be minified. So:
<pre>&lt;script type="text/javascript" src="myscript.js?jsmin=1"&gt;&lt;/script&gt;</pre>
<p>will minify <code>myscript.js</code></li>
</ol>
<p>And that&#8217;s that!</p>
<p><a href="/blog/data/scripts/jsmin.pike">jsmin Roxen filter module</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.poppa.se/blog/javascript-minifier-filter-module-for-roxen/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Roxen Application Launcher 0.4.5</title>
		<link>http://www.poppa.se/blog/roxen-application-launcher-0-4-5/</link>
		<comments>http://www.poppa.se/blog/roxen-application-launcher-0-4-5/#comments</comments>
		<pubDate>Tue, 13 Apr 2010 21:52:52 +0000</pubDate>
		<dc:creator>Pontus</dc:creator>
				<category><![CDATA[Applications]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Roxen]]></category>
		<category><![CDATA[Github]]></category>
		<category><![CDATA[Gnome]]></category>
		<category><![CDATA[Vala]]></category>

		<guid isPermaLink="false">http://www.poppa.se/blog/?p=421</guid>
		<description><![CDATA[
Okey, here comes an update of my Roxen Application Launcher (come again?) for Linux. 
There&#8217;s no major changes to this release. The connection to the Roxen server is now stored in a shared object so that it can use a &#8220;keep-alive&#8221; connection. Not that I think it matters a great deal.
There&#8217;s now an option to [...]]]></description>
			<content:encoded><![CDATA[<p><a href="/blog/data/images/ral-045.jpg" class="no-file"><img src="/blog/data/images/ral-045.jpg/530" alt="Screenshot of Roxen Application Launcher" /></a></p>
<p>Okey, here comes an update of my <a href="http://roxen.se">Roxen</a> Application Launcher (<a href="/blog/stuff/#roxen-applauncher">come again?</a>) for <a href="http://www.linux.com/">Linux</a>. </p>
<p>There&#8217;s no major changes to this release. The connection to the Roxen server is now stored in a shared object so that it can use a &#8220;keep-alive&#8221; connection. Not that I think it matters a great deal.</p>
<p>There&#8217;s now an option to change the behavior of the applications window close button so that it hides the application to the tray &#8211; or notification area as it&#8217;s called in <a href="http://gnome.org">Gnome</a> &#8211; rather than closes the application. </p>
<p>More <a href="http://live.gnome.org/Vala">Vala</a> programming to the people &#8211; <a href="http://github.com/poppa/Roxen-Application-Launcher">Sources at Github</a>.</p>
<p><a href="/blog/data/roxenlauncher-0.4.5.tar.gz">Roxen Appliction Launcher 0.4.5</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.poppa.se/blog/roxen-application-launcher-0-4-5/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Blue Screen of Death</title>
		<link>http://www.poppa.se/blog/blue-screen-of-death/</link>
		<comments>http://www.poppa.se/blog/blue-screen-of-death/#comments</comments>
		<pubDate>Fri, 05 Feb 2010 13:21:00 +0000</pubDate>
		<dc:creator>Pontus</dc:creator>
				<category><![CDATA[Misc]]></category>
		<category><![CDATA[Rant]]></category>

		<guid isPermaLink="false">http://www.poppa.se/blog/?p=406</guid>
		<description><![CDATA[
You don&#8217;t get surprised when you get to work one morning and your monitors looks like this, given your computer is running the bless from Redmond!
Man I wish I was running Linux at work as well! 
(And no: I hardly drink any Coca Cola at all  )
]]></description>
			<content:encoded><![CDATA[<p><img src="/blog/data/images/bs.jpg/530" /></p>
<p>You don&#8217;t get surprised when you get to work one morning and your monitors looks like this, given your computer is running the bless from <a href="http://maps.google.se/maps?hl=sv&#038;safe=active&#038;ie=UTF8&#038;q=redmond+microsoft&#038;fb=1&#038;gl=se&#038;hq=microsoft&#038;hnear=Redmond,+WA,+USA&#038;ei=kBhsS4X9F43r-AbL4rCGBA&#038;ved=0CB0QtgMwAA&#038;z=13&#038;iwloc=A">Redmond</a>!</p>
<p>Man I wish I was running Linux at work as well! </p>
<p>(And no: I hardly drink any Coca Cola at all <img src='http://www.poppa.se/blog/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> )</p>
]]></content:encoded>
			<wfw:commentRss>http://www.poppa.se/blog/blue-screen-of-death/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>GTK hacking in Pike</title>
		<link>http://www.poppa.se/blog/gtk-hacking-in-pike/</link>
		<comments>http://www.poppa.se/blog/gtk-hacking-in-pike/#comments</comments>
		<pubDate>Mon, 18 Jan 2010 23:40:59 +0000</pubDate>
		<dc:creator>Pontus</dc:creator>
				<category><![CDATA[Applications]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Gnome]]></category>
		<category><![CDATA[GTK]]></category>
		<category><![CDATA[Pike]]></category>

		<guid isPermaLink="false">http://www.poppa.se/blog/?p=386</guid>
		<description><![CDATA[I&#8217;ve found out that it&#8217;s great fun programming desktop applications and of course it gets more fun the more you learn. Now I&#8217;m doing a Twitter client in Pike &#8211; my favorite programming language &#8211; mostly because I wanted to try out GTK programming in Pike. I use the good Twitter client Pino &#8211; written [...]]]></description>
			<content:encoded><![CDATA[<p><a href="/blog/data/images/tweepi.jpg" class="no-file"><img src="/blog/data/images/tweepi.jpg/220" class="alignright" alt="Tweepi, the Twitter client written in Pike" /></a>I&#8217;ve found out that it&#8217;s great fun programming desktop applications and of course it gets more fun the more you learn. Now I&#8217;m doing a <a href="http://twitter.com">Twitter</a> client in <a href="http://pike.ida.liu.se">Pike</a> &#8211; my favorite programming language &#8211; mostly because I wanted to try out <a href="http://www.gtk.org/">GTK</a> programming in Pike. I use the good Twitter client <a href="http://pino-app.appspot.com/">Pino</a> &#8211; written in <a href="http://live.gnome.org/Vala/">Vala</a> &#8211; and I have borrowed the concept and layout from it. I call it <strong>Tweepi</strong>.</p>
<p>The only major difference between Tweepi and Pino &#8211; besides they are written in different programming languages &#8211; is that Pino uses WebKit to draw the status messages where I am using good old GTK widgets &#8211; and I guess there are no bindings to WebKit in Pike for that matter <img src='http://www.poppa.se/blog/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<p>One thing I noticed is that the <code>Gtk.Label</code> widget sucks at displaying longer texts that line wraps. Since the label widget handles some HTML formatting I thought that it would be suitable for displaying the status messages, but the text looked like shit, line wrapping where ever it felt like. And the <code>Gtk.TextView</code> widget doesn&#8217;t handle formatting per default so I Googled some and found that you can format text in <code>Gtk.TextView</code>s by inserting <code>Gtk.TextTag</code>s at desired positions. And since Pike has the most awesome HTML parser It was just a matter of sending the text through the parser and create some <code>Gtk.TextTag</code>s and inserting them at the same position in the text buffer. (Well, actually it wasn&#8217;t that easy but with some help from a Python class I found on the web it was doable).</p>
<p>So now I have a start at something that is a <code>Gtk.HtmlTextView</code> &#8211; actually it inherits <code>Gtk.TextView</code> but has an additional method <code>insert_html_text(string text)</code> &#8211; and albeit quite simple at the moment it&#8217;s worth continuing on. The code for the <code>HtmlTextView</code> is available at my <a href="http://github.com/poppa/Pike-Modules/blob/master/Misc.pmod/GTK2.pmod/module.pmod">Github repository</a>.</p>
<p>In general I find the GTK implementation in Pike to be pretty OK, but there exist some verbose, and tedious, stuff like getting the text from a <code>Gtk.TextView</code>: </p>
<pre><code lang="pike">
Gtk.TextBuffer b = my_textview->get_buffer();
string text = b->get_text(b->get_start_iter(), b->get_end_iter(), 0);
</code></pre>
<p>which in Vala and C# would be done like:</p>
<pre><code lang="vala">
// Vala
string text = my_textview.get_buffer().text;

// C#
string text = myTextView.Buffer.Text;
</code></pre>
<p>Anyway! Tweepi isn&#8217;t done yet but I think I have solved the most tedious stuff and it&#8217;s starting to become useful. It&#8217;ll probably be done in a couple of weeks and I will of course release the sources then.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.poppa.se/blog/gtk-hacking-in-pike/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Roxen Application Launcher 0.4.4</title>
		<link>http://www.poppa.se/blog/roxen-application-launcher-0-4-4/</link>
		<comments>http://www.poppa.se/blog/roxen-application-launcher-0-4-4/#comments</comments>
		<pubDate>Wed, 13 Jan 2010 22:58:28 +0000</pubDate>
		<dc:creator>Pontus</dc:creator>
				<category><![CDATA[Applications]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Roxen]]></category>
		<category><![CDATA[Gnome]]></category>
		<category><![CDATA[GTK]]></category>
		<category><![CDATA[Vala]]></category>

		<guid isPermaLink="false">http://www.poppa.se/blog/?p=368</guid>
		<description><![CDATA[So, here&#8217;s a new release of the Roxen Application Launcher for Linux (RAL). The previous versions used my home made (sloppy so) HTTP client which didn&#8217;t handle redirects or secure connections &#8211; thank you tec for the feed back &#8211; since I had some major problems getting libsoup working with binary files like images and [...]]]></description>
			<content:encoded><![CDATA[<p>So, here&#8217;s a new release of the Roxen Application Launcher for Linux (RAL). The previous versions used my home made (sloppy so) HTTP client which didn&#8217;t handle redirects or secure connections &#8211; thank you <a href="/blog/new-roxen-application-launcher-for-linux-written-in-vala/#comments">tec</a> for the feed back &#8211; since I had some major problems getting <code>libsoup</code> working with binary files like images and such. Binary files was heavily scrambled when read from or written to disk so I made my own simple HTTP client that kept the data as a byte array to prevent some underlying libraries (GLib) from fiddling with it.</p>
<p>But I solved the <code>libsoup</code> issue so now the RAL handles redirects and secure connections. This is how I solved it:</p>
<h2>The <code>libsoup</code> issue</h2>
<p>When uploading a file back to the <a href="http://roxen.com">Roxen</a> server I use <code>IOChannel (g_io_channel in plain C)</code> instead of <code>Gio</code>. So the upload works like this:</p>
<pre><code lang="vala">
var sess = new Soup.SessionSync();
var mess = new Soup.Message("PUT", get_uri());
mess.request_headers.append("Cookie", get_cookie());
mess.request_headers.append("Translate", "f");

IOChannel ch = new IOChannel.file(local_file, "r");
ch.set_encoding(null); // Enables reading of binary data
string data;
size_t len;
ch.read_to_end(out data, out len);

mess.request_body.append(Soup.MemoryUse.COPY, data, len);
sess.send_message(mess);
</code></pre>
<p>And that seems to work like a charm!</p>
<p>When downloading data it&#8217;s a bit more tricky! Of course I tried using <code>IOChannel</code> in this case also but that made no difference. Downloaded images ended up 4 bytes long! But then I thought: You can make your own C bindings in Vala (remember the Vala compiler generates C code) through what is called Vapi files. So what I did was writing a C function that takes a <code>SoupMessageBody</code> object/struct passed from Vala and writes the data part to a file given as argument.</p>
<pre><code lang="cpp">
gboolean save_soup_data(SoupMessageBody *data, const char *file)
{
  FILE *fh;

  if ((fh = fopen(file, "w")) == NULL) {
    fprintf(stderr, "Unable to open file \"%s\" for writing!\n", file);
    return FALSE;
  }

  int wrote = fwrite(data->data, 1, data->length, fh);

  if (wrote != (int)data->length) {
    fprintf(stderr, "wrote (%d) != data->length (%d). Data may have been "
                    "truncated", wrote, (int)data->length);
  }

  fclose(fh);
  return TRUE;
}
</code></pre>
<p>And this was then made available to Vala by the following Vapi file:</p>
<pre><code lang="vala">
[CCode (cprefix = "", lower_case_cprefix = "", cheader_filename = "")]
namespace Soppa // Soppa is Swedish for Soup <img src='http://www.poppa.se/blog/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' />
{
  [CCode (cname = "save_soup_data")]
  public bool save_soup_data(Soup.MessageBody data, string file);
}
</code></pre>
<p>And this is how the actual Vala code downloading the files looks like:</p>
<pre><code lang="vala">
var sess = new Soup.SessionSync();
var mess = new Soup.Message("GET", get_uri());
mess.request_headers.append("Cookie", get_cookie());
mess.request_headers.append("Translate", "f");
sess.send_message(mess);

if (mess.status_code == Soup.KnownStatusCode.OK) {
  // Here I call the C function made available through the Vapi file
  if (Soppa.save_soup_data(mess.response_body, local_file)) {
    message("The file was downloaded and written to disk OK");
  }
  else {
    message("Failed writing data to disk!");
  }
}
</code></pre>
<p>So that&#8217;s that on that! <img src='http://www.poppa.se/blog/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<h2>The notification</h2>
<p><img src="/blog/data/images/libnotify.png" class="alignright"/> I also &#8211; just for fun &#8211; implemented a notification mechanism through <code>libnotify</code>. Since I believe that can be rather annoying it&#8217;s not activated by default but can easily be activated by a checkbox in the user interface.</p>
<h2>The packages</h2>
<p>The Roxen Application Launcher for Linux can be downloaded at the <a href="http://github.com/poppa/Roxen-Application-Launcher/downloads"><strong>download page</strong></a> at <a href="http://github.com">Github</a> where also the <a href="http://github.com/poppa/Roxen-Application-Launcher"><strong>work in progress sources</strong></a> is available or downloaded below!</p>
<p><a href="/blog/data/roxenlauncher-0.4.4.tar.gz">Roxen Application Launcher 0.4.4</a></p>
<p>Stay black!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.poppa.se/blog/roxen-application-launcher-0-4-4/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Extracting text from PDFs</title>
		<link>http://www.poppa.se/blog/extracting-text-from-pdfs/</link>
		<comments>http://www.poppa.se/blog/extracting-text-from-pdfs/#comments</comments>
		<pubDate>Mon, 11 Jan 2010 16:24:09 +0000</pubDate>
		<dc:creator>Pontus</dc:creator>
				<category><![CDATA[Applications]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[C#]]></category>
		<category><![CDATA[Textifyer]]></category>

		<guid isPermaLink="false">http://www.poppa.se/blog/?p=355</guid>
		<description><![CDATA[
Unwanted line breaks in text copied from PDF
Anybody working with information sooner or later have to copy and paste text from PDF-files. And anybody who has done that knows what a pain in the a** that is! You get actual line breaks from the visual line breaks in the PDF. The other day I began [...]]]></description>
			<content:encoded><![CDATA[<p><img src="/blog/data/images/textifyer-3.png/530" alt="Unwanted line breaks in text copied from PDF" /><br />
<small><em>Unwanted line breaks in text copied from PDF</em></small></p>
<p>Anybody working with information sooner or later have to copy and paste text from PDF-files. And anybody who has done that knows what a pain in the a** that is! You get actual line breaks from the visual line breaks in the PDF. The other day I began a job where I have to copy and paste text from a whole bunch of PDF files and it didn&#8217;t take long before I almost exploded with anger <img src='http://www.poppa.se/blog/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<p><strong>So I thought:</strong> Why not make a simple application that extracts the text from the PDF and &#8211; to the most possible degree &#8211; normalizes the unwanted line breaks.</p>
<h2>And then there was Textifyer</h2>
<p>So I fired up <em>Visual C# Express</em> and started hacking. I soon found the <a href="http://www.pdfbox.org">PDFbox</a> component &#8211; using <a href="http://ikvm.net">IKVM.NET</a> &#8211; and it didn&#8217;t take long before I had some code that actually extracted the text from a PDF file. (<a href="http://www.codeproject.com/KB/string/pdf2text.aspx">a PDF extraction in C#  howto</a>)</p>
<p>I figured out how to detect unwanted line breaks: Each line with an unwanted line break ends with a space character. Lines with a wanted line break doesn&#8217;t (in 99% of the cases). So it is just a matter of of looping over the lines and if it ends with a space skip adding a line break and just append it to the previous text buffer. </p>
<p><img src="/blog/data/images/textifyer-2.png/530" alt="Unwanted line breaks removed" /><br />
<small><em>Unwanted line breaks removed</em></small></p>
<p>So now I just have to clean up the interface and bug test the program &#8211; which will happen automatically since I&#8217;m copy and paste from a whole bunch of PDFs at the moment. When I feel it&#8217;s working alright I will release the program. It&#8217;s really nothing hardcore about it anyway <img src='http://www.poppa.se/blog/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<p><img src="/blog/data/images/textifyer.png/530" alt="Textifyer: Drag-n-drop enabled" /><br />
<small><em>Of course there&#8217;s drag-n-drop support!</em></small></p>
]]></content:encoded>
			<wfw:commentRss>http://www.poppa.se/blog/extracting-text-from-pdfs/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Bitlyfier &#8211; A Bit.ly client for GNOME</title>
		<link>http://www.poppa.se/blog/bitlyfier-a-bit-ly-client-for-gnome/</link>
		<comments>http://www.poppa.se/blog/bitlyfier-a-bit-ly-client-for-gnome/#comments</comments>
		<pubDate>Wed, 06 Jan 2010 23:46:14 +0000</pubDate>
		<dc:creator>Pontus</dc:creator>
				<category><![CDATA[Applications]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Gnome]]></category>
		<category><![CDATA[Vala]]></category>

		<guid isPermaLink="false">http://www.poppa.se/blog/?p=333</guid>
		<description><![CDATA[ For those of us tweeting &#8211; or sharing web addresses in general &#8211; these long addresses with extensive query strings you wan&#8217;t to share isn&#8217;t too user friendly. So we have Bit.ly, among others, that lets you shorten a URL &#8211; or give it an alias if you like &#8211; and also gives you [...]]]></description>
			<content:encoded><![CDATA[<p><img src="/blog/data/images/bitlyfier/bitlyfier-about.png" alt="Bitlyfier" class="alignright nobg"/> For those of us tweeting &#8211; or sharing web addresses in general &#8211; these long addresses with extensive query strings you wan&#8217;t to share isn&#8217;t too user friendly. So we have <a href="http://bit.ly">Bit.ly</a>, among others, that lets you shorten a URL &#8211; or give it an alias if you like &#8211; and also gives you statistics on how many clicks it has and if it&#8217;s shared on Twitter and what not. </p>
<p>Since I&#8217;m on the quest of learning the programming language <a href="http://live.gnome.org/Vala/">Vala</a> I though why not making a Bit.ly desktop client for <a href="http://gnome.org/">GNOME</a>. So I did!</p>
<h2>The desktop client</h2>
<p>There&#8217;s really nothing extraordinary about it, in fact it&#8217;s quite simple. Put a long URL in the input field and hit &#8220;OK&#8221;. You&#8217;ll get the shortened URL back in the same input field.</p>
<p><em>NOTE! The screenshots is showing the Swedish translation but the interface is orginally in English.</em></p>
<p><em><small>Shortening a long URL</small></em><br />
<img src="/blog/data/images/bitlyfier/bitlyfier-2.png" alt="Shortening an URL with Bitlyfier" /></p>
<p><em><small>The shortened URL</small></em><br />
<img src="/blog/data/images/bitlyfier/bitlyfier-3.png" alt="The Bit.ly shortened URL" /></p>
<p>To use the application you will of course need a Bit.ly account. The first time Bitlyfier is launched it will ask for your Bit.ly account settings. Just fill in your username and API key (it&#8217;s found on your account page at <a href="http://bit.ly/account">http://bit.ly/account</a>).</p>
<p><em><small>Bitlyfier account settings</small></em><br />
<img src="/blog/data/images/bitlyfier/bitlyfier-settings.png" alt="The bitlyfier settings dialog" /></p>
<h2>The command line interface</h2>
<p>For the hacker you, Bitlyfier can also be used as a command line tool. These are the options:</p>
<pre><code lang="none">
Usage:
  bitlyfier [OPTION...] - Bitlyfier, URL shortener/expander

Help Options:
  -h, --help        Show help options

Application Options:
  -e, --expand      Expands the given URL
  -s, --shorten     Shortens the given URL
  -n, --no-gui      Sets the application in command line mode
  -g, --gconf       Invokes setting username and apikey
</code></pre>
<p>NOTE! You should quote the value of the &#8216;-s&#8217; flag. If the URL to be shortened<br />
contains a querystring with ampersands the URL will be truncated if it&#8217;s not<br />
quoted. </p>
<p>So to shorten a long URL do like:</p>
<pre>  user@machine:~$ bitlyfier -n -s "http://domain.com/long/url/to/shorten"</pre>
<h2>The Vala Bitly API classes</h2>
<p>The Bitly API class I&#8217;ve written can of course be used standalone (it&#8217;s located in <code><a href="http://bit.ly/4DsuVg">src/bitly.vala</a></code> in the sources package downloadable below). Here&#8217;s an example of usage:</p>
<pre><code lang="vala">
// main.vala
// Compile: valac --pkg gee-1.0 --pkg json-glib-1.0 --pkg libsoup-2.4 -o main

int main(string[] argv)
{
  Bitly.Api api = new Bitly.Api("username", "R_the_api_key");
  Bitly.Response response = api.shorten("http://domain.com/the/long/url");
  stdout.printf("Short URL: %s\n", response.get_string("shortUrl"));

  response = api.stats("A2ma2z");
  stdout.printf("Clicks: %d\n", response.get_integer("clicks"));

  return 0;
}
</code></pre>
<p>More about the Bit.ly API and what the API methods do can be read about at <a href="http://bit.ly/6HIqjS">http://bit.ly/6HIqjS</a>.</p>
<h2>The sources</h2>
<p>The development sources of this application is available at <a href="http://bit.ly/7QFHvC"><strong>Bitlyfier at Github</strong></a>. The current stable release can be found at the <a href="http://github.com/poppa/Bitlyfier/downloads"><strong>Download page</strong></a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.poppa.se/blog/bitlyfier-a-bit-ly-client-for-gnome/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>JavaScript URI class</title>
		<link>http://www.poppa.se/blog/javascript-uri-class/</link>
		<comments>http://www.poppa.se/blog/javascript-uri-class/#comments</comments>
		<pubDate>Tue, 22 Dec 2009 17:31:22 +0000</pubDate>
		<dc:creator>Pontus</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Javascript]]></category>
		<category><![CDATA[Vala]]></category>

		<guid isPermaLink="false">http://www.poppa.se/blog/?p=319</guid>
		<description><![CDATA[The other day I needed an URI class for JavaScript. I was doing some stuff where I needed to alter certain parts of an URI. I bet there&#8217;s a couple of URI classes for JavaScript out there but I can be a bit nit-picky about code and how it&#8217;s written  
Anyway, I had a [...]]]></description>
			<content:encoded><![CDATA[<p>The other day I needed an URI class for JavaScript. I was doing some stuff where I needed to alter certain parts of an URI. I bet there&#8217;s a couple of URI classes for JavaScript out there but I can be a bit nit-picky about code and how it&#8217;s written <img src='http://www.poppa.se/blog/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<p>Anyway, I had a URI parser <acronym title="Regular Expression">regexp</acronym> lying which I wrote for a <a href="http://live.gnome.org/Vala/">Vala</a> class (before I found the <code>Soup.URI</code> class) and I thought that since that&#8217;s reusable I could hack up a JavaScript URI class myself. So I did!</p>
<p>Here&#8217;s some examples of usage:</p>
<pre><code lang="js">
var uri = new URI("http://poppa.se/blog/javascript-uri-class/");
console.log(uri.scheme); //-> http
console.log(uri.host);   //-> poppa.se
console.log(uri.path);   //-> /blog/javascript-uri-class/
console.log(uri.port);   //-> 80
</code></pre>
<p>Now, if we want to alter the host so that it contains <code>www</code> we do:</p>
<pre><code lang="js">
uri.host = "www.poppa.se";
console.log(uri.toString()); //-> http://www.poppa.se/blog/javascript-uri-class/
</code></pre>
<p>It&#8217;s also easy to alter query string variables:</p>
<pre><code lang="js">
var uri = new URI("http://host.com/?name=poppa&#038;lang=se");
uri.variables["name"] = 'Günther';
uri.variables["lang"] = 'de';
console.log(uri.toString()); //-> http://host.com/?name=Günther&#038;lang=de
</code></pre>
<p>And I think that&#8217;s pretty smooth <img src='http://www.poppa.se/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p><a href="/blog/data/scripts/uri.js">Download the URI class</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.poppa.se/blog/javascript-uri-class/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>
