<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>dan&#039;s linux blog</title>
	<atom:link href="http://www.dark.ca/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.dark.ca</link>
	<description>direct from the mysterious land of the sysadmin</description>
	<lastBuildDate>Tue, 22 Nov 2011 09:00:41 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Elasticsearch backup strategies</title>
		<link>http://www.dark.ca/2011/11/22/elasticsearch-backup-strategies/</link>
		<comments>http://www.dark.ca/2011/11/22/elasticsearch-backup-strategies/#comments</comments>
		<pubDate>Tue, 22 Nov 2011 09:00:41 +0000</pubDate>
		<dc:creator>dan</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[How-to]]></category>
		<category><![CDATA[comparison]]></category>
		<category><![CDATA[elasticsearch]]></category>
		<category><![CDATA[nosql]]></category>
		<category><![CDATA[perl]]></category>

		<guid isPermaLink="false">http://www.dark.ca/?p=299</guid>
		<description><![CDATA[Hello again! Today we&#8217;re going to talk about backup strategies for Elasticsearch. One popular way to make backups of ES requires the use of separate ES node, while another relies entirely on the underlying file system of a given set of ES nodes. The ES-based approach: Bring up an independent (receiving) ES node on a [...]]]></description>
			<content:encoded><![CDATA[<p>Hello again! Today we&#8217;re going to talk about backup strategies for <a href="http://elasticsearch.org" target="_blank">Elasticsearch</a>. One popular way to make backups of ES requires the use of separate ES node, while another relies entirely on the underlying file system of a given set of ES nodes.</p>
<p>The ES-based approach:</p>
<ul>
<li>Bring up an independent (receiving) ES node on a machine that has network access to the actual ES cluster.</li>
<li>Trigger a script to perform a full index import <em>from</em> the ES cluster <em>to</em> the receiving node.
<li>Since the receiving node is unique, every shard will be represented on said node.</li>
<li>Shutdown the receiving node.</li>
<li>Preserve the <code>/data/</code> directory from the receiving node.</li>
</ul>
<p>The file system-based approach:</p>
<ul>
<li>Identify a <em>quorum </em>of nodes in the ES cluster.</li>
<li>Quorum is necessary in order to ensure that all of the shards are represented.</li>
<li>Trigger a script that will preserve the <code>/data/</code> directory of each selected node.</li>
</ul>
<p>At first glance the file system-based approach appears simpler &#8211; and it is &#8211; but it comes with some drawbacks, notably the fact that coherency is <em>impossible to guarantee</em> due to the amount of time required to preserve <code>/data/</code> on each node. In other words, if data changes on node between the start and end times of the preservation mechanism, those changes may or may not be backed up. Furthermore, from an operational perspective, restoring nodes from individual shards may be problematic.</p>
<p>The ES-based approach does not have the coherency problem; however, beyond the fact that it is more complex to implement and maintain, it is also more costly in terms of service delivery. The actual import process itself requires a large number of requests to be made to the cluster, and the resulting resource consumption on both the cluster nodes as well as the receiving node are non-trivial. On the other hand, having a single, coherent representation of every shard in one place may pay dividends during a restoration scenario.</p>
<p>As is often the case, there is no one solution that is going to work for everybody all of the time &#8211; different environments have different needs, which call for different answers.  That said, if your <em>primary</em> goal is a consistent, coherent, and complete backup that can be easily restored when necessary (and overhead be damned!), then the ES-based approach is clearly the superior of the two.</p>
<h3>import it !</h3>
<p>Regarding the ES-based approach, it may be helpful to take a look at a simple import script as an example.  How about a quick and dirty Perl script (based on something from <a href="https://github.com/noplay" target="_blank">Noplay</a>) ?</p>
<pre>use ElasticSearch;

my $local = ElasticSearch-&gt;new(
    servers =&gt; 'localhost:9200'
);
my $remote = ElasticSearch-&gt;new(
    servers    =&gt; 'cluster_member:9200',
    no_refresh =&gt; 1
);

my $source = $remote-&gt;scrolled_search(
    index =&gt; 'content',
    search_type =&gt; 'scan',
    scroll      =&gt; '5m'
);
$local-&gt;reindex(source=&gt;$source);</pre>
<p>You&#8217;ll want to replace the relevant elements with something sane for your environment, of course.</p>
<p>As for preserving the resulting /data/ directory (in either method), I will leave that as an exercise to the reader, since there are simply too many equally relevant ways to go about it.  It&#8217;s worth noting that the import method doesn&#8217;t need to be complex <em>at all</em> &#8211; in fact, it really shouldn&#8217;t be, since complex backup schemes tend to have too many chances for failure than is necessary.</p>
<p>Happy indexing!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dark.ca/2011/11/22/elasticsearch-backup-strategies/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Send your logs to the cloud; Loggly vs. Papertrail</title>
		<link>http://www.dark.ca/2011/11/17/send-your-logs-to-the-cloud-loggly-vs-papertrail/</link>
		<comments>http://www.dark.ca/2011/11/17/send-your-logs-to-the-cloud-loggly-vs-papertrail/#comments</comments>
		<pubDate>Thu, 17 Nov 2011 16:31:45 +0000</pubDate>
		<dc:creator>dan</dc:creator>
				<category><![CDATA[Commentary]]></category>
		<category><![CDATA[comparison]]></category>
		<category><![CDATA[web service]]></category>

		<guid isPermaLink="false">http://www.dark.ca/?p=321</guid>
		<description><![CDATA[Centralised cloud-based logging.  It sounds tasty &#8211; and it is &#8211; but who should you go with?  Well, Loggly and Papertrail are the only games in town when it comes to the aforementioned service; the only other competitor in this space is Splunk Storm, but their offering &#8211; well-pedigreed though it may be &#8211; is [...]]]></description>
			<content:encoded><![CDATA[<p>Centralised cloud-based logging.  It sounds tasty &#8211; and it is &#8211; but who should you go with?  Well, <a href="http://loggly.com/" target="_blank">Loggly</a> and <a href="https://papertrailapp.com/" target="_blank">Papertrail</a> are the only games in town when it comes to the aforementioned service; the only other competitor in this space is <a href="https://www.splunkstorm.com/" target="_blank">Splunk Storm</a>, but their offering &#8211; well-pedigreed though it may be &#8211; is strictly in private beta at this time, and therefore cannot really be considered a valid option.</p>
<p>The fact of the matter is that Loggly and Papertrail are, at a high level, functionally identical. They offer more or less the same bouquet of functionality, including alert triggers, aggregate visualisation, and even map reduce tools for data mining and reporting. Loggly has been around longer, and has a better track record for open-source involvement, meaning that the eco-system around their service is more mature; however, that doesn&#8217;t mean that they are necessarily superior to Papertrail in terms of the actual service.</p>
<p>My suggestion: If you&#8217;re in a hurry, flip a coin and go with one or the other. If you have the time, you should go ahead and try both out for a bit; Papertrail has a 7-day free trial programme, and Loggly is free (in perpetuity) for sufficiently small amounts of data and retention (which is no problem if you&#8217;re just poking around).</p>
<p>I&#8217;m <em>very</em> interested in hearing about actual user experiences with either or both, so please don&#8217;t hesitate to add a comment or drop me a line directly via the contact form.</p>
<p><strong>Edit</strong>: From <a href="http://spootnik.org/" target="_blank">@pyr</a> : « you  can also consider <a href="http://www.datadoghq.com" target="_blank">@datadoghq</a> which has a different take on the issue but might fit the bill. »</p>
<p><strong>Edit 2</strong>: From the comments, there&#8217;s also <a href="https://logentries.com/">Logentries</a>, which I don&#8217;t personally have any experience with, but which appears to offer a reasonably comprehensive offering as well.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dark.ca/2011/11/17/send-your-logs-to-the-cloud-loggly-vs-papertrail/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Heavyweight tilt : GitHub vs. Bitbucket</title>
		<link>http://www.dark.ca/2011/11/08/heavyweight-tilt-github-vs-bitbucket/</link>
		<comments>http://www.dark.ca/2011/11/08/heavyweight-tilt-github-vs-bitbucket/#comments</comments>
		<pubDate>Tue, 08 Nov 2011 09:42:38 +0000</pubDate>
		<dc:creator>dan</dc:creator>
				<category><![CDATA[Commentary]]></category>
		<category><![CDATA[comparison]]></category>
		<category><![CDATA[web service]]></category>

		<guid isPermaLink="false">http://www.dark.ca/?p=284</guid>
		<description><![CDATA[When it comes to code hosting on The Internets today, GitHub is absolutely the hottest, trendiest service going &#8211; but it&#8217;s not alone. Right now, the primary direct competitor to GitHub is Bitbucket, and choosing the best service for you or your company can be a less than obvious scenario &#8211; so let&#8217;s break it [...]]]></description>
			<content:encoded><![CDATA[<p>When it comes to code hosting on The Internets today, <a href="https://github.com/" target="_blank">GitHub</a> is absolutely the hottest, trendiest service going &#8211; but it&#8217;s not alone. Right now, the primary direct competitor to GitHub is <a href="https://bitbucket.org/">Bitbucket</a>, and choosing the best service for you or your company can be a less than obvious scenario &#8211; so let&#8217;s break it down, shall we?</p>
<p>GitHub is generally considered to be the most popular code hosting and collaboration site out there today. They have an excellent track record for innovation and evolution of their service, and they put their money where their mouth is, notably by promoting and releasing their own internal tools into the open source community.  Their site offers a buffet of ever-improving facilities for collaborative activity, notably including an integrated issue tracker and excellent code comparison tools, among others. To be fair, not every feature has had the same level of care and attention paid to it, and as a result, some elements feel quite a bit more mature than others; however, again, they never stop trying to make things better.</p>
<p>Bitbucket looks <em>a lot</em> like GitHub.  That&#8217;s a fact.  I don&#8217;t honestly know which one came first, but it&#8217;s clear that today they&#8217;re bouncing off of each other in terms of design, features, and functionality.  You can more or less transpose your user experience between the two sites without missing too much of a beat, so for a casual user looking to contribute here and there, you get two learning curves for the price of one (nice).  Bitbucket&#8217;s pace of evolution is (perhaps) less blistering, but they too are capable of rolling out new and improved toys over time.</p>
<h3>let&#8217;s get down to brass tacks</h3>
<p>Both services offer the same basic functionality, which is the ability to create an account, and associate that account with any number of <em>publicly-accessible</em> repositories; however, if you want a <em>private</em> repository, GitHub will make you pay for it, whereas BitBucket offers it gratis.  There, as it is said, lies the rub.  More on this later.</p>
<p>One of the big differences between the two services lie in their respective origins: GitHub remains an independent start-up, whereas Bitbucket (although once independent) was acquired by &#8211; and is now strongly associated with &#8211; Atlassian (of JIRA fame). It is my opinion that this affects the cultural make-up of Bitbucket in subtle ways, leading to a more corporate take on development, deployment, and importantly, community relations and involvement.  Take a look at their respective blogs (go ahead, I&#8217;ll wait).</p>
<p>A quick scan of the past few months from each blog will reveal some important differences:</p>
<ul>
<li>GitHub&#8217;s release schedule is more aggressive, with improvements and new features coming more regularly, whereas Bitbucket places greater emphasis on their tight integration with JIRA, Jenkins, and other industry tools.</li>
<li>Bitbucket advertises paid services and software on their blog, whereas GitHub advertises open source projects.</li>
<li>Bitbucket&#8217;s blog has one recent author, whereas GitHub&#8217;s blog has many recent authors.</li>
<li>GitHub hosts more community events (notably drinkups, heh) over a greater geographic area than Bitbucket (and their posts have more community response overall).</li>
</ul>
<p>Also, check out GitHub&#8217;s &#8220;<a href="https://github.com/about" target="_blank">about us</a>&#8221; page &#8211; <a href="http://www.quora.com/Brogramming/How-does-a-programmer-become-a-brogrammer" target="_blank">brogrammers</a> abound!  I&#8217;d compare the group to Bitbucket, but as it so happens, they don&#8217;t have an analogous page.</p>
<p>Previously I mentioned that GitHub would like you to pay for private repositories.  This is obviously part of their revenue scheme (and who can blame them for wanting to get that cheese?), but it also has the side-effect of making people <em>choose</em> to willingly host their projects publicly.  This has ended up creating a (very) large community of active participants representing a variety of languages and interests, which in turn results in more projects, and so on and so forth.  This feedback loop is interesting since it auto-builds popularity: as more people use it, the more people will use it.<em></em></p>
<p>These observations are, in no way, objective statements of the superiority of one platform over the other &#8211; they <em>are</em>, however, indicative of cultural differences between the two companies.  This is (or, at least, should be) a non-trivial element when deciding which service is right for you or your organisation.  For example, I&#8217;m a beer-drinking open source veteran that works in start-ups and small companies, so culturally my preferences are different than those of a suit-wearing system architect, working for a thousand-person consulting firm.  One isn&#8217;t necessarily better than the other &#8211; they&#8217;re just not the same (and that&#8217;s OK).</p>
<h3>but wait, there&#8217;s more</h3>
<p>Alright, here comes the shocker: for paid services (i.e. private repositories), GitHub is <em>much more expensive</em> than Bitbucket.  As in nowhere near the same price.  At all.  How can this be?  Well, I&#8217;m not privy to the financials of either company (if I were, I doubt I&#8217;d have written this post), but hey, the money for all those great open source projects, drinkups, and (bluntly) <em>salaries</em> have to come from somewhere &#8211; and while Bitbucket has Atlassian&#8217;s pockets backing them, GitHub has to stand on their own successes, and live with their own failures.</p>
<p>The two services are not dissimilar technically speaking, so it&#8217;s really up to you to decide which culture is better suited for your project.  Do you just need a spot to put your private project, that you program alone, isolated from the greater Internet?  BitBucket.  Do you have a public project that you&#8217;d like other people to discover, hack on together, and build a community around?  GitHub.  As for paid services, well I suppose that comes down to whether you want to pay extra to support what GitHub is doing or not.</p>
<p>Now, let&#8217;s be fair, for a lot of companies, &#8220;culture&#8221; is an irrelevant factor in their purchasing department &#8211; cost is the only concern.  Fair enough.  But let&#8217;s say you&#8217;ve got a team of developers, all of whom already have their own projects on GitHub, are familiar with the tools and processes, and have a network of fellow hackers built-in and ready to go.  In that case, perhaps culture is worth something after all.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dark.ca/2011/11/08/heavyweight-tilt-github-vs-bitbucket/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Improvements in Cassandra 1.0, briefly stated</title>
		<link>http://www.dark.ca/2011/11/03/improvements-in-cassandra-1-0-briefly-stated/</link>
		<comments>http://www.dark.ca/2011/11/03/improvements-in-cassandra-1-0-briefly-stated/#comments</comments>
		<pubDate>Thu, 03 Nov 2011 11:53:55 +0000</pubDate>
		<dc:creator>dan</dc:creator>
				<category><![CDATA[Commentary]]></category>
		<category><![CDATA[cassandra]]></category>
		<category><![CDATA[nosql]]></category>

		<guid isPermaLink="false">http://www.dark.ca/?p=275</guid>
		<description><![CDATA[Datastax recently announced the availability of Cassandra 1.0 (stable), and along with that announcement, they made a series of blog posts (1, 2, 3, 4, 5) about many of the great new features and improvements that the current version brings to the table. For those of you looking for an executive summary of those posts, [...]]]></description>
			<content:encoded><![CDATA[<p>Datastax recently announced the availability of Cassandra 1.0 (stable), and along with that announcement, they made a series of blog posts (<a href="http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-compression">1</a>, <a href="http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-improved-memory-and-disk-space-management">2</a>, <a href="http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra">3</a>, <a href="http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-performance">4</a>, <a href="http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-windows-service-new-cql-clients-and-more">5</a>) about many of the great new features and improvements that the current version brings to the table.</p>
<p>For those of you looking for an executive summary of those posts, you&#8217;re in luck, cause I&#8217;ve got your back on this one.</p>
<ul>
<li>New multi-layer approach to compression that provides improvements to <em>both</em> write and (especially) read operations.</li>
<li>Said compression strategy also yields potentially significant disk space savings.</li>
<li>Leverages the <a href="http://jna.java.net/">JNA library</a> in order to provide in-memory caching; this procedure is optimised for garbage collection, resulting in a more efficient collection and a smaller overall footprint.</li>
<li>A much improved compaction strategy results in less costly compaction runs, improving overall performance on each node.</li>
<li>Fewer requests are made over the network, and said requests are smaller in size, improving overall performance across the cluster.</li>
</ul>
<p>In short, 1.0 is a very significant, very important upgrade to 0.8 (et al.), and one which will likely bring it to the forefront of the hardcore big data / nosql scene at large.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dark.ca/2011/11/03/improvements-in-cassandra-1-0-briefly-stated/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Nagios plugin to parse the result of a MySQL query</title>
		<link>http://www.dark.ca/2011/07/25/nagios-plugin-to-parse-the-result-of-a-mysql-query/</link>
		<comments>http://www.dark.ca/2011/07/25/nagios-plugin-to-parse-the-result-of-a-mysql-query/#comments</comments>
		<pubDate>Mon, 25 Jul 2011 09:29:09 +0000</pubDate>
		<dc:creator>dan</dc:creator>
				<category><![CDATA[Code]]></category>

		<guid isPermaLink="false">http://www.dark.ca/?p=269</guid>
		<description><![CDATA[Hello all !  As before, I wrote a Nagios plugin that will perform an arbitrary MySQL query and parse the results.  It is written in Ruby (and tested against 1.9 only, so ymmv).  If that sounds interesting to you, check my my Github. Usage ./check_mysql_query.rb [-c &#60;config_file&#62;] -q 'SELECT etc...' -h, --help Help! -v, --verbose [...]]]></description>
			<content:encoded><![CDATA[<p>Hello all !  As before, I wrote a Nagios plugin that will perform an arbitrary MySQL query and parse the results.  It is written in Ruby (and tested against 1.9 <strong>only</strong>, so ymmv).  If that sounds interesting to you, check my my <a href="https://github.com/phrawzty">Github</a>.</p>
<pre>Usage ./check_mysql_query.rb [-c &lt;config_file&gt;] -q 'SELECT etc...'
    -h, --help                       Help!
    -v, --verbose                    Human output
    -q, --query 'QUERY'              The query to execute
    -s, --result STRING              Expected (string) result. No need for -w or -c.
    -r, --regex REGEX                Expected (string) result expressed as regular expression. No need for -w or -c.
    -w, --warn VALUE                 Warning threshold
    -c, --crit VALUE                 Critical threshold
    -f, --file CONFIG_FILE           Config file (replaces the switches below).
        --host                       MySQL host
        --database                   MySQL database
        --user                       MySQL user
        --pass                       MySQL pass</pre>
<p>A YAML config file can used to populate the MySQL criteria (instead of using the arguments each time). Example :</p>
<pre>---
:mysql_host: 'db01.your.net'
:mysql_database: 'name_of_database'
:mysql_user: 'username'
:mysql_pass: 'p455w0rd'</pre>
<p>If a result of either type string or regular expression is specified :</p>
<ul>
<li>A match is OK and anything else is CRIT.</li>
<li>The warn / crit thresholds will be ignored.</li>
</ul>
<p>The &#8211;warn and &#8211;crit arguments conform to the threshold format guidelines noted here :<br />
<a href="http://nagiosplug.sourceforge.net/developer-guidelines.html">http://nagiosplug.sourceforge.net/developer-guidelines.html</a></p>
<p>How you choose to implement the plugin is, of course, up to you. Here are some suggestions :</p>
<pre># check a mysql query for a string result
define command {
command_name check_mysql_query-string
command_line /&lt;path&gt;/check_mysql_query.rb -f /&lt;path&gt;/check_mysql_query.yml -q '$ARG1$' -s '$ARG2$'
}
# do the same but check against a regex
define command {
command_name check_mysql_query-regex
command_line /&lt;path&gt;/check_mysql_query.rb -f /&lt;path&gt;/check_mysql_query.yml -q '$ARG1$' -r '$ARG2$'
}
# and finally some standard integer results (good for warn and crit levels)
define command {
command_name check_mysql_query-int
command_line /&lt;path&gt;/check_mysql_query.rb -f /&lt;path&gt;/check_mysql_query.yml -q '$ARG1$' -w '$ARG2$' -c '$ARG3$'
}</pre>
<p>Finally, I invite you to peruse the commit history for the list of contributors :<br />
<a href="https://github.com/phrawzty/check_mysql_query/commits/master">https://github.com/phrawzty/check_mysql_query/commits/master</a></p>
<p>Github pull requests welcome !</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dark.ca/2011/07/25/nagios-plugin-to-parse-the-result-of-a-mysql-query/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Nagios plugin to parse JSON from an HTTP response</title>
		<link>http://www.dark.ca/2011/03/04/nagios-plugin-to-parse-json-from-an-http-response/</link>
		<comments>http://www.dark.ca/2011/03/04/nagios-plugin-to-parse-json-from-an-http-response/#comments</comments>
		<pubDate>Fri, 04 Mar 2011 14:47:33 +0000</pubDate>
		<dc:creator>dan</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[common tools]]></category>
		<category><![CDATA[nagios]]></category>
		<category><![CDATA[ruby]]></category>

		<guid isPermaLink="false">http://www.dark.ca/?p=262</guid>
		<description><![CDATA[Hello all !  I wrote a plugin for Nagios that will parse JSON from an HTTP response.  If that sounds interesting to you, feel free to check out my Github.  The plugin itself is written in Ruby &#8211; 1.9 initially, but it&#8217;s compatible with earlier versions thanks to some excellent contributions from other Githubbers.  Pull [...]]]></description>
			<content:encoded><![CDATA[<p>Hello all !  I wrote a plugin for Nagios that will parse JSON from an HTTP response.  If that sounds interesting to you, feel free to check out my <a href="https://github.com/phrawzty/check_http_json">Github</a>.  The plugin itself is written in Ruby &#8211; 1.9 initially, but it&#8217;s compatible with earlier versions thanks to some excellent contributions from other Githubbers.  Pull requests welcome !</p>
<pre>Usage: ./check_http_json.rb -u &lt;URI&gt; -e &lt;element&gt; -w &lt;warn&gt; -c &lt;crit&gt;
 -h, --help                       Help info
 -v, --verbose                    Human output
 -u, --uri URI                    Target URI
 -e, --element ELEMENT            Desired element (ex. foo=&gt;bar=&gt;ish is foo.bar.ish)
 -r, --result STRING              Expected (string) result. No need for -w or -c.
 -w, --warn VALUE                 Warning threshold
 -c, --crit VALUE                 Critical threshold
 -t, --timeout SECONDS            Wait before HTTP timeout</pre>
<p>The &#8211;result argument expects a string; if the values match, it&#8217;s OK, and if not, it&#8217;s CRIT.<br />
If &#8211;result is specified, then &#8211;warn and &#8211;crit will be ignored.</p>
<p>Speaking of, &#8211;warn and &#8211;crit conform to the (official?) <a href="http://nagiosplug.sourceforge.net/developer-guidelines.html">threshold format guidelines</a>, so that&#8217;s neat.</p>
<p>Finally, the script makes a couple of unapologetic assumptions:<br />
- The response is pure JSON.<br />
- None of the elements contain periods, since it uses that character to flatten the JSON.</p>
<p>How you choose to implement the plugin is, of course, up to you.  Here&#8217;s one suggestion:</p>
<pre># check json from http
define command{
 command_name    check_http_json-string
 command_line    /etc/nagios3/plugins/check_http_json.rb -u 'http://$HOSTNAME$:$ARG1$/$ARG2$' -e '$ARG3$' -r '$ARG4$'
}
define command{
 command_name    check_http_json-int
 command_line    /etc/nagios3/plugins/check_http_json.rb -u 'http://$HOSTNAME$:$ARG1$/$ARG2$' -e '$ARG3$' -w '$ARG4$' -c '$ARG5$'
}

# make use of http json check
define service{
 service_description     elasticsearch-cluster-status
 check_command           check_http_json-string!9200!_cluster/health!status!green
}
define service{
 service_description     elasticsearch-cluster-nodes
 check_command           check_http_json-int!9200!_cluster/health!number_of_nodes!4:!3:
}</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.dark.ca/2011/03/04/nagios-plugin-to-parse-json-from-an-http-response/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>RabbitMQ plugin for Collectd</title>
		<link>http://www.dark.ca/2011/02/24/rabbitmq-plugin-for-collectd/</link>
		<comments>http://www.dark.ca/2011/02/24/rabbitmq-plugin-for-collectd/#comments</comments>
		<pubDate>Thu, 24 Feb 2011 12:37:50 +0000</pubDate>
		<dc:creator>dan</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[collectd]]></category>
		<category><![CDATA[common tools]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[uncommon tools]]></category>

		<guid isPermaLink="false">http://www.dark.ca/?p=255</guid>
		<description><![CDATA[Hello all, I wrote a rudimentary RabbitMQ plugin for Collectd.  If that sounds interesting to you, feel free to take a look at my GitHub.  The plugin itself is written in Python and makes use of the Python plugin for Collectd. It will accept four options from the Collectd plugin configuration : Locations of binaries [...]]]></description>
			<content:encoded><![CDATA[<p>Hello all,</p>
<p>I wrote a rudimentary RabbitMQ plugin for Collectd.  If that sounds interesting to you, feel free to take a look at my <a href="https://github.com/phrawzty/rabbitmq-collectd-plugin">GitHub</a>.  The plugin itself is written in Python and makes use of the Python plugin for Collectd.</p>
<p>It will accept four options from the Collectd plugin configuration :</p>
<p style="padding-left: 30px;">Locations of binaries :</p>
<pre style="padding-left: 30px;">RmqcBin = /usr/sbin/rabbitmqctl
PmapBin = /usr/bin/pmap
PidofBin = /bin/pidof</pre>
<p style="padding-left: 30px;">Logging :</p>
<pre style="padding-left: 30px;">Verbose = false</pre>
<p>It will attempt to gather the following information :</p>
<p style="padding-left: 30px;">From « rabbitmqctl list_queues » :</p>
<pre style="padding-left: 30px;">messages
memory
consumser</pre>
<p style="padding-left: 30px;">From « pmap » of « beam.smp » :</p>
<pre style="padding-left: 30px;">memory mapped
memory writeable/private (used)
memory shared</pre>
<p>Props to Garret Heaton for inspiration and conceptual guidance from his « <a href="https://github.com/powdahound/redis-collectd-plugin">redis-collectd-plugin</a> ».</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dark.ca/2011/02/24/rabbitmq-plugin-for-collectd/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>how to use the Distributed Numeric Assignment (DNA) plug-in in 389 Directory Server</title>
		<link>http://www.dark.ca/2010/04/20/how-to-use-the-distributed-numeric-assignment-dna-plug-in-in-389-directory-server/</link>
		<comments>http://www.dark.ca/2010/04/20/how-to-use-the-distributed-numeric-assignment-dna-plug-in-in-389-directory-server/#comments</comments>
		<pubDate>Tue, 20 Apr 2010 09:17:17 +0000</pubDate>
		<dc:creator>dan</dc:creator>
				<category><![CDATA[How-to]]></category>
		<category><![CDATA[centos]]></category>
		<category><![CDATA[fedora]]></category>
		<category><![CDATA[ldap]]></category>

		<guid isPermaLink="false">http://www.dark.ca/?p=212</guid>
		<description><![CDATA[Hello everybody !  Today&#8217;s post is about the Distributed Numeric Assignment (or « DNA » ) plug-in for the 389 Directory Server (also known as the Fedora, Red Hat, and CentOS Directory Servers).  Although this plug-in has existed for quite some, there isn&#8217;t a whole lot of documentation about how to implement it in a [...]]]></description>
			<content:encoded><![CDATA[<p>Hello everybody !  Today&#8217;s post is about the <a href="http://directory.fedoraproject.org/wiki/DNA_Plugin">Distributed Numeric Assignment</a> (or « DNA » ) plug-in for the 389 Directory Server (also known as the Fedora, Red Hat, and CentOS Directory Servers).  Although this plug-in has existed for quite some, there isn&#8217;t a whole lot of documentation about how to implement it in a real-world scenario.  I recently submitted some documentation to the maintainer of the <a href="http://directory.fedoraproject.org/wiki/Documentation">389 wiki</a>, but since i&#8217;m not sure how, when, or in what form that documentation will come to exist on their site, i thought i&#8217;d expand on it here as well.  If you&#8217;ve made it this far, i&#8217;m going to assume that you&#8217;re already familiar with the basics of LDAP, and already have an instance of Directory Server up and running &#8211; if not, i suggest you take a look through the official <a href="http://www.redhat.com/docs/manuals/dir-server/8.1/install/index.html">Red Hat documentation</a> in order to get you started.</p>
<p>By way of some background, it is worth noting that my basic requirement was simply to have a centralised back-end for authenticating SSH logins to the various machines in our park.  The actual numerical values for the UID and GID fields did not need to be the same, they simply needed to be both extant and unique for each user, with the further caveat that they should not collide with any existing values that might be defined locally on the machines.  This is a very basic set of requirements, so it is an excellent starting point for our example.  The first step is to activate the DNA plug-in via the console :</p>
<pre>[TAB] Servers and Applications
Domain -&gt; Server -&gt; Server Group -&gt; Directory Server
[SECTION] Configuration
Server -&gt; Plug-ins -&gt; Distributed Numeric Assignment
[X] Enable plug-in
Save</pre>
<p>The Directory Server needs to be restarted in order for the activation to take effect.  This can either be done via the console, or via the command-line as normal.  The next step is to define how DNA will interact with new user data ; this is different from configuring the plug-in itself, in that we will be setting up a layer in between the plug-in and the user data that will allow certain values to be generated automatically (which is, of course, the end goal of this exercise).  Consider the following two LDIF snippets :</p>
<pre># uids
dn: cn=UID numbers,cn=Distributed Numeric Assignment Plugin,cn=plugins,cn=config
objectClass: top
objectClass: extensibleObject
cn: UID numbers
dnatype: uidNumber
dnamagicregen: 99999
dnafilter: (objectclass=posixAccount))
dnascope: dc=example,dc=com
dnanextvalue: 1000

# gids
dn: cn=GID numbers,cn=Distributed Numeric Assignment Plugin,cn=plugins,cn=config
objectClass: top
objectClass: extensibleObject
cn: GID numbers
dnatype: gidNumber
dnamagicregen: 99999
dnafilter: (|(objectclass=posixAccount)(objectclass=posixGroup))
dnascope: dc=example,dc=com
dnanextvalue: 1000</pre>
<p>As you can see, they are nearly identical.  This configuration activates the DNA magic-number functionality for the UID and GID fields as shown in the Posix attributes section of the console, though the values used may require further explanation.  The only particular requirement for the magic number (specified by the «  dnamagicregen » field) is that it be a value that cannot occur  naturally, which is to say a value that would not be generated by the  DNA plug-in, nor set manually at any time.  The default value is « 0 »,  but since this is clearly a number with meaning on the average Posix  system, i would recommend a suitably large number that is unlikely to  ever be used, such as « 99999 ».  Non-numerical values can technically be  used too ; however, these will not be acceptable to the console, so unless you&#8217;re using a third-party interface (or doing everything from the commandline), a numerical value must be used.</p>
<p>The « dnanextvalue » field functionally indicates where the count will  start from.  As noted previously, in order to avoid collisions with existing local entries on the various machines, i chose a start point of « 1000 », which was more than acceptable in my environment.  Once these two snippets are integrated via the <a href="http://www.google.ca/search?q=ldapmodify">commandline</a>, simply re-start the Directory Server (again), and you&#8217;re good to go  From now on, any time that a new user is created with the value « 99999 » entered into either (or both) of the UID and GID Posix fields, DNA will automagically generate real values as appropriate.</p>
<p>Hope that helps &#8211; enjoy !</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dark.ca/2010/04/20/how-to-use-the-distributed-numeric-assignment-dna-plug-in-in-389-directory-server/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>CPAN RPMs in RHEL / CentOS : generation, conflict, and solutions</title>
		<link>http://www.dark.ca/2010/04/08/cpan-rpms-in-rhel-centos/</link>
		<comments>http://www.dark.ca/2010/04/08/cpan-rpms-in-rhel-centos/#comments</comments>
		<pubDate>Thu, 08 Apr 2010 15:25:24 +0000</pubDate>
		<dc:creator>dan</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[How-to]]></category>
		<category><![CDATA[centos]]></category>
		<category><![CDATA[common tools]]></category>
		<category><![CDATA[perl]]></category>
		<category><![CDATA[rpm]]></category>

		<guid isPermaLink="false">http://www.dark.ca/?p=185</guid>
		<description><![CDATA[Hello all !  Today we&#8217;re going to take a look at a somewhat obscure problem that &#8211; once encountered &#8211; can cause nothing but headaches for a system administrator.  The problem relates to conflicts in CPAN RPM packages, and what can be done to work around the issue.  If you&#8217;ve made it this far, i&#8217;m [...]]]></description>
			<content:encoded><![CDATA[<p>Hello all !  Today we&#8217;re going to take a look at a somewhat obscure problem that &#8211; once encountered &#8211; can cause nothing but headaches for a system administrator.  The problem relates to conflicts in CPAN RPM packages, and what can be done to work around the issue.  If you&#8217;ve made it this far, i&#8217;m going to assume a couple of things : you&#8217;re comfortable with RPMs and repositories, have worked with a .spec file before, and you know what Perl modules are.  Good ?  Ok, let&#8217;s go.</p>
<p><em>Edit : About a week after i posted this article, the pastebin i uploaded the examples to disappeared.  Maybe it will come back &#8211; i don&#8217;t know &#8211; but if not, sorry for the broken links&#8230;</em></p>
<p><a href="http://www.cpan.org/">CPAN</a> is an enormous collection of Perl modules.  If you&#8217;ve ever written a Perl script, there&#8217;s a good chance you&#8217;ve used a module that &#8211; at one point or another &#8211; came from this archive.  One of the really neat features of CPAN is the <a href="http://search.cpan.org/~jhi/perl-5.8.0/lib/CPAN.pm#Interactive_Mode">interactive manner</a> in which modules can be downloaded and installed from the archive using Perl right from the command line (frankly, if you&#8217;re reading this post, there&#8217;s a good chance you&#8217;ve used this feature, too).  This is a fairly common way to install new modules and add functionality to your system, especially if you&#8217;re coding for local use (i.e. on your personal box).</p>
<p>It&#8217;s useful, but it&#8217;s not perfect, and one of the key areas where it starts to fail is scalability : if you&#8217;ve got a bunch of machines, and you need to SSH into each one to interactively install a CPAN module or two, it&#8217;s going to be a hassle.  Likewise, CPAN doesn&#8217;t often find its way into the hearts and minds of enterprise Red Hat or CentOS environments, where the official policy is often to install software via RPM only (for support, administration, and sanity reasons, this is often the case).</p>
<p>Luckily, some of the most commonly used CPAN modules exist as RPMs in the default repositories.  Some, but not all (and not even « many ») &#8211; for this, there are other repositories available.  Some examples :</p>
<ul>
<li><a href="http://fedoraproject.org/wiki/EPEL">EPEL</a></li>
<li><a href="http://dag.wieers.com/rpm/">Dag</a></li>
<li><a href="https://rpmrepo.org/RPMforge/">RPMForge</a></li>
<li><a href="http://rpm.mag-sol.com/">Magnum Solutions</a></li>
</ul>
<p>That last one &#8211; Magnum &#8211; is particularly interesting given the subject of our post today.  From their info page :</p>
<blockquote><p>At Magnum we have a firm rule that all CPAN modules on our machines are installed from RPMs. The Fedora and Centos projects build RPMs for many CPAN modules, but there are always ones missing and the ones that are available often lag behind the most up to date versions.  For that reason, we build a lot of RPMs of CPAN modules. And we don&#8217;t want to keep that work to ourselves, so on these pages we make them available for anyone to download.</p></blockquote>
<p>Their RPMs are generated automagically using a great tool called « cpanspec », which does exactly what you think it does : given a CPAN tarball, it will generate a .spec file suitable for building an installable RPM.  It is available in the standard repositories, and can be installed easily via YUM as normal, so go ahead and do that now.  Ok, example time : say you needed HTML::Laundry, but after a quick peek through your repositories, it becomes readily apparent that an RPM is not available.  Thanks to cpanspec, all is not lost :</p>
<pre>[build@host-119 ~]$ wget http://search.cpan.org/CPAN/authors/id/S/ST/STEVECOOK/HTML-Laundry-0.0103.tar.gz
[build@host-119 ~]$ cpanspec --packager "build &lt;build@domain.ext&gt;" HTML-Laundry-0.0103.tar.gz</pre>
<p>We just downloaded the tarball right from the CPAN website, and ran cpanspec against it.  The « &#8211;packager » argument simple defines the person who&#8217;s generating the .spec, and doesn&#8217;t necessarily have to be anything accurate.  Go ahead and try it for yourself.  Now take a look at the resulting .spec file (or on the a pastebin <a href="http://pastebin.ca/1858756">here</a>).  As you can see, it fills in all the fields, including the critical (and often tricky-to-determine) « BuildRequires » and « Requires » items.  Frankly, it&#8217;s solid gold, and it has made the lives of CentOS / RHEL admins all over the world much easier.</p>
<p>That said, it&#8217;s not perfect, and there are times when you might run into problems.  Actually, you may run into two problems in particular.  The first is conflicts over ownership, which arises when multiple RPMs claim to be responsible for the same file (or files, or directories, or features, or whatever).  The second is more nefarious : an RPM that writes files to the system without declaring ownership for them &#8211; a condition often referred to as « clobbering ».  The former is irritating, but at least it&#8217;s not destructive, unlike the latter, which can cause all manner of headaches.  To illustrate these two problems, let&#8217;s take a look at another example (this one being decidedly more real-world than that of Laundry above) : <a href="http://search.cpan.org/dist/CGI.pm/lib/CGI.pm">CGI.pm</a>.</p>
<p>The <a href="http://pastebin.ca/1858885">.spec file</a> that is generated from this tarball is functional and correct, and we can build an installable RPM out of it, so at first all appears well.  Again, go ahead and try for yourself &#8211; i&#8217;ll wait.  You may wish to capture the build output for review &#8211; otherwise, check the <a href="http://pastebin.ca/1858890">pastebin</a>.  I&#8217;d like to draw your attention to the « Installing » lines.  By trimming the « Installing /var/tmp/perl-CGI.pm.3.49-1-root-root » element from each of those lines, we can see the actual paths and files that this RPM will install to.  Examples :</p>
<pre>/usr/lib/perl5/vendor_perl/5.8.8/CGI.pm
/usr/lib/perl5/vendor_perl/5.8.8/CGI/Cookie.pm
/usr/lib/perl5/vendor_perl/5.8.8/CGI/Util.pm
/usr/share/man/man3/CGI.3pm
/usr/share/man/man3/CGI::Pretty.3pm
/usr/share/man/man3/CGI::Cookie.3pm</pre>
<p>At first glance this looks perfectly acceptable.  But look what happens when we try to install the resulting RPM (clipped for brevity) :</p>
<pre>[root@host-119 build]# rpm -iv /usr/src/redhat/RPMS/noarch/perl-CGI.pm-3.49-1.noarch.rpm
Preparing packages for installation...
file /usr/share/man/man3/CGI.3pm.gz from install of perl-CGI.pm-3.49-1.noarch conflicts with file from package perl-5.8.8-27.el5.x86_64
file /usr/share/man/man3/CGI::Cookie.3pm.gz from install of perl-CGI.pm-3.49-1.noarch conflicts with file from package perl-5.8.8-27.el5.x86_64
file /usr/share/man/man3/CGI::Pretty.3pm.gz from install of perl-CGI.pm-3.49-1.noarch conflicts with file from package perl-5.8.8-27.el5.x86_64</pre>
<p>As it turns out, the Perl package that comes with RHEL / CentOS already contains CGI.pm.  This is normal, since it&#8217;s so popular, and is included as a convenience.  Thus, RPM &#8211; in an attempt to preserve the coherence of the package management system &#8211; refuses to install overtop of the existing owned files.  This is a fine illustration of the first of the two problems previously noted : conflicts over ownership.  As i mentioned above, it&#8217;s aggravating, but it&#8217;s not a bug &#8211; it&#8217;s a feature, and it&#8217;s doing exactly what it&#8217;s designed to do.  Irritating, but not ultimately dire.</p>
<p>If you look carefully, though, it&#8217;s also an illustration of the second problem.  Note the list of files that are conflicting.  Look back to the list of files that the package contains &#8211; notice anything missing from the conflicts list ?  That&#8217;s right &#8211; the actual module files (*.pm) are not showing conflicts, which means they&#8217;d get overwritten without complaint by RPM.  You might be thinking « who cares ? that&#8217;s what i want » right now, but trust me, it&#8217;s not what you want.  Imagine this CGI package, with this version of CGI.pm gets installed, and then later you upgrade the Perl package &#8211; your CGI.pm files will get overwritten by the Perl package, because as far as RPM is concerned, Perl owns those files.  All of a sudden, things break because you had scripts that relied on your particular version, but since you just upgraded Perl, you think (quite naturally) that the problem could be anywhere &#8211; where do you even start looking ?</p>
<p>Imagine the headache if there are multiple administrators, multiple servers, multiple data centres, and multiple clients paying multiple dollars.  No fun at all.</p>
<p>So how can we upgrade CGI.pm, using an RPM, without running into these problems ?  As is often the case, the answer is deceptively simple, but not immediately obvious.  Ultimately what we want to accomplish is twofold :</p>
<ul>
<li>Avoid the man conflicts.</li>
<li>Ensure that the existing owned module files are not clobbered by our new package.</li>
</ul>
<p>Concerning the man pages &#8211; and i&#8217;m going to be perfectly blunt here &#8211; the solution is to simply not install them, since, of course, they&#8217;re already there.  As for avoiding a clobbering condition, this requires a little bit of investigation into how Perl modules and libraries are stored on an RHEL / CentOS machine.  Consider the following output :</p>
<pre>[root@host-119 ~]# ls -d /usr/lib64/perl5/*
/usr/lib64/perl5/5.8.8  /usr/lib64/perl5/site_perl  /usr/lib64/perl5/vendor_perl</pre>
<p>What&#8217;s it all mean ?  Well, the « 5.8.8 » directory is the default directory as defined by the Perl architecture, and is system and platform-agnostic, which is to say that it&#8217;s (supposed to be) the same on every system.  The « vendor_perl » directory contains everything that specific to RHEL / CentOS (the « vendor » of the distribution).  As you may recall from the rpmbuild output above, this is where the RPM wants to install the modules (thus creating the clobbering condition).</p>
<p>There&#8217;s a third directory there, promisingly named « site_perl » ; as the name implies, this is where site-specific files are stored, which is to say items that are neither part of the default Perl architecture, nor part of the RHEL / CentOS distribution.  As you&#8217;ve no doubt guessed by now, site_perl is where we&#8217;re going to put our new modules.</p>
<p>Luckily for us, the only thing that needs to be changed is the .spec file &#8211; and we even get a headstart, since cpanspec does most of the heavy lifting for us.  Examining the <a href="http://pastebin.ca/1858885">.spec file</a> once more, we see the following lines of note (again, cut for brevity) :</p>
<pre>%build
%{__perl} Makefile.PL INSTALLDIRS=vendor
%files
%{perl_vendorlib}/*</pre>
<p>These indicate that the target installation directory is that of the vendor, which is normally the case, and thus the default setting.  Since we want to install to the site directory, we make the following changes :</p>
<pre>%build
%{__perl} Makefile.PL INSTALLDIRS=site
%files
%{perl_sitelib}/*</pre>
<p>That solves our clobbering problem quite nicely, but what about the man files ?  As i mentioned above, the idea is to simply avoid installing them altogether, but since they&#8217;re generated automatically during the build process, how can we exclude them ?  What i&#8217;m about to present is a bit of a hack, but it&#8217;s absolutely effective, and ultimately quite clean : we delete them after they&#8217;ve been generated, and then don&#8217;t declare them in the file list.  Some items are already being potentially deleted by default, so let&#8217;s go ahead and add our own line into the mix :</p>
<pre>find $RPM_BUILD_ROOT -depth -type d -exec rmdir {} 2&gt;/dev/null \;
# destroy manified man, man.
find $RPM_BUILD_ROOT -type f -name '*.3pm' -exec rm -f {} \;</pre>
<p>This will look for all of the « manified » man files and just remove from the build tree.  All that&#8217;s left now is to remove them from the file list.  This is as simple as deleting (or commenting out) their sole declaration :</p>
<pre>#%{_mandir}/man3/*</pre>
<p>Another option is to simply install use the « &#8211;excludedocs » argument when installing the RPM.  I opted to remove the docs altogether in order to ensure that the package can be installed without errors by anyone else without needed to know about the argument requirement ahead of time (and to facilitate automated rollouts).</p>
<p>What you&#8217;ll end up with is a .spec file that <a href="http://pastebin.ca/1858951">looks like this</a>.  Go ahead and build your RPM &#8211; it&#8217;ll install without conflicts and without danger.  This is a technique that can be used for other CPAN packages as well, so go ahead and install everything you&#8217;ve always wanted.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dark.ca/2010/04/08/cpan-rpms-in-rhel-centos/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>workaround for slow shared folders in Virtualbox 3.x</title>
		<link>http://www.dark.ca/2010/01/29/workaround-for-slow-shared-folders-in-virtualbox-3-x/</link>
		<comments>http://www.dark.ca/2010/01/29/workaround-for-slow-shared-folders-in-virtualbox-3-x/#comments</comments>
		<pubDate>Fri, 29 Jan 2010 14:02:09 +0000</pubDate>
		<dc:creator>dan</dc:creator>
				<category><![CDATA[How-to]]></category>
		<category><![CDATA[filesystem]]></category>
		<category><![CDATA[network]]></category>
		<category><![CDATA[uncommon tools]]></category>
		<category><![CDATA[windows]]></category>

		<guid isPermaLink="false">http://www.dark.ca/?p=179</guid>
		<description><![CDATA[Happy 2010 fair readers !  I hope that all is well with you and yours.  Let&#8217;s get right to business : Virtualbox has a feature that allows you to access the host OS&#8217;s file system from the guest OS (shared folders), which is super useful, but not exactly perfectly implemented.  In particular, there are known, [...]]]></description>
			<content:encoded><![CDATA[<p>Happy 2010 fair readers !  I hope that all is well with you and yours.  Let&#8217;s get right to business : <a href="http://www.virtualbox.org/">Virtualbox</a> has a feature that allows you to access the host OS&#8217;s file system from the guest OS (<a href="http://www.virtualbox.org/manual/UserManual.html#sharedfolders">shared folders</a>), which is super useful, but not exactly perfectly implemented.  In particular, there are known, <a href="http://www.google.com/search?q=virtualbox+shared+folder+slow">documented performance issues</a> in certain scenarios, such as when accessing a Linux host via a Windows guest (which, as you might imagine, is a pretty regular sort of activity).</p>
<p>One common (?) workaround is to install and configure Samba on the Linux host, then access it from the Windows guest like one would access any network server.  The problem here is that it requires that Samba be installed and configured, which <em>can</em> be a pain in the, well, you know.  Furthermore, the connection will be treated like any other, and the traffic will travel up and down the network stack, which is fundamentally unnecessary since the data is, physically speaking, stored locally.</p>
<p>Instead, here&#8217;s another workaround, one that keeps things simple, <em>and</em> solves the performance problem : just map the shared folder to a local drive in the host OS.  It&#8217;s that easy.  For those of us who aren&#8217;t too familiar with the Windows explorer interface (me included, heh), there are tonnes of <a href="http://www.google.com/search?q=windows+map+shared+folder+to+local+drive">step by step instructions</a> available.  For whatever reason (i suspect Netbios insanity), accessing the network share via a mapped drive manages to <em>avoid</em> whatever condition creates the lag problems, resulting in a rapid, efficient access to the underlying filesystem.</p>
<p>Hope that helps &#8211; enjoy !</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dark.ca/2010/01/29/workaround-for-slow-shared-folders-in-virtualbox-3-x/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

