Continuing the theme from the last post I made, I've recently started working my way down the list of existing object-storage implementations.
tahoe-LAFS is a well-established project which looked like a good fit for my needs:
Getting the system up and running, on four nodes, was very simple. Setup a single/simple "introducer" which is a well-known node that all hosts can use to find each other, and then setup four deamons for storage.
When files are uploaded they are split into chunks, and these chunks are then distributed amongst the various nodes. There are some configuration settings which determine how many chunks files are split into (10 by default), how many chunks are required to rebuild the file (3 by default) and how many copies of the chunks will be created.
The biggest problem I have with tahoe is that there is no rebalancing support: Setup four nodes, and the space becomes full? You can add more nodes, new uploads go to the new nodes, while old ones stay on the old. Similarly if you change your replication-counts because you're suddenly more/less paranoid this doesn't affect existing nodes.
In my perfect world you'd distribute blocks around pretty optimistically, and I'd probably run more services:
The storage nodes would have the primitives "List all blocks", "Get block", "Put block", and using that you could ensure that each node had sent its data to at least N other nodes. This could be done in the background.
The indexer would be responsible for keeping track of which blocks live where, and which blocks are needed to reassemble upload N. There's probably more that it could do.
Over the bank holiday weekend I made two batches of jam: rhubarb & ginger and rhubarb & orange. I made a small batch last year - which we've not yet eaten - but it's quite a while since I've made so much and for sale.
This year I remembered to grade the rhubarb first, so that each batch was made from stems of similar diameter, which means that they cook evenly and you don't end up with a heterogeneous mixture - which is bad.
If you use a Linux or Unix box with bash or zsh, and you haven’t come across Liquid Prompt, then I suggest you head there right now to install it. I’m loving having more info on the status line, especially near code version control, but even having cpu load and temperature along with battery life right under where I am typing is really useful
It turns out that a raspberry pi does a very good job of being a print server for a google cloud printer. Thanks to https://matthew.mceachen.us/blog/add-google-cloudprint-wifi-access-to-your-older-printer-with-a-raspberry-pi-1342.html I can now print at home directly from my phone!
Update: Replacing the battery and retraining the receiver fixed it. I suppose it must have had enough juice to flash the LED but not transmit.
A few days ago my CurrentCost starting reading just dashes. There’s also no transmitter icon, so I think it’s not receiving anything from the transmitter. It looks like this:
I went and fished the transmitter box out of the meter closet expecting its batteries to be dead, but it still has its red LED flashing periodically, so I don’t think it’s that.
I did the thing where you hold down the button on the transmitter for 9 seconds and also hold down the V button on the display to make them pair. The display showed its “searching” screen for a while but then it went back to how it looks above.
Anyone had that happen before? It’s otherwise worked fine for 4 years or so (batteries replaced once).
The Debian Ruby Ruby team had a first sprint in 2014. The experience was very positive, and it was decided to do it again in 2015. Last April, the team once more met at the IRILL offices, in Paris, France.
The participants worked to improve the quality Ruby packages in Debian, including fixing release critical and security bugs, improving metadata and packaging code, and triaging test failures on the Debian Continuous Integration service.
The sprint also served to prepare the team infrastructure for the future Debian 9 release:
the gem2deb packaging helper to improve the semi-automated generation of Debian source packages from existing standard-compliant Ruby packages from Rubygems.
there was also an effort to prepare the switch to Ruby 2.2, the latest stable release of the Ruby language which was released after the Debian testing suite was already frozen for the Debian 8 release.
Left to right: Christian Hofstaedtler, Tomasz Nitecki, Sebastien Badia and Antonio Terceiro.
A full report with technical details has been posted to the relevant Debian mailing lists.
The UK has just had it's General Election. Labour failed miserably to increase their vote. The SNP picked uploads of votes and seats - mostly as they felt betrayed by the failure of delivery of anything after they agreed to remain in the union. The Liberal Democrats lost votes and seats a plenty as expected. The result is now we have a weak Conservative government with a slim majority - that will no doubt destroy it's self as the swivel-eyed loons on the far right of the party start to make increasingly unrealistic demands on the rest of the party.
The nutters in the home office, with the Liberal Democrat "sanity" checks removed will now demand ever increasing powers to snoop on everything we do, so that they can protect us from what ever problem they have invented to scare us with next...
I now feel compelled to support the Open Rights Group with my money as well as my moral support. If the lunatics aren't stopped then we'll have no civil liberties left.
This evening I've been mostly playing with removing duplicate content. I've had this idea for the past few days about object-storage, and obviously in that context if you can handle duplicate content cleanly that's a big win.
The naive implementation of object-storage involves splitting uploaded files into chunks, storing them separately, and writing database-entries such that you can reassemble the appropriate chunks when the object is retrieved.
If you store chunks on-disk, by the hash of their contents, then things are nice and simple.
The end result is that you might upload the file /etc/passwd, split that into four-byte chunks, and then hash each chunk using SHA256.
This leaves you with some database-entries, and a bunch of files on-disk:/tmp/hashed/ef267892ee080862c96a8d2d05de62f48e20f0875f27379e7d58c73ea4455bf1 /tmp/hashed/a378977155fb42bb006496321cbe31f74cbda803c3f6ca590f30e76d1afad921 .. /tmp/hashed/3805b0245bc8375be7125ae228eef711552ac082ffb9bf8756e2964a2393a9de
In my toy-code I wrote out the data in 4-byte chunks, which is grossly ineffeciant. But the value of using such small pieces is that there is liable to be a lot of collisions, and that means we save-space. It is a trade-off.
So the main thing I was experimenting with was the size of the chunks. If you make them too small you lose I/O due to the overhead of writing out so many small files, but you gain because collisions are common.
The rough testing I did involved using chunks of 16, 32, 128, 255, 512, 1024, 2048, and 4096 bytes. As sizes went up the overhead shrank, but also so did the collisions.
Unless you could handle the case of users uploading a lot of files like /bin/ls which are going to collide 100% of the time with prior uploads using larger chunks just didn't win as much as I thought they would.
I wrote a toy server using Sinatra & Ruby, which handles the splitting/hashing/and stored block-IDs in SQLite. It's not so novel given that it took only an hour or so to write.
The downside of my approach is also immediately apparent. All the data must live on a single machine - so that reassmbly works in the simple fashion. That's possible, even with lots of content if you use GlusterFS, or similar, but it's probably not a great approach in general. If you have large capacity storage avilable locally then this might would well enough for storing backups, etc, but .. yeah.
This weekend has been all about migrations:
I've migrated several more systems to the Jessie release of Debian GNU/Linux. No major surprises, and now I'm in a good state.
I have 18 hosts, and now 16 of them are running Jessie. One of them I won't touch for a while, and the other is a KVM-host which runs about 8 guests - so I won't upgraded that for a while (because I want to schedule the shutdown of the guests for the host-reboot).
I've started migrating my passwords to pass, which is a simple shell wrapper around GPG. I generated a new password-managing key, and started migrating the passwords.
I dislike that account-names are stored in plaintext, but that seems known and unlikely to be fixed.
I've "solved" the problem by dividing all my accounts into "Those that I wish to disclose post-death" (i.e. "banking", "amazon", "facebook", etc, etc), and those that are "never to be shared". The former are migrating, the latter are not.
(Yeah I'm thinking about estates at the moment, near-death things have that effect!)
The waist to height and waist to hips ratios are apparently better future indicators of health issues than the media friendly BMI. They also have the advantage that the only thing you need is a tape measure - which is a lot cheaper than an accurate weight scale.
My weight continues to melt away but more importantly my waist has started to shrink. While my weight has come down at an even 750 g per week rate, until now my waist line hasn't changed much. This morning's weigh in showed the largest waist shrinkage so far. While I can now wear one size smaller trousers, I've still got a long way to go to get to a health ratio.
It looks like I'll be spending a lot of time working with puppet over the coming weeks.
I've setup some toy deployments on virtual machines, and have converted several of my own hosts to using it, rather than my own slaughter system.
When it comes to puppet some things are good, and some things are bad, as exected, and as any similar tool (even my own). At the moment I'm just aiming for consistency and making sure I can control all the systems - BSD, Debian GNU/Linux, Ubuntu, Microsoft Windows, etc.
Little changes are making me happy though - rather than using a local git pre-commit hook to validate puppet manifests I'm now doing that checking on the server-side via a git pre-receive hook.
Doing it on the server-side means that I can never forget to add the local hook and future-colleagues can similarly never make this mistake, and commit malformed puppetry.
It is almost a shame there isn't a decent collection of example git-hooks, for doing things like this puppet-validation. Maybe there is and I've missed it.
It only crossed my mind because I've had to write several of these recently - a hook to rebuild a static website when the repository has a new markdown file pushed to it, a hook to validate syntax when pushes are attempted, and another hook to deny updates if the C-code fails to compile.
There's a new sheriff in town. And her name is Jessie. We're happy to announce the release of Debian 8.0, codenamed Jessie.
Want to install it? Choose your favourite installation media among Blu-ray Discs, DVDs, CDs and USB sticks. Then read the installation manual. For cloud users Debian also offers pre-built OpenStack images ready to use.
Do you want to celebrate the release? Share the banner from this blog in your blog or your website!
About a decade ago I decided to lose some weight. I've always been overweight for my height - or undertall for my weight. I managed to reduce my weight slowly over a number of months by removing snacks & junk, and basic portion control. Combined with more exercise I managed to shed a quite a few kilos.
My diet and exercise regime has largely remained constant, I don't each too much junk and have plenty of fresh fruit and vegetables in my diet and in summer bike quite a bit. However overtime more snacks sneaked in, and portions started to grow again. While I wasn't as heavy as I was a decade ago, I was definetly heavier than I should be.
While I'm still highly dubious of the Body Mass Index (BMI), it being based on flawed maths, I clearly need to target a much lower weight than the last time I reduce my mass. The BMI suggests about 75 kg for my height (1.7 m), and at the moment I'm 83.5 kg and falling by a target rate of about 100 g per day. If I stay on track that's about 26 weeks on my current diet before I tweak it to level off.
So far I've stuck a pretty even rate of about 111 g per day, and I've only had one period of food cravings, after a bike ride on an empty stomach - which was to be expected - and was satisfied with an some fruit and a drink of water.
I've also managed to drop a trouser size, going from 91.5 cm being tight, through being loose, to 86.5 cm being wearable if a little tight after a meal. According to the height to waist theory - which has better science behind it than the BMI - I should aim to wear 81.5 cm trousers and they should be loose.
Today I upgraded my main web-host to the Jessie release of Debian GNU/Linux.
I performed the upgraded by changing wheezy to jessie in the sources.list file, then ran:apt-get update apt-get dist-upgrade
For some reason this didn't upgrade my kernel, which remained the 3.2.x version. That failed to boot, due to some udev/systemd issues (lots of "waiting for job: udev /dev/vda", etc, etc). To fix this I logged into my KVM-host, chrooted into the disk image (which I mounted via the use of kpartx), and installed the 3.16.x kernel, before rebooting into that.
All my websites seemed to be OK, but I made some changes regardless. (This was mostly for "neatness", using Debian packages instead of gems, and installing the attic package rather than keeping the source-install I'd made to /opt/attic.)
The only surprise was the significant upgrade of the Net::DNS perl-module. Nothing that a few minutes work didn't fix.
Now that I've upgraded the SSL-issue I had with redirections is no longer present. So it was a worthwhile thing to do.