LUG Community Blogs

Debian Bits: DebConf17 Schedule Published!

Planet HantsLUG - Tue, 25/07/2017 - 00:15

The DebConf17 orga team is proud to announce that over 120 activities have been scheduled so far, including 45- and 20-minute talks, team meetings, and workshops, as well as a variety of other events.

Most of the talks and BoFs will be streamed and recorded, thanks to our amazing video team!

We'd like to remind you that Saturday August 5th is also our much anticipated Open Day! This means a program for a wider audience, including special activities for newcomers, such as an AMA session about Debian, a beginners workshop on packaging, a thoughtful talk about freedom with regard to today's popular gadgets and more.

In addition to the published schedule, we'll provide rooms for ad-hoc sessions where attendees will be able to schedule activities at any time during the whole conference.

The current schedule is available at https://debconf17.debconf.org/schedule/

This is also available through an XML feed. You can use ConfClerk in Debian to consume this, or Giggity on Android devices: https://debconf17.debconf.org/schedule/mobile/

We look forward to seeing you in Montreal!

Categories: LUG Community Blogs

Jonathan McDowell: Learning to love Ansible

Planet ALUG - Mon, 24/07/2017 - 18:41

This post attempts to chart my journey towards getting usefully started with Ansible to manage my system configurations. It’s a high level discussion of how I went about doing so and what I got out of it, rather than including any actual config snippets - there are plenty of great resources out there that handle the actual practicalities of getting started much better than I could.

I’ve been convinced about the merits of configuration management for machines for a while now; I remember conversations about producing an appropriate set of recipes to reproduce our haphazard development environment reliably over 4 years ago. That never really got dealt with before I left, and as managing systems hasn’t been part of my day job since then I never got around to doing more than working my way through the Puppet Learning VM. I do, however, continue to run a number of different Linux machines - a few VMs, a hosted dedicated server and a few physical machines at home and my parents’. In particular I have a VM which handles my parents’ email, and I thought that was a good candidate for trying to properly manage. It’s backed up, but it would be nice to be able to redeploy that setup easily if I wanted to move provider, or do hosting for other domains in their own VMs.

I picked Ansible, largely because I wanted something lightweight and the agentless design appealed to me. All I really need to do is ensure Python is on the host I want to manage and everything else I can bootstrap using Ansible itself. Plus it meant I could use the version from Debian testing on my laptop and not require backports on the stable machines I wanted to manage.

My first attempt was to write a single Ansible YAML file which did all the appropriate things for the email VM; installed Exim/Apache/Roundcube, created users, made sure the appropriate SSH keys were in place, installed configuration files, etc, etc. This did the job, but I found myself thinking it was no better than writing a shell script to do the same things.

Things got a lot better when instead of concentrating on a single host I looked at what commonality was shared between hosts. I started with simple things; Debian is my default distro so I created an Ansible role debian-system which configured up APT and ensured package updates were installed. Then I added a task to setup my own account and install my SSH keys. I was then able to deploy those 2 basic steps across a dozen different machine instances. At one point I got an ARM64 VM from Scaleway to play with, and it was great to be able to just add it to my Ansible hosts file and run the playbook against it to get my basic system setup.

Adding email configuration got trickier. In addition to my parents’ email VM I have my own email hosted elsewhere (along with a whole bunch of other users) and the needs of both systems are different. Sitting down and trying to manage both configurations sensibly forced me to do some rationalisation of the systems, pulling out the commonality and then templating the differences. Additionally I ended up using the lineinfile module to edit the Debian supplied configurations, rather than rolling out my own config files. This helped ensure more common components between systems. There were also a bunch of differences that had grown out of the fact each system was maintained by hand - I had about 4 copies of each Let’s Encrypt certificate rather than just putting one copy in /etc/ssl and pointing everything at that. They weren’t even in the same places on different systems. I unified these sorts of things as I came across them.

Throughout the process of this rationalisation I was able to easily test using containers. I wrote an Ansible role to create systemd-nspawn based containers, doing all of the LVM + debootstrap work required to produce a system which could then be managed by Ansible. I then pointed the same configuration as I was using for the email VM at this container, and could verify at each step along the way that the results were what I expected. It was still a little nerve-racking when I switched over the live email config to be managed by Ansible, but it went without a hitch as hoped.

I still have a lot more configuration to switch to being managed by Ansible, especially on the machines which handle a greater number of services, but it’s already proved extremely useful. To prepare for a jessie to stretch upgrade I fired up a stretch container and pointed the Ansible config at it. Most things just worked and the minor issues I was able to fix up in that instance leaving me confident that the live system could be upgraded smoothly. Or when I want to roll out a new SSH key I can just add it to the Ansible setup, and then kick off an update. No need to worry about whether I’ve updated it everywhere, or correctly removed the old one.

So I’m a convert; things were a bit more difficult by starting with existing machines that I didn’t want too much disruption on, but going forward I’ll be using Ansible to roll out any new machines or services I need, and expect that I’ll find that new deployment to be much easier now I have a firm grasp on the tools available.

Categories: LUG Community Blogs

Andy Smith: 12 hours of lvmcache

Planet HantsLUG - Fri, 21/07/2017 - 20:59

In the previous post I noted that the performance of lvmcache was still increasing and it might be worth testing it for longer than 3 hours.

Here’a a 12 hour test $ cd /srv/cache/fio && FIOTEST=lvmcache-12h fio ~/lvmcache-12h.fio partial: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=8 ... fio-2.16 Starting 2 processes partial: Laying out IO file(s) (1 file(s) / 2048MB) partial: Laying out IO file(s) (1 file(s) / 2048MB) Jobs: 2 (f=2): [r(2)] [100.0% done] [6272KB/0KB/0KB /s] [1568/0/0 iops] [eta 00m:00s] partial: (groupid=0, jobs=1): err= 0: pid=11130: Fri Jul 21 16:37:30 2017 read : io=136145MB, bw=3227.2KB/s, iops=806, runt=43200062msec slat (usec): min=3, max=586402, avg=14.27, stdev=619.54 clat (usec): min=2, max=1517.9K, avg=9897.80, stdev=29334.14 lat (usec): min=71, max=1517.9K, avg=9912.72, stdev=29344.74 clat percentiles (usec): | 1.00th=[ 103], 5.00th=[ 110], 10.00th=[ 113], 20.00th=[ 119], | 30.00th=[ 124], 40.00th=[ 129], 50.00th=[ 133], 60.00th=[ 143], | 70.00th=[ 157], 80.00th=[11840], 90.00th=[30848], 95.00th=[56576], | 99.00th=[136192], 99.50th=[179200], 99.90th=[309248], 99.95th=[382976], | 99.99th=[577536] lat (usec) : 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%, 100=0.03% lat (usec) : 250=76.84%, 500=0.26%, 750=0.13%, 1000=0.13% lat (msec) : 2=0.25%, 4=0.02%, 10=1.19%, 20=6.67%, 50=8.59% lat (msec) : 100=3.93%, 250=1.77%, 500=0.18%, 750=0.02%, 1000=0.01% lat (msec) : 2000=0.01% cpu : usr=0.44%, sys=1.63%, ctx=34524570, majf=0, minf=17 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=34853153/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=8 partial: (groupid=0, jobs=1): err= 0: pid=11131: Fri Jul 21 16:37:30 2017 read : io=134521MB, bw=3188.7KB/s, iops=797, runt=43200050msec slat (usec): min=3, max=588479, avg=14.35, stdev=613.38 clat (usec): min=2, max=1530.3K, avg=10017.42, stdev=29196.28 lat (usec): min=70, max=1530.3K, avg=10032.43, stdev=29207.06 clat percentiles (usec): | 1.00th=[ 103], 5.00th=[ 109], 10.00th=[ 112], 20.00th=[ 118], | 30.00th=[ 124], 40.00th=[ 127], 50.00th=[ 133], 60.00th=[ 143], | 70.00th=[ 157], 80.00th=[12352], 90.00th=[31360], 95.00th=[57600], | 99.00th=[138240], 99.50th=[179200], 99.90th=[301056], 99.95th=[370688], | 99.99th=[561152] lat (usec) : 4=0.01%, 20=0.01%, 50=0.01%, 100=0.04%, 250=76.56% lat (usec) : 500=0.26%, 750=0.12%, 1000=0.13% lat (msec) : 2=0.26%, 4=0.02%, 10=1.20%, 20=6.75%, 50=8.65% lat (msec) : 100=4.01%, 250=1.82%, 500=0.17%, 750=0.01%, 1000=0.01% lat (msec) : 2000=0.01% cpu : usr=0.45%, sys=1.60%, ctx=34118324, majf=0, minf=15 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=34437257/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=8 Run status group 0 (all jobs): READ: io=270666MB, aggrb=6415KB/s, minb=3188KB/s, maxb=3227KB/s, mint=43200050msec, maxt=43200062msec Disk stats (read/write): dm-2: ios=69290239/1883, merge=0/0, ticks=690078104/246240, in_queue=690329868, util=100.00%, aggrios=23098543/27863, aggrmerge=0/0, aggrticks=229728200/637782, aggrin_queue=230366965, aggrutil=100.00% dm-1: ios=247/64985, merge=0/0, ticks=36/15464, in_queue=15504, util=0.02%, aggrios=53025553/63449, aggrmerge=0/7939, aggrticks=7413340/14760, aggrin_queue=7427028, aggrutil=16.42% xvdc: ios=53025553/63449, merge=0/7939, ticks=7413340/14760, in_queue=7427028, util=16.42% dm-0: ios=53025306/6403, merge=0/0, ticks=7417028/1852, in_queue=7419784, util=16.42% dm-3: ios=16270078/12201, merge=0/0, ticks=681767536/1896032, in_queue=683665608, util=100.00%, aggrios=16270077/12200, aggrmerge=1/1, aggrticks=681637744/1813744, aggrin_queue=683453224, aggrutil=100.00% xvdd: ios=16270077/12200, merge=1/1, ticks=681637744/1813744, in_queue=683453224, util=100.00%

It’s still going up, slowly. The cache hit rate was 76.53%. In the 30 minute test the hit rate was 73.64%.

Over 30 minutes the average IOPS was 1,484.

Over 12 hours the average IOPS was 1,603.

I was kind of hoping to reach the point where the hit rate is so high that it just takes off like bcache does and we start to see tens of thousands of IOPS, but it wasn’t to be.

24 hours of lvmcache

…so I went ahead and ran the same thing for 24 hours.

I’ve skipped the first 2 hours of results since we know what they look like. It appears to still be going up, although the results past 20 hours leave some doubt there.

Here’s the full fio output.

$ cd /srv/cache/fio && FIOTEST=lvmcache-24h fio ~/lvmcache-24h.fio partial: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=8 ... fio-2.16 Starting 2 processes partial: Laying out IO file(s) (1 file(s) / 2048MB) partial: Laying out IO file(s) (1 file(s) / 2048MB) Jobs: 2 (f=2): [r(2)] [100.0% done] [7152KB/0KB/0KB /s] [1788/0/0 iops] [eta 00m:00s] partial: (groupid=0, jobs=1): err= 0: pid=14676: Sat Jul 22 21:34:12 2017 read : io=278655MB, bw=3302.6KB/s, iops=825, runt=86400091msec slat (usec): min=3, max=326, avg=12.43, stdev= 6.97 clat (usec): min=1, max=1524.1K, avg=9673.02, stdev=28748.45 lat (usec): min=71, max=1525.7K, avg=9686.11, stdev=28748.87 clat percentiles (usec): | 1.00th=[ 103], 5.00th=[ 106], 10.00th=[ 111], 20.00th=[ 116], | 30.00th=[ 119], 40.00th=[ 125], 50.00th=[ 131], 60.00th=[ 139], | 70.00th=[ 155], 80.00th=[11456], 90.00th=[30336], 95.00th=[55552], | 99.00th=[134144], 99.50th=[177152], 99.90th=[305152], 99.95th=[374784], | 99.99th=[569344] lat (usec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01% lat (usec) : 100=0.03%, 250=77.16%, 500=0.22%, 750=0.12%, 1000=0.12% lat (msec) : 2=0.23%, 4=0.02%, 10=1.18%, 20=6.65%, 50=8.54% lat (msec) : 100=3.84%, 250=1.70%, 500=0.17%, 750=0.01%, 1000=0.01% lat (msec) : 2000=0.01% cpu : usr=0.47%, sys=1.64%, ctx=70653446, majf=0, minf=17 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=71335660/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=8 partial: (groupid=0, jobs=1): err= 0: pid=14677: Sat Jul 22 21:34:12 2017 read : io=283280MB, bw=3357.4KB/s, iops=839, runt=86400074msec slat (usec): min=3, max=330, avg=12.44, stdev= 6.98 clat (usec): min=2, max=1515.9K, avg=9514.83, stdev=28128.86 lat (usec): min=71, max=1515.2K, avg=9527.92, stdev=28129.29 clat percentiles (usec): | 1.00th=[ 103], 5.00th=[ 109], 10.00th=[ 112], 20.00th=[ 118], | 30.00th=[ 123], 40.00th=[ 126], 50.00th=[ 133], 60.00th=[ 141], | 70.00th=[ 157], 80.00th=[11328], 90.00th=[29824], 95.00th=[55040], | 99.00th=[132096], 99.50th=[173056], 99.90th=[292864], 99.95th=[362496], | 99.99th=[544768] lat (usec) : 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%, 100=0.03% lat (usec) : 250=77.29%, 500=0.23%, 750=0.11%, 1000=0.12% lat (msec) : 2=0.23%, 4=0.02%, 10=1.18%, 20=6.65%, 50=8.49% lat (msec) : 100=3.81%, 250=1.66%, 500=0.15%, 750=0.01%, 1000=0.01% lat (msec) : 2000=0.01% cpu : usr=0.47%, sys=1.67%, ctx=71794214, majf=0, minf=15 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=72519640/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=8 Run status group 0 (all jobs): READ: io=561935MB, aggrb=6659KB/s, minb=3302KB/s, maxb=3357KB/s, mint=86400074msec, maxt=86400091msec Disk stats (read/write): dm-2: ios=143855123/29, merge=0/0, ticks=1380761492/1976, in_queue=1380772508, util=100.00%, aggrios=47953627/25157, aggrmerge=0/0, aggrticks=459927326/7329, aggrin_queue=459937080, aggrutil=100.00% dm-1: ios=314/70172, merge=0/0, ticks=40/15968, in_queue=16008, util=0.01%, aggrios=110839338/72760, aggrmerge=0/2691, aggrticks=15300392/17100, aggrin_queue=15315432, aggrutil=16.92% xvdc: ios=110839338/72760, merge=0/2691, ticks=15300392/17100, in_queue=15315432, util=16.92% dm-0: ios=110839024/5279, merge=0/0, ticks=15308540/1768, in_queue=15312588, util=16.93% dm-3: ios=33021544/20, merge=0/0, ticks=1364473400/4252, in_queue=1364482644, util=100.00%, aggrios=33021544/19, aggrmerge=0/1, aggrticks=1364468920/4076, aggrin_queue=1364476064, aggrutil=100.00% xvdd: ios=33021544/19, merge=0/1, ticks=1364468920/4076, in_queue=1364476064, util=100.00%

So, 1,664 average IOPS (825 + 839), 77.05% (110,839,338 / (71,335,660 + 72,519,640)*100) cache hit rate.

Not sure I can be bothered to run a multi-day test on this now!

Categories: LUG Community Blogs

Andy Smith: bcache and lvmcache

Planet HantsLUG - Wed, 19/07/2017 - 07:06
Background

Over at BitFolk we offer both SSD-backed storage and HDD-backed archive storage. The SSDs we use are really nice and basically have removed all IO performance problems we have ever encountered in the past. I can’t begin to describe how pleasant it is to just never have to think about that whole class of problems.

The main downside of course is that SSD capacity is still really expensive. That’s why we introduced the HDD-backed archive storage: for bulk storage of things that didn’t need to have high performance.

You’d really think though that by now there would be some kind of commodity tiered storage that would allow a relatively small amount of fast SSD storage to accelerate a much larger pool of HDDs. Of course there are various enterprise solutions, and there is also ZFS where SSDs could be used for the ZIL and L2ARC while HDDs are used for the pool.

ZFS is quite appealing but I’m not ready to switch everything to that yet, and I’m certainly not going to buy some enterprise storage solution. I also don’t necessarily want to commit to putting all storage behind such a system.

I decided to explore Linux’s own block device caching solutions.

Scope

I’ve restricted the scope to the two solutions which are part of the mainline Linux kernel as of July 2017, these being bcache and lvmcache.

lvmcache is based upon dm-cache which has been included with the mainline kernel since April 2013. It’s quite conservative, and having been around for quite a while is considered stable. It has the advantage that it can work with any LVM logical volume no matter what the contents. That brings the disadvantage that you do need to run LVM.

bcache has been around for a little longer but is a much more ambitious project. Being completely dedicated to accelerating slow block devices with fast ones it is claimed to be able to achieve higher performance than other caching solutions, but as it’s much more complicated than dm-cache there are still bugs being found. Also it requires you format your block devices as bcache before you use them for anything.

Test environment

I’m testing this on a Debian testing (buster) Xen virtual machine with a 20GiB xvda virtual disk containing the main operating system. That disk is backed by a software (md) RAID-10 composed of two Samsung sm863 SSDs. It was also used for testing the baseline SSD performance from the directory /srv/ssd.

The virtual machine had 1GiB of memory but the pagecache was cleared between each test run in an attempt to prevent anything being cached in memory.

A 5GiB xvdc virtual disk was provided, backed again on the SSD RAID. This was used for the cache role both in bcache and lvmcache.

A 50GiB xvdd virtual disk was provided, backed by a pair of Seagate ST4000LM016-1N2170 HDDs in software RAID-1. This was used for the HDD backing store in each of the caching implementations. The resulting cache device was mounted at /srv/cache.

Finally a 50GiB xvde virtual disk also backed on HDD was used to test baseline HDD performance, mounted at /srv/slow.

The filesystem in use in all cases was ext4 with default options. In dom0, deadline scheduler was used in all cases.

TL;DR, I just want graphs

In case you can’t be bothered to read the rest of this article, here’s just the graphs with some attempt at interpreting them. Down at the tests section you’ll find details of the actual testing process and more commentary on why certain graphs were produced.

git test graphs

Times to git clone and git grep.

fio IOPS graphs

These are graphs of IOPS across the 30 minutes of testing. There’s two important things to note about these graphs:

  1. They’re a Bezier curve fitted to the data points which are one per second. The actual data points are all over the place, because achieved IOPS depends on how many cache hits/misses there were, which is statistical.
  2. Only the IOPS for the first job is graphed. Even when using the per_job_logs=0 setting my copy of fio writes a set of results for each job. I couldn’t work out how to easily combine these so I’ve shown only the results for the first job.

    For all tests except bcache (sequential_cutoff=0) you just have to bear in mind that there is a second job working in parallel doing pretty much the same amount of IOPS. Strangely for that second bcache test the second job only managed a fraction of the IOPS (though still more than 10k IOPS) and I don’t know why.

IOPS over time for all tests

Well, those results are so extreme that it kind of makes it hard to distinguish between the low-end results.

A couple of observations:

  • SSD is incredibly and consistently fast.
  • For everything else there is a short steep section at the beginning which is likely to be the effect of HDD drive cache.
  • With sequential_cutoff set to 0, bcache very quickly reaches near-SSD performance for this workload (4k reads, 90% hitting 10% of data that fits entirely in the bcache). This is probably because the initial write put data straight into cache as it’s set to writeback.
  • When starting with a completely empty cache, lvmcache is no slouch either. It’s not quite as performant as bcache but that is still up near the 48k IOPS per process region, and very predictable.
  • When sequential_cutoff is left at its default of 4M, bcache performs much worse though still blazing compared to an HDD on its own. At the end of this 30 minute test performance was still increasing so it might be worth performing a longer test
  • The performance of lvmcache when starting with a cache already full of junk data seems to be not that much better than HDD baseline.
IOPS over time for low-end results

Leaving the high-performers out to see if there is anything interesting going on near the bottom of the previous graph.

Apart from the initial spike, HDD results are flat as expected.

Although the lvmcache (full cache) results in the previous graph seemed flat too, looking closer we can see that performance is still increasing, just very slowly. It may be interesting to test for longer to see if performance does continue to increase.

Both HDD and lvmcache have a very similar spike at the start of the test so let’s look closer at that.

IOPS for first 30 seconds

For all the lower-end performers the first 19 seconds are steeper and I can only think this is the effect of HDD drive cache. Once that is filled, HDD remains basically flat, lvmcache (full cache) increases performance more slowly and bcache with the default sequential_cutoff starts to take off.

SSDs don’t have the same sort of cache and bcache with no sequential_cutoff spikes up too quickly to really be noticeable at this scale.

3-hour lvmcache test

Since it seemed like lvmcache with a full cache device was still slowly increasing in performance I did a 3-hour testing on that one.

Skipping the first 20 minutes which show stronger growth, even after 3 hours there is still some performance increase happening. It seems like even a full cache would eventually promote read hot spots, but it could take a very very long time.

The tests

This being caching, we would only really expect it to be effective for reads of already-known data. Sequential reads on an HDD aren’t too bad but what really makes things feel slow are random reads all over the place: the time taken to seek to a new location on an HDD is really a killer for performance. So, we’d like a test that involves multiple reads of the same data but non-sequential.

git tests

I decided to do a git clone of the Linux kernel and then do a git grep of this, which would be guaranteed to be random access. The git grep step was done repeatedly to see if we could spot the cache being warmed up, also with the kernel’s pagecache being cleared after each step to avoid a situation where the entirety of the kernel source ended up in RAM. The clone process was also timed just to see if there would be any effect on write time.

fio tests

Despite git being a real application, the workload itself (searching for a string over and over) seemed rather too synthetic and simplistic. I decided to also try using the excellent fio to gather some detailed statistics. This is also synthetic of course, relating to no real workload, but it’s quite sophisticated and hopefully of interest.

The fio job spec used in all cases was as follows:

[partial] ioengine=libaio direct=1 iodepth=8 numjobs=2 readwrite=randread random_distribution=zipf:1.2 bs=4k size=2g unlink=1 runtime=1800 time_based=1 per_job_logs=1 log_avg_msec=1000 write_iops_log=/var/tmp/fio-${FIOTEST}

This causes two 2GiB files of random data to be written out and then two processes read from them for 30 minutes with up to 8 IOs in flight at once for each process. Between the use of direct=1 and the fact that there is 4GiB of data and only 1GiB of RAM, that should be enough to avoid testing any memory cache.

log_avg_msec=1000 is specified in order to restrict the IOPS log to one entry per second, otherwise you get one entry per IO. As that is kept in memory until fio completes, unless the number of entries is reduced then due to the sheer number of IOs the SSDs can service the fio process will cause the VM to run out of memory. This directive causes just one averaged entry per 1000ms, which is adequate for graphing.

unlink is used to make sure that data files are deleted after each test run. Leaving the files in place would cause additional promotions into cache, which would favour tests that are run multiple times.

By default when told to do a random read, fio would ensure it reads the entire file once only (before starting again, if the test is time-based). That would not be any use for testing cache warming as there would only ever be one read from each part of the file. To avoid this, a Zipf distribution is used so that read locations are chosen according to a somewhat repetitive distribution, simulating hotspots. I have no real idea about this part so if anyone has any input on that I’d be interested to hear, but my understanding is that a Zipf distribution with exponent 1.2 should cause about 91% of the reads to be from about 10% of the blocks.

In fact I would be very interested to hear of other fio job configurations that people think would be useful here.

If you’re unfamiliar with the tool, fio output can appear quite daunting. There’s a wealth of information in there though, so it’s worth looking into if you ever need to test storage performance. Too many times people just reach for dd or bonnie as performance testing tools but apart from the most trivial of tests they are just too simple. I found this article by Albert Tobey useful in understanding fio‘s output.

Each test run was performed against:

  1. A purely SSD-backed filesystem.
  2. A purely HDD-backed filesystem.
  3. A filesystem on top of a bcache device, sequential_cutoff=4M (default).
  4. A filesystem on top of a bcache device, sequential_cutoff=0.
  5. A filesystem on top of an lvmcache device, empty cache.
  6. A filesystem on top of an lvmcache device, cache full of junk.
Baseline SSD git grep test $ cd /srv/ssd $ /usr/bin/time -f '%U user %S system %E elapsed %P CPU' \ > git clone git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git Cloning into 'linux-stable'... remote: Counting objects: 5948690, done. remote: Compressing objects: 100% (124799/124799), done. remote: Total 5948690 (delta 427756), reused 549517 (delta 425675) Receiving objects: 100% (5948690/5948690), 1.21 GiB | 39.61 MiB/s, done. Resolving deltas: 100% (4961965/4961965), done. Checking out files: 100% (59844/59844), done. 195.45 user 22.10 system 2:37.89 elapsed 137% CPU $ cd /srv/ssd/linux-stable $ for i in {1..20}; do > echo 3 | sudo tee /proc/sys/vm/drop_caches >/dev/null ; \ > /usr/bin/time -f '%U user %S system %E elapsed %P CPU' \ > git --no-pager grep -q blah ; \ > sleep 1; \ > done 0.98 user 1.28 system 0:02.28 elapsed 99% CPU 0.95 user 1.15 system 0:02.18 elapsed 96% CPU 0.89 user 1.30 system 0:02.23 elapsed 98% CPU 1.04 user 1.13 system 0:02.25 elapsed 96% CPU 0.95 user 1.22 system 0:02.24 elapsed 96% CPU 0.95 user 1.23 system 0:02.23 elapsed 97% CPU 1.00 user 1.16 system 0:02.24 elapsed 96% CPU 1.00 user 1.16 system 0:02.23 elapsed 96% CPU 0.89 user 1.28 system 0:02.22 elapsed 97% CPU 0.97 user 1.20 system 0:02.24 elapsed 97% CPU 1.00 user 1.18 system 0:02.23 elapsed 97% CPU 0.96 user 1.21 system 0:02.23 elapsed 97% CPU 0.96 user 1.22 system 0:02.24 elapsed 97% CPU 1.01 user 1.17 system 0:02.24 elapsed 97% CPU 0.94 user 1.22 system 0:02.24 elapsed 96% CPU 1.00 user 1.16 system 0:02.24 elapsed 96% CPU 0.98 user 1.18 system 0:02.24 elapsed 96% CPU 0.92 user 1.24 system 0:02.24 elapsed 97% CPU 0.92 user 1.28 system 0:02.25 elapsed 98% CPU 1.09 user 1.07 system 0:02.23 elapsed 97% CPU

The initial git clone took 157.89 seconds and each git grep reliably took about 2.24 seconds.

fio test $ cd /srv/ssd/fio && FIOTEST=ssd fio ~/partial.fio partial: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=8 ... fio-2.16 Starting 2 processes partial: Laying out IO file(s) (1 file(s) / 2048MB) partial: Laying out IO file(s) (1 file(s) / 2048MB) Jobs: 2 (f=2): [r(2)] [100.0% done] [409.8MB/0KB/0KB /s] [105K/0/0 iops] [eta 00m:00s] partial: (groupid=0, jobs=1): err= 0: pid=29306: Wed Jul 19 05:17:39 2017 read : io=379033MB, bw=215628KB/s, iops=53906, runt=1800001msec slat (usec): min=2, max=1547, avg= 4.38, stdev= 1.68 clat (usec): min=26, max=4350, avg=141.54, stdev=29.67 lat (usec): min=94, max=4353, avg=146.39, stdev=29.65 clat percentiles (usec): | 1.00th=[ 103], 5.00th=[ 108], 10.00th=[ 112], 20.00th=[ 119], | 30.00th=[ 124], 40.00th=[ 129], 50.00th=[ 135], 60.00th=[ 141], | 70.00th=[ 149], 80.00th=[ 161], 90.00th=[ 179], 95.00th=[ 197], | 99.00th=[ 243], 99.50th=[ 262], 99.90th=[ 314], 99.95th=[ 338], | 99.99th=[ 406] lat (usec) : 50=0.01%, 100=0.14%, 250=99.08%, 500=0.77%, 750=0.01% lat (usec) : 1000=0.01% lat (msec) : 2=0.01%, 4=0.01%, 10=0.01% cpu : usr=14.30%, sys=39.70%, ctx=50601996, majf=0, minf=16 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=97032448/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=8 partial: (groupid=0, jobs=1): err= 0: pid=29307: Wed Jul 19 05:17:39 2017 read : io=358188MB, bw=203769KB/s, iops=50942, runt=1800001msec slat (usec): min=2, max=225, avg= 4.46, stdev= 1.71 clat (usec): min=21, max=4385, avg=150.08, stdev=43.00 lat (usec): min=69, max=4391, avg=155.01, stdev=42.98 clat percentiles (usec): | 1.00th=[ 104], 5.00th=[ 110], 10.00th=[ 115], 20.00th=[ 121], | 30.00th=[ 127], 40.00th=[ 133], 50.00th=[ 137], 60.00th=[ 145], | 70.00th=[ 155], 80.00th=[ 171], 90.00th=[ 199], 95.00th=[ 237], | 99.00th=[ 322], 99.50th=[ 350], 99.90th=[ 414], 99.95th=[ 438], | 99.99th=[ 498] lat (usec) : 50=0.01%, 100=0.10%, 250=95.82%, 500=4.06%, 750=0.01% lat (usec) : 1000=0.01% lat (msec) : 2=0.01%, 4=0.01%, 10=0.01% cpu : usr=13.80%, sys=37.97%, ctx=48695297, majf=0, minf=15 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=91696255/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=8 Run status group 0 (all jobs): READ: io=737221MB, aggrb=419396KB/s, minb=203769KB/s, maxb=215627KB/s, mint=1800001msec, maxt=1800001msec Disk stats (read/write): xvda: ios=188724848/22, merge=0/28, ticks=27221784/0, in_queue=27267336, util=100.00%

This provides a nice illustration of just how fast the SSDs are: each process managed a very consistent 52 thousand IOPS and between the two of them they managed to read ~720GiB of data in 30 minutes.

Baseline HDD git grep test $ cd /srv/slow $ /usr/bin/time -f '%U user %S system %E elapsed %P CPU' \ > git clone git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git Cloning into 'linux-stable'... remote: Counting objects: 5948690, done. remote: Compressing objects: 100% (124799/124799), done. remote: Total 5948690 (delta 427756), reused 549517 (delta 425675) Receiving objects: 100% (5948690/5948690), 1.21 GiB | 38.36 MiB/s, done. Resolving deltas: 100% (4961965/4961965), done. Checking out files: 100% (59844/59844), done. 194.61 user 21.48 system 3:42.23 elapsed 97% CPU $ cd /srv/slow/linux-stable $ for i in {1..20}; do > echo 3 | sudo tee /proc/sys/vm/drop_caches >/dev/null ; \ > /usr/bin/time -f '%U user %S system %E elapsed %P CPU' \ > git --no-pager grep -q blah ; \ > sleep 1; \ > done 1.22 user 1.60 system 0:23.60 elapsed 11% CPU 1.35 user 1.32 system 0:23.08 elapsed 11% CPU 1.24 user 1.43 system 0:22.49 elapsed 11% CPU 1.15 user 1.57 system 0:22.37 elapsed 12% CPU 1.25 user 1.44 system 0:22.09 elapsed 12% CPU 1.26 user 1.48 system 0:22.60 elapsed 12% CPU 1.20 user 1.52 system 0:22.47 elapsed 12% CPU 1.21 user 1.48 system 0:22.82 elapsed 11% CPU 1.14 user 1.55 system 0:22.45 elapsed 12% CPU 1.28 user 1.49 system 0:22.32 elapsed 12% CPU 1.21 user 1.48 system 0:22.43 elapsed 12% CPU 1.26 user 1.51 system 0:22.95 elapsed 12% CPU 1.21 user 1.49 system 0:22.33 elapsed 12% CPU 1.23 user 1.45 system 0:22.54 elapsed 11% CPU 1.19 user 1.51 system 0:22.08 elapsed 12% CPU 1.26 user 1.42 system 0:22.42 elapsed 12% CPU 1.20 user 1.51 system 0:22.47 elapsed 12% CPU 1.19 user 1.53 system 0:22.42 elapsed 12% CPU 1.15 user 1.56 system 0:22.48 elapsed 12% CPU 1.24 user 1.47 system 0:22.10 elapsed 12% CPU

The initial git clone took 222.23 seconds and each git grep reliably took about 22.53 seconds.

Of course, a significant part of the clone time was spent downloading the data.

fio test $ cd /srv/slow/fio && FIOTEST=hdd fio ~/partial.fio partial: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=8 ... fio-2.16 Starting 2 processes partial: Laying out IO file(s) (1 file(s) / 2048MB) partial: Laying out IO file(s) (1 file(s) / 2048MB) Jobs: 2 (f=2): [r(2)] [100.0% done] [3795KB/0KB/0KB /s] [948/0/0 iops] [eta 00m:00s] partial: (groupid=0, jobs=1): err= 0: pid=29158: Wed Jul 19 04:33:19 2017 read : io=3276.3MB, bw=1863.8KB/s, iops=465, runt=1800078msec slat (usec): min=2, max=132, avg=13.00, stdev= 8.10 clat (usec): min=59, max=830656, avg=17151.07, stdev=36630.33 lat (usec): min=64, max=830668, avg=17164.94, stdev=36630.39 clat percentiles (usec): | 1.00th=[ 70], 5.00th=[ 78], 10.00th=[ 88], 20.00th=[ 111], | 30.00th=[ 143], 40.00th=[ 205], 50.00th=[ 988], 60.00th=[ 1496], | 70.00th=[15424], 80.00th=[26240], 90.00th=[50432], 95.00th=[82432], | 99.00th=[177152], 99.50th=[224256], 99.90th=[346112], 99.95th=[403456], | 99.99th=[544768] lat (usec) : 100=15.10%, 250=26.82%, 500=3.79%, 750=1.99%, 1000=2.47% lat (msec) : 2=10.84%, 4=0.39%, 10=2.91%, 20=10.92%, 50=14.68% lat (msec) : 100=6.53%, 250=3.21%, 500=0.34%, 750=0.02%, 1000=0.01% cpu : usr=0.39%, sys=1.17%, ctx=822685, majf=0, minf=16 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=838728/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=8 partial: (groupid=0, jobs=1): err= 0: pid=29159: Wed Jul 19 04:33:19 2017 read : io=3328.4MB, bw=1893.4KB/s, iops=473, runt=1800076msec slat (usec): min=2, max=157, avg=13.20, stdev= 8.22 clat (usec): min=4, max=1046.5K, avg=16882.45, stdev=35737.35 lat (usec): min=64, max=1046.6K, avg=16896.52, stdev=35737.42 clat percentiles (usec): | 1.00th=[ 70], 5.00th=[ 78], 10.00th=[ 88], 20.00th=[ 111], | 30.00th=[ 143], 40.00th=[ 211], 50.00th=[ 996], 60.00th=[ 1528], | 70.00th=[15296], 80.00th=[25728], 90.00th=[49920], 95.00th=[81408], | 99.00th=[173056], 99.50th=[218112], 99.90th=[329728], 99.95th=[391168], | 99.99th=[522240] lat (usec) : 10=0.01%, 50=0.01%, 100=15.03%, 250=26.59%, 500=3.86% lat (usec) : 750=2.07%, 1000=2.49% lat (msec) : 2=10.70%, 4=0.36%, 10=3.03%, 20=11.45%, 50=14.46% lat (msec) : 100=6.49%, 250=3.17%, 500=0.30%, 750=0.01%, 1000=0.01% lat (msec) : 2000=0.01% cpu : usr=0.47%, sys=1.15%, ctx=834258, majf=0, minf=14 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=852044/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=8 Run status group 0 (all jobs): READ: io=6604.6MB, aggrb=3757KB/s, minb=1863KB/s, maxb=1893KB/s, mint=1800076msec, maxt=1800078msec Disk stats (read/write): xvde: ios=1690665/12, merge=0/15, ticks=28764332/2848, in_queue=28767992, util=100.00%

Both jobs together managed an average of 938 IOPS here, which is pretty weak but still considerably more than a pair of HDDs should be able to do (100–150 IOPS each, so 300 if you were amazingly lucky). This indicates that the drive cache must have been involved, and indeed the IOPS rate does very rapidly increase within the first few seconds of the test run but reaches a stable maximum within 20 seconds. That marks the time when the disk cache is full and cache hit rate can’t be maintained.

A total of 6,604.6MiB was read during the 30 minute test.

bcache Setup

By default bcache uses writethrough cache mode, where writes go to both the backing store and the cache and are only acknowledged once they’ve completed at the backing store. In this mode bcache will never accelerate writes, but it is safe to lose the cache device at any time.

I felt comfortable switching the bcache device to writeback mode as the underlying storage used for the cache is already reliable. In writeback mode the writes are acknowledged as soon as they are completed on the cache device and are later sequentially written to backing device.

This should result in higher write performance as well as the higher read performance we’d expect from using a cache.

$ sudo make-bcache -B /dev/xvdd UUID: cf0353eb-ac08-45bd-a3d5-f8b92fbc0fb8 Set UUID: 8c06d317-e735-41f5-8dec-52da26a37883 version: 1 block_size: 1 data_offset: 16 $ sudo mkfs.ext4 -L cached /dev/bcache0 mke2fs 1.43.4 (31-Jan-2017) Discarding device blocks: done Creating filesystem with 13107198 4k blocks and 3276800 inodes Filesystem UUID: 7db8568e-282b-4fa2-bf19-a42143708a0d Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 4096000, 7962624, 11239424 Allocating group tables: done Writing inode tables: done Creating journal (65536 blocks): done Writing superblocks and filesystem accounting information: done $ sudo make-bcache -C /dev/xvdc UUID: ae581052-4449-460d-8cf5-b11d6467403d Set UUID: 2fcc7937-9f5f-4089-9400-552675929ea5 version: 0 nbuckets: 10240 block_size: 1 bucket_size: 1024 nr_in_set: 1 nr_this_dev: 0 first_bucket: 1 $ sudo bcache-super-show /dev/xvdc sb.magic ok sb.first_sector 8 [match] sb.csum 206ADF979437131E [match] sb.version 3 [cache device] dev.label (empty) dev.uuid ae581052-4449-460d-8cf5-b11d6467403d dev.sectors_per_block 1 dev.sectors_per_bucket 1024 dev.cache.first_sector 1024 dev.cache.cache_sectors 10484736 dev.cache.total_sectors 10485760 dev.cache.ordered yes dev.cache.discard no dev.cache.pos 0 dev.cache.replacement 0 [lru] cset.uuid 2fcc7937-9f5f-4089-9400-552675929ea5 $ echo 2fcc7937-9f5f-4089-9400-552675929ea5 | sudo tee /sys/block/bcache0/bcache/attach >/dev/null $ echo writeback | sudo tee /sys/block/bcache0/bcache/cache_mode >/dev/null $ sudo mount /dev/bcache0 /srv/cache $ sudo chown andy: /srv/cache git grep test $ cd /srv/cache $ /usr/bin/time -f '%U user %S system %E elapsed %P CPU' \ > git clone git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git Cloning into 'linux-stable'... remote: Counting objects: 5948690, done. remote: Compressing objects: 100% (124799/124799), done. remote: Total 5948690 (delta 427755), reused 549518 (delta 425675) Receiving objects: 100% (5948690/5948690), 1.21 GiB | 40.30 MiB/s, done. Resolving deltas: 100% (4961964/4961964), done. Checking out files: 100% (59844/59844), done. 194.88 user 21.30 system 3:34.10 elapsed 100% CPU $ cd /srv/cache/linux-stable $ for i in {1..20}; do > echo 3 | sudo tee /proc/sys/vm/drop_caches >/dev/null ; \ > /usr/bin/time -f '%U user %S system %E elapsed %P CPU' \ > git --no-pager grep -q blah ; \ > sleep 1; \ > done 1.03 user 1.40 system 0:03.02 elapsed 80% CPU 1.01 user 1.25 system 0:02.55 elapsed 88% CPU 1.05 user 1.21 system 0:02.63 elapsed 85% CPU 0.90 user 1.34 system 0:02.56 elapsed 87% CPU 0.96 user 1.28 system 0:02.59 elapsed 86% CPU 0.90 user 1.35 system 0:02.54 elapsed 88% CPU 1.04 user 1.22 system 0:02.51 elapsed 90% CPU 1.00 user 1.25 system 0:02.50 elapsed 89% CPU 0.93 user 1.34 system 0:02.54 elapsed 89% CPU 1.02 user 1.26 system 0:02.53 elapsed 90% CPU 0.91 user 1.43 system 0:02.67 elapsed 88% CPU 0.94 user 1.39 system 0:02.57 elapsed 90% CPU 1.04 user 1.28 system 0:02.57 elapsed 90% CPU 1.00 user 1.37 system 0:02.58 elapsed 91% CPU 1.02 user 1.33 system 0:02.84 elapsed 82% CPU 1.11 user 1.21 system 0:02.58 elapsed 90% CPU 1.03 user 1.30 system 0:02.57 elapsed 90% CPU 0.96 user 1.37 system 0:02.58 elapsed 90% CPU 1.02 user 1.36 system 0:02.62 elapsed 90% CPU 0.92 user 1.41 system 0:02.62 elapsed 89% CPU

The initial git clone took 214.10 seconds and each git grep on average took about 2.61 seconds. It can be seen that the first run is slightly slower but very quickly things settle down.

The git grep results are very encouraging being only 11.6% of the time taken for the baseline HDD tests, or almost 9x faster. Each run was only 0.37 seconds slower than when run directly against SSD.

The initial clone results are disappointing though, being only 8.13 seconds faster than when run against baseline HDD. It looks like most data must have bypassed the cache and waited for HDD. bcache does bypass cache for sequential writes which is what the git clone would have done, and we can prove this:

$ cat /sys/block/bcache0/bcache/stats_total/bypassed 270M

So yes, there we go, all 270MiB bypassed the cache and waited on HDD completion.

We can give bcache hints about when to consider writes sequential, and at the extreme end tell it to never consider writes sequential, so let’s try that.

$ echo 0 | sudo tee /sys/block/bcache0/bcache/sequential_cutoff >/dev/null $ cd /srv/cache $ rm -r linux-stable $ /usr/bin/time -f '%U user %S system %E elapsed %P CPU' \ > git clone git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git Cloning into 'linux-stable'... remote: Counting objects: 5948690, done. remote: Compressing objects: 100% (124799/124799), done. remote: Total 5948690 (delta 427756), reused 549517 (delta 425675) Receiving objects: 100% (5948690/5948690), 1.21 GiB | 39.51 MiB/s, done. Resolving deltas: 100% (4961965/4961965), done. Checking out files: 100% (59844/59844), done. 195.10 user 23.53 system 2:39.87 elapsed 136% CPU $ cat /sys/block/bcache0/bcache/stats_total/bypassed 299M

This time only 29MiB of data bypassed cache, and the time for clone was 159.87 seconds. That’s 62.36 seconds faster than against HDD, and only 1.98 seconds slower than against SSD!

fio test – sequential_cutoff=4M $ cd /srv/cache/fio && FIOTEST=bcache-sc4m fio ~/partial.fio partial: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=8 ... fio-2.16 Starting 2 processes partial: Laying out IO file(s) (1 file(s) / 2048MB) partial: Laying out IO file(s) (1 file(s) / 2048MB) Jobs: 2 (f=2): [r(2)] [100.0% done] [119.9MB/0KB/0KB /s] [30.7K/0/0 iops] [eta 00m:00s] partial: (groupid=0, jobs=1): err= 0: pid=8640: Thu Jul 20 18:13:08 2017 read : io=45237MB, bw=25734KB/s, iops=6433, runt=1800041msec slat (usec): min=3, max=5707, avg= 6.14, stdev=10.97 clat (usec): min=2, max=1238.4K, avg=1235.14, stdev=10675.04 lat (usec): min=70, max=1238.5K, avg=1241.70, stdev=10675.36 clat percentiles (usec): | 1.00th=[ 101], 5.00th=[ 104], 10.00th=[ 106], 20.00th=[ 108], | 30.00th=[ 111], 40.00th=[ 117], 50.00th=[ 119], 60.00th=[ 121], | 70.00th=[ 122], 80.00th=[ 124], 90.00th=[ 133], 95.00th=[ 151], | 99.00th=[35584], 99.50th=[67072], 99.90th=[148480], 99.95th=[185344], | 99.99th=[305152] lat (usec) : 4=0.01%, 50=0.01%, 100=0.37%, 250=97.11%, 500=0.08% lat (usec) : 750=0.02%, 1000=0.02% lat (msec) : 2=0.03%, 4=0.01%, 10=0.13%, 20=0.65%, 50=0.87% lat (msec) : 100=0.46%, 250=0.24%, 500=0.01%, 750=0.01%, 1000=0.01% lat (msec) : 2000=0.01% cpu : usr=1.81%, sys=6.79%, ctx=10722687, majf=0, minf=17 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=11580733/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=8 partial: (groupid=0, jobs=1): err= 0: pid=8641: Thu Jul 20 18:13:08 2017 read : io=43743MB, bw=24885KB/s, iops=6221, runt=1800037msec slat (usec): min=3, max=5703, avg= 6.27, stdev=10.53 clat (usec): min=2, max=1081.7K, avg=1277.44, stdev=10884.69 lat (usec): min=70, max=1081.8K, avg=1284.13, stdev=10884.99 clat percentiles (usec): | 1.00th=[ 100], 5.00th=[ 103], 10.00th=[ 104], 20.00th=[ 106], | 30.00th=[ 107], 40.00th=[ 109], 50.00th=[ 111], 60.00th=[ 117], | 70.00th=[ 120], 80.00th=[ 122], 90.00th=[ 129], 95.00th=[ 145], | 99.00th=[37632], 99.50th=[69120], 99.90th=[150528], 99.95th=[189440], | 99.99th=[317440] lat (usec) : 4=0.01%, 20=0.01%, 50=0.01%, 100=0.68%, 250=96.73% lat (usec) : 500=0.07%, 750=0.02%, 1000=0.02% lat (msec) : 2=0.03%, 4=0.01%, 10=0.13%, 20=0.68%, 50=0.90% lat (msec) : 100=0.48%, 250=0.25%, 500=0.02%, 750=0.01%, 1000=0.01% lat (msec) : 2000=0.01% cpu : usr=1.74%, sys=6.57%, ctx=10425775, majf=0, minf=15 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=11198290/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=8 Run status group 0 (all jobs): READ: io=88981MB, aggrb=50618KB/s, minb=24884KB/s, maxb=25734KB/s, mint=1800037msec, maxt=1800041msec Disk stats (read/write): bcache0: ios=22775854/880, merge=0/0, ticks=28668080/36768, in_queue=197531580, util=100.00%, aggrios=11393419/289009, aggrmerge=0/182, aggrticks=14329508/435284, aggrin_queue=14764868, aggrutil=100.00% xvdc: ios=22230796/570128, merge=0/348, ticks=2656456/37312, in_queue=2693812, util=67.39% xvdd: ios=556042/7890, merge=0/17, ticks=26002560/833256, in_queue=26835924, util=100.00%

The ~15k IOPS per process achieved towards the end of this test are quite impressive and suggest that there is more scope for performance increase in a longer test. The combined average IOPS across 30 minutes were “only” 12.7k though. Vastly superior to a HDD and we would be glad to take it.

22,779,023 IOs were issued (11,580,733+11,198,290) of which 22,230,796 (97.6%) hit the cache. That’s a really great result, but we learned earlier that setting sequential_cutoff to 0 allows us to get data we write straight into the cache, so let’s see how that changes things.

fio test – sequential_cutoff=0 $ cd /srv/cache/fio && FIOTEST=bcache-sc0 fio ~/partial.fio partial: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=8 ... fio-2.16 Starting 2 processes partial: Laying out IO file(s) (1 file(s) / 2048MB) partial: Laying out IO file(s) (1 file(s) / 2048MB) Jobs: 2 (f=2): [r(2)] [100.0% done] [276.1MB/0KB/0KB /s] [70.1K/0/0 iops] [eta 00m:00s] partial: (groupid=0, jobs=1): err= 0: pid=9190: Thu Jul 20 20:34:49 2017 read : io=367029MB, bw=208799KB/s, iops=52199, runt=1800001msec slat (usec): min=3, max=350673, avg= 5.49, stdev=54.90 clat (usec): min=22, max=558335, avg=145.49, stdev=593.01 lat (usec): min=67, max=558342, avg=151.41, stdev=597.19 clat percentiles (usec): | 1.00th=[ 102], 5.00th=[ 106], 10.00th=[ 109], 20.00th=[ 116], | 30.00th=[ 120], 40.00th=[ 123], 50.00th=[ 127], 60.00th=[ 133], | 70.00th=[ 139], 80.00th=[ 153], 90.00th=[ 185], 95.00th=[ 223], | 99.00th=[ 310], 99.50th=[ 338], 99.90th=[ 442], 99.95th=[ 868], | 99.99th=[20352] lat (usec) : 50=0.01%, 100=0.27%, 250=96.55%, 500=3.11%, 750=0.02% lat (usec) : 1000=0.01% lat (msec) : 2=0.02%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01% lat (msec) : 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01% cpu : usr=13.23%, sys=41.79%, ctx=46793234, majf=0, minf=17 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=93959531/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=8 partial: (groupid=0, jobs=1): err= 0: pid=9191: Thu Jul 20 20:34:49 2017 read : io=116645MB, bw=66357KB/s, iops=16589, runt=1800020msec slat (usec): min=3, max=86150, avg= 6.31, stdev=22.07 clat (usec): min=16, max=581877, avg=473.49, stdev=3466.14 lat (usec): min=67, max=581884, avg=480.26, stdev=3466.62 clat percentiles (usec): | 1.00th=[ 102], 5.00th=[ 107], 10.00th=[ 110], 20.00th=[ 116], | 30.00th=[ 120], 40.00th=[ 123], 50.00th=[ 126], 60.00th=[ 131], | 70.00th=[ 135], 80.00th=[ 143], 90.00th=[ 161], 95.00th=[ 183], | 99.00th=[14656], 99.50th=[23680], 99.90th=[47360], 99.95th=[59648], | 99.99th=[94720] lat (usec) : 20=0.01%, 50=0.01%, 100=0.23%, 250=97.81%, 500=0.37% lat (usec) : 750=0.03%, 1000=0.02% lat (msec) : 2=0.02%, 4=0.02%, 10=0.18%, 20=0.66%, 50=0.58% lat (msec) : 100=0.08%, 250=0.01%, 500=0.01%, 750=0.01% cpu : usr=4.86%, sys=16.81%, ctx=24164179, majf=0, minf=15 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=29861123/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=8 Run status group 0 (all jobs): READ: io=483674MB, aggrb=275153KB/s, minb=66357KB/s, maxb=208798KB/s, mint=1800001msec, maxt=1800020msec Disk stats (read/write): bcache0: ios=123818417/28, merge=0/0, ticks=27945560/5768, in_queue=429639892, util=100.00%, aggrios=61919282/259095, aggrmerge=0/25, aggrticks=13927904/125270, aggrin_queue=14059068, aggrutil=100.00% xvdc: ios=123345393/500268, merge=0/35, ticks=17064408/30904, in_queue=17107112, util=99.99% xvdd: ios=493172/17923, merge=0/15, ticks=10791400/219636, in_queue=11011024, util=100.00%

123,820,654 (93,959,531+29,861,123) IOs issued, 123,345,393 (99.62%) hit cache. It’s just ~2% higher than the previous test, but what a difference it makes. The IOPS log indicates that performance very quickly increases to near-SSD levels, averaging 68.8k IOPS across 30 minutes.

Interestingly unlike all other tests (even unlike bcache with the default sequential_cutoff), the second job doesn’t manage anywhere near the same level of performance as the first job. It’s 52.2k IOPS vs. 16.6k IOPS. Perhaps at this high level of cache utilisation there is some sort of locking going on. Might be worth following up with upstream.

lvmcache Setup $ sudo pvcreate /dev/xvdc Physical volume "/dev/xvdc" successfully created. $ sudo pvcreate /dev/xvdd Physical volume "/dev/xvdd" successfully created. $ sudo vgcreate cachedvg /dev/xvd{d,c} Volume group "cachedvg" successfully created $ sudo lvcreate -L50m -n lv_cache_meta cachedvg /dev/xvdc Rounding up size to full physical extent 52.00 MiB Logical volume "lv_cache_meta" created. $ sudo lvcreate -L4g -n lv_cache cachedvg /dev/xvdc Logical volume "lv_cache" created. $ sudo lvcreate -l100%PVS -n lv cachedvg /dev/xvdd Logical volume "lv" created. $ sudo lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert lv cachedvg -wi-a----- 50.00g lv_cache cachedvg -wi-a----- 4.00g lv_cache_meta cachedvg -wi-a----- 52.00m $ sudo lvconvert --type cache-pool --cachemode writeback --poolmetadata cachedvg/lv_cache_meta cachedvg/lv_cache Converted cachedvg/lv_cache to cache pool. $ sudo lvconvert --type cache --cachepool cachedvg/lv_cache cachedvg/lv Logical volume cachedvg/lv is now cached. $ sudo lvs -a -o +devices LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Devices lv cachedvg Cwi-a-C--- 50.00g [lv_cache] [lv_corig] 0.00 1.01 0.00 lv_corig(0) [lv_cache] cachedvg Cwi---C--- 4.00g 0.00 1.01 0.00 lv_cache_cdata(0) [lv_cache_cdata] cachedvg Cwi-ao---- 4.00g /dev/xvdc(13) [lv_cache_cmeta] cachedvg ewi-ao---- 52.00m /dev/xvdc(0) [lv_corig] cachedvg owi-aoC--- 50.00g /dev/xvdd(0) [lvol0_pmspare] cachedvg ewi------- 52.00m /dev/xvdc(1037) $ sudo mkfs.ext4 -L cached /dev/cachedvg/lv mke2fs 1.43.4 (31-Jan-2017) Creating filesystem with 13106176 4k blocks and 3276800 inodes Filesystem UUID: bff28530-339a-403d-889d-dd42f4465ab5 Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 4096000, 7962624, 11239424 Allocating group tables: done Writing inode tables: done Creating journal (65536 blocks): done Writing superblocks and filesystem accounting information: done $ sudo mount /dev/cachedvg/lv /srv/cache $ sudo chown andy: /srv/cache git grep test $ cd /srv/cache $ /usr/bin/time -f '%U user %S system %E elapsed %P CPU' \ > git clone git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git Cloning into 'linux-stable'... remote: Counting objects: 5948690, done. remote: Compressing objects: 100% (124799/124799), done. remote: Total 5948690 (delta 427755), reused 549518 (delta 425675) Receiving objects: 100% (5948690/5948690), 1.21 GiB | 35.05 MiB/s, done. Resolving deltas: 100% (4961964/4961964), done. Checking out files: 100% (59844/59844), done. 196.00 user 23.40 system 2:43.13 elapsed 134% CPU $ cd /srv/cache/linux-stable $ for i in {1..20}; do > echo 3 | sudo tee /proc/sys/vm/drop_caches >/dev/null ; \ > /usr/bin/time -f '%U user %S system %E elapsed %P CPU' \ > git --no-pager grep -q blah ; \ > sleep 1; \ > done 1.00 user 1.58 system 0:06.41 elapsed 40% CPU 0.93 user 1.46 system 0:04.22 elapsed 56% CPU 1.09 user 1.30 system 0:04.09 elapsed 58% CPU 1.02 user 1.39 system 0:03.89 elapsed 62% CPU 0.97 user 1.41 system 0:03.89 elapsed 61% CPU 1.03 user 1.34 system 0:03.52 elapsed 67% CPU 1.06 user 1.30 system 0:03.60 elapsed 65% CPU 0.94 user 1.46 system 0:03.79 elapsed 63% CPU 1.05 user 1.33 system 0:03.28 elapsed 72% CPU 1.00 user 1.39 system 0:03.81 elapsed 62% CPU 1.03 user 1.37 system 0:03.25 elapsed 74% CPU 1.05 user 1.37 system 0:03.30 elapsed 73% CPU 1.07 user 1.30 system 0:03.18 elapsed 74% CPU 1.07 user 1.32 system 0:03.28 elapsed 72% CPU 0.96 user 1.46 system 0:02.95 elapsed 82% CPU 1.07 user 1.31 system 0:03.14 elapsed 76% CPU 0.98 user 1.40 system 0:03.04 elapsed 78% CPU 0.96 user 1.42 system 0:03.19 elapsed 74% CPU 1.00 user 1.40 system 0:03.12 elapsed 77% CPU 0.89 user 1.48 system 0:02.89 elapsed 82% CPU

The initial git clone took 163.13 seconds, not far off native SSD timings (157.68 seconds). Each git grep on average took about 3.35 seconds. These started off slower (6.41 seconds) but settled down with the last 15 runs averaging at 3.29 seconds.

fio test – empty cache $ cd /srv/cache/fio && FIOTEST=lvmcache-empty fio ~/partial.fio partial: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=8 ... fio-2.16 Starting 2 processes partial: Laying out IO file(s) (1 file(s) / 2048MB) partial: Laying out IO file(s) (1 file(s) / 2048MB) Jobs: 2 (f=2): [r(2)] [100.0% done] [373.5MB/0KB/0KB /s] [95.7K/0/0 iops] [eta 00m:00s] partial: (groupid=0, jobs=1): err= 0: pid=6652: Thu Jul 20 12:14:06 2017 read : io=333831MB, bw=189912KB/s, iops=47478, runt=1800001msec slat (usec): min=2, max=245, avg= 6.41, stdev= 2.17 clat (usec): min=29, max=234093, avg=159.69, stdev=401.74 lat (usec): min=69, max=234098, avg=166.57, stdev=401.75 clat percentiles (usec): | 1.00th=[ 103], 5.00th=[ 108], 10.00th=[ 113], 20.00th=[ 119], | 30.00th=[ 124], 40.00th=[ 129], 50.00th=[ 135], 60.00th=[ 143], | 70.00th=[ 153], 80.00th=[ 175], 90.00th=[ 221], 95.00th=[ 274], | 99.00th=[ 370], 99.50th=[ 410], 99.90th=[ 502], 99.95th=[ 996], | 99.99th=[16768] lat (usec) : 50=0.01%, 100=0.16%, 250=92.76%, 500=6.97%, 750=0.05% lat (usec) : 1000=0.01% lat (msec) : 2=0.01%, 4=0.01%, 10=0.02%, 20=0.02%, 50=0.01% lat (msec) : 100=0.01%, 250=0.01% cpu : usr=12.04%, sys=44.39%, ctx=42073879, majf=0, minf=16 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=85460611/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=8 partial: (groupid=0, jobs=1): err= 0: pid=6653: Thu Jul 20 12:14:06 2017 read : io=313896MB, bw=178572KB/s, iops=44642, runt=1800001msec slat (usec): min=2, max=335, avg= 6.55, stdev= 2.25 clat (usec): min=22, max=212646, avg=170.22, stdev=721.29 lat (usec): min=63, max=212656, avg=177.25, stdev=721.32 clat percentiles (usec): | 1.00th=[ 103], 5.00th=[ 108], 10.00th=[ 112], 20.00th=[ 118], | 30.00th=[ 122], 40.00th=[ 127], 50.00th=[ 131], 60.00th=[ 137], | 70.00th=[ 145], 80.00th=[ 157], 90.00th=[ 177], 95.00th=[ 197], | 99.00th=[ 258], 99.50th=[ 314], 99.90th=[12096], 99.95th=[15424], | 99.99th=[28032] lat (usec) : 50=0.01%, 100=0.22%, 250=98.57%, 500=0.91%, 750=0.01% lat (usec) : 1000=0.01% lat (msec) : 2=0.01%, 4=0.02%, 10=0.11%, 20=0.11%, 50=0.03% lat (msec) : 100=0.01%, 250=0.01% cpu : usr=11.36%, sys=42.77%, ctx=41146564, majf=0, minf=14 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=80357365/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=8 Run status group 0 (all jobs): READ: io=647726MB, aggrb=368484KB/s, minb=178571KB/s, maxb=189912KB/s, mint=1800001msec, maxt=1800001msec Disk stats (read/write): dm-2: ios=165815070/5, merge=0/0, ticks=27172896/564, in_queue=27297688, util=100.00%, aggrios=55273566/7259, aggrmerge=0/0, aggrticks=8986252/1498, aggrin_queue=9007888, aggrutil=100.00% dm-1: ios=2/19047, merge=0/0, ticks=0/3072, in_queue=3072, util=0.14%, aggrios=165379217/17690, aggrmerge=0/4083, aggrticks=23890596/3616, aggrin_queue=23926592, aggrutil=100.00% xvdc: ios=165379217/17690, merge=0/4083, ticks=23890596/3616, in_queue=23926592, util=100.00% dm-0: ios=165379215/2726, merge=0/0, ticks=23917816/860, in_queue=23979056, util=100.00% dm-3: ios=441483/6, merge=0/0, ticks=3040940/564, in_queue=3041536, util=75.31%, aggrios=441483/5, aggrmerge=0/1, aggrticks=3040896/464, aggrin_queue=3041312, aggrutil=75.30% xvdd: ios=441483/5, merge=0/1, ticks=3040896/464, in_queue=3041312, util=75.30%

We can see that the first job issued 85,460,611 IOs and the second job issued 80,357,365 IOs, for a total of 165817976 IOs. Of these, 165,379,217 (99.74%) came from xvdc, the cache (SSD) device. Only 441,483 (0.27%) came from xvdd, the backing (HDD) device. Again this is very aggressive cache population, with a combined average of 92k IOPS being achieved over 30 minutes.

80k IOPS are achieved within 20 seconds of the start of testing, which is probably helped by the disk cache, but the steady maximum of ~90k IOPS is still achieved by 40 seconds in. Average IOPS ended up at 88% that of the SSD.

This is incredible performance, but I could only get it to happen the first time, against an empty cache device. Since the next test run will unlink the previous run’s files, the cache device will at that point be full of data that is of no use as it’s been deleted. The next test will write out 4GiB of new, uncached data.

fio test – full cache $ cd /srv/cache/fio && FIOTEST=lvmcache-full fio ~/partial.fio partial: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=8 ... fio-2.16 Starting 2 processes partial: Laying out IO file(s) (1 file(s) / 2048MB) partial: Laying out IO file(s) (1 file(s) / 2048MB) Jobs: 2 (f=2): [r(2)] [100.0% done] [6072KB/0KB/0KB /s] [1518/0/0 iops] [eta 00m:00s] partial: (groupid=0, jobs=1): err= 0: pid=6857: Thu Jul 20 12:58:00 2017 read : io=5266.1MB, bw=2996.2KB/s, iops=749, runt=1800111msec slat (usec): min=3, max=117, avg=12.93, stdev= 7.50 clat (usec): min=58, max=1060.8K, avg=10663.38, stdev=31025.49 lat (usec): min=73, max=1060.8K, avg=10676.99, stdev=31025.92 clat percentiles (usec): | 1.00th=[ 105], 5.00th=[ 112], 10.00th=[ 117], 20.00th=[ 125], | 30.00th=[ 127], 40.00th=[ 133], 50.00th=[ 137], 60.00th=[ 153], | 70.00th=[ 179], 80.00th=[14144], 90.00th=[33024], 95.00th=[59136], | 99.00th=[144384], 99.50th=[191488], 99.90th=[337920], 99.95th=[411648], | 99.99th=[618496] lat (usec) : 100=0.29%, 250=74.34%, 500=0.46%, 750=0.20%, 1000=0.21% lat (msec) : 2=0.47%, 4=0.03%, 10=1.30%, 20=7.26%, 50=9.23% lat (msec) : 100=4.09%, 250=1.88%, 500=0.22%, 750=0.02%, 1000=0.01% lat (msec) : 2000=0.01% cpu : usr=0.48%, sys=1.52%, ctx=1335232, majf=0, minf=17 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=1348336/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=8 partial: (groupid=0, jobs=1): err= 0: pid=6858: Thu Jul 20 12:58:00 2017 read : io=5171.2MB, bw=2941.7KB/s, iops=735, runt=1800133msec slat (usec): min=3, max=320, avg=13.05, stdev= 7.56 clat (usec): min=63, max=1145.3K, avg=10861.20, stdev=30827.62 lat (usec): min=72, max=1145.3K, avg=10874.93, stdev=30828.07 clat percentiles (usec): | 1.00th=[ 109], 5.00th=[ 116], 10.00th=[ 120], 20.00th=[ 126], | 30.00th=[ 131], 40.00th=[ 133], 50.00th=[ 141], 60.00th=[ 155], | 70.00th=[ 185], 80.00th=[14912], 90.00th=[33536], 95.00th=[59136], | 99.00th=[142336], 99.50th=[189440], 99.90th=[333824], 99.95th=[411648], | 99.99th=[602112] lat (usec) : 100=0.33%, 250=73.29%, 500=0.50%, 750=0.22%, 1000=0.22% lat (msec) : 2=0.52%, 4=0.03%, 10=1.38%, 20=7.58%, 50=9.63% lat (msec) : 100=4.18%, 250=1.88%, 500=0.21%, 750=0.02%, 1000=0.01% lat (msec) : 2000=0.01% cpu : usr=0.43%, sys=1.55%, ctx=1310390, majf=0, minf=15 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=1323812/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=8 Run status group 0 (all jobs): READ: io=10438MB, aggrb=5937KB/s, minb=2941KB/s, maxb=2996KB/s, mint=1800111msec, maxt=1800133msec Disk stats (read/write): dm-2: ios=2671993/27, merge=0/0, ticks=28764200/8380, in_queue=28773988, util=100.00%, aggrios=891886/15335, aggrmerge=0/0, aggrticks=9387100/6312, aggrin_queue=9393456, aggrutil=100.00% dm-1: ios=173/42639, merge=0/0, ticks=16/9316, in_queue=9336, util=0.24%, aggrios=1967684/42191, aggrmerge=0/3788, aggrticks=291076/9440, aggrin_queue=300476, aggrutil=15.68% xvdc: ios=1967684/42191, merge=0/3788, ticks=291076/9440, in_queue=300476, util=15.68% dm-0: ios=1967511/3340, merge=0/0, ticks=291184/1240, in_queue=292496, util=15.55% dm-3: ios=707976/28, merge=0/0, ticks=27870100/8380, in_queue=27878536, util=100.00%, aggrios=707976/12, aggrmerge=0/16, aggrticks=27870028/2908, aggrin_queue=27872968, aggrutil=100.00% xvdd: ios=707976/12, merge=0/16, ticks=27870028/2908, in_queue=27872968, util=100.00%

Testing with an already-full cache produces dramatically different results. We can see that a total of 2,672,148 (1,348,336+1,323,812) IOs were issued, with 1,967,684 (73.64%) coming from the cache. 707,976 (26.49%) came from the backing device. Although on the face of it that sounds like a good hit rate, it was poor enough to severely limit the average IOPS to 1484 across both jobs. We saw how much the difference was between 99.6% and 97.6% cache hit rate, and now we see what the difference is between ~99% and ~73%.

1,484 IOPS represents a 158% increase on the baseline HDD IOPS of 938 so it’s nothing to be sniffed at, and as I say it may be the case that it only gets better over much longer time span.

Even after this 30 minute test the performance of the lvmcache was still very slowly going up, so I decided to run another test across 3 hours instead of 30 seconds. The performance did continue to increase very very slowly peaking at about 1,640 aggregate IOPS after 3 hours. It seems like it is quite difficult to evict data from the cache in favour of new data that is being read, whereas when the cache was empty it promoted everything that was read immediately. Possibly if this was left for long enough there would eventually be enough cache hits to achieve similar performance to the first run.

bcache did not suffer from this problem: every test run against bcache was the same, except for the effect of preloading the cache device by messing with sequential_cutoff. I suspect that this is because bcache is a write cache; when set to writethrough or writeback mode, writes are going through the cache and immediately being cached. lvmcache is not a write cache; it will only promote things when they are read.

It seems like lvmcache needs a very large number of reads before it will evict data from the cache and bring new data in.

Further tests?

If you have any further tests you’d like to suggest please leave a comment or a tweet. I think it would be good to come up with more fio tests for this, but bear in mind that it really needs to test random read IO with a repetitive element so that a cache can actually be effective. If you read every part of 100GiB equally as often then there will be no hot spots and a comparatively small cache will not be effective, no matter what technology is used.

Also real-world benchmarks would be good, so perhaps a really complex SQL SELECT on a data set that won’t fit in (1GiB) memory but will fit in (5GiB) SSD cache.

It might be interesting to use a cache device much smaller than the working set. These tests laid out two 2GiB files then used a Zipf distribution to randomly read 10% of each (so two sets of ~205MiB data) while the cache devices were 4 or 5GiB in size. It is difficult to guess what the hot spots would be on a production file system, but a 90/10 or 80/20 distribution does seem reasonable.

What may be less reasonable is starting off with such a small data set. What would it be like to try to cache 1TiB with 10GiB of SSD? I’d rather not write out 1TiB of data to check, but I could simulate by writing out 51GiB (5%) and having a cache device that is only 512MiB. On a 90/10 distribution it would still want to cache ~5.2GiB and only have 512MiB to put it in.

Conclusions

SSDs are lovely but they’re still really expensive. Although they are fun for turning out impressive benchmarks, in the real world you only need things to be fast enough. If you want consistently great performance and you have the money, buy some decent SSDs. If you don’t have the money then there is a whole world of trade-offs.

bcache from what I can measure provides really impressive performance but I am concerned about relying upon it. The existing implementation is no longer the focus of the main developer’s attention as the developer has now moved on to bcachefs, a complete filesystem in its own right that builds upon the bcache concept.

I’m really conservative when it comes to filesystems and I know it can take years, even decades, to get to feature parity with all the nasty bugs hammered out. I also worry about continued maintenance of a separate filesystem.

lvmcache on the other hand is providing only a very modest performance boost. In many cases it won’t be worth it. I do like its simplicity though and I’m already very familiar with LVM.

The way that any logical volume can have a cache attached without needing to start over with the filesystem makes lvmcache very flexible; it also means that some volumes can be cached and others not. Furthermore with a slow PV and a fast PV both in the volume group, if necessary an individual logical volume can be allocated entirely from the fast PV.

I may experiment with bcache some more in situations where I can quickly restore the data in event of calamity, or fairly easily move to a different solution if bcache becomes abandonware. I may experiment with lvmcache in places where a modest speedup would be welcome.

Categories: LUG Community Blogs

Daniel Silverstone (Kinnison): Yay, finished my degree at last

Planet ALUG - Tue, 18/07/2017 - 22:56

A little while back, in June, I sat my last exam for what I hoped would be the last module in my degree. For seven years, I've been working on a degree with the Open University and have been taking advantage of the opportunity to have a somewhat self-directed course load by taking the 'Open' degree track. When asked why I bothered to do this, I guess my answer has been a little varied. In principle it's because I felt like I'd already done a year's worth of degree and didn't want it wasted, but it's also because I have been, in the dim and distant past, overlooked for jobs simply because I had no degree and thus was an easy "bin the CV".

Fed up with this, I decided to commit to the Open University and thus began my journey toward 'qualification' in 2010. I started by transferring the level 1 credits from my stint at UCL back in 1998/1999 which were in a combination of basic programming in Java, some mathematics including things like RSA, and some psychology and AI courses which at the time were aiming at a degree called 'Computer Science with Cognitive Sciences'.

Then I took level 2 courses, M263 (Building blocks of software), TA212 (The technology of music) and MS221 (Exploring mathematics). I really enjoyed the mathematics course and so...

At level 3 I took MT365 (Graphs, networks and design), M362 (Developing concurrent distributed systems), TM351 (Data management and analysis - which I ended up hating), and finally finishing this June with TM355 (Communications technology).

I received an email this evening telling me the module result for TM355 had been posted, and I logged in to find I had done well enough to be offered my degree. I could have claimed my degree 18+ months ago, but I persevered through another two courses in order to qualify for an honours degree which I have now been awarded. Since I don't particularly fancy any ceremonial awarding, I just went through the clicky clicky and accepted my qualification of 'Batchelor of Science (Honours) Open, Upper Second-class Honours (2.1)' which grants me the letters 'BSc (Hons) Open (Open)' which, knowing me, will likely never even make it onto my CV because I'm too lazy.

It has been a significant effort, over the course of the past few years, to complete a degree without giving up too much of my personal commitments. In addition to earning the degree, I have worked, for six of the seven years it has taken, for Codethink doing interesting work in and around Linux systems and Trustable software. I have designed and built Git server software which is in use in some universities, and many companies, along with a good few of my F/LOSS colleagues. And I've still managed to find time to attend plays, watch films, read an average of 2 novel-length stories a week (some of which were even real books), and be a member of the Manchester Hackspace.

Right now, I'm looking forward to a stress free couple of weeks, followed by an immense amount of fun at Debconf17 in Montréal!

Categories: LUG Community Blogs

Jonathan McDowell: Just because you can, doesn't mean you should

Planet ALUG - Mon, 17/07/2017 - 19:41

There was a recent Cryptoparty Belfast event that was aimed at a wider audience than usual; rather than concentrating on how to protect ones self on the internet the 3 speakers concentrated more on why you might want to. As seems to be the way these days I was asked to say a few words about the intersection of technology and the law. I think people were most interested in all the gadgets on show at the end, but I hope they got something out of my talk. It was a very high level overview of some of the issues around the Investigatory Powers Act - if you’re familiar with it then I’m not adding anything new here, just trying to provide some sort of details about why it’s a bad thing from both a technological and a legal perspective.

Download

Categories: LUG Community Blogs

Chris Lamb: Installation birthday

Planet ALUG - Fri, 14/07/2017 - 11:08

Fancy receiving congratulations on the anniversary of when you installed your system?

Installing the installation-birthday package on your Debian machines will celebrate each birthday of your system by automatically sending a message to the local system administrator.

The installation date is based on the system installation time, not the package itself.

installation-birthday is available in Debian 9 ("stretch") via the stretch-backports repository, as well as in the testing and unstable distributions:

$ apt install installation-birthday

Enjoy, and patches welcome. :)

Categories: LUG Community Blogs

Steve Engledow (stilvoid): An evening of linux on the desktop

Planet ALUG - Mon, 10/07/2017 - 23:14

Last time, I wrote about trying a few desktop environments to see what's out there, keep things fresh, and keep me from complacency. Well, as with desktop environments, so with text editors. I decided briefly that I would try a few of the more recent code editors that are around these days. Lured in by their pleasing, modern visuals and their promises of a smooth, integrated experience, I've been meaning to give these a go for a while. Needless to say, as a long-time vim user, I just found myself frustrated that I wasn't able to get things done as efficiently in any of those editors as I could in vim ;) I tried installing vim keybindings in Atom but it just wasn't the same as a very limited set of functionality was there. As for the integrated environment, when you have tmux running by default, everything's integrated anyway.

And, as with editors, so once again with desktop environments. I've decided to retract my previous hasty promise and no longer to bother with trying any other environments; i3 is more than fine :)

However, I did spend some time this evening making things a bit prettier so here are some delicious configs for posterity:

Configs Xresources

I've switched back to xterm from urxvt because, er... dunno.

Anyway, I set some nice colours for terminals and some magic stuff that makes man pages all colourful :)

XTerm*faceName: xft:Hack:regular:size=12 *termName: xterm-256color ! Colourful man pages *VT100.colorBDMode: true *VT100.colorBD: cyan *VT100.colorULMode: true *VT100.colorUL: darkcyan *VT100.colorITMode: true *VT100.colorIT: yellow *VT100.veryBoldColors: 518 ! terminal colours *foreground:#CCCCCC *background:#2B2D2E !black darkgray *color0: #2B2D2E *color8: #808080 !darkred red *color1: #FF0044 *color9: #F92672 !darkgreen green *color2: #82B414 *color10: #A6E22E !darkyellow yellow *color3: #FD971F *color11: #E6DB74 !darkblue blue *color4: #266C98 *color12: #7070F0 !darkmagenta magenta *color5: #AC0CB1 *color13: #D63AE1 !darkcyan cyan *color6: #AE81FF *color14: #66D9EF !gray white *color7: #CCCCCC *color15: #F8F8F2 Vimrc

Nothing exciting here except for discovering a few options I hadn't previous known about:

" Show a marker at the 80th column to encourage nice code set colorcolumn=80 highlight ColorColumn ctermbg=darkblue " Scroll the text when we're 3 lines from the top or bottom set so=3 " Use browser-style incremental search set incsearch " Override the default background colour in xoria256 to match the terminal background highlight Normal ctermbg=black " I like this theme colorscheme xoria256 i3

I made a few colour tweaks to my i3 config so I get colours that match my new Xresources. One day, I might see if it's easy enough to have them both read colour definitions from the same place so I don't have to define things twice.

The result

Here's what it looks like:

Categories: LUG Community Blogs

Steve Kemp: bind9 considered harmful

Planet HantsLUG - Mon, 10/07/2017 - 22:00

Recently there was another bind9 security update released by the Debian Security Team. I thought that was odd, so I've scanned my mailbox:

  • 11 January 2017
    • DSA-3758 - bind9
  • 26 February 2017
    • DSA-3795-1 - bind9
  • 14 May 2017
    • DSA-3854-1 - bind9
  • 8 July 2017
    • DSA-3904-1 - bind9

So in the year to date there have been 7 months, in 3 of them nothing happened, but in 4 of them we had bind9 updates. If these trends continue we'll have another 2.5 updates before the end of the year.

I don't run a nameserver. The only reason I have bind-packages on my system is for the dig utility.

Rewriting a compatible version of dig in Perl should be trivial, thanks to the Net::DNS::Resolver module:

These are about the only commands I ever run:

dig -t a steve.fi +short dig -t aaaa steve.fi +short dig -t a steve.fi @8.8.8.8

I should do that then. Yes.

Categories: LUG Community Blogs

Jonathan McDowell: Going to DebConf 17

Planet ALUG - Mon, 10/07/2017 - 18:54

Completely forgot to mention this earlier in the year, but delighted to say that in just under 4 weeks I’ll be attending DebConf 17 in Montréal. Looking forward to seeing a bunch of fine folk there!

Outbound:

2017-08-04 11:40 DUB -> 13:40 KEF WW853 2017-08-04 15:25 KEF -> 17:00 YUL WW251

Inbound:

2017-08-12 19:50 YUL -> 05:00 KEF WW252 2017-08-13 06:20 KEF -> 09:50 DUB WW852

(Image created using GIMP, fonts-dkg-handwriting and the DebConf17 Artwork.)

Categories: LUG Community Blogs

Daniel Silverstone (Kinnison): Gitano - Approaching Release - Access Control Changes

Planet ALUG - Sat, 08/07/2017 - 10:31

As mentioned previously I am working toward getting Gitano into Stretch. A colleague and friend of mine (Richard Maw) did a large pile of work on Lace to support what we are calling sub-defines. These let us simplify Gitano's ACL files, particularly for individual projects.

In this posting, I'd like to cover what has changed with the access control support in Gitano, so if you've never used it then some of this may make little sense. Later on, I'll be looking at some better user documentation in conjunction with another friend of mine (Lars Wirzenius) who has promised to help produce a basic administration manual before Stretch is totally frozen.

Sub-defines

With a more modern lace (version 1.3 or later) there is a mechanism we are calling 'sub-defines'. Previously if you wanted to write a ruleset which said something like "Allow Steve to read my repository" you needed:

define is_steve user exact steve allow "Steve can read my repo" is_steve op_read

And, as you'd expect, if you also wanted to grant read access to Jeff then you'd need yet set of defines:

define is_jeff user exact jeff define is_steve user exact steve define readers anyof is_jeff is_steve allow "Steve and Jeff can read my repo" readers op_read

This, while flexible (and still entirely acceptable) is wordy for small rulesets and so we added sub-defines to create this syntax:

allow "Steve and Jeff can read my repo" op_read [anyof [user exact jeff] [user exact steve]]

Of course, this is generally neater for simpler rules, if you wanted to add another user then it might make sense to go for:

define readers anyof [user exact jeff] [user exact steve] [user exact susan] allow "My friends can read my repo" op_read readers

The nice thing about this sub-define syntax is that it's basically usable anywhere you'd use the name of a previously defined thing, they're compiled in much the same way, and Richard worked hard to get good error messages out from them just in case.

No more auto_user_XXX and auto_group_YYY

As a result of the above being implemented, the support Gitano previously grew for automatically defining users and groups has been removed. The approach we took was pretty inflexible and risked compilation errors if a user was deleted or renamed, and so the sub-define approach is much much better.

If you currently use auto_user_XXX or auto_group_YYY in your rulesets then your upgrade path isn't bumpless but it should be fairly simple:

  1. Upgrade your version of lace to 1.3
  2. Replace any auto_user_FOO with [user exact FOO] and similarly for any auto_group_BAR to [group exact BAR].
  3. You can now upgrade Gitano safely.
No more 'basic' matches

Since Gitano first gained support for ACLs using Lace, we had a mechanism called 'simple match' for basic inputs such as groups, usernames, repo names, ref names, etc. Simple matches looked like user FOO or group !BAR. The match syntax grew more and more arcane as we added Lua pattern support refs ~^refs/heads/${user}/. When we wanted to add proper PCRE regex support we added a syntax of the form: user pcre ^/.+?... where pcre could be any of: exact, prefix, suffix, pattern, or pcre. We had a complex set of rules for exactly what the sigils at the start of the match string might mean in what order, and it was getting unwieldy.

To simplify matters, none of the "backward compatibility" remains in Gitano. You instead MUST use the what how with match form. To make this slightly more natural to use, we have added a bunch of aliases: is for exact, starts and startswith for prefix, and ends and endswith for suffix. In addition, kind of match can be prefixed with a ! to invert it, and for natural looking rules not is an alias for !is.

This means that your rulesets MUST be updated to support the more explicit syntax before you update Gitano, or else nothing will compile. Fortunately this form has been supported for a long time, so you can do this in three steps.

  1. Update your gitano-admin.git global ruleset. For example, the old form of the defines used to contain define is_gitano_ref ref ~^refs/gitano/ which can trivially be replaced with: define is_gitano_ref ref prefix refs/gitano/
  2. Update any non-zero rulesets your projects might have.
  3. You can now safely update Gitano

If you want a reference for making those changes, you can look at the Gitano skeleton ruleset which can be found at https://git.gitano.org.uk/gitano.git/tree/skel/gitano-admin/rules/ or in /usr/share/gitano if Gitano is installed on your local system.

Next time, I'll likely talk about the deprecated commands which are no longer in Gitano, and how you'll need to adjust your automation to use the new commands.

Categories: LUG Community Blogs

Steve Kemp: I've never been more proud

Planet HantsLUG - Tue, 04/07/2017 - 22:00

This morning I remembered I had a beefy virtual-server setup to run some kernel builds upon (when I was playing with Linux security moduels), and I figured before I shut it down I should use the power to run some fuzzing.

As I was writing some code in Emacs at the time I figured "why not fuzz emacs?"

After a few hours this was the result:

deagol ~ $ perl -e 'print "`" x ( 1024 * 1024 * 12);' > t.el deagol ~ $ /usr/bin/emacs --batch --script ./t.el .. .. Segmentation fault (core dumped)

Yup, evaluating an lisp file caused a segfault, due to a stack-overflow (no security implications). I've never been more proud, even though I have contributed code to GNU Emacs in the past.

Categories: LUG Community Blogs

Debian Bits: New Debian Developers and Maintainers (May and June 2017)

Planet HantsLUG - Sun, 02/07/2017 - 13:30

The following contributors got their Debian Developer accounts in the last two months:

  • Alex Muntada (alexm)
  • Ilias Tsitsimpis (iliastsi)
  • Daniel Lenharo de Souza (lenharo)
  • Shih-Yuan Lee (fourdollars)
  • Roger Shimizu (rosh)

The following contributors were added as Debian Maintainers in the last two months:

  • James Valleroy
  • Ryan Tandy
  • Martin Kepplinger
  • Jean Baptiste Favre
  • Ana Cristina Custura
  • Unit 193

Congratulations!

Categories: LUG Community Blogs

Daniel Silverstone (Kinnison): F/LOSS activity, June 2017

Planet ALUG - Sat, 01/07/2017 - 13:59

It seems to be becoming popular to send a mail each month detailing your free software work for that month. I have been slowly ramping my F/LOSS activity back up, after years away where I worked on completing my degree. My final exam for that was in June 2017 and as such I am now in a position to try and get on with more F/LOSS work.

My focus, as you might expect, has been on Gitano which reached 1.0 in time for Stretch's release and which is now heading gently toward a 1.1 release which we have timed for Debconf 2017. My friend a colleague Richard has been working hard on Gitano and related components during this time too, and I hope that Debconf will be an opportunity for him to meet many of my Debian friends too. But enough of that, back to the F/LOSS.

We've been running Gitano developer days roughly monthly since March of 2017, and the June developer day was attended by myself, Richard Maw, and Richard Ipsum. You are invited to read the wiki page for the developer day if you want to know exactly what we got up to, but a summary of my involvement that day is:

  • I chaired the review of our current task state for the project
  • I chaired the decision on the 1.1 timeline.
  • I completed a code branch which adds rudimentary hook support to Gitano and submitted it for code review.
  • I began to learn about git-multimail since we have a need to add support for it to Gitano

Other than that, related to Gitano during June I:

  • Reviewed Richard Ipsum's lua-scrypt patches for salt generation
  • Reviewed Richard Maw's work on centralising Gitano's patterns into a module.
  • Responded to reviews of my hook work, though I need to clean it up some before it'll be merged.

My non-Gitano F/LOSS related work in June has been entirely centred around the support I provide to the Lua community in the form of the Lua mailing list and website. The host on which it's run is ailing, and I've had to spend time trying to improve and replace that.

Hopefully I'll have more to say next month. Perhaps by doing this reporting I'll get more F/LOSS done. Of course, July's report will be sent out while I'm in Montréal for debconf 2017 (or at least for debcamp at that point) so hopefully more to say anyway.

Categories: LUG Community Blogs

Chris Lamb: Free software activities in June 2017

Planet ALUG - Sat, 01/07/2017 - 00:29

Here is my monthly update covering what I have been doing in the free software world (previous month):

  • Updated travis.debian.net, my hosted service for projects that host their Debian packaging on GitHub to use the Travis CI continuous integration platform to test builds:
    • Support Debian "buster". (commit)
    • Set TRAVIS=true environment variable when running autopkgtests. (#45)
  • Updated the documentation in django-slack, my library to easily post messages to the Slack group-messaging utility to link to Slack's own message formatting documentation. (#66)
  • Added "buster" support to local-debian-mirror, my package to easily maintain and customise a local Debian mirror via the DebConf configuration tool. (commit)
Reproducible builds

Whilst anyone can inspect the source code of free software for malicious flaws, most software is distributed pre-compiled to end users.

The motivation behind the Reproducible Builds effort is to allow verification that no flaws have been introduced — either maliciously or accidentally — during this compilation process by promising identical results are always generated from a given source. Multiple third-parties then can come to a consensus on whether a build was compromised or not.

I have generously been awarded a grant from the Core Infrastructure Initiative to fund my work in this area.

This month I:

  • Chaired our monthly IRC meeting. (Summary, logs, etc.)
  • Presented at Hong Kong Open Source Conference 2017.
  • Presented at LinuxCon China.
  • Submitted the following patches to fix reproducibility-related toolchain issues within Debian:
    • cracklib2: Ensuring /var/cache/cracklib/src-dicts are reproducible. (#865623)
    • fontconfig: Ensuring the cache files are reproducible. (#864082)
    • nfstrace: Make the PDF footers reproducible. (#865751)
  • Submitted 6 patches to fix specific reproducibility issues in cd-hit, janus, qmidinet, singularity-container, tigervnc & xabacus.
  • Submitted a wishlist request to the TeX mailing list to ensure that PDF files are reproducible even if generated from a difficult path after identifying underlying cause. (Thread)
  • Categorised a large number of packages and issues in the Reproducible Builds notes.git repository.
  • Worked on publishing our weekly reports. (#110, #111, #112 & #113)
  • Updated our website with 13 missing talks (e291180), updated the metadata for some existing talks (650a201) and added OpenEmbedded to the projects page (12dfcf0).

I also made the following changes to our tooling:

diffoscope

diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues.


strip-nondeterminism

strip-nondeterminism is our tool to remove specific non-deterministic results from a completed build.

  • Add libarchive-cpio-perl with the !nocheck build profile. (01e408e)
  • Add dpkg-dev dependency build profile. (f998bbe)


Debian

My activities as the current Debian Project Leader are covered in my "Bits from the DPL" email to the debian-devel-announce mailing list. However, I:

Debian LTS

This month I have been paid to work 16 hours hours on Debian Long Term Support (LTS). In that time I did the following:

  • "Frontdesk" duties, triaging CVEs, etc.
  • Issued DLA 974-1 fixing a command injection vulnerability in picocom, a dumb-terminal emulation program.
  • Issued DLA 972-1 which patches a double-free vulnerability in the openldap LDAP server.
  • Issued DLA 976-1 which corrects a buffer over-read vulnerability in the yodl ("Your Own Document Language") document processor.
  • Issued DLA 985-1 to address a vulnerability in libsndfile (a library for reading/writing audio files) where a specially-crafted AIFF file could result in an out-of-bounds memory read.
  • Issued DLA 990-1 to fix an infinite loop vulnerability in the expat, an XML parsing library.
  • Issued DLA 999-1 for the openvpn VPN server — if clients used a HTTP proxy with NTLM authentication, a man-in-the-middle attacker could cause the client to crash or disclose stack memory that was likely to contain the proxy password.
Uploads
  • bfs (1.0.2-1) — New upstream release, add basic/smoke autopkgtests.
  • installation-birthday (5) — Add some basic autopkgtest smoke tests and correct the Vcs-{Git,Browser} headers.
  • python-django:
    • 1:1.11.2-1 — New upstream minor release & backport an upstream patch to prevent a test failure if the source is not writable. (#816435)
    • 1:1.11.2-2 — Upload to unstable, use !nocheck profile for build dependencies that are only required for tests and various packaging updates.

I also made the following non-maintainer uploads (NMUs):

  • kluppe (0.6.20-1.1) — Fix segmentation fault caused by passing a truncated pointer instead of a GtkType. (#863421)
  • porg (2:0.10-1.1) — Fix broken LD_PRELOAD path for libporg-log.so. (#863495)
  • ganeti-instance-debootstrap (0.16-2.1) — Fix "illegal option for fgrep" error by using "--" to escape the search needle. (#864025)
  • pavuk (0.9.35-6.1) — Fix segmentation fault when opening the "Limitations" window due to pointer truncation in src/gtkmulticol.[ch]. (#863492)
  • timemachine (0.3.3-2.1) — Fix two segmentation faults in src/gtkmeter.c and gtkmeterscale.c caused by passing a truncated pointers using guint instead of a GtkType. (#863420)
  • jackeq (0.5.9-2.1) — Fix another segmentation fault caused by passing a truncated pointer instead of a GtkType. (#863416)
Debian bugs filed
  • debhelper: Don't run dh_installdocs if nodoc is specified in DEB_BUILD_PROFILES? (#865869)
  • python-blessed: Non-determistically FTBFS due to unreliable timing in tests. (#864337)
  • apt: Please print a better error message if zero certificates are loaded from the system CA store. (#866377)
FTP Team

As a Debian FTP assistant I ACCEPTed 16 packages: faceup, golang-github-andybalholm-cascadia, haskell-miniutter, libplack-builder-conditionals-perl, libprelude, lua-argparse, network-manager-l2tp, node-gulp-concat, node-readable-stream, node-stream-assert, node-xterm, pydocstyle, pytest-bdd, python-iso3166, python-zxcvbn & stressant.

Categories: LUG Community Blogs

Bring-A-Box, Saturday 11 June 2016, All Saints, Mitcham

Surrey LUG - Fri, 15/04/2016 - 19:54
Start: 2016-06-11 12:00 End: 2016-06-11 12:00

We have regular sessions on the second Saturday of each month. Bring a 'box', bring a notebook, bring anything that might run Linux, or just bring yourself and enjoy socialising/learning/teaching or simply chilling out!

This month's meeting is at the All Saints Centre, Mitcham, Surrey.  CR4 4JN

New members are very welcome. We're not a cliquey bunch, so you won't feel out of place! Usually between 15 and 30 people come along.

Categories: LUG Community Blogs

Bring-A-Box, Saturday 14th May 2016

Surrey LUG - Fri, 15/04/2016 - 19:50
Start: 2016-05-14 12:00 End: 2016-05-14 12:00

Venue to be found.  Watch this space!  No!  Better still, find a venue and discuss it on the mailing list!

Categories: LUG Community Blogs

Bring-A-Box, Saturday 9th April 2016, Station pub, W Byfleet

Surrey LUG - Thu, 07/04/2016 - 16:04
Start: 2016-04-09 12:00 End: 2016-04-09 12:00

We have regular sessions on the second Saturday of each month. Bring a 'box', bring a notebook, bring anything that might run Linux, or just bring yourself and enjoy socialising/learning/teaching or simply chilling out!

This month's meeting is at the Station Pub in West Byfleet, Surrey.

New members are very welcome. We're not a cliquey bunch, so you won't feel out of place! Usually between 15 and 30 people come along.

Categories: LUG Community Blogs

Bring-A-Box, Saturday 12th March 2016, Lion Brewery, Ash

Surrey LUG - Thu, 10/03/2016 - 23:09
Start: 2016-03-12 12:00 End: 2016-03-12 12:00

We have regular sessions on the second Saturday of each month. Bring a 'box', bring a notebook, bring anything that might run Linux, or just bring yourself and enjoy socialising/learning/teaching or simply chilling out!

This month's meeting is at the Lion Brewery Pub in Ash, Surrey.

New members are very welcome. We're not a cliquey bunch, so you won't feel out of place! Usually between 15 and 30 people come along.

Categories: LUG Community Blogs

Bring-A-Box, Saturday 13th February 2016, Merstham

Surrey LUG - Mon, 08/02/2016 - 12:58
Start: 2016-02-13 12:00 End: 2016-02-13 12:00

We have regular sessions on the second Saturday of each month. Bring a 'box', bring a notebook, bring anything that might run Linux, or just bring yourself and enjoy socialising/learning/teaching or simply chilling out!

This month's meeting is at The Feathers Pub, Merstham

42 High St, Merstham, Redhill, Surrey, RH1 3EA ‎
01737 645643 ‎ · http://www.thefeathersmerstham.co.uk

NOTE the pub opens at 12 Noon.

Categories: LUG Community Blogs
Syndicate content