Hello. It is 2 PM. This morning, I took Eamon to the Maritime museum. It’s an OK museum, at least so far as children are concerned. They have plenty of kids stuff but it’s not nearly as exciting as the science museum, where Eamon is actually interested in the artifacts.

Yesterday, I got a MySQL sysbench test running. Today, I am going to pivot slightly. I was running things on a crappy (2 CPUs with 2 GB memory) VM yesterday. Today, I want to use an Intel MacBook Pro I have lying around. But I’d like to install Linux on it because, well, Linux is much easier to work with.

I actually already tried to install Debian 11 on it, but it couldn’t detect my network card, so I don’t think it even installed networking drivers, and it also can’t seem to launch any graphical interface. So I’m going to spend some time seeing if I can unfuck it. If I can unfuck it, I’ll install Ubuntu on it instead. The Ubuntu install procedure should go smoother.

Alright, so I’m dropped into a bash shell. How do I configure networking? I looked up “Debian networking” and it mentioned I can use network-manager. This sounds familiar. Do I have it installed? apt list does not show anything networking-related. It might live somewhere on the USB stick I used to install Debian, but, fuck it, I’m just going to reinstall MacOS on that laptop and then burn Ubuntu onto that USB stick. (It’s also the only computer I have with an USB A hole).

Meanwhile, yesterday I started playing with sysbench. Today I want to establish a benchmark while setting limits using docker compose. I actually began that yesterday.

As of 2023, resource limits and requests can be defined in a way that’ll be familiar to your average Kubernetes user.

    deploy:
      resources:
        limits:
          cpus: '1'
          memory: 1000M
        reservations:
          cpus: '1'
          memory: 1000M

How did I come up with these values?

The VM has 2 GB memory, and isn’t using much of it. So I figured I’d give MySQL half. In production, you’d give MySQL 80-90% of the overall memory allowance, depending on how much headroom you need for other crap you may need to run on the VM (e.g. useless AV software required by compliance), and you’d also tune innodb_buffer_pool_size to fully utilize it. In our case, it doesn’t matter because the whole database will fit into memory.

Regarding CPU, well, there are only 2 CPUs, and I figured I’ll give 1 to MySQL and the other CPU will be shared by everything else. This is, of course, really coarse, because Docker doesn’t control the entire system. So MySQL might not get a full CPU to itself because the OS might need to run other stuff. But, there isn’t anything really going on on this VM, so I doubt it’ll be an issue.

So, how can I verify that these limits actually do anything? Well, let me start with the limits commented out, and see if my p95 query duration is similar to yesterdays (~16 ms). Yep, today it’s 17.63ms. Now, let me turn on the limits. Nope, nothing changed. Well, probably because MySQL’s needs fit entirely into my limits. Let me set them much lower. Let’s be evil and set the CPU to like 0.1. (By the way, you may be wondering what that means. It means that the kernel will slice time into chunks, and out of any chunk, MySQL can only be busy 10% of the time). Ah, I can’t even run the benchmark because MySQL couldn’t handle the connection. Interesting! Anyway, when I increased it to the cpu limit to 0.5, the test ran. The p95 query duration is now 46.63ms! Clearly, MySQL was hitting the limit. I could also tell because to the side I was running docker stats, which prints container resource utilization. Actually, let me show you the output. It’s useful. This is what the output of docker stats looks like while a benchmark is running.

CONTAINER ID   NAME                                                 CPU %     MEM USAGE / LIMIT     MEM %     NET I/O           BLOCK I/O       PIDS
6b25c2535a0b   sysbench-mysql-basic-mysql-1                         50.57%    380.1MiB / 1000MiB    38.01%    1.46MB / 26.2MB   1.19MB / 30MB   40
63d59065b9fd   sysbench-mysql-basic_sysbench_run_run_fc3ee393dab3   10.17%    3.574MiB / 1.936GiB   0.18%     4.05MB / 227kB    918kB / 0B      2

Most importantly, look at the CPU % for sysbench-mysql-basic-mysql-1 (which is the container for the mysql service): 50%. That’s our target.

Anyway, now that I have confirmed that the resource limits actually do something, I’m going to increase them to 1. p95 query duration is back to 17ms. From docker stats I can see that we used about 70% of 1 CPU.

Now, how can I push it a little harder? An easy way would be to increase the number of sysbench threads. MySQL 8 is single-threaded in the sense that no parallelism will be used in the service of a single transaction. The current sysbench load can be handled with only 70% of one CPU. If I increase sysbench to 2 threads, I should expect p95 duration to drop below 17ms. If I allow MySQL to use up to 2 CPUs, then duration should be close to 17ms. Let’s verify!

According to help, I can set the number of threads with --threads N. Also, I realized that there’s a cleanup option. Should I be cleaning up and preparing between tests? Before I proceed to increasing threads, let me see what happens when I cleanup. Well, actually, is there any documentation on this? I shouldn’t need to experiment in order to figure out how to mix the prepare/run/cleanup commands. OK, an outdated-looking manpage says that some tests should be re-prepared between runs. The test LUA itself seems like it deletes its records after creating them. Also, in the shell I verified that the sbtest1 table created by the prepare command is unchanged after a test (it would be smart to actually diff them, but it would be a bit of a digression).

Aha, with 2 threads, I max’ed out CPU utilization. But p95 rose only to 23ms! Not bad. Let me see if it falls back to 16-18ms if I increase the limit to 2 CPUs. Nice – p95 is at 18ms. It makes sense that it’s not quite as low as before, as I have only 2 CPUs so we have to juggle both the sysbench threads as well as the MySQL threads.

Alright, what now? Well, on the side, I’ve been trying to unfuck my 2014 MacBook. I had to reinstall MacOS using Mac’s internet recovery thing, but that installed MacOS 10.10, which is so old that it can’t even do HTTPS right, so the Mac support site doesn’t load, which means I couldn’t even download a newer Mac version (Sierra) (also not available in the Mac app store). So I had to download it on my other computer, move it over to my old computer, and was able to upgrade. Now I’ll install Ubuntu, except this time I won’t delete the Mac partition. This thing has 4 2.8GHz i7 cores, so it’s much faster than the VM I’ve been using.

I want to manage this VM using Ansible, so I’ll set that up now. Well, actually, a SSH server isn’t installed on the machine, so I need to do that first.

On the Ubuntu host, as the dmitry user (which I created when installing Ubuntu):

sudo apt-get update
sudo apt-get install -y openssh-server curl
touch ~/.ssh/authorized_keys
chmod 0600 ~/.ssh/authorized_keys
curl https://github.com/dmazin.keys > ~/.ssh/authorized_keys

Hooray, I can SSH in now! I found the IP of the machine by looking at the connected clients in my router’s awful UI. I’ll reserve its IP so it’s easier to find it later. I’ll add its IP to /etc/hosts so I can connect to it easier next time. Its hostname is dmitry-mbp-15, so I’ll just do that for consistency.

Alright, now I can do some Ansibling. Well, wait. First, I want to figure out how to shut the laptop lid without the computer shutting down.

Ah, what the hell! I keep disconnecting. Why?!!! All I wanted was to run Ubuntu on a computer sitting 2 feet away from me. I mean, it’s not surprising that something horrible is happening: the laptops are separated by a power line networking kit, so maybe some massive packet loss is happening. Seemingly, the issue stopped for now. So let me continue.

First, I should create a new Ansible user and set it up. I’m doing these commands as the dmitry user.

sudo adduser ansible
sudo usermod -aG sudo ansible

sudo -u ansible mkdir /home/ansible/.ssh
sudo -u ansible chmod 0700 /home/ansible/.ssh
sudo -u ansible touch /home/ansible/.ssh/authorized_keys
sudo -u ansible chmod 0600 /home/ansible/.ssh/authorized_keys
curl https://github.com/dmazin.keys | sudo -u ansible tee -a /home/ansible/.ssh/authorized_keys

The reason I’m doing sudo -u ansible is so that the file ownership isn’t all screwed up. In the last command, I originally did this: sudo -u ansible curl https://github.com/dmazin.keys > /home/ansible/.ssh/authorized_keys but the two sides of the > are actually different processes, and sudo doesn’t affect the latter. So instead I passed it into a tee process running as ansible.

Now, I set up a simple playbook, but ansible-playbook isn’t working: Failed to connect to the host via ssh: ansible@dmitry-mbp-15: Permission denied (publickey,password).", "unreachable": true. I guess I need to pass the password (-k flag). When I do so, I now get to use the 'ssh' connection type with passwords, you must install the sshpass program. But, look, I don’t even want password authentication. Authentication should be entirely via keys. So, I need to disable that in the sshd config: PasswordAuthentication no.

Huh, now I’m unable to SSH in! I think that means that SSH pub key authentication was never working; I was simply using password authentication. Alright, well, why isn’t it working? Let’s see what the SSH logs say. First, in /etc/ssh/sshd_config, I need to change the log level to VERBOSE. Otherwise, failed attempts won’t be logged (the Red Book recommends always using this setting). And then I need to reload sshd. Note that you don’t need to restart, a HUP is enough: systemctl reload sshd. Another note: when you do this, existing connections are untouched. This is why I’m able to still be logged in to this host. Anyway, the logs say that I’m trying these two keys, and neither of them is authorized:

Mar  7 17:10:04 dmitry-mbp-15 sshd[9239]: Failed publickey for dmitry from 192.168.0.10 port 53572 ssh2: ED25519 SHA256:Kdmkt6Co73JdAFvAvlpYV8v8oqHkuDwQw0VYk5MPgUg
Mar  7 17:10:04 dmitry-mbp-15 sshd[9239]: Failed publickey for dmitry from 192.168.0.10 port 53572 ssh2: ED25519 SHA256:6Jmdl1/EwxkNMnGumJw19tBQDsTm7/GswkSFJjuRsMU

Indeed, I guess I haven’t uploaded either of those keys to GitHub:

-> % curl https://github.com/dmazin.keys
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIAsl83tdYrC5JWQC96SJonp1VxMdbhdi0s/O3vaghEWv

Oops. Well, let me upload one of them: gh ssh-key add ~/.ssh/dmitry-mba-m1.pub. Weird, it says I already uploaded it.

Alright, then, it must be a permissions issue or something else. Let me try specifying the key when I SSH, just for sanity (the output in the logs above must format the key differently than the output in dmazin.keys). Well, that’s bizarre! When I do so, ssh works.

Ah, well, it’s because I have configured SSH to use 1Password to provide ssh keys for all hosts, so the two public keys in my sshd log output above are coming from there. OK, well, that’s what I’ll do. I think it’s a good system. I’ll create a new private key in 1Password and use it for dmitry-mbp-15. Instead of uploading the pub keys to GitHub, I’ll just copy/paste them, which is not nearly as cool.

Alright, the playbook works now! Well, not really. Except now I get this mysterious error: Failed to lock apt for exclusive operation. Oh, right, here’s the playbook.

- name: General setup
  hosts: all
  tasks:
    - name: install some nice packages
      apt:
        update_cache: true
        pkg:
          - vim
          - zsh
          - fzf
          - ripgrep

It seems that this is happening during the apt-get update step. Will I get the same error if I try it directly? Yes, and I get permissions errors. Ah, right, it’s because I was executing the playbook without elevated privileges. I need to pass -Kb so that it tries to escalate privileges, and asks for a password. Wee, now it works!

I did one more thing: I changed a setting so that I can close the laptop lid without the host kicking me off. Here’s the final playbook.

Well, that’s enough for today! It’s nearly 6. Tomorrow, I’ll install Docker Engine on this machine and get the sysbench test running. I’m curious what throughput and p95 duration I can squeeze out of it!

If you read this far, maybe you’ll want to hear from me again.

If you’d like to discuss this post, please do reach out! My email is dm@cyberdemon.org.