Archive for the ‘Tech’ Category

Async Rust: Why is it so Fast?

Wednesday, March 27th, 2024

Everyone knows that a web server, for example, will be able to serve up a lot more traffic if it is written using Rust async code rather than straight OS threads. Lots of people have the numbers to prove it.

But I have wondered why, and none of the explanations I have read have explained it to me. They all seem to boil down to a vague “because”.

Well, today, listening to the podcast Rustacean Station, the episode “Asynchronous Programming in Rust with Carl Fredrik Samson”, I think I figured it out. Yes they gave more explanation, not quite satisfying, but it helped, so I hit pause and started talking to myself like a crazy person (handy to be driving alone in cases like this) and I think I get it. Here it is.

Disclaimer

I am not experienced with Rust async code. I am still learning, I am certain to get things at least a little wrong, possibly a lot. I’m largely writing this for myself, but I’m putting it in public on the theory that others might find it illuminating even in its errors.

Multitasking Techniques

The problem is how to do multitasking, efficiently, so as to get the most work done. Let’s look at different approaches, in sequence, to land with a better understanding the virtues of Rust’s async abilities.

Processes

This is the oldest of these approaches. Here the operating system uses its god-like abilities to set up individual processes, each with an apparently complete computer at its disposal. Each process runs its own program, and can do what it wants, can (attempt to) access any memory location, and can happy crash when it messes up too badly, but likely with no other process noticing. Individual processes are so well isolated that unless they go to special effort to communicate with each other, or really load the machine with heavy work, can be completely ignorant of any other process even existing.

Processes are very general purpose. They are also relatively expensive, for the OS to switch context from one process to another everything that it can access has to be changed, and that is work. To do the work to switch processes dozens or it hundreds of times a second is no problem, but don’t try to do it hundreds of thousands of times a second. Being general purpose has a cost.

Threads

Threads are more lightweight than are processes, but at least on Linux, they are kind of the same. The way to start a new process on Linux is to fork a copy of oneself. There will now be two of you, clones of each other, both running at once, which seems a bit wrong and pointless. But you wrote the program they are running, and there is a way for the two copies to each check and see which copy it is. One can discover it is the original and the other discover it is the copy, and they can do different things in the two cases. A common thing for a new process to do is to start running an entirely new program.

So what is a thread? Well, when forking a Linux process one can choose how much is shared between the old and the new. In the case of a thread they still share the same memory. Sharing memory means they can cooperate in their work, providing the sharing is done correctly.

Threads do not share everything. Each thread has its own copy of the CPU registers, and each has its own stack. So we are still pretty general purpose, the different threads could be working on different problems independently, more likely they cooperate, though that is a lot harder to program correctly.

The primary value to threads isn’t that their context is faster to switch (thought threads are little faster), the motivation for threads is to be able to very efficiently share memory and so programs take take advantage of multiple CPUs for some single purpose.

Green Threads

Green threads are very much like threads, except they live entirely in userland. Instead of the OS using its god-like powers to switch execution from one thread to another, the switching happens within the process. But the process doesn’t have any god-like powers over itself other than self-discipline. So green threads need to be written in a slightly special way to make the context switching possible at all. When context switches do happen there is still a similar amount of work needed, but there is a little win in that the OS doesn’t have to do it, so no context switch into kernel mode. Remember, any system call into the Linux kernel takes longer than a local function call.

So green threads do switch faster than can OS threads, and early Rust, before 1.0, did have some sort of green threads. But it is long gone.

Preemptive vs. Cooperative Multitasking

Time for a little detour. Everything up to this point has been preemptive multitasking. At one moment one program (or thread) is running then whoosh the next moment suddenly another program or thread will start running instead. The code that is being run and then not run doesn’t have any control, nor even any easy knowledge, of this change.

In cooperative multitasking these unanticipated changes do not happen, rather cooperative multitasking requires code regularly yield the CPU and let other code run. If it doesn’t yield, other code can’t run. This is both good and bad. The good is if code has valuable and important work it is doing, it won’t be interrupted. The bad is if some code is churning away on something unimportant it can keep other code from running. Cooperative multitasking assumes responsible—cooperative—behavior.

Fun fact: The original Macintosh, way back when, before you were born (not all of you, but an awful lot of you), had cooperative multitasking. It worked amazingly well, but a lot of people scoffed, they wanted “real” multitasking. Meaning preemptive multitasking. But cooperative multitasking is also real.

Rust Knows More

Async Rust benefits from being Rust.

I like to say that writing a multithreaded program is easy, but maintaining a multithreaded program is mostly impossible. (And writing a multithreaded program of any size takes long enough that some of the effort is effectively maintenance before you are done…which means you are screwed.)

Rust makes multithreaded programming possible (if not easy) by insisting that the program explicitly say a bunch of stuff that in other languages would be in the comments. “Be sure to use this mutex before accessing that data structure.”, for example.

The result is Rust knows a lot about your program and how all data is shared and not shared. This is key to async Rust.

Rust Async: Cooperative

Rust async is cooperative multitasking. (Mostly: you can also spawn threads and that is no longer just cooperative.)

When you write async Rust code you can be assured that once your code is executing the Rust async stuff will not halt you to start running some other async code, except at clear points, where you know this might happen.

These context switches can happen whenever you call code that would block, normally this is any code that might wait for IO. Such as waiting for network activity, waiting for a disk, or waiting for a user to type something. Or, you can explicitly yield with yield_now() or maybe you have a reason to sleep() for a specific amount of time. But the key point is that execution can only switch from one hunk of code to another hunk of code at specific and well defined points.

Here async draws on Rust’s requirement that we be all precise about what data is where and who has access to it for what purposes. I don’t know the details, but given all that information about data, plus knowledge of all the locations where execution might change, the Rust compiler works out a state machine for how to do all of the possible execution transitions. (I’m a bit amazed that this is possible, but I’ll believe it works, people use it for real work.)

And here is where the efficiency seems to lie: To switch context from one async execution path to another async execution path nothing really has to happen: no stacks need to be swapped out, no CPU register files need to be swapped, no MMU configurations need to be touched, nothing has to happen inside the kernel, rather we are still just running a Rust program, and the Rust program is just doing something else, not that different from branches you explicitly put in your own code.

Put another way, Rust’s passion for zero cost abstractions apply here, too. Rust only has to mess with the minimum to start working on something different, and that means it will be fast.

At least I think so. Note I used the weaselly word “just” twice in that paragraph. Always be suspicious of the word “just” (along with the words “exactly” and “simply”, and …).

Concluding

The same way that Rust doesn’t have a garbage collector, because it largely manages to analyze away the whole problem at compile time, Rust async seems to make context switching costs go away by mostly analyzing away the code that implements context switches, at compile time. This is what makes it fast.

In async Rust there is still a “runtime” (more than one mutually incompatible runtime option to choose from, annoyingly), and it still needs to have code that talks to the OS to find out about IO that has unblocked and connect that up with pending awaits in your Rust code, and a scheduler that decides what to run next when blocked execution is unblocked. This is well smaller than a big OS. It is still a Rust crate, part of what the compiler will be optimizing along with your code, and something to be aware of. When you see an “async” or an “await” in code a lot of magic is happening in there.

Stuff I will learn more about as I dig deeper and use async more.

-kb

Epilogue

It occurs to me another important reason for async being faster than multithreaded systems. Not just Rust async, but other async systems, too: The scheduler has a lot of information about the program being run. It will have choices in what to schedule (if more than one task becomes unblocked, which should run?). It can maybe put the two together and be smart about what it schedules.

For example, if there happens to be a crossbeam channel that is filled by only one task, but is drained by many, and if several are waiting to pull work out, it might be a really smart to give more priority to the one that puts work in so as to unblock those ready to drain it. The scheduler has information about every blocked task and can know dependencies between them. Clever programming in the scheduler might make a really big difference in total performance.

Also, the scheduler is in a position to keep statistics about system performance. To the extent work loads follow patterns it might be able to dynamically tune how it schedules unblocked work. For example, if the code that fills the hypothetical crossbeam channel runs very quickly but the code the drains it takes a lot of time and can be processed in parallel, maybe statistics can reveal that, and used to put more priority on filling the channel.

Some of these sorts of tricks can and are done by OS schedulers, too, but an async scheduler runs inside the program and can have a lot more information with which to make such choices.

I suspect this is going to be an area of ongoing work, and I suppose it makes it a little less annoying that we have multiple Rust async runtimes, if that multitude allows more innovation to happen.

©2024 Kent Borg

Using emacs as a Rust IDE

Friday, March 1st, 2024

Turns out I don’t like Helix very much. The problem? My fingers know emacs. I hate emacs, but it is what my fingers know.

So I decided to figure out how to get emacs to be a Rust IDE. This was made a little tricky because I had attempted to do so a few years ago, when things were rougher, and the solution was to start over and not try to fix the old. Luckily I do not customize emacs a bunch, so starting over isn’t such a problem. I am not going to get into the details of how I did it, but I installed:

  • rust-analyzer
  • eglot
  • lsp-mode
  • lsp-ui
  • company
  • lsp-treemacs
I don’t know that that be a sensible collection, but they do do some nice stuff. This is really a cheat-sheet for myself, but I put it in public just in case anyone else finds it useful.
(This is a work-in-progress.)
  • mouse over something that has a definition and a popup should appear
  • M-. to see something’s definition
  • M-? to search for occurrences
  • M-, to navigate backwards
  • start typing and auto completion options should appear
  • M-x eglot- TAB to see various commands
  • M-x rust-run-clippy
  • right mouse
  • note the Flymake menu
©2024 Kent Borg

Helix, Terminal-based Rust IDE

Monday, February 26th, 2024

I was skimming through the latest 2023 Rust Survey, and I noticed the 5th most popular “editor or IDE setup” is Helix. (Just ahead of emacs, even!) What the heck is Helix?

Helix seems to be:

  • Rust IDE, seems to do other languages, too, but I’m interested in Rust.
  • Heavily inspired by vim,
  • Written in Rust
  • Terminal-based (run remotely on some distant or headless target machine)
  • Open source
Based on vim? Ugh. I long ago learned vi (and vim) just enough to be able to get it to do basic stuff. Intriguing. Let me give it a try.

This post? It started as my own cheat sheet, and I decided it might be useful to others. The goal is to tell you the very basics, enough to use Helix, not enough to get good at it. (I’m not.)

Installation

I decided to build it myself, they said:

$ git clone https://github.com/helix-editor/helix
 […]
$ cd helix/
$ cargo install --path helix-term --locked
 […]

The “–locked” option gave me a very unnerving warning:

warning: package `cc v1.0.85` in Cargo.lock is yanked in registry `crates-io`, consider running without --locked

But without that option it failed, so I kept the “–locked”.

On my couple year old laptop it took about 5-minutes. First time I tried to build it on a Raspberry Pi Zero W, running inside an emacs shell buffer, it died after filling up all of its 512MB of RAM and 100 MB of “swap”, before it finished. I’m trying again not inside emacs, I’ll let you know.

Also:

$ ln -Ts $PWD/runtime ~/.config/helix/runtime

That last part is necessary to have syntax highlighting work, etc.

Using Helix, Setting the Stage

I do not find it obvious how to use Helix, but it is based on vim, so what would I expect?

First, how do I even run it? No, typing “helix” will not work. Because that would be too obvious. And because nostalgia and tradition, I suppose, the executable is called “hx”. Run that and you are in. (But do you know how to get back out? Keep reading.)

Second, as it is based on vim (which is based on vi), it is very modal. The design of vi dates back to the beginning of time (1976), when for $5,000 (in today’s dollars) would buy an ADM-3A CRT terminal, which could display 24-lines of 80-characters each! It was the hot new technology, and it didn’t even have dedicated arrow keys. The computer mouse, and graphical user interfaces of any sort, had barely been seen in public. And “power users” would scoff at such silliness for many years to come. This is a modal, keyboard-based interface.

In vi there are two enormous modes. You can be in “insert mode”, where typing “hjkl” results in “hjkl” appearing in your document, or you can be in “normal mode” where pressing “hjkl” moves the cursor left, down, up, and then right. Tending to leave you back where you started.

We are about to get very modal.

Insert Mode and Normal Mode

When you first run Helix you will be in normal  mode. In the bottom left corner it will say “NOR” indicating NORmal mode, and when you are in insert mode it will say “INS” for INSert mode.

Time for our first cheat sheet items:

  • i” puts you in insert mode. This is where you can type stuff. (And because current keyboards have arrow keys, more around, too.)
  • Type stuff and it will be inserted into your file.
  • Arrow keys, pageup, pagedown, home, end, backspace, forward delete all do as you would hope.
  • Escape key gets you out of insert mode, back to normal mode, where the letter keys do not type those letters, but do other things. No matter what you are doing in Helix, the escape key seems to generally be a safe thing to press; wherever you are, press it a few times until you will get back to familiar territory.
Everything is organized around these two big modes: Normal  mode, and insert mode.

Command Mode

The normal mode is for single-keystroke stuff. But opening and saving files isn’t single keystroke territory (file names can be long), so there needs to be a command mode that accepts multiple keystrokes.

  • :” command mode commands all begin with colon.
  • :quit” (or “:q“), followed by another press of enter, will quit; though it will stop if you have unsaved changes.
  • :q!” will quit without saving changes. If you are in a panic and need to get out (without saving changes) press escape a couple times “:q!”, then enter, and you will be free again.
  • :write” (or “:w“) will save changes to current file
  • :w somefile.rs” will save to a file called “somefile.rs”.
  • “:open differentfile.txt” (or “:o” …) will open a new or existing file called “differentfile.txt”
  • :reload (or “:rl“) revert current buffer, discarding any changes.
  • Escape will get you out of this mini mode and any command you have partially typed, and put you back to the regular normal mode.
In Helix pressing “:” will immediately show you lots of available commands, I don’t know what most of them are. I don’t think there is a way navigate through these with arrow keys, and that text seems there merely to guide further typing As you enter more letter keys the number of possible completions will be reduced accordingly. This feels like a dangerous mode to me, because other than the name of the command, I can find no on-screen documentation of these commands. Explore here, but maybe carefully.

More Normal Mode: Real Cheat Sheet, Finally

At this point I think you know enough to do the most basic editing. Hooray! But only just barely enough. So press escape, to be sure you are back to the normal mode (NOR in the corner), let the cheat sheet begin!

We all make mistakes:

  • u – undo
  • U – redo

Navigation:

  • hjkl - move cursor, pretend those letter keys are labeled: ⇠⇣⇡⇢ (those letters might already be under your fingers, so maybe it is worth learning them, or maybe you are on a vintage ADM-3A terminal with no dedicated arrow keys—but probably just use arrow keys for now).
  • w -  word forward
  • W -  WORD (larger concept of word) forward
  • b -  back one word
  • B -  back one WORD
  • e -  end of word
  • pageup – page up
  • pagedown – page down
  • ctrl-u – half page up
  • ctrl-d – half page down
  • nnnG – goto line nnn
  • 4k (or 4 then up arrow) – move up 4-lines, etc.
Searching:
  • /xyz – search forward for “xyz”
  • ?xyz – search backwards for “xyz”
Searching is also another mode. While typing your search string Helix will show you the next text that matches your search string so far. Press the enter key and now the search is done. At the point you are back to normal mode.

Once out of search mode:

  • n – next matching search string.
  • N – previous matching search string.

Making a selection—another mode:

  • v – enter (or leave) select mode
This is a little like holding the mouse -specificdown in a graphical editor, combined with navigation, you can make a selection. A bit like dragging a mouse through text.

Once you have a selection you can use copy and paste:

  • y – copy (yank) selection
  • p – paste after selection
  • P – paste before selection
  • d – delete selection
  • x – shortcut to select a line, without being in selection mode.
  • xx – select two lines, etc.
  • xd – select line, and delete it, etc.
At this point you have enough to use Helix as a basic editor: run and quit the program; type stuff; open, save, and close files; navigate through your file; search; copy and paste; undo and redo. Nothing very fancy. Time for another mode, where they seem to keep all the fancy stuff.

Insert Mode

The simplest thing to do in insert mode is to type text. In addition those extra keys that didn’t exist in 1976 also do what you would expect: arrow keys, backspace, forward delete, home, end, page up, page down. Beyond them there are a lot of other things that can be done:
  • ctrl-k Delete to end of line.
  • tab When offered a completion item, move to next

Space Bar Mode

When in normal mode, pressing the space bar puts you in an interactive menu world (and the escape key still works to get you out). Press space and cool options appear, each begins with a single letter followed by a little explanatory text, press the space bar again (or press escape or an arrow key) and this menu disappears. Press the key for one of the letters and you will get a new menu. I appreciate that the choices in this menu system appear to be pretty clear and safe to explore. Go take a look.

Here are some of the editor things I’ve discovered in my exploring, and exploring here is nice:
  • f – file picker. Cooler than the “:o” feature of the command mode. Once you are in the list of files up and down keys, page up and down keys, all work as you might hope.
  • w – window picker. In here you can split your current window horizontally or vertically. The result is tiled panes not overlapping windows, but that’s a good thing. You can close a window pane. You can move from one of these window panes to another. This is starting to turn into a useful editor!
  • b – buffer picker. You can have more than one file open at a time. How many files you have open and how many window panes you have displayed (and what they display) are different things. This editor is getting more useful.
  • y, p – more copy and paste features.
Here are—finally—some of the Rust-specific things I have discovered:
  • s – symbol picker. Choose from the variables, structs, constants, functions, etc., that you have actually defined, rather than mistyping them from memory.
  • S – workplace picker. Like the symbol picker, but seems to only offer the public symbols.
  • r – rename. Seems to work one variables, struct names and members, function names, etc. Cool.
  • / – global search.
  • k – Documentation for whatever thing the cursor is in. (To scroll the result use ctrl-u and crtl-d.)
  • a – perform code action. A bunch of very Rust-specific options that you should explore a little, maybe once all this other stuff has settled in.

Conclusion

So far Helix looks good. By being limited to a keyboard and having no GUI, and being based on VIM, it seems a lot more constrained than the other, flashier, disorienting, featuritis-plagued IDEs that I have tried in recent years. The space bar mode is nicely self-documenting, as opposed to all the magic single-key “What did I just do when I bumped that key‽‽” of VS Code or some Jetbrains product.  At least for me all this makes Helix a lot easier to learn.

Epilogue

Oh, and my native Raspberry PI Zero W build keeps failing as I try various experiments. I would like this because I’m programming some Pi Zero-specific hardware…

Followups: Syntax Highlighting

I hate low contrast, gray on gray displays, so to have more choices:

$ git clone https://github.com/CptPotato/helix-themes.git
$ cd helix-themes
$ ./build.sh
$ mkdir ~/.config/helix/themes/
$ cp build/* ~/.config/helix/themes/

Now I can select the theme I want by editing ~/.config/helix/config.toml. Um, I think I got my config.toml from configuration section of the Helix docs.

©2024 Kent Borg

Python is a Great Prototyping Language…but One Should Never Ship a Prototype

Tuesday, May 26th, 2020

I really like how Python lets me start to get things working before everything is working. I can fire up an interactive debugger and immediately start playing with some library I Googled up and think I might need, quickly get it doing stuff, plug it in to other code and quickly get the whole doing useful stuff.

I can get my Python program in a useful state before I have really decided what I wanted it to do, and well before I have stopped to think hard about the best way to do it.

This kind of exploratory programming is exactly what is needed to develop a prototype. But never “ship” the prototype!

Here is an analogy to the physical world: there are prototyping materials that are easy to work with but are not as durable nor economical as are materials suited to real manufacturing. For an extreme example, automobile bodies used to be prototyped, at least in part, with modeling clay. And the very properties that make modeling clay good for prototyping make it terrible for manufacturing. (Take it to go buy a Christmas tree, strapped on top.)

Similarly, in the case of Python, the key property that makes it good for prototyping, makes it terrible for “real” programs: Probably the biggest thing that makes Python powerful is precisely that it allows the programmer to defer so many decisions. What kind of parameter does the function take? “A parameter called X!” Not very useful. Even if the parameter is called something like “address_list”, that only hints–it might not actually be a list, maybe the address “list” is in a dictionary and the keys are customer numbers. (Likely.) And even if we really honestly know the address_list is a Python list. Okay…a list of what? Let’s guess dictionaries, Python loves dictionaries. And what will be in the dictionary? Whatever anyone anywhere else in the code might manage to put in there–or remove from there. And it gets worse: Some programmers think it is cool to put “**kwargs” in the parameters, which means we don’t even know what the parameters to the function are! We have to examine every line of code that might call this function to see what the possible parameters are, and even then you will see (you just now it) that some of that code is going to be passing a dictionary that is only known at runtime.

The fact the programmer doesn’t have to decide what s/he is doing can give the illusion that real programming is happening really fast, but there is an illusion there. A dangerous and beguiling illusion. Worse, of course, is when such dynamic features are actively abused (see kwargs), but merely deciding to use a simple list yet having no good way to pin down what is in it is such a rich place to hide bugs.

Strongly Typed Language: Python

There is this idea that compiled languages such as C (or C++, all the weaknesses of C without the virtues of being a small and elegant language) are strongly typed but an interpreted language such as Python is not. This is half-right.

In C you have to say what kind of data goes into your variable.

In Python you can put whatever you want in your variable–a string, a boolean, some kind of number, some enormous data structure, a function, or None. Not only can you put what you want in there, you can change it at your whim; in one line the you might declare an instance of the class your variable holds, and a couple of lines later (or a different thread, if it can get a chance to run) set your variable to 42. Python is very liberal about such things.

But this doesn’t mean Python isn’t strongly typed! It is very strongly typed, it just doesn’t make up its mind about types until the last possible moment, at runtime. Repeatedly. Every time through your loop.

In fact, Python does almost nothing but constantly checking types of things. It takes much longer to check the types of two variables for adding than it takes to actually add them. (To check whether they are numbers and that adding is sensible, and how to add these particular numbers–assuming they proved to be numbers. Python needs to check a lot before it can do the addition.)

Deferred Work Doesn’t Go Away

It is presumably important to you that when the Python code runs it not crash. One would think. In which case doing that clever thing of instantiating a class instance from a variable at one moment and doing arithmetic on 42 the next had better be done right because the reverse operations will not work. Even doing unclever things, such as misspelling a variable name and accidentally doing arithmetic on a class definition with a similar spelling is a bad idea.

And though Python will catch both of these mistakes if you make them, it will only do so if you exercise the right lines of code with the right (unfortunate) values. And only if the right person is watching in the right way will it do any good.

It is really hard to thoroughly exercise code. And in the case of a very dynamic language like Python the permutations are so great that it really isn’t possible.

Yes, Compilers are Annoying

In statically typed, compiled languages, it is more work to make the compiler happy, but a benefit is the compiler will prevent these sorts of errors. It is less work in total to catch a type problem up-front than to have to do it in the debugger and in vague bug reports from users. Unless you are planning to defer some of the work forever, planning on never finding and fixing some of the bugs…

Yes, Compilers are Inflexible

Yes. And in a good way, if it prevents accidentally doing arithmetic on a class definition.

But what about cases where one needs to be clever. Maybe not so clever as to mess with class definitions at runtime, but something more conventional, such as wanting either a value like 42 or some other flag value (such as Python’s None), isn’t that reasonable?

Yes. And compiled languages allow such things. Some in safe ways, even.

(Some Compilers are Nice)

The Rust compiler is demanding but in exchange lots of bugs simply won’t exist once the compiler is happy.

Rust: Not as slow as C without being as low-level as Python.

Prototypes are Expensive to Operate

I would like to see some hard numbers, but it feels to me like Python must spend a hundred times as much effort constantly checking the runtime type of every bit of data as it does doing real work on that data. Certainly Python is not very efficient, whatever the ratio. How much carbon is released just because of Python?

-kb, the Kent who is looking opportunities to finally get good at Rust.

P.S. Comments are broken and have been for sometime. Sorry.

©2020 Kent Borg

Kent’s Super-Simple, Excellent Password Advice

Thursday, September 22nd, 2016

This excellent advice is simple, in fact its excellence depends upon being simple. Complicated is the enemy of security. If you follow this advice you will be among a very rare elite in how secure your passwords will be.

Four parts:

1. Write down your passwords. On real paper, with a real pen or pencil, and keep the list safe. If you want to get fancy, maybe don’t quite tell the truth, at least not the whole truth, maybe leave something off each password (something you will remember), so if someone finds the list they won’t quite know any of the passwords on the list. And keep the list safe.

2. Now that you can keep track of what your passwords are, never recycle passwords between accounts. So, if someone breaks into one site, your other accounts aren’t at risk. (Today’s news, as I write this, is information on 500,000,000 accounts were stolen from Yahoo.) Don’t reuse passwords in different places.

3. When you make up a new password, dream up something you think no one will guess. (I know, you already do that.) Now, to be extra secure, add something even you couldn’t guess. Maybe look at the time, exactly how many minutes past the hour? Include that in the password. Or look around you, pick something else—but pick something you could not anticipate—and include it as part of the password.

4. Keep this entirely manual, the whole approach is low-tech for a reason. Computers are usually pretty insecure. (Ask Yahoo…) Don’t automate any of it, because that’s really hard to do safely (ask Yahoo), keep it manual. Don’t even photocopy your password list, because copiers are really computers these days. Don’t take a picture of the list, because cameras are also computers these days. Yes, backups are good, but sorry that has to be manual. The benefit is, as long as you keep all of this manual, you can trust your common sense, because you will understand every aspect, you have real expertise manual stuff because you can see it.

That’s it. Low-tech as hell, which means most techies will hate it, but who cares that it’s controversial as hell? It’s smart. Because it is simple.

-kb

P.S. And I really am so very sorry you can’t use a password manager program, but they are just too complicated, they will have security problems, admit it, you know it in your heart they will. Don’t trust them.

Snowden, the Movie

Friday, September 16th, 2016

I went to one of the first Boston matinees of the movie Snowden today.

It was all very familiar territory: it could have been boring or–as with any subject I know a lot about–it could have been excruciating in its errors. It was neither. It held my attention, it did not disappoint.

But was it a good movie? I usually have tons of opinions, I fret over whether a movie hits the ten-minute mark right, whether the script is “economical”, whether characters are compelling, whether the plot is interesting. In this case I can’t say, I am not unbiased: I am an American. And this is really important material–important to any American.

I do know it was at least a competent movie, because it had me wanting to cry. I knew Edward Snowden was a hero, but Oliver Stone tugs for tears. At least from me.

Is it a great movie? Probably not, just because great movies are rare. But I don’t know. Ask me in a few years, I’ll know better. But right now I am kinda choked up over a man whose illusions were shattered, followed by his world being shattered as he followed his conscience with selfless acts.

Another bit of praise: Usually it is painful to see a movie on a topic that I know something about, worse if the movie is technical, and far worse if it is about a technical topic I know something about. This movie did well by that measure.

-kb, the Kent who thinks the three branches of government should not be secret legislative measures, implemented by secret executive orders and agencies, overseen by secret courts.

©2016 Kent Borg

Touchscreen Password Idea

Monday, February 1st, 2016

Passwords are a problem, and lots of people say they are doomed, but I have seen no good alternatives, so I sometimes think about making them better.

Touchscreens are important yet really hard to enter good passwords.

Also, I would like to do more of a “key exchange” when entering my password. I use different computers and I don’t reuse passwords between these computers, which means I sometimes enter a password for the wrong computer. Oops! Some sort of richer interaction with the other end would prevent this.

So here is my (embrionic) idea.

Have the password be a location in a virtual 3D space. Use the 3D hardware capabilities of phones and tablets and have the user drag around the screen to drive to the location that is the password. By having different randomly chosen starting points in the 3D space for each login attempt a simple “key logger” is made more difficult as is reading screen smudges. By having more of the space revealed as the user navigates the computer has to reveal more information in response to the user’s input, making it more of a “key exchange” and making the space richer and so lengthing the password.

Put another way: a complex 3D space, uniquely generated for each user. The password is a “secret button” somewhere is the space. To authenticate the computer starts the user in some random location and the user flies through the space and touches the secret button but no other.

Shoulder surfing is a problem, but once the user gets good at it s/he might be swooping through so fast that a casual observer might have a hard time realizing what just happened. Particularly if there were a needle-threading aspect where some routes are good and other are not.

By using the full power of the GPU it also puts a limit on how far away a man-in-the-middle could be. (Which makes remote authentication tricky.)

By drawing on the user’s motor skills there might be a way to drop the password down in the brain so the user doesn’t know it in a way that can be told to others. Make the password more like a customized motor skill.

-kb

©2016 Kent Borg

An Idea for Doing Background Removal from a Sequence of Stationary Images, Manual-Style [Updated]

Monday, January 11th, 2016

Update: Finally looking at implementing this and I realize that thinking of that fully populated tree is probably good for understanding it, I don’t need to store anything but the left edge. When a new frame comes in, I will calculate a new left edge based on the new frame and the previous left edge.

My memory requirements for a size N triangle are then N-1 (I don’t need to save the result if no one will ask for it again) and while calculating I need to store N-frames plus whatever my image processing library uses, etc. The fact this scales linearly with the length of my background history is nice, I can go long for cheap. The time to calculate does scale with the length of the history, but still linear.

Another thought: natural vision systems pretty much only see change, make something stand still long enough and it will go away. It might make sense to spend the linear time and memory to compute a long history, but allow the caller to choose how quickly stationery objects disappear; compute the whole left edge to maintain the chain of history, but choose to look at a more recent step.

A final correction: This is not really parallelizible, the library doing the underlying image processing could well parallelize, but these steps need to be done in sequence.

Back to the original post…

[Warning, this completely techie, musing about computer vision by someone who doesn't really know much about computer vision. But heck, sometimes those who don't know the right way to do something occasionally come up with something cool.]

How about something like this. Maintain a triangular poly tree where at the base is a history of recent frames.

                              x
                             / \
                            x   x
                           / \ / \
                          x   x   x
                         / \ / \ / \
                        x   x   x   x
                       / \ / \ / \ / \
                      x   x   x   x   x
                     / \ / \ / \ / \ / \
                    x   x   x   x   x   x
Newer              / \ / \ / \ / \ / \ / \              Older
 <-               x   x   x   x   x   x   x               ->
                 / \ / \ / \ / \ / \ / \ / \
                x   x   x   x   x   x   x   x
               / \ / \ / \ / \ / \ / \ / \ / \
              x   x   x   x   x   x   x   x   x
             / \ / \ / \ / \ / \ / \ / \ / \ / \
            x   x   x   x   x   x   x   x   x   x
           / \ / \ / \ / \ / \ / \ / \ / \ / \ / \
          x   x   x   x   x   x   x   x   x   x   x
         / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \
        x   x   x   x   x   x   x   x   x   x   x   x
       / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \
      x   x   x   x   x   x   x   x   x   x   x   x   x
     / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \
    x   x   x   x   x   x   x   x   x   x   x   x   x   x
   / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \
  x   x   x   x   x   x   x   x   x   x   x   x   x   x   x
 / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \
x   x   x   x   x   x   x   x   x   x   x   x   x   x   x   x
A 16-base tree then has 120 vertices in it. The 16 Xs along the base are 16 historical frames. At 5fps, this covers a history of three seconds.

The first row of Xs above the base is made by taking the two images below it and doing:

  • an absdiff;
  • a threshold of the result to make contours of the areas in common; and
  • using those contours, a masking of one of the frames to make masked   image of what is in common between the two.

This is a reduction operation. We start with whole frames and we produce new frames that are at most whole frames, but quite likely reduced (masked) frame areas.

At this point every pixel that has made it up to the second row is in some sense a good pixel, it has matched some other pixel.

We create the rest of the tree by continuing to do pair-wise operations on the images below, but the operation for the rest of the tree is a bit different from the first of operations.

  • To begin with we do do the same operation, we do the matching and reduction (for any area in both masks, if the pixels match the they get added to the output mask sent up to the next level).
  • But then we do a supplementing operation: for any pixels in one input mask but not the other input mask, they get added to the output mask and included in the output sent up to the next level.This continues at each layer to yield one masked image at the top.

I won’t know what this looks like until I see it, but imagine something moving through time, casting a shadow from under the pyramid, maybe tampering with, say 6-frames. Looking up the tree, it can only influence the triangle above it for 5-layers up, then is gets out-voted by constant stuff from before or after it in time.

This poly tree scheme is expensive and so can only go back a short distance in time. The number of operations in that whole triangle, to compute the apex is great, too much of a cheap CPU to calculate per frame at any reasonable frame rates. So instead we trade memory for CPU. We keep most all the output data from each new frame’s computation, and just computer the change for each. What is that computation of change?

- Age out the oldest frame: remove the 16 Xs that go down the right edge;

- Add the newest frame: a new X on the left at the bottom row; and

- Do the 15 calculations necessary to put 15 new Xs up the left edge above that new frame.

Motivation: I have played with OpenCV and the cv2.BackgroundSubtractorMOG() and cv2.BackgroundSubtractorMOG2() background removal functions and I don’t like them.

First, they aren’t working for me: old background information never ages out, a big change in scene continues to be included in the foreground and never displaces the old background.

Second, they are too slow. Particularly MOG2. I can’t keep up with a reasonable frame rate on a Raspberry Pi 2.
Falling back on a simple absdiff for motion detection I discovered I can, in place of the MOG2 that was falling behind, do a stupid loop of 30 absdiff’s and not fall behind. With this scheme I estimate I will have to do about that much work. And, unlike MOG2, this can be parallelized to multiple threads which can run on the multiple CPUs of a Raspberry Pi 2.
There are probably better ways, but it was easier to think up this one than to go read a couple books on computer vision. And it looks pretty easy to build. I just need to find the time to try it. And how do I efficiently represent a polytree in Python without breaking my brain. Will it be easy or hard?Will a nice recursive model work…?

-kb

©2016 Kent Borg

Which Gadgets Do I Need, Again

Saturday, October 17th, 2015

Four-years ago I asked the question “What gadgets do I need?”, and it seemed time to revisit the question.

First, what gadgets do I have?

Pebble Watch

This is my most-present gadget these days. I like that it is limited, it is too small a screen to offer a rich experience, and it is too small a case to have a large battery. So it is limited in its ambition and achieves that goal beautifully. It is an accessory to my phone.

Smart Phone

My Android phone (Nexus 4, soon to be Nexus 5x) is my next most present gadget, it is pretty much always on me (I am not a “set it down”, user, I have it on my belt). I am paranoid about keeping it charged, so not only do I charge it at night, I charge it in the car, I charge it at my desk at work.

But I don’t use it as much as I used to. I do not obsessively check it a hundred times a day, I let my Pebble let me know if something notable has happened, the battery life has gone up since I got the Pebble. I also use it less because of my next gadget, my tablet.

Nexus 7 Tablet

I now have the “2013″ version of the Nexus 7, which was a nice improvement on the first one (that I broke), and it is a shame they don’t sell something like it anymore.

It took me a while to figure out what it is good for, but I finally realized it is good for everything. I have it in my “purse” which is always close at hand, or I have it in my hand. It is also small enough to fit in most of my back pockets, I don’t sit down with it there, but I can free up my hands. I don’t have cellular data service for it, but wifi is common, and I can turn my phone into a wifi hotspot when I need to.

Notebook Computer

I don’t use it as much as I once did. I don’t carry it around as much as I did. But if I need to do “computing” or writing or real web surfing, it is critical. A real keyboard, a lot of capacity and power, and–notably to me–a computer that I largely control. I run Linux on my notebook and I can do what I want with it. On my other devices someone else is pretty much in control and I am just a user.

It is also a nice way to read the replica version of the New York Times. A big enough screen to handle the current physical layout of the real paper.

The old Thinkpad X230 I bought shortly after my earlier gadget post is getting physically tattered, but it is still a nice machine. The addition of an SSD and more RAM have kept it very useful. And the “Displayport” jack that I didn’t appreciate at first (“What’s that?”) is useful now. And the two USB jacks that are the newer “super speed” flavor (“What’s that?”) are even more useful now. The PCMCIA slot has been useless and the missing modem I might have once wondered over has not been missed–it has been a very nice computer.

I carry it in a bag to and from work everyday, I always bring that bag with me if I travel, and even on a day trip I usually bring it, but I don’t just carry the computer alone with me anymore, I have other toys, the computer has been displaced somewhat.

E-Book Reader

I have a Kindle “Paperwhite”, the second version of that. And it fits nicely in my purse. I can use the Kindle software on my phone or tablet, but they have disadvantages:

  • Distractions, the Paperwhite is limited in a good way (see Pebble Watch);
  • Battery life, my Android gizmos can’t touch the battery life of the e-reader;
  • Better display, easier on the eyes, good in bright light;
  • Better user interface, the builtin dictionary works better.

One shortcoming: The Kindle is slow and has a more limited display. If I am reading something that is not just linear text, something that has layout and pictures and charts that don’t reflow well for the Kindle screen, it doesn’t work so well. So where a given recipe might work well on a Kindle, cookbooks do not. And anything much like a textbook that is more than just text also does not work well.

I do sometimes jump from device to device, and having my reading synchronized is nice. I will read the bulk of a book on the Kindle but read bits here and there while waiting for the washing machine to finish, maybe–because I have my phone on me when I don’t have the Kindle.

I wish I made more time to read.

Ipod

Poor Ipod, I still carry it around (in the same bag as my notebook computer) but I don’t use it so much anymore. Why? Two reasons: Streaming radio stations on my Android devices (not streaming services, I actually like real radio stations from around the world) are my more common choice for music.

The second reason is the rather limited user interface on the Ipod. It has a lot of capacity and battery life, but it is hard to find things. One of my pet peeves is it is nearly impossible to listen to anything that came on more than one CD, because the persons who did the data entry were not consistent, and it is really hard to find the other discs of the opera, say. Further frustrations come from Apple neglecting their Itunes program on the Mac, they have made a ton of stupid changes that make it really hard to use.

Camera, Little

I no longer carry a little camera. My Android devices have pretty good cameras in them.

Camera, Big

Still photographs aren’t going away, but “still photography” is. That is, a conventional exposure, with a specific shutter speed, lens opening, focal length, focus distance, “film” speed–these are going away. There is a lot of much richer data collection that can produce a still picture with much more power–but that is another post I might get to later. In the mean time, I do have a big, old fashioned-ish, DSLR that I  needed for a specific project (35-mm slide digitizing, but that’s another post, too), and I sometimes get exercise by also carrying this camera.

This is not well resolved territory. It weighs a ton, but it has marvelous resolution and can see in the dark better than I can.

Hiking GPS

My old Garmin is feeling lonely, who buys a hiking GPS anymore? For about a year there I couldn’t even find it and feared I had left it on an airplane (I hadn’t). The point is it is at risk of being pointless. But it is not obsolete yet.

It works offline, when I am without cell coverage, where Google Maps becomes worthless. But I also have on offline map program that works well on my Android devices. Recently on a hike I pulled out my Nexus 7 several times for a good map of where I was, and finding where the trail we wanted branched off. No cell coverage, but it worked anyway.

I still carried the Garmin on that hike, it is much tougher, much better battery life (replaceable batteries!), and has a screen that works in sunlight. And it is small enough to have out and recording where I have gone, willing to help me retrace my way back to civilization. I also still use it in the car, knowing I can leave the main road and explore and it will show me a dotted line of where I have been.

But I also lost it for about a year and got along quite well without it.

I haven’t quite resolved where I keep it, hence my ability to lose it. Interesting how this will turn out.

I do worry that Garmin will lose interest in this model and quit selling maps, I should update my current North America maps if I can, my current data is starting to get obsolete. But I think Garmin isn’t interested in this business model, I think they want to sell me an expiring subscription, they want me more online. At that point they will lose me, maybe a find some armored “phone” with long battery life and use it as my new hiking GPS.

Extra Ipod Nano

It is the larger model Nano, not the really tiny square one. I won it in a drawing at some seminar. It has a fair amount of capacity–many, many hours of music. And it has an FM radio. And it is tiny enough to keep in my purse. But I seldom use it.

It does not have great battery life, but it is a different battery from my phone, I can wear it down without worry. I just have to remember to recharge it occasionally. I forget I have it.

Smart phones should have FM radios in them. Some have the hardware but the software is missing. For me it is worth carrying, but I don’t know that this device really has a niche that will last, I suspect not.

Missing Item: Big Tablet?

There is one thing I fear I need to add to my load: a large tablet computer. I want a big and very detailed screen for looking at detailed information: Maps and pictures and other graphics. Google announced one for later this year that might be tempting. I think this would go in my heavy bag with my notebook computer. I seriously doubt I would carry it around as I do with my Nexus 7, but I think it would be nice for specific purposes.

Hey! New York Times: I want the replica edition for Android!

Radio

Something I didn’t really talk about in my previous gadget post is a radio. I listen to the radio a lot and I think 4-years ago I was still carrying around a portable radio, listening to NPR. Now I use my Androids for that. It worries me that I can’t easily listen to local radio if something goes wrong with larger technical infrastructure, and that is the real reason I try to keep that Ipod Nano charged.

Other

In the bag with my computer (a bag that is seldom far), I have a dual USB charger, and a reasonable collection of cables, including a short AC extension cord with three outlets on the end–very hand in airports when I would like to share a rare outlet. And an external USB battery pack so I can revive a phone, or run that Ipod Nano for hours on end if I need to. In my purse I have tiny car and AC to USB power adapters, and a couple USB cables. Oh, and don’t forget a little flashlight on my keyring, so much better than using my phone as a light.

Conclusions

Though my Android devices fill a lot of functions, they haven’t completely displaced that many gadgets: the Ipod, hiking GPS, and radio are endangered, but not yet banished. Good thing I am still young enough to haul around so much crap.

Something this technology has displaced: a lot of paper. We still buy travel books but we don’t carry them around much. We very seldom buy paper maps. (And when I do get a paper map I sometimes photograph it with my Nexus 7 and use it that way. I also photograph the big maps at the trail head instead of trying to just remember them.) Highway maps are long gone from our routine. And I miss them, spreading out a big map is still nice. That is why I want a big tablet, I think it will fill that desire.

There is still a lot happening in the gadget department. I wonder whether 4-years from now a followup posting would show more or less change? Will my load finally start to shrink?

-kb
©2015 Kent Borg

Pebble Battery Life

Tuesday, March 10th, 2015

I decided that, cool as the shake-to-light feature is on the Pebble watch, I turned it off. Saves battery life. And, frankly, it can be annoying if one sleeps with a watch or is in a movie theater.

But I don’t know how long my year-plus-old watch lasts in this setting because I have taken to setting it to charge when I get up in the morning, before I take a shower. I put it back on when I get dressed. Yes, I know the watch can go in the shower, but it still gets in the way of, say , washing my wrist.

-kb

©2015 Kent Borg