What is kwtd?
The kwt package releases both
kwtd, one can think of
kwtd as a coordinator for user interaction via the browser, and
kwt as the executor that actually runs the commands. Guide is available for a demo.
kwtdneeds to be run. Convenience scripts are available to run it at startup.
kwtdis running, it is accessible on http://127.0.0.1:7171. It can only show and ingest menus, but not run the commands.
kwtexecutor must be connected to
kwtddeclaring a channel and a menu, e.g.
kwt start --channel 0 --menu python-demo.
- Finally a user can navigate to said channel on
kwtdand run commands defined in said menu.
The web stack of Gorilla + VueJS + Bootstrap is chosen out of simplicity and convenience. A deliberate decision is made not to use
npm, since it's foreign to the go ecosystem, which complicates continuous deployment (normal go containers images won't have
npm available) and worsens developer experience. Building includes these steps.
- Download 3rd party assets by building and running
go-assets-builderto build 3rd party and project assets into
- Build the web server with
Points to note.
- No internet is required for
- The guideline for layout and styling is simplicity (laziness).
- Asset files are obviously not bundled, which is fine since we only serve localhost.
- To help develop the frontend, env variable
ASSETS_ROOTcan be set so
kwtdreads assets from the file system.
kwtd server is the central exchange in the system, its job is to pass WebSocket messages between
kwt executor and the browser. Each channel contains an exchange loop, defined by
exch.go (github) using go channels (different from kwt channels).
kwtexecutor initiates the exchange by WebSocket connection to the channel's back endpoint
/ws/back/0with menu name as parameter.
- If the channel is available,
kwtdserver will initialize the exchange loop, load the menu (menu from server side is used), and create a random validation token, occupying the channel.
- The executor
GETthe channel endpoint
/channel/0to receive the menu and the validation token, and initializes its own exchange loop.
- The executor waits for a Command, renders and executes it, and sends the Output back line by line until the loop is broken by server or Ctrl^C.
- Once the exchange loop is established, the browser can connect to
/channel/0to send Commands and receive Outputs.
Some particularities makes this different from a traditional web server.
- Each channel accommodates at most one executor and one browser session. We are trying to basically run commands in a terminal, this is a very one-to-one mapped interaction model. If a second agent tries to connect to a channel from either side, it will be refused.
- Each time an exchange loop is established between an executor and a server, a random validation token is generated. The browser keeps this token and attaches it to Commands. If the executor sees a token it didn't expect, it won't execute the Command.
- There are three channels by hardcode, not ideal.
- The HTTP port is configurable on both
- The server serves
127.0.0.1, only allowing local traffic for security.
The browser receives a WebSocket message per line of command output. All of the outputs are kept for the channel, and can be optionally displayed by toggling the
Logs switch. The logs can cleared by going back to
Home. The output of the latest command is kept above the log display, it can be rendered as markdown by toggling the
Markdown switch. The markdown feature is implemented in hope that the formatting capabilities can enable more useful commands to be built, e.g. with hyperlinks, tables, or images.
The web interface was part of the first vision of the system, because I wanted a tool that allows non-developers to run commands easily. The core of the architecture is allowing a surrogate terminal program to be stuffed with declarative configurations and controlled by a UI. A contradicting but more popular approach to desktop productivity is to run all the actual commands (or call the APIs) from the UI application itself, successful examples include Raycast, Command E, Alfred. Here is why I did it differently.
The wheels have been invented. The terminal is very good at lots of things, and I already know the command to do XYZ, I shouldn't have to invest extra time to configure a productivity tool that work in an unfamiliar way. For example, I already know
aws commands very well, I'd like to configure the tool with this knowledge, instead of reading the docs on how the tool handles MySQL databases and configures credentials.
The execution is controlled. Running a terminal program makes me feel safe. I can Ctrl^C it or close the terminal to kill it, read its logs to know its alive, or configure its environment to run the correct version of Python. In contrast, a process running behind a UI application is much less controllable, especially when faced with problems.
The tool is easy to trust. Since the surrogate terminal program runs inside a environment defined by the user, the tool does not have to store the passwords or credentials. The system is also auditable, because all of the business logic is defined declaratively in config files and the tool itself is dumb. The tool can be trusted because it knows very little.
On the other hand, I do use Alfred on a daily basis, the esthetics is definitely better than a browser UI, and it is probably easier to ship lots of business logic as a application bundle instead of as a small application with a bunch of config files. Maybe there is room to converge.