KWT Goes Web: Revolutionizing CLI Interaction with KWTD (Part 3)

KWT Goes Web: Revolutionizing CLI Interaction with KWTD (Part 3)

In part 3 of our series on kwt, we'll explore the implementation and motivation behind kwtd, the web interface for kwt. If you haven't already, please install kwt by following the guide and read part 1 and part 2 for context.

What is kwtd?

The kwt package releases both kwt and kwtd. You can think of kwtd as a coordinator for user interaction via the browser, while kwt is the executor that actually runs the commands. A guide is available for a demo.

Here's how the system works:

  1. First, kwtd needs to be run. Convenience scripts are available to run it at startup.
  2. Once kwtd is running, it's accessible at http://127.0.0.1:7171. It can show and ingest menus, but not run commands.
  3. A kwt executor must be connected to kwtd, declaring a channel and a menu, e.g., kwt start --channel 0 --menu python-demo.
  4. Finally, a user can navigate to the specified channel on kwtd and run commands defined in the chosen menu.

KWTD Demo

Web Stack: Simplicity by Design

The web stack of Gorilla + VueJS + Bootstrap was chosen for simplicity and convenience. A deliberate decision was made not to use npm, as it's foreign to the Go ecosystem, which would complicate continuous deployment (normal Go container images won't have npm available) and worsen developer experience. The building process includes these steps:

  1. Download 3rd party assets by building and running cmd/prebuild/main.go.
  2. Use go-assets-builder to build 3rd party and project assets into assets.go.
  3. Build the web server with assets.go.

Key points to note:

  • No internet connection is required for kwtd to work.
  • The guideline for layout and styling is simplicity (and admittedly, a bit of laziness).
  • Asset files are not bundled, which is fine since we only serve localhost.
  • To aid frontend development, the environment variable ASSETS_ROOT can be set so kwtd reads assets from the file system.

Communication: WebSocket

The kwtd server acts as the central exchange in the system, passing WebSocket messages between the kwt executor and the browser. Each channel contains an exchange loop, defined by exch.go (GitHub) using Go channels (different from kwt channels).

The communication flow works as follows:

  1. The kwt executor initiates the exchange by opening a WebSocket connection to the channel's back endpoint /ws/back/0 with the menu name as a parameter.
  2. If the channel is available, the kwtd server initializes the exchange loop, loads the menu (using the server-side menu), and creates a random validation token, occupying the channel.
  3. The executor performs a GET request to the channel endpoint /channel/0 to receive the menu and the validation token, then initializes its own exchange loop.
  4. The executor waits for a Command, renders and executes it, and sends the Output back line by line until the loop is broken by the server or Ctrl+C.
  5. Once the exchange loop is established, the browser can connect to /ws/front/0 and /channel/0 to send Commands and receive Outputs.

This setup has some unique characteristics:

  • Each channel accommodates at most one executor and one browser session, reflecting the one-to-one mapped interaction model of running commands in a terminal.
  • A random validation token is generated each time an exchange loop is established between an executor and a server. The browser keeps this token and attaches it to Commands. If the executor sees an unexpected token, it won't execute the Command.
  • There are three channels by hardcode, which is not ideal.
  • The HTTP port is configurable on both kwt (--master) and kwtd (env PORT).
  • The server serves 127.0.0.1, only allowing local traffic for security reasons.

Display: Markdown Support

The browser receives a WebSocket message per line of command output. All outputs are kept for the channel and can be optionally displayed by toggling the Logs switch. The logs can be cleared by going back to Home. The output of the latest command is kept above the log display and can be rendered as markdown by toggling the Markdown switch.

The markdown feature was implemented with the hope that its formatting capabilities would enable more useful commands to be built, such as those including hyperlinks, tables, or images.

Motivation: Reinventing the Wheel?

The web interface was part of the first vision for the system, driven by a desire to create a tool that allows non-developers to run commands easily. The core of the architecture allows a surrogate terminal program to be populated with declarative configurations and controlled by a UI.

This approach contrasts with more popular desktop productivity tools like Raycast, Command E, and Alfred, which typically run all the actual commands (or call the APIs) from the UI application itself. Here's why I chose a different path:

  1. The wheels have been invented: The terminal is excellent at many things, and I already know the commands to do XYZ. I shouldn't have to invest extra time to configure a productivity tool that works in an unfamiliar way. For example, I already know mysql and aws commands very well, so I'd like to configure the tool with this knowledge, rather than reading documentation on how the tool handles MySQL databases and configures credentials.

  2. Execution is controlled: Running a terminal program feels safe. I can use Ctrl+C to stop it or close the terminal to kill it, read its logs to know it's alive, or configure its environment to run the correct version of Python. In contrast, a process running behind a UI application is much less controllable, especially when faced with problems.

  3. The tool is easy to trust: Since the surrogate terminal program runs inside an environment defined by the user, the tool doesn't have to store passwords or credentials. The system is also auditable because all of the business logic is defined declaratively in config files, and the tool itself is relatively simple. The tool can be trusted because it knows very little.

On the other hand, I do use Alfred daily. Its aesthetics are definitely better than a browser UI, and it's probably easier to ship lots of business logic as an application bundle instead of as a small application with a bunch of config files. Perhaps there's room for convergence in the future.

Conclusion

KWTD, as the web interface for KWT, brings a unique approach to command-line productivity tools. By leveraging the power of existing terminal commands and wrapping them in a user-friendly web interface, it strikes a balance between ease of use and the flexibility of the command line. While there's always room for improvement, particularly in areas like aesthetics and bundling, the current implementation offers a solid foundation for both developers and non-developers to streamline their command-line workflows.