by Alex Bunardzic

Breaking The Fourth Wall In Software

Or, Everything Old Is New Again

The phenomenon of breaking the fourth wall is well known in the world of theater and cinematography. The breaking of the so-called ‘fourth wall’ is typically brought about by one of the protagonists in the movie suddenly turning toward the camera and addressing the viewing audience, thus breaking the illusion that we’re witnessing a real-life event.

But how does breaking the fourth wall work in software?

Early Human-Computer Interfaces

The first computers were big and expensive and finicky. The way humans interacted with computers at that time was typically by feeding it a stack of punched cards.

Early Interfaces Were Intimidating

You obviously needed a university degree in order to operate computers.

Early Interfaces Were Clunky

A lot of buttons and switches and dials and levers. Intimidating and clunky.

Breakthrough — Text!

Late 1960s — early 1970s witnessed the introduction of the so-called computer terminal. Emulating a typewriter for entering commands, and then displaying the results of the evaluated text on the monitor that looked like a TV screen.

Text Is Intuitive

Pretty much all people find text to be very intuitive — close to the way we think and speak. It feels much more natural to speak to the machine than it is to twoddle the knobs, flip the switches, and pull the levers (not to mention punch the cards or rewire the circuits).

But Computers Are Clunky

In the early days of computers, if you type a wrong command or use wrong syntax, the computer used to throw a tantrum. Temperamental beasts!

Only These People Knew How To Talk To Computers

You may recognize some faces on the group photo below.

Replace Text With Graphical Interface — The Desktop Metaphor

Use pictorial representation to shield people from having to memorize awkward commands and syntax when operating computers. The idea was to present users with some familiar scenery — for example, their desktop. Everyone is familiar with the idea of having a desktop with file folders containing files and also a trash can at the side of the desk etc.

This Graphical User Interface (GUI) was deemed as being even more intuitive than text.

GUIs Quickly Mushroomed Into Something Scary and Non-Intuitive

How is the interface below intuitive? It is as frustrating as the arcane and awkward syntax that the early computers insisted on when processing text.

A Picture Is Worth A Thousand Words

True. But what if most of those thousand words are gibberish? What’s the worth of that?

Bottom Line: People Find GUIs Frustrating

GUIs typically present us with too much information at once. Then the onus is on us to digest all that and try to make sense out of it.

GUIs also tend to enforce ‘one size fits all’ approach, which is not very user-centric.

So What’s The Solution Then?

What if, instead of this poorly thought out buffer that consists of intermediary graphical representation, we were to revert to plain text again? It is, after all, much easier to focus and follow simple discussion threads than it is trying to navigate hairy, convoluted GUIs.

But Computers Are Brittle And Will Not Be As Forgiving As GUIs

We have grown to depend on GUIs as we would depend on training wheels. Installing training wheels gives us a sense of safety — we cannot fall, and yet we can somehow move forward and get to our destination.

Can’t Get Very Far Using Training Wheels

Training wheels are okay for peddling around our back yard, but we can’t use them effectivelly in real life situations.

How To Remove Training Wheels And Learn To Ride Properly?

Break the fourth wall!

How do we break the fourth wall? Stop pushing pixels!

How can we stop pushing pixels?

Frustrating Example

Say we order something online. The next week we may be wondering about the status of our order (hasn’t, for some reason, arrived yet). Frustrated, we open the browser, go to the online store, log in, and then try to navigate to the order status page.

To reach the order status page, we have to navigate through the veritable jungle of confusing menus, shifting layouts, poorly styled links (often barely visible on the page), and so on. To add insult to an injury, these elements keep constantly changing, so we cannot rely on our muscle memory from previous navigation sessions.

Less Frustrating Example

What if, instead of doing all of the above gymnastics and acrobatics, we just do the following:

Go to the command line (say in Messenger or Slack etc.) and type ‘@merchant_name what’s the status of my order?

That way, we let the merchant service (i.e. Amazon or Etsy or Ebay etc.) do the legwork on our behalf.

Which of the two ‘check order status’ experiences is more intuitive?

Guess What — We Just Broke The Fourth Wall In Software!

By foregoing the GUI, we have switched to interacting with some online service using plain text. And it felt quite natural. Notice how, by doing that, we were not expected to undergo any training.

How’s that possible? Simple — instead of interacting with the bare metal computing machinery, we got in touch with a sophisticated chatbot whose role is to know how to parse and interpret colloquial English text.

How Do We Get Involved With A Chatbot?

We /invite it to our channel. For example, say we find out there is a chatbot that specializes in restaurant recommendations. We wish to get in touch with that bot, and after finding out the name of the bot (say, restobot), we ‘hire’ that bot to work for us by typing:

/invite @restobot

Once invited to your channel, this bot will remain always online, listening attentively for its name to get mentioned.

The Bot Is The Buffer

Similar to how a GUI was the buffer between us, human users, and the cold, bare computing machinery, bots are now replacing GUIs as that warm and fuzzy buffer. Bots are shielding us from having to deal with the temperamental machinery by translating our plain English commands into something that the underlying computing services can understand and work with.

What’s The Value Proposition Of Bots?

Bots are attentive to human needs and sensitive to human frailty.

So Is This A Revolutionary Change?

Not really. It’s the natural outcome of the advancements we’ve made in the field of human-computer interaction. So it’s more of an evolutionary change.

In actuality, this conversation-based interface is not all that different from operating computers via GUIs. Because, if we examine a bit closer what’s going on behind the surface of a typical GUI processing, we’ll find the following scenario:

  • A user wants to ask the computer to do something
  • User goes to the screen/page where they get presented with one or more input fields
  • These input fields, sometime referred to as text boxes, accept text from the user
  • The GUI then listens to the user’s gestures, such as ‘send’ or ‘submit’ gesture
  • Once the event signalling the expected gesture occurs, the GUI turns around and sends text to the underlying servers

GUIs Are Also Text-Driven

Similar to how bots operate, GUIs also posses the knowledge of how to collect text from users and then formulate the collected values using the strict syntax that the back-end computers can understand.

So if that’s the case, where do pixels come into play?

Most of the time, pixels are being used as decoration. They typically sugarcoat the screen, or a web page, and dress it up in a robe that looks more familiar to the users. Such as, for example, dressing up a web form to look similar to the paper form.

Doing that decompresses the tension users may feel when attempting to work with computers. The intent is to demystify the interaction, and make it feel similar to everyday interactions one may encounter when dealing with various non-virtual services.

Remove Pixelated Decorations, And What Are We Left With?

One word — microcopy.

What is microcopy?

In the above example, microcopy is any text placed next to the GUI control. In a GUI form, we may ask users to enter their phone number. Often times, people are not sure if they want to do that, and also why would we need their personal information? So we place a simple, direct sentence in the brackets, right next to the caption asking for the phone number, explaining the purpose of that request. For example, “we need your phone number for shipping-related questions”.

Or, we may offer a microcopy that is a bit more verbose, such as in the example below:

Conversational Thread

If we imagine removing all pixels and with it the graphical user interface, what we’re left with is a simple conversational thread that gets recorded between the user and the computer.

What Are The Advanatages Of Conversational Interfaces?

  • Intuitive
  • Sensitive to human frailty (the bot will try to clarify human request if initially not clearly understood)
  • Familiar (everyone is already fully accustomed to chatting with family/friends/coworkers)
  • Consistent experience across all devices (immune to any concerns/issues related to layouts, fonts, colors, etc.)
  • Guarantees full user ownership of the conversation — fully personalized discussion thread is forever recorded and owned by the human user (full transparency, complete audit)

Conversational Commerce

As we’re moving into the post-web 2.0 world, the universal slogan ‘content is king’ now becomes ‘commerce is king’. In the web 2.0 world, when a user completes a transaction via GUI, all the steps that transpired between the user and the online service may have been recorded by the back-end service, but are opaque to the user. In the world of conversational commerce, every step that transpired between the user and the online service is recorded in the conversational thread and is fully owned by the user.

Conversational Interface Experience Is Similar To Regular Customer Support Experience

Similar to how calling a 1–800 number was a mainstream customer support channel before the emergence of web 2.0 and the mobile apps, we’re moving back into conversing directly with the customer suport. Only this time, instead of being put on an indefinite hold and forced to listen to horrible muzak, we’re conversing with bots which are always on and are much faster and more accurate, more detialed than human workforce.

And same as with the 1–800 scenarios, if our call for some reason cannot get resolved in satisfactory manner, we can easily escalate. In the old regime we would ask the customer service representative to talk with their supervisor, and in the new regime we would ask the bot-agent to put us in touch with the human operator.

Let’s Create Our Own Bot Now!

Perhaps the best way to grasp this transition from graphical to text-based user interface is to roll up our sleeves and create a bot from ground up. Creating a bot is quite easy, because the tools necessary for building bots have been largely commoditized. Still, I feel that merely creating a bot would not be an efficient nor convincing demonstration of the importance of conversational commerce. That’s why I’m proposing that we here learn not only how to create a bot, but also how to create a bot that is capable of doing something useful for us.

For example, let’s create a bot that will help us get in touch with some e-commerce service by using plain text as the user interface.

Create Online Commerce Service First

For the sake of brevity, let’s create a simple e-commerce site that will host an inventory of products. Those products will be offered for sale, and some of the products on sale will also be offered at a discount price.

We will use state-of-the-art web development framework (Ruby on Rails) for building this service. If you don’t have the Rails framework installed, please refer to the Rails site for instructions on how to get it installed on your computer.

Once installed, we use Rails to create a new site. Open the terminal and type:

rails new your_site_name

Rails will then create the new project for you, and once you navigate to your new project (by typing cd your_site_name), you are ready to create the inventory of products to be hosted on the new site. We will create a resource called Product, and will then assign several attributes to it:

rails generate scaffold Product name:string price:decimal on_special:boolean discount_percentage:integer description:text

The above command will create the resource called Product and will implement product attributes, such as product name, its price, whether or not it’s on special, and the discount percentage.

Now is the time to create a database where the inventory of products will get stored. We do that by using the specifications that got created with the previous command. The command to create and install the products database is as follows:

rake db:migrate

The only thing left to do is to start the server and verify that the web site we’ve just created is working as expected:

rails s

Maintain Inventory Of Products

Now that we’ve created our products database and our web site, we should navigate to it and add some products. Open up the web browser and navigate to the http://localhost:3000/products URL.

Of course, the product inventory page will be empty, because we haven’t added any products yet. Let’s do that by clicking on the ‘New Product’ link.

After entering some values, we click on the “Create Product” button and the product is now added to the inventory. Let’s enter a few more products (remembering to click on the “On special” checkbox for some of them).

Now that we have several products in our inventory, time to build a conversational commerce bot. What will be the usefulness of that bot? In order to keep things simple, we will endow this bot with the ability to answer text commands enquiring about the products that are on special.

Where Will Our Bot Live?

A bot must be able to listen to text messages arriving from users, and the best way to make that happen is to add the bot to some messaging platform. Currently, the most attractive messaging platform for adding bots is Slack, so we’re going to use it to demonstrate how to build conversational commerce.

Signup with Slack (if you’re not a member already), and then go to:

https://yourteam.slack.com/services/new/bot

You will be asked to specify the name of your bot. Let’s call our bot ‘gofer’.

After clicking the “Add bot integration” button, we will be able to set our gofer up on Slack. First thing first, let’s choose the icon that will represent our bot. I choose my favourite robot, Bender.

We can also add bot’s first and last name and a description outlining bot’s capabilities.

After saving the integration, we notice the API Token; this token is extremely important, as it allows the integration between our hand crafted bot and the Slack platform. Let’s copy the value of that API Token for future reference.

Final Step — Crafting Our Bot

Now’s the time to open the source code of our e-commerce product inventory site. We must add bot to this site, because the bot will be able to utilize the services built into our inventory site and aswer the questions coming from the Slack users.

First thing we need to do is navigate to the config folder in our product inventory site and create a new file. That file will contain the Slack API Token. We can name this file anything we want; I prefer to keep its name simple, so I call it api.rb. This file will consist of only one line of code:

ENV[‘SLACK_API_TOKEN’]=’xoxb-23830295172-r5CzhzDnUSZQfUfXWmR’

Next we need to tell Rails framework to load that API Token during the intialization phase. We open the config/environment.rb file, and add the following two lines of code:

api = File.join(Rails.root, ‘api.rb’)
load(api) if File.exists?(api)

Now that we have declared Slack API Token and instructed Rails to load it, we need to add our bot to the project. The best way to do that is to navigate to the app folder, and create a new folder simply named bots.

Create a new file in the app/bots folder, and name it real_time_messaging.rb. This file will deal with the thread used for our bot to listen for the incoming messages. Add these lines to the file, and save it:

$:.unshift File.dirname(__FILE__)
Thread.abort_on_exception = true
Thread.new do
Gofer.run
end

You have probably noticed that in the file above we have mentioned Gofer; we managed to get ahead of ourselves by mentioning the bot we havent created yet. But that’s okay, because we’re not ready yet to kick start the bot service listening on the channel.

So the real challenge now is to figure out how to craft our bot named Gofer. For the sake of brevity, we will cheat here by leveraging the commodity service known as Slack Ruby Bot. Leveraging this commodity allows us to save time that would’ve otherwise be spent on coding the low level web sockets processing, which is a fairly involved exercise.

The quickest way to leverage this commodity is to open the Gemfile found in the root of the project, and add the following line to it:

gem ‘slack-ruby-bot’

Save the file and then go to the command line to the root of the project and run:

bundle install

When the installation completes, will will have baked in our Slack Ruby Bot commodity service, which we will leverage when creating our Gofer bot.

But before we jump into crafting the bot logic, we need to complete one more step related to the underlying plumbing needed for the Slack bot to work properly. Navigate to the config/initializers folder, and create a new file simply called bot.rb. This is a simple file consisting of only one line of code:

require File.join(Rails.root, ‘app/bots/real_time_messaging’)

It is simply instructing Rails to load the real_time_messaging.rb file on initialization. And if we look back into the contents of the real_time_messaging.rb file, we will see that, once the web site boots, it will also run a thread that is responsible for running the Gofer bot.

And finally, on to creating the bot logic! Create a new file in the app/bots folder, and name it gofer.rb. This file will declare Gofer bot as inheriting its capabilities from the commodity we’ve just installed — SlackRubyBot::Bot.

class Gofer< SlackRubyBot::Bot

This bot inherits some rudimentary capabilities from SlackRubyBot, such as the ability to respond to commands. And these commands is what we’ll be teaching this bot, telling it how to respond to each command it receives.

Let’s start with something extremely simple — let’s teach our Gofer bot how to handle the ‘help’ command. Add the following command definition to the gofer.rb file:

command ‘help’ do |bot, thread|
bot.say(channel: thread.channel, text: “Help is on its way.”)
end

This command is going to use the bot to get it to display the text ‘Help is on its way.’ to the channel from where it was asked for help.

Save the file, go to the command line and start the server (rails s). You will now notice additional messages on the command line when the server starts:

Now that our bot Gofer is successfully connected to our Slack team, we can test it. Go to your Slack team, and you will see that gofer bot is online (there is a green semaphore ligh next to its name). Click on its name, and then type ‘help’. You will see that the bot immediately responds with the text we’ve given it above.

OK, neat, so now we see that our bot is working. But how do we get it to tell us what products are offered on special at the moment? Simple — we just add a new command (let’s call it ‘promo’ for simplicity) and instruct the bot to gather information on the products with discounted prices and send us the list back.

Save the file, restart the server, and flip over to Slack to ask gofer what’s on special.

Just to verify that the bot is indeed working in real time, go to the product inventory and make some changes. Like, remove the discount for the Hat and maybe add discount for some other product. After you do that and ask gofer what’s on promo, it will tell you all the latest details.

Conclusion

Latest trends have shown that more and more people tend to spend majority of their online time chatting. As of the early 2016, almost 1 billion active users are spending time on Messenger and other chat apps. As people are getting familiar with texting with their friends, family, and coworkers, they are also slowly becoming acclimatized to chatting with bots. That experience is offering a more intuitive way to get things done, and all signs indicate that this new way of interacting with computers is what people at large seem to prefer.

We have attempted to illustrate how will this transition work by walking the readers through the hands-on session on how to create their own bots. Once the first trial bot gets created, we realize that sky is the limit — there are so many useful things these bots can do, so let’s get cracking!

Update

I was invited by the RED Academy to deliver the talk on Conversational Commerce and the Bot ‘revolution’. The talk was recorded and can be viewed below:

Intrigued? Want to learn more about the bot revolution? Read more detailed explanations here:

How To Build A Stateful Bot
The Age of Self-Serve is Coming to an End
Only No Ux Is Good UX
Stop Building Lame Bots!
Four Types Of Bots
Is There A Downside To Conversational Interfaces?
Are Bots just a Fad? Are GUIs really Superior?
How to Design a Bot Protocol
Bots Are The Anti-Apps
How Much NLP Do Bots Need?
Screens Are For Consumption, Not For Interaction