Perl CGI  «Prev  Next»

Lesson 3The state-machine model
ObjectiveLearn how to use the State-Machine Model to make CGI Programs flow more Naturally.

Use the State-Machine Model to make CGI Programs

Using the state-machine model in CGI (Common Gateway Interface) programs can greatly enhance their flow and user experience, particularly in web applications where maintaining the state of a user's interaction across multiple requests is crucial. CGI programs, by default, are stateless: each request is treated as an independent transaction, and the program doesn't retain user state or session information between requests. The state-machine model can help overcome this limitation by explicitly managing states and transitions, leading to a more natural flow in CGI programs. Here's how you can implement this model:
  1. Define States: First, identify the different states your CGI program can be in. These states should represent significant points in the interaction flow with the user. For example, in a shopping cart application, states could include `Browsing`, `Adding Item`, `Viewing Cart`, `Checkout`, and `Payment`.
  2. Define Events: Events are user actions or system actions that trigger transitions between states. In a CGI context, events might be triggered by form submissions, hyperlink clicks, or any input from the user. For each state you've defined, determine what events can occur and where these events should lead.
  3. Implement State Transitions: For each event in a given state, implement the logic that determines the next state. This often involves processing user input, updating session data, and generating the appropriate CGI output (like HTML pages) for the next state. It's crucial to ensure that the program responds correctly to each possible event in every state.
  4. Maintain Session State: Since HTTP is stateless, use session management techniques to maintain the state across multiple CGI requests. This can involve cookies, URL rewriting, hidden form fields, or server-side session storage. The session data should store the current state of the user's interaction so that each request can be contextualized correctly.
  5. Design Stateful Interactions: Design your CGI program's user interface and interactions around the state-machine concept. Ensure that at any given state, the user is presented with options that make sense for that state and that lead to valid subsequent states. This might mean customizing forms, links, and navigation elements based on the current state.
  6. Error Handling: Implement robust error handling to manage unexpected events or invalid state transitions. This includes providing useful feedback to the user and ensuring the program can gracefully return to a valid state.
  7. Testing: Thoroughly test your CGI program across all states and transitions to ensure that the flow is natural and intuitive and that all possible paths lead to valid outcomes. Pay special attention to edge cases and error conditions.

Example Imagine a simple CGI-based survey application. The states might be `Start`, `Question 1`, `Question 2`, `Finish`. Each question page submission is an event leading to the next state. The CGI script checks the session data to determine the current state and displays the appropriate question. Upon completion, it transitions to the `Finish` state, where it might display a thank you message and store the survey results. By employing the state-machine model, you can make your CGI programs more interactive and intuitive, closely mimicking the stateful behavior seen in more complex web applications, despite the stateless nature of HTTP.


The state-machine model (sometimes called the multiple-state automaton, but you knew that already) is a technique for automatically restoring the state of flow whenever the program is recalled from another iteration of itself. This works by passing enough information back to the user's browser so that it can recall the program with enough context to continue where it left off. Each time the program is called, it will be presented with a complete package of data representing its current state and a token representing the next state for processing. Generally, the state-management data is passed in hidden fields within an HTML document. Here is a simple example (statemach.cgi) for you to examine.
There is no need to write any HTML to interact with statematch.cgi, it is included in the CGI.

statemach.cgi

#!/usr/bin/perl
# statemach.cgi
# a simple state-machine example
# constants
$CRLF = "\x0d\x0a";
$servername = $ENV{'SERVER_NAME'};  # this server
$scriptname = $ENV{'SCRIPT_NAME'};  # this program's URI
# how to call back
$callback = "http://$servername$scriptname";  
# need this at the top of all CGI progs
print "Content-type: text/html$CRLF$CRLF";
# get the query vars, if any
%query = getquery();

# if there's no data, assume this is the first iteration
$state = 'first' unless %query;
# make variables from query hash
while(($qname, $qvalue) = each %query) 
  { $$qname = $qvalue; }

# the main jump table
if    ($state eq 'first'   ) { first()    }
elsif ($state eq 'validate') { validate() }
else                         { unknown()  }

exit;
# STATE SCREENS


sub first{
print <<FIRST;
<title>First Time</title>
<h1>Is this your first time?</h1>
<form method=POST action="$scriptname">

<p> This is your first time through the state-machine. 

<p> Are you having fun yet?  

<input type=checkbox name=fun>
<input type=hidden name=state value=validate>
<input type=submit value=" tell me about it ">

</form>

</title>
FIRST
}
sub validate
{
if ($fun) { isfun() }
else { notfun() }
}
sub isfun
{
print <<ISFUN;
<title>Having Fun</title>

<h1>Glad you are having fun!</h1>

ISFUN
}
sub notfun
{
print <<NOTFUN;
<title>Sorry</title>

<h1>Sorry, that's the wrong answer</h1>

<h3>Come back when you are having fun!</h3>

NOTFUN
}

sub unknown
{
print <<UNK;
<title>Huh?</title>

<h1>Error: Confused</h1>

<p> Um...I do not know how I got here. Sorry. 

UNK
}


# UTILITY ROUTINES
# getquery
# returns hash of CGI query strings
sub getquery
{
my $method = $ENV{'REQUEST_METHOD'};
my ($query_string, $pair);
my %query_hash;

$query_string = $ENV{'QUERY_STRING'} if $method eq 'GET';
$query_string = <STDIN> if $method eq 'POST';
return undef unless $query_string;

foreach $pair (split(/&/, $query_string)) {
  $pair =~ s/\+/ /g;
  $pair =~ s/%([\da-f]{2})/pack('c',hex($1))/ieg;
  ($_qsname, $_qsvalue) = split(/=/, $pair);
  $query_hash{$_qsname} = $_qsvalue;
  }
return %query_hash;
}

# printvars
# diagnostic to print the environment and CGI variables
sub printvars{
  print "<p>Environment:<br>\n";
  foreach $e (sort keys %ENV) {
     print "<br><tt>$e => $ENV{$e}</tt>\n";
  }
  print "<p>Form Vars:<br>\n";
  foreach $name (sort keys %query) {
    print "<br><tt>$name => [$query{$name}]</tt>\n"; 
  }
}

Perl 6

Lexers More Generally

To write a lexer in a language like C, one typically writes a loop that reads the input, one character at a time, and which runs a state machine, returning a complete token to the caller when the state machine definition says to. Alternatively, we could use a program like lex, whose input is a definition of the tokens we want to recognize, and whose output is a state machine program in C. In Perl, explicit character-by-character analysis of input is slow. But Perl has a special feature whose sole purpose is to analyze a string character-by-character and to run a specified state machine on it; the character loop is done internally in compiled C, so it is fast. This feature is regex matching. To match a string against a regex, Perl examines the string one character at a time and runs a state machine as it goes. The structure of the state machine is determined by the regex. This suggests that regexes can act like lexers, and in fact Perl's regexes have been extended with a few features put in expressly to make them more effective for lexing. As an example, let us consider a calculator program. The program will accept an input in the following form:
a = 12345679 * 6
b=a*9; c=0
print b

This will perform the indicated computations and print the result, 666666666. The first stage of processing this input is to tokenize it. Tokens are integer numerals; variable names, which have the form
/^[a-zA-Z_]\w*$/; 
 

parentheses; the operators +, -, *, /, **, and =; and the special directive print. Also, newlines are significant, since they terminate expressions, while other whitespace is unimportant except insofar as it separates tokens that might otherwise be considered a single token. (For example, printb is a variable name, as opposed to print b, which is not)

State machine - Quiz

Take a brief multiple-choice quiz about details of the state-machine model.
State Machine - Quiz

SEMrush Software