In this article, we'll look at the Common Gateway Interface (CGI), which lets you set up scripts to do server-side processing. Such scripts are usually the best way to process form input.
In the simplest case, the request consists of a single
followed by a blank line:
GET /~gedetil/form/form.html HTTP/1.0The server's response has a similar structure: status line, MIME-format headers, blank line, and message body. For a typical
GETrequest, the server responds with the requested document:
HTTP/1.0 200 Document follows Date: Mon, 25 Mar 1996 20:32:23 GMT Server: NCSA/1.5 Content-type: text/html Last-modified: Mon, 25 Mar 1996 18:48:57 GMT Content-length: 1010 ... document content ...All of these transactions are normally transparent to the user, except when an error occurs. However, when dealing with CGI programs, it's important to know about the structure of transactions.
The server still handles the communication with the client, but when a request is received for a CGI program, the server invokes the program, and passes it the information it needs. Since the server has already read in the client's command and any MIME headers, this information is passed along via environment variables.
REQUEST_METHOD variable will be set to either
on the method used to submit the request. In the case of a
QUERY_STRING will contain the URL-encoded string that was submitted
(which was sent as part of the URL on the request. In the case of a
CONTENT_LENGTH will contain the values of
the equivalent MIME headers.
CONTENT_TYPE should be set to
application/x-www-form-urlencoded" for a normal form submission.
CONTENT_LENGTH will indicate how many bytes of data must be read from
the standard input, in order to obtain the URL-encoded form data.
So much for the input side. For output, the CGI program is more in
control. The server will usually worry about sending the initial status
line, but the CGI program must output the MIME headers, a blank line,
then a message body. At the very least, a "
Content-type:" header must
be sent, so that the client will know what sort of message is to follow.
(The server doesn't know what the CGI program will send as output, so
it's up to the program to say so itself.) The message body can be a
simple text message, or a more elaborate document, either pulled from a
file or generated on the fly.
#!/bin/sh # Start with MIME-format message header: echo "Content-type: text/plain" echo "" # Process information, and output message: dateWhen the server calls this script, it simply outputs the
Content-typeheader, and a blank line, then it invokes the UNIX
datecommand to output a single line of text, containing the system date.
As far as the HTTP transactions are concerned, the request is the same as for a normal document file, except that the CGI program name is given:
GET /~gedetil/cgi-bin/date.sh HTTP/1.0The server's reply will consist of a few lines of its own, followed by the CGI program's output, verbatim:
HTTP/1.0 200 Document follows Date: Mon, 25 Mar 1996 23:09:06 GMT Server: NCSA/1.5 Content-type: text/plain Mon Mar 25 17:09:06 CST 1996If we wanted to pass input to the script, we could have added it to the
GETrequest, by adding a string such as "
?query-string" on to the end of the file name. Everything after the "
?" character would have been passed to the script in the
Querysh will take care of determining whether a
POST method was
used, and the URL-encoded data from the appropriate source, decode it,
split the fields, and set the value of each field in an environment
variable (which is easy to process in a script).
For example, given our earlier sample form, the URL-encoded string would be something like this:
status=New&membnum=&fullname=Gilbert+Detillieux&address=123+Mulberry+Lane%0D%0AWinnipeg%2C+MB%0D%0AR3R+3R3email@example.comQuerysh would decode this, and set the following variables for the script to use:
QSH_Fields="QSH_status QSH_membnum QSH_fullname QSH_address QSH_phone QSH_email" QSH_status="New" QSH_membnum="" QSH_fullname="Gilbert Detillieux" QSH_address="123 Mulberry Lane[CRLF]Winnipeg, MB[CRLF]R3R 3R3" QSH_phone="(204) 555-1212" QSH_email="firstname.lastname@example.org"Here, "
[CRLF]" is used to denote an actual CR/LF pair of characters in the string. With the form data in this format, we're all set to write a script.
#!/usr/local/etc/httpd/querysh #?/bin/sh if [ -z "$QSH_Fields" ] then # No form fields given -- send empty form cat <<! Content-type: text/html ... HTML for an empty form ... ! exit fi ... process the form data ... cat <<! Content-type: text/html ... HTML for a status message ... !The first couple lines are to let the system know where to find querysh, so it can be run, and to let querysh know what shell it should then run to interpret the rest of the script. The next part of the script checks to see if there are any fields that were set. (It's possible that a blank form was submitted, or that the script was invoked directly, without any form data being passed.) In that case, a sensible thing to do might be to simply return an HTML document containing a blank form to be filled out.
The script then goes on to do some processing, based on the submitted data, and finally, it should output something back to the client, such as a document, a status message, or some output based on the request.
This is particularly important in shell scripts, where this input may be passed along to other commands, or interpreted by the shell itself. A malicious user could pass along input containing special characters for the shell, and possibly even commands to be run by this shell. So, be careful how you use input data in your script, and make sure you check all fields thoroughly before you make use of them.
This article is copyrighted by MUUG and the specific author(s). You are granted permission to duplicate it for non-commercial purposes only, provided it is not modified and includes this copyright notice as well as all author credits and attributions.
If you found this useful, you might also be interested in other MUUG tutorial articles. Or, why not find out more about MUUG? If you live in or near the Winnipeg area, why not check out one of our monthly meetings?