Sybernet / Supplied Packages Reference
Release 3.00
June 15, 2002
backwards forwards

HTTP.DOCUMENT

The DOCUMENT package provides subprograms for operating on URL's. With it, you can use PL/SQL to open a URL, parse its contents and display anything you may find useful in the process.

The package specification and package body are available to public. You are encouraged to view the actual source when in doubt on any of these functions or procedures.

Many of the functions and procedures in this package are overloaded. That means (for example) that you can call different flavors of OPEN. In all cases, the first syntax is the easiest to use. The other forms are only important (and can otherwise be ignored) when you need more functionality than what this package allows.

Constants

DOCUMENT defines the following constants:

DOCUMENT_FOLLOWS    constant    int:=200;           -- Document Follows.
REDIRECT            constant    int:=302;           -- Redirect.
USE_LOCAL_COPY      constant    int:=304;           -- Use local copy.
BAD_REQUEST         constant    int:=400;           -- Bad request.
AUTH_REQUIRED       constant    int:=401;           -- Authorization required.
FORBIDDEN           constant    int:=403;           -- Forbidden.
NOT_FOUND           constant    int:=404;           -- Not found.
SERVER_ERROR        constant    int:=500;           -- Server error.
NOT_IMPLEMENTED     constant    int:=501;           -- Not implemented.

Globals

DOCUMENT defines the following global variables:

CONTENT_LENGTH                  int;                -- size of response
CONTENT_TYPE                    varchar(30);        -- text/html (for example)
DISPOSITION                     varchar(255);       -- inline / attachment 'filename=filename'
COOKIE                          varchar(255);       -- cookie
RESPONSE                        clob;               -- output buffer

Summary of Subprograms

Subprogram Description
CHARINDEX function
 
Searches for a pattern.
CLOSE procedure
 
Closes a URL.
GETC function
 
Retrieves a character from a URL.
GETS function
 
Retrieves a string from a URL.
OPEN function
 
Opens a URL.
STATUS_CODE function
 
Retrieves the STATUS_CODE associated with this URL.
STRINDEX function
 
Searches for a pattern and retrieves the string index.
SUBSTRING function
 
Retrieves a substr() from a URL.
WRITE procedure
 
Displays the current URL on your browser.

OPEN function

This function opens a URL and stores it as a character LOB inside of the DOCUMENT package.

When you call OPEN, the specified URL is retrieved. If successful, the following global variables are assigned:

CONTENT_LENGTH represents the number of characters read.

CONTENT_TYPE represents the type of document retrieved. If the CONTENT_TYPE is neither text/html nor text/plain, the CLOB (RESPONSE) is Ascii encoded.

DISPOSITION represents the disposition for this document. Normally (and by default), the DISPOSITION will be inline. If this is suppose to be an attachment, then that is what is returned instead.

COOKIE represents the HTTP Cookie returned by this site (if any). Although you cannot pass cookies back with DOCUMENT.OPEN (Syntax 1), you can do this by calling function DOCUMENT.OPEN (Syntax 2 or 3).

Syntax 1

HTTP.DOCUMENT.OPEN
(
    URL             IN VARCHAR2
)
RETURN BINARY_INTEGER;

Parameters

Parameter Description
url
 
Universal resource locator.

Syntax 2

HTTP.DOCUMENT.OPEN
(
    HTTPS           IN BOOLEAN        := FALSE
,   SERVER_NAME     IN VARCHAR2
,   PORT            IN BINARY_INTEGER := 80
,   PATH_INFO       IN VARCHAR2       := NULL
,   QUERY_STRING    IN VARCHAR2       := NULL
,   REMOTE_HOST     IN VARCHAR2       := NULL
,   REQUEST_METHOD  IN VARCHAR2       := NULL
,   HTTP_USER_AGENT IN VARCHAR2       := NULL
,   HTTP_COOKIE     IN VARCHAR2       := NULL
,   POST            IN VARCHAR2       := NULL
)
RETURN BINARY_INTEGER;

Parameters

Parameter Description
HTTPS
 
HTTP (FALSE) or HTTPS (TRUE). TRUE is currently not supported and OPEN will return BAD_REQUEST if you attempt to do so.
SERVER_NAME
 
The host or IP address of the server you are contacting.
PORT
 
(optional) The port number for this host.
PATH_INFO
 
(optional) The path to the specified file or script (CGI).
QUERY_STRING
 
(optional) A comma separated list of parameters passed to this script.
REMOTE_HOST
 
(optional) This determines your IP address.
REQUEST_METHOD
 
(optional) Valid values are GET or POST. If you specify POST, then you also want to pass the post request in the POST parameter.
HTTP_USER_AGENT
 
(optional) If you want to pass the value of the user agent, use this parameter.
HTTP_COOKIE
 
(optional) This is the cookie that is passed to this SERVER_NAME. Normally cookies are not passed to a web server unless you want to return the cookie (HTTP.DOCUMENT.COOKIE) that was assigned by a prior call to OPEN.
POST
 
(optional) If REQUEST_METHOD is POST, then this parameter determines the contents of your post request.

Syntax 3

HTTP.DOCUMENT.OPEN
(
    SERVER_NAME     IN  VARCHAR2
,   PORT            IN  BINARY_INTEGER := 80
,   PATH_INFO       IN  VARCHAR2       := NULL
,   QUERY_STRING    IN  VARCHAR2       := NULL
,   REMOTE_HOST     IN  VARCHAR2       := NULL
,   REQUEST_METHOD  IN  VARCHAR2       := NULL
,   HTTP_USER_AGENT IN  VARCHAR2       := NULL
,   HTTP_COOKIE     IN  VARCHAR2       := NULL
,   POST            IN  VARCHAR2       := NULL
,   CONTENT_LENGTH  OUT BINARY_INTEGER
,   CONTENT_TYPE    OUT VARCHAR2
,   DISPOSITION     OUT VARCHAR2
,   COOKIE          OUT VARCHAR2
,   RESPONSE        OUT CLOB
)
RETURN BINARY_INTEGER;

Parameters

Parameter Description
SERVER_NAME
 
The host or IP address of the server you are contacting.
PORT
 
(optional) The port number for this host.
PATH_INFO
 
(optional) The path to the specified file or script (CGI).
QUERY_STRING
 
(optional) A comma separated list of parameters passed to this script.
REMOTE_HOST
 
(optional) This determines your IP address.
REQUEST_METHOD
 
(optional) Valid values are GET or POST. If you specify POST, then you also want to pass the post request in the POST parameter.
HTTP_USER_AGENT
 
(optional) If you want to pass the value of the user agent, use this parameter.
HTTP_COOKIE
 
(optional) This is the cookie that is passed to this SERVER_NAME. Normally cookies are not passed to a web server unless you want to return the cookie (HTTP.DOCUMENT.COOKIE) that was assigned by a prior call to OPEN.
POST
 
(optional) If REQUEST_METHOD is POST, then this parameter determines the contents of your post request.
CONTENT_LENGTH
 
The number of characters returned in this document.
CONTENT_TYPE
 
The document type of this document.
DISPOSITION
 
The disposition of this document. Normally, DISPOSITION is inline. If this is an attachment, the attachment name is included.
COOKIE
 
The cookie returned by this call to OPEN.
RESPONSE
 
The lob locator for this document.

Returns

OPEN returns the corresponding status code for this URL. If the call to OPEN is successful, the status code will be DOCUMENT_FOLLOWS, which means you can use this package to manipulate this information. Any other result constitutes an error of some sort. Whether the result is successful or not, it is important that you call CLOSE to deallocate the internal CLOB that is created.

OPEN is an overloaded function. The easiest way to call OPEN is to simply pass a URL. That version of OPEN parses the URL and calls the second form of this function. The third form (called by the second) can be used externally if you need to declare the CLOB internally.

Example

The following block retrieves the default home page from Sybernet.sri.com. It prints the entire page by calling DOCUMENT.WRITE() and then closes this URL:

DECLARE 
    VAR BINARY_INTEGER;
BEGIN
    VAR:=HTTP.DOCUMENT.OPEN('http://Sybernet.sri.com/default.html');
    IF (VAR = HTTP.DOCUMENT.DOCUMENT_FOLLOWS) THEN
        HTTP.DOCUMENT.WRITE(VAR);
    END IF;
    HTTP.DOCUMENT.CLOSE();
END;

CLOSE procedure

This procedure closes the character LOB you created by calling OPEN. CLOSE doesn't really close the URL because that is done when the LOB was created in OPEN.

Syntax 1

HTTP.DOCUMENT.CLOSE;

Parameters

None.

Syntax 2

HTTP.DOCUMENT.CLOSE (RESPONSE CLOB);

Parameters

Parameter Description
response
 
The output CLOB returned by calling function HTTP.OPEN (Syntax 3)

CLOSE is an overloaded procedure. When using the DOCUMENT package, use the first method. If you are calling HTTP.OPEN (Syntax 3) directly, then you can use the second method to delete your lob locator.

Returns

None.

Example

The following block retrieves the default home page from Sybernet.sri.com. It prints the entire page by calling DOCUMENT.WRITE() and then closes this URL:

DECLARE 
    VAR BINARY_INTEGER;
BEGIN
    VAR:=HTTP.DOCUMENT.OPEN('http://Sybernet.sri.com/default.html');
    IF (VAR = HTTP.DOCUMENT.DOCUMENT_FOLLOWS) THEN
        HTTP.DOCUMENT.WRITE(VAR);
    END IF;
    HTTP.DOCUMENT.CLOSE();
END;

GETC function

This function retrieves the next character from an opened URL. You can also use this function to index any position in a URL. The index (if specified) is zero-relative. If the index specified is out of bounds or GETC is called after the last character has been retrieved, a NULL char is returned.

Syntax

HTTP.DOCUMENT.GETC(I BINARY_INTEGER:=POSITION) RETURN BINARY_INTEGER;

Parameters

Parameter Description
I
 
(optional) character index.

Returns

GETC returns the next character from this URL or the character at a specified index. NULL is returned if the index (implied or not) is invalid.

Example

The following block illustrates how you might print an entire URL. This is clearly not the most efficient way to print a URL, but does illustrate how GETC behaves.

DECLARE 
    VAR BINARY_INTEGER;
    C CHAR;
BEGIN
    VAR:=HTTP.DOCUMENT.OPEN('http://Sybernet.sri.com/default.html');
    IF (VAR = HTTP.DOCUMENT.DOCUMENT_FOLLOWS) THEN
        LOOP
            C:=HTTP.DOCUMENT.GETC();
            EXIT WHEN C IS NULL;
            HTTP.WRITE(C);
        END LOOP;
    END IF;
    HTTP.DOCUMENT.CLOSE();
END;

Example

The following block illustrates how you might print an entire URL. This is clearly not the most efficient way to print a URL, but does illustrate how GETC behaves.

DECLARE 
    VAR BINARY_INTEGER;
    I BINARY_INTEGER :=0;
BEGIN
    VAR:=HTTP.DOCUMENT.OPEN('http://Sybernet.sri.com/default.html');
    IF (VAR = HTTP.DOCUMENT.DOCUMENT_FOLLOWS) THEN
        WHILE (I < HTTP.DOCUMENT.CONTENT_LENGTH) LOOP
            HTTP.WRITE(HTTP.DOCUMENT.GETC(I));
            I:=I+1;
        END LOOP;
    END IF;
    HTTP.DOCUMENT.CLOSE();
END;

GETS function

This function retrieves the next string from an opened URL. You can also use this function to index any string in a URL. The index (if specified) is zero-relative. If the index specified is out of bounds or GETS is called after the last string has been retrieved, a NULL string is returned.

Syntax

HTTP.DOCUMENT.GETS(I BINARY_INTEGER:=POSITION) RETURN VARCHAR2;

Parameters

Parameter Description
I
 
(optional) string index.

Returns

GETS returns the next string from this URL or the string at a specified index. NULL is returned if the index (implied or not) is invalid.

A string is any HTML tag or anything between two HTML tags. Consider the following HTML page:

<HTML>
<BODY>
Hello World
</BODY>
</HTML>
In this example there are actually 7 strings. Calling GETS on this page would return the following strings:
<HTML>         -- 1st tag
               -- line-feed
<BODY>         -- 2nd tag
Hello World    -- Hello World with a line-feed
</BODY>        -- 3rd tag
               -- line-feed
</HTML>        -- 4th tag

Example

The following block illustrates how you might print an entire URL. This is clearly not the most efficient way to print a URL, but does illustrate how GETS behaves.

DECLARE 
    VAR BINARY_INTEGER;
    S VARCHAR(32767);
BEGIN
    VAR:=HTTP.DOCUMENT.OPEN('http://Sybernet.sri.com/default.html');
    IF (VAR = HTTP.DOCUMENT.DOCUMENT_FOLLOWS) THEN
        LOOP
            S:=HTTP.DOCUMENT.GETS();
            EXIT WHEN S IS NULL;
            HTTP.WRITE(S);
        END LOOP;
    END IF;
    HTTP.DOCUMENT.CLOSE();
END;

SUBSTRING function

This function performs a substr() from the current URL.

Syntax

HTTP.DOCUMENT.SUBSTRING
(
    OFFSET          BINARY_INTEGER
,   AMOUNT          BINARY_INTEGER
)
RETURN VARCHAR2;

Parameters

Parameter Description
OFFSET
 
The one-relative offset or staring position.
AMOUNT
 
The number of characters to be retrieved.

Returns

SUBSTRING returns a VARCHAR2 from the current URL.

Example

The following block illustrates how to print the first 6 characters of our URL. Since this is an HTML tag, you may want to turn on HTMLEncode to see the results:

DECLARE 
    VAR BINARY_INTEGER;
BEGIN
    VAR:=HTTP.DOCUMENT.OPEN('http://Sybernet.sri.com/default.html');
    IF (VAR = HTTP.DOCUMENT.DOCUMENT_FOLLOWS) THEN
        HTTP.WRITE(HTTP.DOCUMENT.SUBSTRING(1,6));
    END IF;
    HTTP.DOCUMENT.CLOSE();
END;

CHARINDEX function

This function performs a instr() from the current URL.

Syntax

HTTP.DOCUMENT.CHARINDEX
(
    TARGET          VARCHAR2
,   OFFSET          BINARY_INTEGER := 1
,   OCCURENCE       BINARY_INTEGER := 1
)
RETURN BINARY_INTEGER;

Parameters

Parameter Description
TARGET
 
Pattern to be tested for. The pattern is a character string. The maximum size of the pattern is 16383 bytes.
OFFSET
 
Absolute offset in characters at which the pattern matching is to start.
OCCURENCE
 
Occurrence number, starting at 1.

Returns

Offset of the start of the matched pattern in characters. It returns 0 if the pattern is not found.

Example

The following block illustrates how to find the offset of "Sybernet."

DECLARE 
    VAR BINARY_INTEGER;
    I BINARY_INTEGER;
BEGIN
    VAR:=HTTP.DOCUMENT.OPEN('http://Sybernet.sri.com/default.html');
    IF (VAR = HTTP.DOCUMENT.DOCUMENT_FOLLOWS) THEN
        I:=HTTP.DOCUMENT.CHARINDEX('Sybernet');
        IF (I=0) THEN
            HTTP.WRITE('Not Found');
        ELSE
            HTTP.WRITE('Found at offset ',I);
        END IF;
    END IF;
    HTTP.DOCUMENT.CLOSE();
END;

STRINDEX function

This function searches for the string index of the specified target. While CHARINDEX returns the character position, STRINDEX returns the string index that could subsequently be used in a call to GETS. Although the search is case-sensitive, you can use Oracle's wild card characters (% and _).

Syntax

HTTP.DOCUMENT.STRINDEX
(
    TARGET          VARCHAR2
,   OFFSET          BINARY_INTEGER := 1
,   OCCURENCE       BINARY_INTEGER := 1
)
RETURN BINARY_INTEGER;

Parameters

Parameter Description
TARGET
 
Pattern to be tested for. The pattern is a character string.
OFFSET
 
Zero relative index at which the pattern matching is to start.
OCCURENCE
 
Occurrence number, starting at 1.

Returns

Index of the string that matches this pattern. It returns -1 if the pattern is not found.

Example

The following block illustrates how to find the opening stock value for the symbol UIS:

DECLARE 
    VAR BINARY_INTEGER;
    I BINARY_INTEGER;
BEGIN
    VAR:=HTTP.DOCUMENT.OPEN('http://finance.lycos.com/home/stocks/quotes.asp?symbols=NYSE%3AUIS');
    IF (VAR = HTTP.DOCUMENT.DOCUMENT_FOLLOWS) THEN
        I:=HTTP.DOCUMENT.STRINDEX('Open');
        HTTP.WRITE(HTTP.DOCUMENT.GETS(i+6));
    END IF;
    HTTP.DOCUMENT.CLOSE();
END;

Example

The following example searches the SRI white pages and displays everyone's name that contains denis. The names are in an HTML table, but this example knows that each row is preceded by a hidden input field and that the name is 4 strings after this tag:

DECLARE 
    VAR BINARY_INTEGER;
    I BINARY_INTEGER;
    OCCURENCE BINARY_INTEGER := 1;
BEGIN
    VAR:=HTTP.DOCUMENT.OPEN('http://mis.sri.com/cgi-bin/Sybernet.cgi$http.dbo.sp_html_whitepages?filter=cgi,BUTTON=QUERY,query=denis');
    LOOP
        i:=HTTP.DOCUMENT.STRINDEX('<input type=hidden name=MAILTO%',1,OCCURENCE);
        EXIT WHEN i = -1;
        HTTP.WRITE(HTTP.DOCUMENT.GETS(i+4),'<br>');
        OCCURENCE:=OCCURENCE+1;
    END LOOP;
    HTTP.DOCUMENT.CLOSE();
END;

Example

The following example determines the traffic status to and on the Dumbarton Bridge:

DECLARE 
    VAR BINARY_INTEGER;
    I BINARY_INTEGER;
    S VARCHAR(2000);
BEGIN
    VAR:=HTTP.DOCUMENT.OPEN('http://www.kcbs.com/pages/kcbs/traffic/traffic_index.nsp');
    IF (VAR = HTTP.DOCUMENT.DOCUMENT_FOLLOWS) THEN

        I:=HTTP.DOCUMENT.STRINDEX('%dumbarton%');
        S:=HTTP.DOCUMENT.GETS(I+5);

        IF (UPPER(S) LIKE '%NO INCIDENTS REPORTED%') THEN

            HTTP.WRITELN('<img src="http://Sybernet.sri.com/oracle/thumbsup.gif">');
            HTTP.WRITELN('<br>');
            HTTP.WRITELN(S);

        ELSE

            HTTP.WRITELN('<img src="http://Sybernet.sri.com/oracle/smile.yuk.gif">');
            HTTP.WRITELN('<br>');

            WHILE (UPPER(S) != '<TD>') LOOP
                I:=I+1;
                S:=HTTP.DOCUMENT.GETS(I);
            END LOOP;

            WHILE (UPPER(S) != '</TD>') LOOP
                I:=I+1;
                S:=HTTP.DOCUMENT.GETS(I);
                IF (SUBSTR(S,1,1) != '<') THEN
                    HTTP.WRITELN(S);
                END IF;
            END LOOP;

        END IF;

    ELSE

        HTTP.WRITELN(HTTP.DOCUMENT.STATUS_CODE(VAR));

    END IF;
    HTTP.DOCUMENT.CLOSE();
END;

WRITE procedure

This procedure writes the current URL to your browser. Obviously, there is no reason on Earth to use this package to open a URL and then display it on your browser. It is simpler to just go to that URL directly. The important bit, however, is that WRITE does illustrate the most efficient way to display a URL.

When you call OPEN, the specified URL is retrieved. If successful, the following global variables are assigned:

WRITE ignores DISPOSITION and COOKIE, but does use CONTENT_TYPE to determine how this information is to be displayed. If text/html or text/plain, WRITE simply passes the current URL (a CLOB) to HTTP.HTTP_SELECT_CLOB. All other types are ASCII encoded while calling HTTP_SELECT_CLOB.

If the VAR result passed to WRITE is anything other than DOCUMENT_FOLLOWS, an appropriate error message is written instead.

Syntax 1

HTTP.DOCUMENT.WRITE (VAR BINARY_INTEGER);

Parameters

Parameter Description
VAR
 
The return result from calling OPEN.

Syntax 2

HTTP.DOCUMENT.WRITE (VAR BINARY_INTEGER, RESPONSE CLOB);

Parameters

Parameter Description
VAR
 
The return result from calling OPEN.
RESPONSE
 
The output CLOB returned by calling function HTTP.OPEN (Syntax 3)

Returns

None.

WRITE is an overloaded procedure. When using the DOCUMENT package, use the first method. If you are calling HTTP.OPEN (Syntax 3) directly. then you can use the second method to write your lob locator.

Example

The following block retrieves the default home page from Sybernet.sri.com. It prints the entire page by calling DOCUMENT.WRITE() and then closes this URL. While the previous examples tested for a return status of DOCUMENT_FOLLOWS, this isn't necessary because DOCUMENT.WRITE will do this as well and display an appropriate message for all return results:

DECLARE 
    VAR BINARY_INTEGER;
BEGIN
    VAR:=HTTP.DOCUMENT.OPEN('http://Sybernet.sri.com/default.html');
    HTTP.DOCUMENT.WRITE(VAR);
    HTTP.DOCUMENT.CLOSE();
END;

STATUS_CODE function

This function maps the return result returned by HTTP.DOCUMENT.OPEN to its English mnemonic value (or vice versa).

Syntax 1

HTTP.DOCUMENT.STATUS_CODE(VAR BINARY_INTEGER) RETURN VARCHAR2;

Parameters

Parameter Description
VAR
 
The return result returned by calling HTTP.DOCUMENT.OPEN.

Syntax 2

HTTP.DOCUMENT.STATUS_CODE(VAR VARCHAR2) RETURN BINARY_INTEGER;

Parameters

Parameter Description
VAR
 
A mnemonic value.

Returns

STATUS_CODE returns either the mnemonic value (a varchar) corresponding to the return result gotten from calling OPEN or the numeric value corresponding to same.

Example

The following example illustrates both flavors of STATUS_CODE():

BEGIN
    HTTP.WRITE(HTTP.DOCUMENT.STATUS_CODE(HTTP.DOCUMENT.DOCUMENT_FOLLOWS));
    HTTP.WRITE(HTTP.DOCUMENT.STATUS_CODE('DOCUMENT_FOLLOWS'));
END;

Sybernet is a trademark of SRI International.
Copyright © 1996-2008 SRI International. All Rights Reserved.
Denis D. Workman / http://Sybernet.sri.com/