|
Sybernet / Supplied Packages Reference
Release 3.00 June 15, 2002 |
|
The DOCUMENT package provides subprograms for operating on URL's. With it, you can use PL/SQL to open a URL, parse its contents and display anything you may find useful in the process.
The package specification and package body are available to public. You are encouraged to view the actual source when in doubt on any of these functions or procedures.
Many of the functions and procedures in this package are overloaded. That means (for example) that you can call different flavors of OPEN. In all cases, the first syntax is the easiest to use. The other forms are only important (and can otherwise be ignored) when you need more functionality than what this package allows.
DOCUMENT defines the following constants:
DOCUMENT_FOLLOWS constant int:=200; -- Document Follows. REDIRECT constant int:=302; -- Redirect. USE_LOCAL_COPY constant int:=304; -- Use local copy. BAD_REQUEST constant int:=400; -- Bad request. AUTH_REQUIRED constant int:=401; -- Authorization required. FORBIDDEN constant int:=403; -- Forbidden. NOT_FOUND constant int:=404; -- Not found. SERVER_ERROR constant int:=500; -- Server error. NOT_IMPLEMENTED constant int:=501; -- Not implemented.
DOCUMENT defines the following global variables:
CONTENT_LENGTH int; -- size of response CONTENT_TYPE varchar(30); -- text/html (for example) DISPOSITION varchar(255); -- inline / attachment 'filename=filename' COOKIE varchar(255); -- cookie RESPONSE clob; -- output buffer
| Subprogram | Description |
|---|---|
CHARINDEX function |
Searches for a pattern. |
CLOSE procedure |
Closes a URL. |
GETC function |
Retrieves a character from a URL. |
GETS function |
Retrieves a string from a URL. |
OPEN function |
Opens a URL. |
STATUS_CODE function |
Retrieves the STATUS_CODE associated with this URL. |
STRINDEX function |
Searches for a pattern and retrieves the string index. |
SUBSTRING function |
Retrieves a substr() from a URL. |
WRITE procedure |
Displays the current URL on your browser. |
This function opens a URL and stores it as a character LOB inside of the DOCUMENT package.
When you call OPEN, the specified URL is retrieved. If successful, the following global variables are assigned:
CONTENT_LENGTH represents the number of characters read.
CONTENT_TYPE represents the type of document retrieved. If the CONTENT_TYPE is neither text/html nor text/plain, the CLOB (RESPONSE) is Ascii encoded.
DISPOSITION represents the disposition for this document. Normally (and by default), the DISPOSITION will be inline. If this is suppose to be an attachment, then that is what is returned instead.
COOKIE represents the HTTP Cookie returned by this site (if any). Although you cannot pass cookies back with DOCUMENT.OPEN (Syntax 1), you can do this by calling function DOCUMENT.OPEN (Syntax 2 or 3).
HTTP.DOCUMENT.OPEN
(
URL IN VARCHAR2
)
RETURN BINARY_INTEGER;
| Parameter | Description |
|---|---|
url |
Universal resource locator. |
HTTP.DOCUMENT.OPEN
(
HTTPS IN BOOLEAN := FALSE
, SERVER_NAME IN VARCHAR2
, PORT IN BINARY_INTEGER := 80
, PATH_INFO IN VARCHAR2 := NULL
, QUERY_STRING IN VARCHAR2 := NULL
, REMOTE_HOST IN VARCHAR2 := NULL
, REQUEST_METHOD IN VARCHAR2 := NULL
, HTTP_USER_AGENT IN VARCHAR2 := NULL
, HTTP_COOKIE IN VARCHAR2 := NULL
, POST IN VARCHAR2 := NULL
)
RETURN BINARY_INTEGER;
| Parameter | Description |
|---|---|
HTTPS |
HTTP (FALSE) or HTTPS (TRUE). TRUE is currently not supported and OPEN will return BAD_REQUEST if you attempt to do so. |
SERVER_NAME |
The host or IP address of the server you are contacting. |
PORT |
(optional) The port number for this host. |
PATH_INFO |
(optional) The path to the specified file or script (CGI). |
QUERY_STRING |
(optional) A comma separated list of parameters passed to this script. |
REMOTE_HOST |
(optional) This determines your IP address. |
REQUEST_METHOD |
(optional) Valid values are GET or POST. If you specify POST, then you also want to pass the post request in the POST parameter. |
HTTP_USER_AGENT |
(optional) If you want to pass the value of the user agent, use this parameter. |
HTTP_COOKIE |
(optional) This is the cookie that is passed to this SERVER_NAME. Normally cookies are not passed to a web server unless you want to return the cookie (HTTP.DOCUMENT.COOKIE) that was assigned by a prior call to OPEN. |
POST |
(optional) If REQUEST_METHOD is POST, then this parameter determines the contents of your post request. |
HTTP.DOCUMENT.OPEN
(
SERVER_NAME IN VARCHAR2
, PORT IN BINARY_INTEGER := 80
, PATH_INFO IN VARCHAR2 := NULL
, QUERY_STRING IN VARCHAR2 := NULL
, REMOTE_HOST IN VARCHAR2 := NULL
, REQUEST_METHOD IN VARCHAR2 := NULL
, HTTP_USER_AGENT IN VARCHAR2 := NULL
, HTTP_COOKIE IN VARCHAR2 := NULL
, POST IN VARCHAR2 := NULL
, CONTENT_LENGTH OUT BINARY_INTEGER
, CONTENT_TYPE OUT VARCHAR2
, DISPOSITION OUT VARCHAR2
, COOKIE OUT VARCHAR2
, RESPONSE OUT CLOB
)
RETURN BINARY_INTEGER;
| Parameter | Description |
|---|---|
SERVER_NAME |
The host or IP address of the server you are contacting. |
PORT |
(optional) The port number for this host. |
PATH_INFO |
(optional) The path to the specified file or script (CGI). |
QUERY_STRING |
(optional) A comma separated list of parameters passed to this script. |
REMOTE_HOST |
(optional) This determines your IP address. |
REQUEST_METHOD |
(optional) Valid values are GET or POST. If you specify POST, then you also want to pass the post request in the POST parameter. |
HTTP_USER_AGENT |
(optional) If you want to pass the value of the user agent, use this parameter. |
HTTP_COOKIE |
(optional) This is the cookie that is passed to this SERVER_NAME. Normally cookies are not passed to a web server unless you want to return the cookie (HTTP.DOCUMENT.COOKIE) that was assigned by a prior call to OPEN. |
POST |
(optional) If REQUEST_METHOD is POST, then this parameter determines the contents of your post request. |
CONTENT_LENGTH |
The number of characters returned in this document. |
CONTENT_TYPE |
The document type of this document. |
DISPOSITION |
The disposition of this document. Normally, DISPOSITION is inline. If this is an attachment, the attachment name is included. |
COOKIE |
The cookie returned by this call to OPEN. |
RESPONSE |
The lob locator for this document. |
OPEN returns the corresponding status code for this URL. If the call to OPEN is successful, the status code will be DOCUMENT_FOLLOWS, which means you can use this package to manipulate this information. Any other result constitutes an error of some sort. Whether the result is successful or not, it is important that you call CLOSE to deallocate the internal CLOB that is created.
OPEN is an overloaded function. The easiest way to call OPEN is to simply pass a URL. That version of OPEN parses the URL and calls the second form of this function. The third form (called by the second) can be used externally if you need to declare the CLOB internally.
The following block retrieves the default home page from Sybernet.sri.com. It prints the entire page by calling DOCUMENT.WRITE() and then closes this URL:
DECLARE
VAR BINARY_INTEGER;
BEGIN
VAR:=HTTP.DOCUMENT.OPEN('http://Sybernet.sri.com/default.html');
IF (VAR = HTTP.DOCUMENT.DOCUMENT_FOLLOWS) THEN
HTTP.DOCUMENT.WRITE(VAR);
END IF;
HTTP.DOCUMENT.CLOSE();
END;
This procedure closes the character LOB you created by calling OPEN. CLOSE doesn't really close the URL because that is done when the LOB was created in OPEN.
HTTP.DOCUMENT.CLOSE;
HTTP.DOCUMENT.CLOSE (RESPONSE CLOB);
| Parameter | Description |
|---|---|
response |
The output CLOB returned by calling function HTTP.OPEN (Syntax 3) |
CLOSE is an overloaded procedure. When using the DOCUMENT package, use the first method. If you are calling HTTP.OPEN (Syntax 3) directly, then you can use the second method to delete your lob locator.
None.
The following block retrieves the default home page from Sybernet.sri.com. It prints the entire page by calling DOCUMENT.WRITE() and then closes this URL:
DECLARE
VAR BINARY_INTEGER;
BEGIN
VAR:=HTTP.DOCUMENT.OPEN('http://Sybernet.sri.com/default.html');
IF (VAR = HTTP.DOCUMENT.DOCUMENT_FOLLOWS) THEN
HTTP.DOCUMENT.WRITE(VAR);
END IF;
HTTP.DOCUMENT.CLOSE();
END;
This function retrieves the next character from an opened URL. You can also use this function to index any position in a URL. The index (if specified) is zero-relative. If the index specified is out of bounds or GETC is called after the last character has been retrieved, a NULL char is returned.
HTTP.DOCUMENT.GETC(I BINARY_INTEGER:=POSITION) RETURN BINARY_INTEGER;
| Parameter | Description |
|---|---|
I |
(optional) character index. |
GETC returns the next character from this URL or the character at a specified index. NULL is returned if the index (implied or not) is invalid.
The following block illustrates how you might print an entire URL. This is clearly not the most efficient way to print a URL, but does illustrate how GETC behaves.
DECLARE
VAR BINARY_INTEGER;
C CHAR;
BEGIN
VAR:=HTTP.DOCUMENT.OPEN('http://Sybernet.sri.com/default.html');
IF (VAR = HTTP.DOCUMENT.DOCUMENT_FOLLOWS) THEN
LOOP
C:=HTTP.DOCUMENT.GETC();
EXIT WHEN C IS NULL;
HTTP.WRITE(C);
END LOOP;
END IF;
HTTP.DOCUMENT.CLOSE();
END;
The following block illustrates how you might print an entire URL. This is clearly not the most efficient way to print a URL, but does illustrate how GETC behaves.
DECLARE
VAR BINARY_INTEGER;
I BINARY_INTEGER :=0;
BEGIN
VAR:=HTTP.DOCUMENT.OPEN('http://Sybernet.sri.com/default.html');
IF (VAR = HTTP.DOCUMENT.DOCUMENT_FOLLOWS) THEN
WHILE (I < HTTP.DOCUMENT.CONTENT_LENGTH) LOOP
HTTP.WRITE(HTTP.DOCUMENT.GETC(I));
I:=I+1;
END LOOP;
END IF;
HTTP.DOCUMENT.CLOSE();
END;
This function retrieves the next string from an opened URL. You can also use this function to index any string in a URL. The index (if specified) is zero-relative. If the index specified is out of bounds or GETS is called after the last string has been retrieved, a NULL string is returned.
HTTP.DOCUMENT.GETS(I BINARY_INTEGER:=POSITION) RETURN VARCHAR2;
| Parameter | Description |
|---|---|
I |
(optional) string index. |
GETS returns the next string from this URL or the string at a specified index. NULL is returned if the index (implied or not) is invalid.
A string is any HTML tag or anything between two HTML tags. Consider the following HTML page:
<HTML> <BODY> Hello World </BODY> </HTML>In this example there are actually 7 strings. Calling GETS on this page would return the following strings:
<HTML> -- 1st tag
-- line-feed
<BODY> -- 2nd tag
Hello World -- Hello World with a line-feed
</BODY> -- 3rd tag
-- line-feed
</HTML> -- 4th tag
The following block illustrates how you might print an entire URL. This is clearly not the most efficient way to print a URL, but does illustrate how GETS behaves.
DECLARE
VAR BINARY_INTEGER;
S VARCHAR(32767);
BEGIN
VAR:=HTTP.DOCUMENT.OPEN('http://Sybernet.sri.com/default.html');
IF (VAR = HTTP.DOCUMENT.DOCUMENT_FOLLOWS) THEN
LOOP
S:=HTTP.DOCUMENT.GETS();
EXIT WHEN S IS NULL;
HTTP.WRITE(S);
END LOOP;
END IF;
HTTP.DOCUMENT.CLOSE();
END;
This function performs a substr() from the current URL.
HTTP.DOCUMENT.SUBSTRING
(
OFFSET BINARY_INTEGER
, AMOUNT BINARY_INTEGER
)
RETURN VARCHAR2;
| Parameter | Description |
|---|---|
OFFSET |
The one-relative offset or staring position. |
AMOUNT |
The number of characters to be retrieved. |
SUBSTRING returns a VARCHAR2 from the current URL.
The following block illustrates how to print the first 6 characters of our URL. Since this is an HTML tag, you may want to turn on HTMLEncode to see the results:
DECLARE
VAR BINARY_INTEGER;
BEGIN
VAR:=HTTP.DOCUMENT.OPEN('http://Sybernet.sri.com/default.html');
IF (VAR = HTTP.DOCUMENT.DOCUMENT_FOLLOWS) THEN
HTTP.WRITE(HTTP.DOCUMENT.SUBSTRING(1,6));
END IF;
HTTP.DOCUMENT.CLOSE();
END;
This function performs a instr() from the current URL.
HTTP.DOCUMENT.CHARINDEX
(
TARGET VARCHAR2
, OFFSET BINARY_INTEGER := 1
, OCCURENCE BINARY_INTEGER := 1
)
RETURN BINARY_INTEGER;
| Parameter | Description |
|---|---|
TARGET |
Pattern to be tested for. The pattern is a character string. The maximum size of the pattern is 16383 bytes. |
OFFSET |
Absolute offset in characters at which the pattern matching is to start. |
OCCURENCE |
Occurrence number, starting at 1. |
Offset of the start of the matched pattern in characters. It returns 0 if the pattern is not found.
The following block illustrates how to find the offset of "Sybernet."
DECLARE
VAR BINARY_INTEGER;
I BINARY_INTEGER;
BEGIN
VAR:=HTTP.DOCUMENT.OPEN('http://Sybernet.sri.com/default.html');
IF (VAR = HTTP.DOCUMENT.DOCUMENT_FOLLOWS) THEN
I:=HTTP.DOCUMENT.CHARINDEX('Sybernet');
IF (I=0) THEN
HTTP.WRITE('Not Found');
ELSE
HTTP.WRITE('Found at offset ',I);
END IF;
END IF;
HTTP.DOCUMENT.CLOSE();
END;
This function searches for the string index of the specified target. While CHARINDEX returns the character position, STRINDEX returns the string index that could subsequently be used in a call to GETS. Although the search is case-sensitive, you can use Oracle's wild card characters (% and _).
HTTP.DOCUMENT.STRINDEX
(
TARGET VARCHAR2
, OFFSET BINARY_INTEGER := 1
, OCCURENCE BINARY_INTEGER := 1
)
RETURN BINARY_INTEGER;
| Parameter | Description |
|---|---|
TARGET |
Pattern to be tested for. The pattern is a character string. |
OFFSET |
Zero relative index at which the pattern matching is to start. |
OCCURENCE |
Occurrence number, starting at 1. |
Index of the string that matches this pattern. It returns -1 if the pattern is not found.
The following block illustrates how to find the opening stock value for the symbol UIS:
DECLARE
VAR BINARY_INTEGER;
I BINARY_INTEGER;
BEGIN
VAR:=HTTP.DOCUMENT.OPEN('http://finance.lycos.com/home/stocks/quotes.asp?symbols=NYSE%3AUIS');
IF (VAR = HTTP.DOCUMENT.DOCUMENT_FOLLOWS) THEN
I:=HTTP.DOCUMENT.STRINDEX('Open');
HTTP.WRITE(HTTP.DOCUMENT.GETS(i+6));
END IF;
HTTP.DOCUMENT.CLOSE();
END;
The following example searches the SRI white pages and displays everyone's name that contains denis. The names are in an HTML table, but this example knows that each row is preceded by a hidden input field and that the name is 4 strings after this tag:
DECLARE
VAR BINARY_INTEGER;
I BINARY_INTEGER;
OCCURENCE BINARY_INTEGER := 1;
BEGIN
VAR:=HTTP.DOCUMENT.OPEN('http://mis.sri.com/cgi-bin/Sybernet.cgi$http.dbo.sp_html_whitepages?filter=cgi,BUTTON=QUERY,query=denis');
LOOP
i:=HTTP.DOCUMENT.STRINDEX('<input type=hidden name=MAILTO%',1,OCCURENCE);
EXIT WHEN i = -1;
HTTP.WRITE(HTTP.DOCUMENT.GETS(i+4),'<br>');
OCCURENCE:=OCCURENCE+1;
END LOOP;
HTTP.DOCUMENT.CLOSE();
END;
The following example determines the traffic status to and on the Dumbarton Bridge:
DECLARE
VAR BINARY_INTEGER;
I BINARY_INTEGER;
S VARCHAR(2000);
BEGIN
VAR:=HTTP.DOCUMENT.OPEN('http://www.kcbs.com/pages/kcbs/traffic/traffic_index.nsp');
IF (VAR = HTTP.DOCUMENT.DOCUMENT_FOLLOWS) THEN
I:=HTTP.DOCUMENT.STRINDEX('%dumbarton%');
S:=HTTP.DOCUMENT.GETS(I+5);
IF (UPPER(S) LIKE '%NO INCIDENTS REPORTED%') THEN
HTTP.WRITELN('<img src="http://Sybernet.sri.com/oracle/thumbsup.gif">');
HTTP.WRITELN('<br>');
HTTP.WRITELN(S);
ELSE
HTTP.WRITELN('<img src="http://Sybernet.sri.com/oracle/smile.yuk.gif">');
HTTP.WRITELN('<br>');
WHILE (UPPER(S) != '<TD>') LOOP
I:=I+1;
S:=HTTP.DOCUMENT.GETS(I);
END LOOP;
WHILE (UPPER(S) != '</TD>') LOOP
I:=I+1;
S:=HTTP.DOCUMENT.GETS(I);
IF (SUBSTR(S,1,1) != '<') THEN
HTTP.WRITELN(S);
END IF;
END LOOP;
END IF;
ELSE
HTTP.WRITELN(HTTP.DOCUMENT.STATUS_CODE(VAR));
END IF;
HTTP.DOCUMENT.CLOSE();
END;
This procedure writes the current URL to your browser. Obviously, there is no reason on Earth to use this package to open a URL and then display it on your browser. It is simpler to just go to that URL directly. The important bit, however, is that WRITE does illustrate the most efficient way to display a URL.
When you call OPEN, the specified URL is retrieved. If successful, the following global variables are assigned:
WRITE ignores DISPOSITION and COOKIE, but does use CONTENT_TYPE to determine how this information is to be displayed. If text/html or text/plain, WRITE simply passes the current URL (a CLOB) to HTTP.HTTP_SELECT_CLOB. All other types are ASCII encoded while calling HTTP_SELECT_CLOB.
If the VAR result passed to WRITE is anything other than DOCUMENT_FOLLOWS, an appropriate error message is written instead.
HTTP.DOCUMENT.WRITE (VAR BINARY_INTEGER);
| Parameter | Description |
|---|---|
VAR |
The return result from calling OPEN. |
HTTP.DOCUMENT.WRITE (VAR BINARY_INTEGER, RESPONSE CLOB);
| Parameter | Description |
|---|---|
VAR |
The return result from calling OPEN. |
RESPONSE |
The output CLOB returned by calling function HTTP.OPEN (Syntax 3) |
None.
WRITE is an overloaded procedure. When using the DOCUMENT package, use the first method. If you are calling HTTP.OPEN (Syntax 3) directly. then you can use the second method to write your lob locator.
The following block retrieves the default home page from Sybernet.sri.com. It prints the entire page by calling DOCUMENT.WRITE() and then closes this URL. While the previous examples tested for a return status of DOCUMENT_FOLLOWS, this isn't necessary because DOCUMENT.WRITE will do this as well and display an appropriate message for all return results:
DECLARE
VAR BINARY_INTEGER;
BEGIN
VAR:=HTTP.DOCUMENT.OPEN('http://Sybernet.sri.com/default.html');
HTTP.DOCUMENT.WRITE(VAR);
HTTP.DOCUMENT.CLOSE();
END;
This function maps the return result returned by HTTP.DOCUMENT.OPEN to its English mnemonic value (or vice versa).
HTTP.DOCUMENT.STATUS_CODE(VAR BINARY_INTEGER) RETURN VARCHAR2;
| Parameter | Description |
|---|---|
VAR |
The return result returned by calling HTTP.DOCUMENT.OPEN. |
HTTP.DOCUMENT.STATUS_CODE(VAR VARCHAR2) RETURN BINARY_INTEGER;
| Parameter | Description |
|---|---|
VAR |
A mnemonic value. |
STATUS_CODE returns either the mnemonic value (a varchar) corresponding to the return result gotten from calling OPEN or the numeric value corresponding to same.
The following example illustrates both flavors of STATUS_CODE():
BEGIN
HTTP.WRITE(HTTP.DOCUMENT.STATUS_CODE(HTTP.DOCUMENT.DOCUMENT_FOLLOWS));
HTTP.WRITE(HTTP.DOCUMENT.STATUS_CODE('DOCUMENT_FOLLOWS'));
END;