SNOBOL Pattern-Matching In Java

The SNOBOL language was developed at Bell Labs in the 1960's. Its statement syntax, though unique, is based on labels, and GOTO's, which is something best left in the 60's.

Its pattern-matching capabilities, on the other hand, surpass the Regular Expressions, used for pattern-matching in modern languages, and are a lot more programmer-friendly than regular expressions.

This ability of the SNOBOL language is best illustrated by the fact that the artificial-intelligence program (Heuristic Analysis of Language), available from this web-site, in its first, simpler version, was implemented using the SNOBOL language, and its source-code was only a page and a half long!

Being accustomed to SNOBOL pattern-matching, and then having to use Java regular-expressions to do pattern-matching, the task seemed needlessly difficult.

Wouldn't it be nice, I thought, if I could have the full generality of SNOBOL pattern-matching within an object-oriented, Java application?

Having already created the compiler (developed to accomplish earlier tasks), I began to seriously consider making this possible, and in 2005, adapted not only the compiler, but developed the pattern-matching code, and even an Integrated Development Environment (IDE) for coding and testing SNOBOL patterns.

That IDE also generates the Java source-code statements for putting it into a Java program.

I demonstrated this software (which I call JSnobolApp) when I worked for Intermountain HealthCare, and after that, the software languished for several years.

Now that I am making software I developed available on the Internet, I took this project up again at the end of 2015. The result of that more recent work appears on this web-site.

What Is JSnobol

The heart of JSnobol is a class named “JSnobol”, which contains the vast majority of the code. It is available in binary-form, as a JAR-file, which can be included into your Java project's build-path.

The full version of the application is available in binary and source-code form (terms negotiated), by sending an e-mail to:

aere@aeresrealm.com

The JSnobol class was created using information from my manual I used in college:

The Snobol4 Programming Language, by R.E. Griswold, J.F. Poage, and I.P. Polonsky, printing number 13-815357-4. Mine is not the most recent version existing of that manual.

The manual (a more recent version) is available at the following link:

Snobol 4 Programming Language Manual

Since you need to know something about the JSnobol class in order to develop JSnobol patterns, it is briefly documented here.

You define (and compile) SNOBOL patterns, by calling JSnobol's setPattern(name, patternText) method.

Before pattern matching begins, you can (optionally) set variables used during pattern-match by calling JSnobol's setVariable(name, value) method.

To perform pattern matching, you call either JSnobol's match(stringToMatch, patternName) method (for normal pattern-matching), or its match(stringBufferToMatch, patternName, replacementString) method for match-and-replace pattern-matching.

After pattern-matching, variables assigned during pattern-match may be queried by calls to JSnobol's getVariable(name) method.

In a nut-shell, that's how you use it.

For more details of the JSnobol class, click on the web-page link below:

The JSnobol Class

Extensions To The SNOBOL Language

There are a few extensions to the SNOBOL language, which make things easier:

Text Specification

Non-space text, within patterns being defined, need not be specified with quotes surrounding it. If such text is the name of a pattern, a reference to that pattern will be generated. If it is the name of a variable (a value has been assigned to it) the value of the variable will be used. Otherwise, the text itself will be used.

This doesn't apply to text that affects the syntax of the language, which must always be enclosed in quotation marks.

If you don't want text interpreted as the name of a pattern (or the name of a variable), simply enclose it in quotation marks, and it will be treated simply as a literal string.

As with SNOBOL, strings may be delimited either using single-quotes ('), or double-quotes (“).

Defining Sub-Patterns

Anywhere within a pattern being specified, a sub-pattern can easily be defined.

Where you want the sub-pattern to begin, specify the sub-pattern's name, followed immediately by a colon (:), followed by a space. The sub-pattern can thereafter be referenced by its name, and consists of everything from its name to the end of the current parenthetical level (or the end of the pattern itself).

For example, in the pattern:

(Sep: ("\n" | " " | RPOS(0)))

(LetterCombinations: ((be | bea | bear) (ds | d) | (ro | roo | roos) (ts | t)))

(ArbnoPat: (ARBNO(LetterCombinations $ OUTPUT $ image ?(imageNum = imageNum + 1) Sep)))

The sub-patterns “Sep”, “LetterCombinations”, and “ArbnoPat” are created, and can be used by name within the pattern-matching environment.

Upper And Lower Case

Though in the SNOBOL compiler I used (on a Univac 1100 computer) was strictly upper-case, as well as was documented in my SNOBOL 4 Programming Manual, everything was strictly upper-case, upper and lower-case pattern-matching is now supported. The keyword “&CASESENSITIVE” has been defined to control case-sensitive (value 1) matching, or case-insensitive (value 0) matching.

Although it is not required (keywords and SNOBOL primitive functions can be either upper, or lower-case) it is good practice (and style) to specify SNOBOL primitive functions and keywords as upper-case.

Assignment Statements Within Patterns

Snobol-style assignment statements can be included in patterns, such as the “?(imageNum = imageNum + 1)” evaluated expression in the example above. Evaluated expressions are run in concatenation-mode, as opposed to match-mode, though the success of the evaluated statement can affect the pattern-match.

The JSnobolApp Application

The Integrated Development Environment for developing and testing JSnobol patterns, is called “JSnobolApp”, and is a Java application making use of the JSnobol class.

This application allows you to code and test JSnobol patterns in their original syntax, without worrying about supplying the various escape-sequences required within Java strings.

The IDE actually generates the Java code for you, which can then be cut-and-pasted into your Java program.

The application uses (and asks permission to initially create) a folder called “JSnobolGUI”, within your home folder. Everything you develop using the application will be stored in that folder, unless you browse to someplace else to put it, or get it from.

The documentation (help information) for the JSnobolApp can be accessed by clicking on the link below. Since the sand-box version can't fire-up your browser to look at it, you should browse there first yourself, and look it over.

JSnobolApp Help Information

Trying Out The Application

I planned to offer this application via Java Web Start, as with the other software on this web-site. Unfortunately, it appears there is no way to launch an application coded as a Java Single-Frame Application (having string resource-bundles) via Java web Start.

A further problem with this plan, is that a single frame application cannot even be launched while following the Java sand-box security limitations.

Nevertheless, I did prepare a sand-box version (that does not violate the sand-box security limitations within the code I wrote). Unfortunately, Java cannot assure you of that (as it can using Java Web Start), so you have to take my word for it. At least, it is a signed (by Laeramin LLC) JAR-file.

Even in the Java sand-box version, there are several projects you can load and try out, including the test pattern used to test the application, and a pattern for matching Health Level 7 (HL7) messages.

If you've followed the instructions on the Software page for obtaining Java, you can try out the application now. It will run the 'sand-box' (few privileges) version, so it can't access resources on your machine (resulting in buttons being grayed-out, and having to manually browse to obtain the help info), but you can check out much of the application's functionality.

Where Java Web Start can't be used for it, you'll have to download it yourself.

For some browsers, you can simply click on the link below, and chose the “Save File” option in the menu that appears, and click the “OK” button.

For other browsers, right-click on the link below, and choose “Save Link As” (or something similar) from the pop-up menu.

JSnobolApp 1.3 (sand-box version)

Once it is downloaded, you can (on Windows or Mac OS X) simply browse to where you downloaded it (your Downloads folder, if you use Firefox), and double-click on it.

On Linux, you need to create a desktop launcher for it, with the following statement to be executed:

java -jar JSnobolApp.jar

(you will need to prefix “JSnobol” with the path to where you put the downloaded JAR-file, such as “java -jar Downloads/JSnobol.jar”.)

Alternatively, you could do the above statement (modified as necessary) in a Run statement (from the menu), or type the above statement in a terminal session (after cd-ing to the directory where it's saved).

If you like what you experiment-with, the full version of the application is available in binary and source-code form (terms negotiated), by sending an e-mail to:

aere@aeresrealm.com

Give it a try – see what you think!

Other SNOBOL resources to check out:

http://www.snobol4.org



(Back To Main Software Index)