SNOBOL Pattern-Matching In Java

The SNOBOL language was developed at Bell Labs in the 1960's. Its statement syntax, though unique, is based on labels, and GOTO's, which is something best left in the 60's.

Its pattern-matching capabilities, on the other hand, surpass the Regular Expressions, used for pattern-matching in modern languages, and are a lot more programmer-friendly than regular expressions.

This ability of the SNOBOL language is best illustrated by the fact that the artificial-intelligence program (Heuristic Analysis of Language), available from this web-site, in its first, simpler version, was implemented using the SNOBOL language, and its source-code was only a page and a half long!

Being accustomed to SNOBOL pattern-matching, and then having to use Java regular-expressions to do pattern-matching, the task seemed needlessly difficult.

Wouldn't it be nice, I thought, if I could have the full generality of SNOBOL pattern-matching within an object-oriented, Java application?

Having already created the compiler (developed to accomplish earlier tasks), I began to seriously consider making this possible, and in 2005, adapted not only the compiler, but developed the pattern-matching code, and even an Integrated Development Environment (IDE) for coding and testing SNOBOL patterns.

That IDE also generates the Java source-code statements for putting it into a Java program.

I demonstrated this software (which I call JSnobolApp) when I worked for Intermountain HealthCare, and after that, the software languished for several years.

Now that I am making software I developed available on the Internet, I took this project up again at the end of 2015. The result of that more recent work appears on this web-site.

What Is JSnobol

The heart of JSnobol is a class named “JSnobol”, which contains the vast majority of the code. It is available in binary-form, as a JAR-file, which can be included into your Java project's build-path.

The full version of the application is available in binary and source-code form (terms negotiated), by sending an e-mail to:

aere@aeresrealm.com

The JSnobol class was created using information from my manual I used in college:

The Snobol4 Programming Language, by R.E. Griswold, J.F. Poage, and I.P. Polonsky, printing number 13-815357-4. Mine is not the most recent version existing of that manual.

The manual (a more recent version) is available at the following link:

Snobol 4 Programming Language Manual

Since you need to know something about the JSnobol class in order to develop JSnobol patterns, it is briefly documented here.

You define (and compile) SNOBOL patterns, by calling JSnobol's setPattern(name, patternText) method.

Before pattern matching begins, you can (optionally) set variables used during pattern-match by calling JSnobol's setVariable(name, value) method.

To perform pattern matching, you call either JSnobol's match(stringToMatch, patternName) method (for normal pattern-matching), or its match(stringBufferToMatch, patternName, replacementString) method for match-and-replace pattern-matching.

After pattern-matching, variables assigned during pattern-match may be queried by calls to JSnobol's getVariable(name) method.

In a nut-shell, that's how you use it.

For more details of the JSnobol class, click on the web-page link below:

The JSnobol Class

Extensions To The SNOBOL Language

There are a few extensions to the SNOBOL language, which make things easier:

Text Specification

Non-space text, within patterns being defined, need not be specified with quotes surrounding it. If such text is the name of a pattern, a reference to that pattern will be generated. If it is the name of a variable (a value has been assigned to it) the value of the variable will be used. Otherwise, the text itself will be used.

This doesn't apply to text that affects the syntax of the language, which must always be enclosed in quotation marks.

If you don't want text interpreted as the name of a pattern (or the name of a variable), simply enclose it in quotation marks, and it will be treated simply as a literal string.

As with SNOBOL, strings may be delimited either using single-quotes ('), or double-quotes (“).

Defining Sub-Patterns

Anywhere within a pattern being specified, a sub-pattern can easily be defined.

Where you want the sub-pattern to begin, specify the sub-pattern's name, followed immediately by a colon (:), followed by a space. The sub-pattern can thereafter be referenced by its name, and consists of everything from its name to the end of the current parenthetical level (or the end of the pattern itself).

For example, in the pattern:

(Sep: ("\n" | " " | RPOS(0)))

(LetterCombinations: ((be | bea | bear) (ds | d) | (ro | roo | roos) (ts | t)))

(ArbnoPat: (ARBNO(LetterCombinations $ OUTPUT $ image ?(imageNum = imageNum + 1) Sep)))

The sub-patterns “Sep”, “LetterCombinations”, and “ArbnoPat” are created, and can be used by name within the pattern-matching environment.

Upper And Lower Case

Though in the SNOBOL compiler I used (on a Univac 1100 computer) was strictly upper-case, as well as was documented in my SNOBOL 4 Programming Manual, everything was strictly upper-case, upper and lower-case pattern-matching is now supported. The keyword “&CASESENSITIVE” has been defined to control case-sensitive (value 1) matching, or case-insensitive (value 0) matching.

Although it is not required (keywords and SNOBOL primitive functions can be either upper, or lower-case) it is good practice (and style) to specify SNOBOL primitive functions and keywords as upper-case.

Assignment Statements Within Patterns

Snobol-style assignment statements can be included in patterns, such as the “?(imageNum = imageNum + 1)” evaluated expression in the example above. Evaluated expressions are run in concatenation-mode, as opposed to match-mode, though the success of the evaluated statement can affect the pattern-match.

The JSnobolApp Application

The Integrated Development Environment for developing and testing JSnobol patterns, is called “JSnobolApp”, and is a Java application making use of the JSnobol class.

This application allows you to code and test JSnobol patterns in their original syntax, without worrying about supplying the various escape-sequences required within Java strings.

The IDE actually generates the Java code for you, which can then be cut-and-pasted into your Java program.

The application uses (and asks permission to initially create) a folder called “JSnobolGUI”, within your home folder. Everything you develop using the application will be stored in that folder, unless you browse to someplace else to put it, or get it from.

The documentation (help information) for the JSnobolApp can be accessed by clicking on the link below. Since the sand-box version can't fire-up your browser to look at it, you should browse there first yourself, and look it over.

JSnobolApp Help Information

Trying Out The Application

You are also welcome to download and install the application, which will even let you scan text files on your machine using Snobol pattern-matching.

There are several projects you can load and try out, including the test pattern used to test the application, and a pattern for matching Health Level 7 (HL7) messages.

If you've followed the instructions on the Software page for obtaining Java, you can try out the application now.

To download it, right-click on the link below, and choose “Save Link As” (or whatever similar choice is offered in the popup-menu by your browser).

JSnobolApp 1.3 Jar-File

With Java installed (on MacOS and Linux, where the JDK is installed), you can verify the authenticy of the software by entering the following statement in a terminal session, having first changed directories to the directory (such as Downloads) where the jar-file is stored:

jarsigner -verify JSnobolApp-ap.jar

On Windows, you can run it as an application, by opening it (double-clicking on it). If you plan to use it a lot, you can put it in your desktop folder.

On MacOS, you need to option-click (2-finger, or right-click) on the downloaded jar-file, and specify it is opened by the Java launcher. The first time you try to open it, it will complain it can’t identify the developer, asking if you want to open it anyway. After choosing to open it, you won’t have to answer that again.

On Linux, you will probably have to right-click on the jar-file, select “Properties” from the popup-menu, and specify it is to be opened by “OpenJDK Java 11 Runtime”, or alternatively, “java -jar”. You will also have to change its permissions to make it executable.

The application is available in source-code form (subject to terms to be negotiated), by sending an e-mail to:

aere@aeresrealm.com

Supporting it as an open-source project (for the right person) is within the realm of possibility.

Give it a try – see what you think!



(Back To Main Software Index)