View on GitHub


sensitive number finder

Download this project as a .zip file Download this project as a tar.gz file

senf: the mustardy sensitive number finder

senf is a portable tool for finding sensitive numbers. Use this tool to identify files on your system that may have Social Security Numbers (SSNs) or Credit Card Numbers (CCNs). The latest version can always be found at the senf site.

Download pre-built release binary

The latest release build can be downloaded at

Other senf-related products for superb purchase agreeings

senf is licensed under a Creative Commons License; see

Warning! [About senf]

Warning: Do not have a false sense of security after running this program

It is important to understand what senf is, and what senf is not

What senf is
What senf is not
What all this means
Which reduces to (even though it's longer)!

This tool is not to be regarded as the end-all in your effort to ensure your computer is free of SSN/CCN records. It simply will report to you files that contain numbers that could pose a security threat. Remember, it looks for strings of numbers -- and the typical computer has lots of these.


Java 1.6 JRE

No matter what system you're running on, you need the Java 1.6 runtime (or greater); you do not need the whole Java 1.6 SDK, which includes the runtime.

System path

We assume that the Java interpreter is in your environment path (meaning, no matter where we try to run "java" from, it will run). The JRE installer should modify your system path to include the Java interpreter.

If you get a strange error message saying something along the lines of "If you see this, senf did not run!" then chances are your path is not set up correctly. Unfortunately, the solution to this is beyond the scope of this text.


Once the JRE is installed, all you need to do is copy senf.jar and the seeds folder to some folder on the computer that is going to run the scan. You might also want to copy the configuration files to the same folder.


Brief Note

On some Operating Systems (typically Windows and Mac OS X), simply double-clicking on the senf.jar file will launch the program automatically. If this works, you can skip the rest of this section.


Open a command prompt Navigate to the folder in which senf is installed. Run java -jar senf.jar (with optional arguments)

Linux and Mac OS X

Open a command shell Navigate to the folder in which senf is installed. Run java -jar senf.jar (with optional arguments)

Using senf

Usage: senf [OPTIONS]

Option Default Effect
-q off quiet mode (display no output)
-v off verbose mode (display everything)
-e off print error messages to the screen
-p working dir Set the path to start scanning from
-l off Set modified-date check; files last modified before this date are skipped
-f infinite Set the max file size to scan; end size (no spaces) with 'g' for gigs, 'm' for megs, 'k' for kilobytes, and nothing for bytes
-m 15 Set minimum number of times to match a CCN/SSN pattern before reporting a file
-o senf_DATE.txt Set the name of the file (including path, if you like) where log information will be saved
-al on Append the current log to the end of the file if it already exists
-ac off Append configuration information to the end of the output log
-nl off Do not use a log file
-g off Hide the GUI
-as off Auto-start scanning (ignored when -g is specified)
-h n/a Display this help and exit

By default, senf only prints to the screen files which are matched -- not all output is shown.


Also, note that this program may take a while to complete; again, by default, the only things it prints to the screen are possible matches (ie no errors), so it may look like it's frozen, not printing anything for a while, but it's (probably) not.

As of the Sasuke.188 release, senf provides a GUI for ease of use. The GUI offers a results viewer to help the user quickly identify what was flagged by senf as being sensitive. Results appear in the central pane of the senf window as they are found; if an entry is clicked on, the senf Analyzer will pop up, showing the applicable matches in the file.

Configuration files


senf uses the file senf.conf to load default settings.


As of the Haku version, senf uses an ACL in place of the old whitelist/blacklist system. The ACL is contained in the file senf.acl, and can be modified either by editing the file, or through the senf GUI.

ACL entries have three columns. The first column denotes whether to allow or deny matches. The second dictates what type of match to look for. The third contains the expression to search for. Possible entries for each row are listed below.


An example "senf.acl" file is included with common entries. In the case of two conflicting entries, the entry listed first will over rule the later entry.


senf relies on the Apache Tika library ( to parse file types. This library should be placed in the "lib" folder in order for senf to function properly. While use of this library does allow senf to scan many file types, there is a caveat: currently, Tika does make use of temporary files while scanning; under normal circumstances, these are deleted when they are no longer used, but in certain circumstances (e.g. JVM crash) they might not get cleaned up.

How senf Works

The way senf scans has changed drasticly with the release of Haku. There are four important parts to senf Haku. Parsers, Seeds, Streams, and Stream Sources.


A Stream is something that senf can scan, and implements the class senfStream. An example of a "Stream" is a text file.

Stream Sources

A Stream Source is something that contains streams, and implemtnts the class senfStreamSource. An example of a Stream Source is a directory, or a zip file.


A Seed is something that senf will look for in a Stream. Seeds implement the class Seed. As of senf Sasuke.188, Seeds are modular. This means that seeds may be added/removed from the "seeds" directory to modify what senf will or will not search for within a Stream. At the moment, senf includes a Seed for both Social Security Numbers and Credit Card numbers.


Parses are the objects that tell the senf engine what each "object" that is to be scanned should be scanned as. That is, the Parser tells senf what type of Stream or StreamSource each senfObject should be cast as. Parsers implement the class senfParser, and are modular.


senf looks for certain patterns to reduce false positives. Those patterns are described here. These patterns cannot be used to find every conceivable incarnation of the numbers senf searches for. However, if you have suggestions for improving the algorithms (and, better, known false negatives to back up your suggestions) please let us know.

Credit card numbers


There are a number of valid credit card formats. senf supports only the 16 digit formats. This includes Mastercard, some (but evidently not all) VISA, and Discover. It does NOT include, for example, American Express.


Credit cards numbers may be one long string of numbers (nnnnnnnnnnnnnnnn), or may be separated into groups of four digits (nnnn-nnnn-nnnn-nnnn). There are, of course, as many ways to delimit groups of digits as can be imagined; senf only counts matches that use either no separator, or only one of:

Luhn check

Credit cards must pass a Luhn mod 10 check to be considered valid.

Social Security Numbers

Formats and separators

Socials are detected in both single string (nnnnnnnnn) and grouped (nnn-nn-nnnn) formats; permitted separators are the same as credit card numbers.

Validity checking

Socials are verified against their area (the first three digits), according to the Social Security Administration's current list of valid high groups. In addition, group and serial numbers may not be all zeroes.

Developers developers developers developers


To compile the source code and build a runnable jar file, just run the included script. Or, to do it manually:

javac -cp lib/tika.jar senf/*.java streams/*.java streamsources/*.java seeds/*.java parsers/*.java
jar cmf manifest senf.jar senf/*.class senf/images/* streams/*.class streamsources/*.class


Seeds are objects which implement the senf.Seed interface. At runtime, senf will load any seeds located in the "seeds" directory and use them when scanning.

Okay, that's all

Thank you for using senf! Feedback and questions are welcome; email