Hadrian Wiki
This wiki provides documentation for the Hadrian ecosystem of tools.
First steps
- How to install: start here!
- Tutorial 1: Building and testing a small model in Titus (Python)
- Tutorial 2: Building a small model in Aurelius (R)
- Tutorial 3: Inspecting a model in PFA-Inspector (commandline)
- Tutorial 4: Executing a model in hadrian-standalone (JVM)
- Basic models: simple examples in Titus PrettyPFA
- Moving windows: How to add moving windows to your preprocessing
- Segmentation: How to subdivide a model into segments
- Concurrency: Running multiple scoring engines simultaneously
Hadrian (Java/Scala/JVM)
Hadrian is a complete implementation of PFA in Scala, which can be accessed through any JVM language, principally Java. It focuses on model deployment, so it is flexible (can run in restricted environments) and fast.
- Complete API reference (Scaladoc)
- Performance table
- Using Hadrian directly
- Ready-to-use Hadrian wrappers:
- Hadrian-Standalone: command-line program that reads data from standard input and writes it to standard output: use this for testing or a simple shell-based workflow
- Hadrian-MR: Hadoop executable that runs two PFA files as mapper and reducer. Has built-in secondary sort: use this for running fast Hadoop jobs with no compilation
- Hadrian-GAE: Java servlet that runs PFA as a service in Google App Engine or any servlet container, such as Tomcat, JBoss, or WildFly: this is the backend for scoringengine.org
- Hadrian-Actors: actor-based network of interacting PFA scoring engines: use this for building data pipelines in a JVM
Titus (Python)
Titus is a complete, independent implementation of PFA in pure Python. It focuses on model development, so it includes model producers and PFA manipulation tools in addition to runtime execution.
- Complete API reference (Sphinx)
- Loading, validating, and executing PFA on the Python command prompt
- PFA development tools
- Model producers in Titus
- CUSUM tutorial: an example of building a model primarily with PrettyPFA
- K-means reference: building cluster models with Titus
- CART reference: building decision trees with Titus
- Transformations producer: coordinates operations on Numpy arrays in the producer stage with PFA code in the runtime scoring engine, for developing pre- and post-processors
- Ready-to-use Titus scripts:
- pfainspector: command-line tool (with history and tab-complete) for inspecting PFA documents (or other JSON): use this to diagnose faulty PFA
- pfachain: turns a linear sequence of PFA files into a combined PFA file, with schema-checking and renaming to avoid namespace collisions
- pfaexternalize: moves large data blocks from a PFA file into external JSON for faster loading (uses ijson)
- pfarandom: given an input and output schema, creates a PFA file to fit these schama (the PFA file ignores input and generates random outputs)
Aurelius (R)
Aurelius is a toolkit for generating PFA in the R programming language. It focuses on porting models to PFA from their R equivalents. To validate or execute scoring engines, Aurelius sends them to Titus through rPython (so both must be installed).
Antinous (Model development in Jython)
Antinous is a model-producer plugin for Hadrian that allows Jython code to be executed anywhere a PFA scoring engine would go. It also has a library of model producing algorithms.