Skip to content

Commit

Permalink
Updated documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
vruusmann committed Dec 8, 2024
1 parent b073f0b commit 464faf8
Show file tree
Hide file tree
Showing 2 changed files with 78 additions and 5 deletions.
50 changes: 50 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,53 @@
# 0.112.0 #

## Breaking changes

* Required Python 3.8 or newer.

This requirement stems from underlying package requirements, most notably that of the NumPy package (`numpy>=1.24`).

Portions of the SkLearn2PMML package may be usable with earlier Python versions.
For example, the `sklearn2pmml.sklearn2pmml(estimator, pmml_path)` utlity function should work with any Python 2.7, 3.4 or newer version.

* Migrated setup from `distutils` to `setuptools`.

* Migrated unit tests from `nose` to `pytest`.

Testing the (source checkout of-) package:

```
python -m pytest .
```

## New features

* Added command-line interface to the `sklern2pmml.sklearn2pmml()` utility function.

Sample usage:

```
python -m sklearn2pmml --input pipeline.pkl --output pipeline.pmml
```

Getting help:

```
python -m sklearn2pmml --help
```

* Added `sklearn2pmml` command-line application.

Sample usage:

```
sklearn2pmml -i pipeline.pkl -o pipeline.pmml
```

## Minor improvements and fixes

None.


# 0.111.2 #

## Breaking changes
Expand Down
33 changes: 28 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,13 @@ This package is a thin Python wrapper around the [JPMML-SkLearn](https://github.

# News and Updates #

The current version is **0.111.2** (4 December, 2024):
The current version is **0.112.0** (8 December, 2024):

```
pip install sklearn2pmml==0.111.2
pip install sklearn2pmml==0.112.0
```

See the [NEWS.md](https://github.com/jpmml/sklearn2pmml/blob/master/NEWS.md#01112) file.
See the [NEWS.md](https://github.com/jpmml/sklearn2pmml/blob/master/NEWS.md#01120) file.

# Prerequisites #

Expand All @@ -38,14 +38,37 @@ pip install --upgrade git+https://github.com/jpmml/sklearn2pmml.git

# Usage #

## Command-line application ##

The `sklearn2pmml` module is executable.
The main application loads the estimator object from the Pickle file (`-i` or `--input`; supports `joblib`, `pickle` or `dill` variants), performs the conversion, and saves the result to a PMML file (`-o` or `--output`):

```
python -m sklearn2pmml --input pipeline.pkl --output pipeline.pmml
```

Getting help:

```
python -m sklearn2pmml --help
```

On some platforms, the [Pip](https://pypi.org/project/pip/) package installer additionally makes the main application available as a top-level command:

```
sklearn2pmml --input pipeline.pkl --output pipeline.pmml
```

## Library ##

A typical workflow can be summarized as follows:

1. Create a `PMMLPipeline` object, and populate it with pipeline steps as usual. Class `sklearn2pmml.pipeline.PMMLPipeline` extends class `sklearn.pipeline.Pipeline` with the following functionality:
1. Create a `PMMLPipeline` object, and populate it with pipeline steps as usual. The `sklearn2pmml.pipeline.PMMLPipeline` class extends the `sklearn.pipeline.Pipeline` class with the following functionality:
* If the `PMMLPipeline.fit(X, y)` method is invoked with `pandas.DataFrame` or `pandas.Series` object as an `X` argument, then its column names are used as feature names. Otherwise, feature names default to "x1", "x2", .., "x{number_of_features}".
* If the `PMMLPipeline.fit(X, y)` method is invoked with `pandas.Series` object as an `y` argument, then its name is used as the target name (for supervised models). Otherwise, the target name defaults to "y".
2. Fit and validate the pipeline as usual.
3. Optionally, compute and embed verification data into the `PMMLPipeline` object by invoking `PMMLPipeline.verify(X)` method with a small but representative subset of training data.
4. Convert the `PMMLPipeline` object to a PMML file in local filesystem by invoking utility method `sklearn2pmml.sklearn2pmml(pipeline, pmml_destination_path)`.
4. Convert the `PMMLPipeline` object to a PMML file in local filesystem by invoking the `sklearn2pmml.sklearn2pmml(estimator, pmml_path)` utility method.

Developing a simple decision tree model for the classification of iris species:

Expand Down

0 comments on commit 464faf8

Please sign in to comment.