Skip to content

Commit 358c7b2

Browse files
committed
adds summarize to docs and history
1 parent 6b7bdea commit 358c7b2

File tree

2 files changed

+60
-2
lines changed

2 files changed

+60
-2
lines changed

HISTORY.rst

+1
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ History
88
- Fix missing distribution field in new models
99
- Add new Field class to deal with BigML auto-generated ids
1010
- Add by_name flag to predict methods to avoid reverse name lookups
11+
- Add summarize method in models to generate class grouped printed output
1112

1213
0.4.0 (2012-08-20)
1314
~~~~~~~~~~~~~~~~~~

docs/index.rst

+59-2
Original file line numberDiff line numberDiff line change
@@ -352,8 +352,8 @@ where the `source['object']` status is set to `UPLOADING` and its `progress`
352352
is periodically updated with the current uploading
353353
progress ranging from 0 to 1. When upload completes, this structure will be
354354
replaced by the real resource info as computed by BigML. Therefore source's
355-
status will eventually be (as it is in the synchronous upload case) ``WAITING``
356-
or ``QUEUED``.
355+
status will eventually be (as it is in the synchronous upload case)
356+
``WAITING`` or ``QUEUED``.
357357

358358
You can retrieve the updated status at any time using the corresponding get
359359
method. For example, to get the status of our source we would use::
@@ -727,6 +727,63 @@ and that can be useful to make the model actionable right away with ``local_mode
727727
if (petal_length <= 2.45):
728728
return 'Iris-setosa'
729729

730+
Summary generation
731+
------------------
732+
733+
You can also print the model from the point of view of the classes it predicts
734+
with ``local_model.summarize()``.
735+
It shows a header section with the training data initial distribution per class
736+
(instances and percentage) and the final predicted distribution per class.
737+
738+
Then each class distribution is detailed. First a header section
739+
shows the percentage of the total data that belongs to the class (in the
740+
training set and in the predicted results) and the rules applicable to
741+
all the
742+
the instances of that class (if any). Just after that, a detail section shows
743+
each of the leaves in which the class members are distributed.
744+
They are sorted in descending
745+
order by the percentage of predictions of the class that fall into that leaf
746+
and also show the full rule chain that leads to it.
747+
748+
::
749+
750+
Data distribution:
751+
Iris-setosa: 33.33% (50 instances)
752+
Iris-versicolor: 33.33% (50 instances)
753+
Iris-virginica: 33.33% (50 instances)
754+
755+
756+
Predicted distribution:
757+
Iris-setosa: 33.33% (50 instances)
758+
Iris-versicolor: 33.33% (50 instances)
759+
Iris-virginica: 33.33% (50 instances)
760+
761+
762+
763+
764+
Iris-setosa : (data 33.33% / prediction 33.33%) petal length <= 2.45
765+
· 100.00%: petal length <= 2.45
766+
767+
768+
Iris-versicolor : (data 33.33% / prediction 33.33%) petal length > 2.45
769+
· 94.00%: petal length > 2.45 and petal width <= 1.65 and petal length <= 4.95
770+
· 2.00%: petal length > 2.45 and petal width <= 1.65 and petal length > 4.95 and sepal length <= 6.05 and sepal width > 2.45
771+
· 2.00%: petal length > 2.45 and petal width > 1.65 and petal length <= 5.05 and sepal width > 2.9 and sepal length <= 5.95
772+
· 2.00%: petal length > 2.45 and petal width > 1.65 and petal length <= 5.05 and sepal width > 2.9 and sepal length > 5.95 and petal length > 4.95
773+
774+
775+
Iris-virginica : (data 33.33% / prediction 33.33%) petal length > 2.45
776+
· 76.00%: petal length > 2.45 and petal width > 1.65 and petal length > 5.05
777+
· 12.00%: petal length > 2.45 and petal width > 1.65 and petal length <= 5.05 and sepal width <= 2.9
778+
· 6.00%: petal length > 2.45 and petal width <= 1.65 and petal length > 4.95 and sepal length > 6.05
779+
· 4.00%: petal length > 2.45 and petal width > 1.65 and petal length <= 5.05 and sepal width > 2.9 and sepal length > 5.95 and petal length <= 4.95
780+
· 2.00%: petal length > 2.45 and petal width <= 1.65 and petal length > 4.95 and sepal length <= 6.05 and sepal width <= 2.45
781+
782+
You can also use ``local_model.get_data_distribution()`` and
783+
``local_model.get_prediction_distribution()`` to obtain the training and
784+
prediction basic distribution
785+
information as a list (suitable to draw histograms or any further processing).
786+
730787
Running the Tests
731788
-----------------
732789

0 commit comments

Comments
 (0)