You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jul 3, 2024. It is now read-only.
Copy file name to clipboardExpand all lines: README.md
+17-27Lines changed: 17 additions & 27 deletions
Original file line number
Diff line number
Diff line change
@@ -1,4 +1,3 @@
1
-
*** WORK-IN-PROGRESS ***
2
1
# Score streaming data with a machine learning model
3
2
4
3
In this code pattern, we will be streaming online shopping data and using the data to track the products that each customer has added to their cart. We will build a k-means clustering model with scikit-learn to group customers according to the contents of their shopping carts. The cluster assignment can be used to predict additional products to recommend.
@@ -32,17 +31,13 @@ Using the Streams Flows editor, we will create a streaming application with the
32
31
1.[Verify access to your IBM Streams instance on Cloud Pak for Data](#1-Verify-access-to-your-IBM-Streams-instance-on-Cloud-Pak-for-Data)
33
32
1.[Create a new project in Cloud Pak for Data](#2-create-a-new-project-in-cloud-pak-for-data)
34
33
1.[Build and store a model](#2-build-and-store-a-model)
35
-
36
-
1.[Create a Streams Flow in Cloud Pak for Data](#6-create-a-streams-flow-in-cloud-pak-for-data)
37
-
1.[Create a Streams Flow with Kafka as source](#7-create-a-streams-flow-with-kafka-as-source)
38
-
1.[Use Streams Flows option to generate a notebook](#8-use-streams-flows-option-to-generate-a-notebook)
39
-
1.[Run the generated Streams Flow notebook](#9-run-the-generated-streams-flow-notebook)
34
+
1.[Create and run a Streams Flow application](#4-create-and-run-a-streams-flow-application)
40
35
41
36
### 1. Verify access to your IBM Streams instance on Cloud Pak for Data
42
37
43
38
Once you login to your `Cloud Pak for Data` instance, ensure that your administrator has provisioned an instance of `IBM Streams`, and has given your user has access to the instance.
44
39
45
-
To see the available services, click on the `Services` icon. Search for `Streams`. You should see an `Enabled` indicator for Streams. `Watson Studio` and `Watson Machine Learning` also need to be enabled to build and deploy the model.
40
+
To see the available services, click on the `Services` icon. Search for `Streams`. You should see an `Enabled` indicator for `Streams`. `Watson Studio` and `Watson Machine Learning` also need to be enabled to build and deploy the model.
@@ -60,9 +55,7 @@ Click on `New project +`. Then select `Create an empty project` and enter a uniq
60
55
61
56
### 2. Build and store a model
62
57
63
-
We will build a model using a Jupyter notebook and scikit-learn. We're using a k-means classifier to group customers based on the contents of their shopping carts. Later, we will use that model to predict which group a customer is most likely to go in so that we can anticipate additional products to recommend.
64
-
65
-
Once we have built and stored the model, it will be available for deployment so that it can be used in our streaming application.
58
+
We will build a model using a Jupyter notebook and scikit-learn. We're using a k-means classifier to group customers based on the contents of their shopping carts. Once we have built and stored the model, it will be available for deployment so that it can be used in our streaming application.
66
59
67
60
#### Import the notebook into your project
68
61
@@ -79,13 +72,15 @@ Fill in the following information:
When you import the notebook you will be put in edit mode. Before running the notebook, you need to configure one thing. Edit the `WML Credentials` cell to set the `url` to the URL you use to access Cloud Pak for Data.
81
+
When you import the notebook you will be put in edit mode. Before running the notebook, you need to configure one thing:
82
+
83
+
* Edit the `WML Credentials` cell to set the `url` to the URL you use to access Cloud Pak for Data.
89
84
90
85

91
86
@@ -95,24 +90,24 @@ Select `Cell > Run All` to run the notebook. If you prefer, you can use the `Run
95
90
96
91
### 2. Associate the deployment space with the project
97
92
98
-
The notebook created a deployment space named "Shopping Cart k-means Model" and stored the model there.
93
+
The notebook created a deployment space named `ibm_streams_with_ml_model_deployment_space` and stored the model there.
99
94
100
95
Inside your new project, select the `Settings` tab and click on `Associate a deployment space +`.
> NOTE: As you've probably already noticed, red error indicators tell you when required settings are missing. For example, some settings are required. You will also see that some operators require a source and/or target connection. When an operator has a red spot on it, you can hover over it to see what the errors are.
135
+
139
136
Click on the canvas object to see its associated properties. From the list of available data types in the `Topic` drop-down list, select `Clickstream`.
As you've probably already noticed, red error indicators tell you when required settings are missing. For example, we've seen settings that were required. You will also see that some operators require a source and/or target connection.
156
-
157
-
When an operator has a red spot on it, you can hover over it to see what the erros are.
158
-
159
150
#### Add a Code operator
160
151
161
152
* From the `Processing and Analytics` list, select and drag the `Code` operator onto the canvas
162
153
* Connect the `Filter` operator's target to this `Code` operator's source (using drag-and-drop like we did earlier)
163
154
* Click on the Code operator to see its associated properties. Select `Python 3.6` as the `Coding Language`.
164
155
* In the `Code` property, paste in the following code:
165
156
```python
166
-
YOUMUSTEDITTHESCHEMAand add all attributes that you are returning as output.
167
157
#
168
158
# Preinstalled Python packages can be viewed from the Settings pane.
169
159
# In the Settings pane you can also install additional Python packages.
@@ -181,7 +171,7 @@ When an operator has a red spot on it, you can hover over it to see what the err
181
171
definit(state):
182
172
# do something once on flow initialization and save in the state object
@@ -215,7 +205,7 @@ When an operator has a red spot on it, you can hover over it to see what the err
215
205
216
206
The output now contains just the customer_id an array indicating which products are in the cart. This is the format we needed to pass to our model.
217
207
218
-
> Notice: We used the Code operator specifically to arrange our shopping cart data for scoring, but if you take another look at the Code operator you'll see that it is a very powerful operator where you can put in the Python code to do whatever manipulation you need in your streaming application.
208
+
> NOTE: We used the Code operator specifically to arrange our shopping cart data for scoring, but if you take another look at the Code operator you'll see that it is a very powerful operator where you can put in the Python code to do whatever manipulation you need in your streaming application.
0 commit comments