You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/4.3/qse-getting-started.markdown
+7-18Lines changed: 7 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -123,26 +123,24 @@ You can connect to external data sources using toolkits. A toolkit is a reusabl
123
123
124
124
Streams includes toolkits that support the most popular systems like [HDFS](https://github.com/IBMStreams/streamsx.hdfs), [HBase](https://github.com/IBMStreams/streamsx.hbase), [Kafka](https://ibmstreams.github.io/streamsx.kafka/docs/user/overview/), Active MQ and more.
125
125
126
-
Refer to the [Product Toolkits Overview](https://developer.ibm.com/streamsdev/docs/product-toolkits-overview/) for a full list of toolkits included in Streams.
126
+
Refer to the [Product Toolkits Overview](https://www.ibm.com/support/knowledgecenter/de/SSCRJU_4.3.0/com.ibm.streams.ref.doc/doc/spltoolkits_intro.html) for a full list of toolkits included in Streams.
127
127
128
128
**Find more toolkits on GitHub**
129
129
130
130
In addition to the toolkits included in the install, [IBMStreams on GitHub](https://github.com/ibmstreams) includes open sour a platform that enables Streams to rapidly add support for emerging technologies. It also includes sample applications and helpful utilities.
131
131
132
-
For a list of open-source projects hosted on GitHub, see: [IBM Streams GitHub Projects Overview](https://developer.ibm.com/streamsdev/docs/github-projects-overview/).
133
-
134
132
### Streams and SPSS
135
133
136
134
SPSS is analytic predictive software that enables you to build predictive models from your data. Your application can perform real-time predictive scoring by running these predictive models using the SPSS operators.
137
135
138
-
To learn about Streams can integrate with SPSS: [Streams and SPSS Lab](https://developer.ibm.com/streamsdev/docs/spss-analytics-toolkit-lab/).
136
+
To learn about Streams can integrate with SPSS: [Streams and SPSS Lab](https://ibmstreams.github.io/streamsx.documentation/docs/spss/spss-analytics/).
139
137
140
138
141
139
### Streams and Microsoft Excel
142
140
143
141
<imgsrc="/streamsx.documentation/images/qse/BargainIndex1.jpg"alt="Streams and Excel"style="width: 60%;"/>
144
142
145
-
IBM Streams integrates with Microsoft Excel, allowing you to see, analyze and visualize live streaming data in an Excel worksheet. This article helps you get started: [Streams for Microsoft Excel](https://developer.ibm.com/streamsdev/docs/streams-4-0-streams-for-microsoft-excel/)
143
+
IBM Streams integrates with Microsoft Excel, allowing you to see, analyze and visualize live streaming data in an Excel worksheet.
146
144
147
145
In the following demo, we demonstrate how you may build a marketing dashboard from real-time data using Excel.
148
146
@@ -152,20 +150,14 @@ Video: Streams and Excel Demo
152
150
153
151
### Operational Decision Manager (ODM)
154
152
155
-
IBM Streams integrates with ODM rules, allowing you to create business rules, construct rule flows, and create and deploy rules applications to analyze data and automate decisions in real-time. This article helps you get started: [ODM Toolkit Lab](https://developer.ibm.com/streamsdev/docs/rules-toolkit-lab/)
153
+
IBM Streams integrates with ODM rules, allowing you to create business rules, construct rule flows, and create and deploy rules applications to analyze data and automate decisions in real-time. This article helps you get started: [ODM Toolkit Lab](https://community.ibm.com/community/user/cloudpakfordata/viewdocument/integrating-business-rules-in-real?CommunityKey=c0c16ff2-10ef-4b50-ae4c-57d769937235&tab=librarydocuments)
156
154
157
155
158
156
### Integration with IBM InfoSphere Data Governance Catalog
159
157
160
158
With IBM InfoSphere Data Governance Catalog integration, developers can easily discover the data and schema that are available for use. By building data lineage with your Streams application, you can quickly see and control how data is consumed.
161
-
To get started, see the [Streams Governance Quick Start Guide](../governance/governance-quickstart/).
162
-
163
-
164
-
### SparkMLLib in Streams
165
-
166
-
To get started, follow this development guide:
159
+
To get started, see the [Streams Governance Quick Start Guide](https://ibmstreams.github.io/streamsx.documentation/docs/4.2/governance/governance-quickstart/).
167
160
168
-
*[SparkMLLib Getting Started Guide](https://developer.ibm.com/streamsdev/docs/getting-started-with-the-spark-mllib-toolkit/)
169
161
170
162
### Apache Edgent (aka Open Embedded Streams) Integration
The following Streams resources can help you connect with the Streams community and get support when you need it:
180
172
181
-
***[Streamsdev](https://developer.ibm.com/streamsdev/)** - This resource is a developer-to-developer website maintained by the Streams Development Team. It contains many useful articles and getting started material. Check back often for new articles, tips and best practices to this website.
182
-
***[Streams Forum](https://www.ibmdw.net/answers/questions/?community=streamsdev&sort=newest&refine=none)** - This forum enables you to ask, and get answers to your questions, related to IBM Streams. If you have questions, start here.
183
-
***[IBMStreams on GitHub](http://ibmstreams.github.io)** - Streams is shipped with many useful toolkits out of the box. IBMStreams on GitHub contains many open-source toolkits. For a list of available toolkits available on GitHub, see this web page: [IBMStreams GitHub Toolkits](https://developer.ibm.com/streamsdev/docs/github-projects-overview/).
184
-
***[IBM Streams Support](http://www.ibm.com/support/entry/portal/Overview/Software/Information_Management/InfoSphere_Streams)** - This website provides information about IBM Streams downloads, technical support tools, documentation, and other resources.
185
-
***[IBM Streams Product Site](http://www.ibm.com/analytics/us/en/technology/stream-computing/)** - This website provides a broad range of information and resources about Streams and related topics.
173
+
***[Streams Community](https://ibm.biz/streams-community)** - This resource is a developer-to-developer website maintained by the Streams Development Team. It contains many useful articles and getting started material.
174
+
***[IBMStreams on GitHub](http://ibmstreams.github.io)** - Streams is shipped with many useful toolkits out of the box. IBMStreams on GitHub contains many open-source toolkits.
Copy file name to clipboardExpand all lines: docs/python/1.6/python-appapi-devguide-4.md
+206-2Lines changed: 206 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -40,7 +40,7 @@ This section will discuss how to use the most common functions and transforms in
40
40
-[Split the stream into dedicated streams](#split_func)
41
41
*[Joining streams](#union)
42
42
*[Sharing data between Streams applications](#publish)
43
-
43
+
*[Defining a stream's schema](#schema)
44
44
45
45
46
46
<aid="intro"></a>
@@ -292,7 +292,7 @@ Reading from a file or using a file within your Streams application can be done
292
292
293
293
However, you must use `Topology.add_file_dependency` to ensure that the file or its containing directory will be available at runtime.
294
294
295
-
Note: If you are using **IBM Cloud Pak for Data** , this [post discusses how to use a data set in your Streams Topology](https://developer.ibm.com/streamsdev/2019/04/23/tip-for-ibm-cloud-private-for-data-how-to-use-local-data-sets-in-your-streams-python-notebook/).
295
+
Note: If you are using **IBM Cloud Pak for Data** , this [post discusses how to use a data set in your Streams Topology](https://community.ibm.com/community/user/cloudpakfordata/viewdocument/how-to-use-local-files-in-a-streams?CommunityKey=c0c16ff2-10ef-4b50-ae4c-57d769937235&tab=librarydocuments).
296
296
297
297
~~~python
298
298
topo = Topology("ReadFromFile")
@@ -2255,3 +2255,207 @@ The contents of your output file look something like this:
2255
2255
2256
2256
For more information, see [Publish-subscribe overview](https://streamsxtopology.readthedocs.io/en/stable/streamsx.topology.html#publish-subscribe-overview).
2257
2257
2258
+
2259
+
<a id="schema"></a>
2260
+
## Defining a stream's schema
2261
+
2262
+
A stream represents an unbounded flow of tuples with a declared schema so that each tuple on the stream complies with the schema.
2263
+
2264
+
A stream's schema may be one of:
2265
+
2266
+
***StreamsSchema** structured schema - a tupleis a sequence of attributes, and an attribute is a named value of a specific type.
2267
+
***Json**- a tupleis a JSONobject.
2268
+
***String**- a tupleis a string.
2269
+
***Python**- a tupleisany Python object, effectively an untyped stream.
2270
+
2271
+
The application below uses the `Stream.map()`callable between a data source and data sink callable:
The diagram contains labels for`stream1`, `stream2`and`outputSchema` since they are used in the code block and table below. Each SPL operator output port and corresponding stream are defined by a schema. In a Python toplogy application the `CommonSchema.Python`is the default schema for Python operators.
2276
+
2277
+
In this sample the output schema is defined with the `schema` parameter of the `map()` function.
The table below contains examples of the schema definition and the corresponding SPL schema that is generated by "streamsx.topology" when creating the application.
2285
+
2286
+
| Schema type| Schema in Python | Schema in generated SPL|
So far in this *development guide*, we don't use schemas explicitly. But in a large application it is good design to define structured schema(s).
2294
+
2295
+
And in certain cases, you must have a schema different than `CommonSchema.Python`:
2296
+
2297
+
* when writing an application using different kinds of callables (Streams SPL operators), because the Python schema isnot supported inSPL Java primitive andSPL C++ primitive operators.
2298
+
* when using **publish**and**subscribe** between different applications (if one application is**not** using Python operators)
2299
+
* when creating a job as service endpoint to consume/produce data via REST using **EndpointSink**or**EndpointSource** [streamsx.service](https://streamsxtopology.readthedocs.io/en/stable/streamsx.service.html)
2300
+
2301
+
### Structured Schema
2302
+
2303
+
Structured schema can be declared a number of ways:
2304
+
2305
+
* An instance of `typing.NamedTuple`
2306
+
* An instance of `StreamSchema`
2307
+
* A string of the format`tuple<...>` defining the attribute names and types.
2308
+
* A string containing a namespace qualified SPL stream type (e.g. `com.ibm.streams.geospatial::FlightPathEncounterTypes.Observation3D`)
2309
+
2310
+
Structured schemas provide type-safety and efficient network serialization when compared to passing a dict using Python streams.
2311
+
2312
+
#### Topology.source()
2313
+
2314
+
***No** support of explicit schema definition
2315
+
* Generates **CommonSchema.Python** by **default**
2316
+
* Use type hint at the "source"callable to generate a structured schema stream
2317
+
2318
+
In the sample below, the **type hint**`-> Iterable[SampleSourceSchema]`is added to the `__call__(self)` method in the class used ascallablein your source.
2319
+
The structured schema `SampleSourceSchema`is defined a named tuple.
#### Structured schema passing styles (dict vs. named tuple)
2344
+
2345
+
In the former example the source callable returned a *dict*. You can also return*named tuple* objects andin both cases the downstream callable tuples are passed in*named tuple* style.
*Does a type hint replace the use of specifying the schema parameter when calling the map transform?*
2386
+
2387
+
If `schema`isset, then the returntypeis defined by the schema parameter. Otherwise if`schema`isnotset then the returntype hint on `func` define the schema of the returned stream, defaulting to `CommonSchema.Python`if no type hints are present.
2388
+
2389
+
Find below the same sample using *dict* style in"source"callable, but the type hint with*named tuple* schema causes that tuples are passed in*named tuple* style to map() callable.
The following samples uses a SPL operator [streamsx.standard.utility.Sequence](https://streamsxstandard.readthedocs.io/en/latest/generated/streamsx.standard.utility.html#streamsx.standard.utility.Sequence) generating a structured schema [streamsx.standard.utility.SEQUENCE_SCHEMA](https://streamsxstandard.readthedocs.io/en/latest/generated/streamsx.standard.utility.html#streamsx.standard.utility.SEQUENCE_SCHEMA)
2427
+
Here you see the difference to the previous sample, that the tuples are passed to the Python callablein*dict* style (see `Delta()`class used in`streams1.map(Delta())`. Furthermore this sample demonstrates how to extend a structured schema with [streamsx.topology.schema.StreamSchema.extend](https://streamsxtopology.readthedocs.io/en/stable/streamsx.topology.schema.html#streamsx.topology.schema.StreamSchema.extend) function. In the `map()` callable the new attribute `d` is set.
Depending on the problem at hand, a developer might choose to create an IBM Streams application in a particular programming language. To this end, the 'streamsx.topology' project supports APIs in Java, Scala, Python, and IBM Streams Processing Language (SPL). Regardless of the language used to develop and submit the application, however, it becomes necessary to monitor the application while it is running. By monitoring the application, you can observe runtime information regarding the application or its environment, for example:
0 commit comments