a) Relationship query on DataStream
The following table compares traditional relational algebra sumStream processingRelationship with input data, execution and output results.
Relational Algebra / SQL | Stream processing |
---|---|
A relationship (or table) is a set of bounded (multiple) tuples. | A stream is a sequence of infinite tuples. |
Queries performed on batch data (such as tables in relational databases) can access the full input data. | Streaming query cannot access all data at startup and must "wait" for data flow in. |
The batch query terminates after a fixed-size result is produced. | The stream query constantly updates its results based on the received records and never ends. |
Despite these differences, it is not impossible to use relational queries and SQL to process flows, and the advanced relational database system provides aMaterialized ViewsCharacteristics.
The materialized view is defined as aSQL Query, just like a regular virtual view; contrary to virtual view, materialized view caches the results of a query, so there is no need to calculate the query when accessing the view. A common problem with caching is to prevent the cache from serving expired results; when the base table that defines the query is modified, the materialized view will expire.Eager View MaintenanceIt is a technique to update the view immediately once the base table of the materialized view is updated.。
Consider the following questions, thenThe link between instant view maintenance and SQL queries on streamsIt will become obvious:
- The database table is
INSERT
、UPDATE
andDELETE
DML statementstreamThe result ofchangelog stream 。 - A materialized view is defined as a SQL query that continuously processes the changelog stream of the fundamental relationship of the view in order to update the view.
- Materialized views are the result of streaming SQL queries.