ニュースレター

Hortonworks から最新情報をメールで受け取る

月に一度、ビッグデータに関する最新のインサイト、トレンド、分析情報、ナレッジをお届けします。

AVAILABLE NEWSLETTERS:

Sign up for the Developers Newsletter

月に一度、ビッグデータに関する最新のインサイト、トレンド、分析情報、ナレッジをお届けします。

行動喚起

始める

クラウド

スタートのご用意はできましたか?

Sandbox をダウンロード

ご質問はありませんか?

*いつでも登録を解除できることを理解しています。Hortonworks プライバシーポリシーのその他の情報も確認しています。
クローズクローズボタン
HDF > Develop Data Flow & Streaming Applications > 入門編の基本

NiFi in Trucking IoT on HDF

Creating a NiFi DataFlow

クラウド スタートのご用意はできましたか?

SANDBOX をダウンロード

はじめに

We are aware of the role NiFi plays in this Trucking IoT application. Let’s analyze the NiFi DataFlow to learn how it was built. Let’s dive into the process behind configuring controller services and configuring processors to learn how to build this NiFi DataFlow.

概要

NiFi Components

Check out the Core Concepts of NiFi to learn more about the NiFi Components used in creating a NiFi DataFlow.

Starting to Build a NiFi DataFlow

Before we begin building our NiFi DataFlow, let’s make sure we start with a clean canvas.

  • Press CTRL-A or COMMAND-A to select entire canvas
  • On the Operate Palette, click DELETE
Note: You may need to empty queues before deleting DataFlow. Do this by **right-clicking** non-empty queue, then select **Empty queue**.

Setting up Schema Registry Controller Service

As the first step in building the DataFlow, we needed to setup NiFi Controller Service called HortonworksSchemaRegistry. Go to the Operate Palette, click on the gear icon, then select Controller Services tab. To add a new controller service, you would press on the ” + “ icon in the top right of the table. However, since the service has already been created, we will reference it to see how a user would connect NiFi with Schema Registry.

HortonworksSchemaRegistry

Properties Tab of this Controller Service

Property Value
Schema Registry URL http://sandbox-hdf.hortonworks.com:7788/api/v1
Cache Size 1000
Cache Expiration 1 hour

A schema is used for categorizing the data into separate categories: TruckData and TrafficData will be applied on the data during the use of the ConvertRecord processor.

From the configuration in the table above, we can see the URL that allows NiFi to interact with Schema Registry, the amount of cache that can be sized from the schemas and the amount of time required until the schema cache expires and NiFi has to communicate with Schema Registry again.

Building GetTruckingData

NiFi Data Simulator – Generates data of two types: TruckData and TrafficData as a CSV string.

GetTruckingData

Keep Configurations across Setting Tab, Scheduling Tab, Properties Tab as Default.

Configuring RouteOnAttribute

RouteOnAttribute – Filters TruckData and TrafficData types into two separate flows from GetTruckingData.

RouteOnAttribute

Right click on the processor, press configure option to see the different configuration tabs and their parameters. In each tab, you will see the following configurations:

Setting Tab

Setting Value
Automatically Terminate Relationships unmatched

The rest should be kept as default.

Scheduling Tab

Keep the default configurations.

Properties Tab

Property Value
Routing Strategy Route to Property name
TrafficData ${dataType:equals(‘TrafficData’)}
TruckData ${dataType:equals(‘TruckData’)}

Building EnrichTruckData

EnrichTruckData – Adds weather data (fog, wind, rain) to the content of each flowfile incoming from RouteOnAttribute’s TruckData queue.

EnrichTruckData

Learn more about building the GetTruckingData processor in the Coming Soon: “Custom NiFi Processor – Trucking IoT” tutorial.

Configuring ConvertRecord: TruckData

ConvertRecord – Uses Controller Service to read in incoming CSV TruckData FlowFiles from the EnrichTruckData processor and uses another Controller Service to transform CSV to Avro TruckData FlowFiles.

ConvertRecordTruckData

Right click on the processor, press configure option to see the different configuration tabs and their parameters. In each tab, you will see the following configurations:

Setting Tab

Setting Value
Automatically Terminate Relationships failure

Scheduling Tab

Keep the default configurations.

Properties Tab

Property Value
Record Reader CSVReader – Enriched Truck Data
Record Writer AvroRecordWriter – Enriched Truck Data

In the operate panel, you can find more information on the controller services used with this processor:

CSVReader – Enriched Truck Data

Properties Tab of this Controller Service

Property Value
Schema Access Strategy Use ‘Schema Name’ Property
スキーマ・レジストリ HortonworksSchemaRegistry
Schema Name trucking_data_truck_enriched
Schema Text ${avro.schema}
Date Format No value set
Time Format No value set
Timestamp Format No value set
CSV Format Custom Format
Value Separator |
Treat First Line as Header false
Quote Character
Escape Character \
Comment Marker No value set
Null String No value set
Trim Fields true

AvroRecordWriter – Enriched Truck Data

Property Value
Schema Write Strategy HWX Content-Encoded Schema Reference
Schema Access Strategy Use ‘Schema Name’ Property
スキーマ・レジストリ HortonworksSchemaRegistry
Schema Name trucking_data_truck_enriched
Schema Text ${avro.schema}

Configuring ConvertRecord: TrafficData

ConvertRecord – Uses Controller Service to read in incoming CSV TrafficData FlowFiles from RouteOnAttribute’s TrafficData queue and uses another Controller Service to write Avro TrafficData FlowFiles.

ConvertRecordTrafficData

Right click on the processor, press configure option to see the different configuration tabs and their parameters. In each tab, you will see the following configurations:

Setting Tab

Setting Value
Automatically Terminate Relationships failure

Scheduling Tab

Keep the default configurations.

Properties Tab

Property Value
Record Reader CSVReader – Traffic Data
Record Writer AvroRecordWriter – Traffic Data

In the operate panel, you can find more information on the controller services used with this processor:

CSVReader – Traffic Data

Properties Tab of this Controller Service

Property Value
Schema Access Strategy Use ‘Schema Name’ Property
スキーマ・レジストリ HortonworksSchemaRegistry
Schema Name trucking_data_traffic
Schema Text ${avro.schema}
Date Format No value set
Time Format No value set
Timestamp Format No value set
CSV Format Custom Format
Value Separator |
Treat First Line as Header false
Quote Character
Escape Character \
Comment Marker No value set
Null String No value set
Trim Fields true

AvroRecordWriter – Traffic Data

Property Value
Schema Write Strategy HWX Content-Encoded Schema Reference
Schema Access Strategy Use ‘Schema Name’ Property
スキーマ・レジストリ HortonworksSchemaRegistry
Schema Name trucking_data_truck
Schema Text ${avro.schema}

Configuring PublishKafka_1_0: TruckData

PublishKafka_1_0 – Receives flowfiles from ConvertRecord – TruckData processor and sends each flowfile’s content as a message to Kafka Topic: trucking_data_truck using the Kafka Producer API.

PublishKafka_TruckData

Right click on the processor, press configure option to see the different configuration tabs and their parameters. In each tab, you will see the following configurations:

Setting Tab

Setting Value
Automatically Terminate Relationships failure, success

Scheduling Tab

Keep the default configurations.

Properties Tab

Property Value
Kafka Brokers sandbox-hdf.hortonworks.com:6667
Security Protocol PLAINTEXT
Topic Name trucking_data_truck_enriched
Delivery Guarantee Best Effort
Key Attribute Encoding UTF-8 Encoded
Max Request Size 1 MB
Acknowledgment Wait Time 5 secs
Max Metadata Wait Time 30 sec
Partitioner class DefaultPartitioner
Compression Type none

Configuring PublishKafka_1_0: TrafficData

PublishKafka_1_0 – Receives flowfiles from ConvertRecord – TrafficData processor and sends FlowFile content as a message using the Kafka Producer API to Kafka Topic: trucking_data_traffic.

PublishKafka_TrafficData

Right click on the processor, press configure option to see the different configuration tabs and their parameters. In each tab, you will see the following configurations:

Setting Tab

Setting Value
Automatically Terminate Relationships failure, success

Scheduling Tab

Keep the default configurations.

Properties Tab

Property Value
Kafka Brokers sandbox-hdf.hortonworks.com:6667
Security Protocol PLAINTEXT
Topic Name trucking_data_traffic
Delivery Guarantee Best Effort
Key Attribute Encoding UTF-8 Encoded
Max Request Size 1 MB
Acknowledgment Wait Time 5 secs
Max Metadata Wait Time 5 sec
Partitioner class DefaultPartitioner
Compression Type none

Summary

Congratulations! You now know about the role that NiFi plays in a data pipeline of the Trucking – IoT demo application and how to create and run a dataflow.

ユーザーの評価

ユーザーの評価
2 1.5 out of 5 stars
5 Star 0%
4 Star 0%
3 Star 0%
2 Star 50%
1 Star 50%
チュートリアル名
NiFi in Trucking IoT on HDF

質問する回答を探す場合は、Hortonworks Community Connectionをご参照ください。

2 Reviews
評価する

登録

登録して評価をご記入ください

ご自身の体験を共有してください

例: 最高のチュートリアル

この欄に最低50文字で記入してください。

成功

ご意見を共有していただきありがとうございます!

Not completely missing, but still lacking
by test test on February 10, 2019 at 6:27 pm

Nathan, the page you were looking at was just the outline, clicking on the hyperlinks will direct to the actual contents for this chapter. Still, instructions could have been more explicit. Even after getting to the actual content, I found there was a lot of stuff missing. For example, all of section 3 has you build out the IOT trucking dataflow but doesn't write the instructions out for how to add each processor (you drag and drop the processor icon onto the canvas, in case you were wondering). Also some of the properties are out of… Show More

Nathan, the page you were looking at was just the outline, clicking on the hyperlinks will direct to the actual contents for this chapter. Still, instructions could have been more explicit.

Even after getting to the actual content, I found there was a lot of stuff missing. For example, all of section 3 has you build out the IOT trucking dataflow but doesn’t write the instructions out for how to add each processor (you drag and drop the processor icon onto the canvas, in case you were wondering). Also some of the properties are out of date. The delimiter that should actually be used is “|” whereas the instructions has you use a delimter of ” `”.

表示件数を減らす
Cancel

Review updated successfully.

WTF
by Nathan Maxfield on December 13, 2018 at 2:37 pm

How is this a tutorial for “Creating a NiFi DataFlow?” It doesn’t actually show you how to do anything.

How is this a tutorial for “Creating a NiFi DataFlow?” It doesn’t actually show you how to do anything.

表示件数を減らす
Cancel

Review updated successfully.