[빅데이터 인프라] StreamSets Basic tutorial 시작하기(preview)

Notice

Recent Posts

Tags more

Archives

관리 메뉴

리그캣의 개발놀이터

프로그래밍 기본/서버 구축 및 관리

리그캣 2019. 2. 25. 18:24

이번엔 'Preview' 기능을 사용하여

Origin에서 data가 어떻게 들어오는지

또는 나의 'Processing' 과정을 통하여 data가 어떻게 수정되는지 알아볼 것이다.

data set에 익숙해지고 pipeline 구성에 대한 세부 정보를 수집하려면 원본 data를 미리 보아야 한다.
다음은 pipeline을 구성하는 데 필요한 주요 세부 정보이다.

field data에 액세스 할때 field경로를 지정한다. Field 경로는 record의 복잡성에 따라 달라진다. /<fieldname> 단순 record와 <path to field> / <fieldname> 복잡한 record의 경로이다.

우리는 List-Map root field type을 사용하기 때문에 /<fieldname>을 사용할 것이다.

Data preview를 시작하려면 모든 stages가 반드시 연결되어 있어야하며 모든 required properties가 정의되어 있어야 한다. Origin이 구성된 유일한 stage기 때문에 pipeline은 미리보기할수 있어야 함.

Preview를 클릭해준다.

Preview Configuration을 다음을 참고하여 구현해준다.

Data Preview Property	Description
Preview Source	Use the default Configured Source to use the sample source data.
Write to Destinations and Executors	By default, this property is not selected. In general, you should keep this property clear to avoid writing data to destination systems or triggering tasks in destination systems.
Show Field Type	By default, this property is selected. Keep it selected to see the data type of fields in the record.
Remember the Configuration	Select to use these properties each time you run data preview. When you select this option, the UI enters data preview without showing this dialog box again.

위와 같이 선택해주라는 말이다.

또는 원하는 방향에 맞추어 선택해주면 되고 익숙해지면 나는 모든것을 선택하였다.

'Run Preview' 를 클릭해주면

데이터를 미리볼 수 있다. (오류가 안나는 경우에만)

아래 아이콘을 클릭하여 중지가 가능하다.

'Preview' 는 정말 많은 도움을 주는 기능이다.

[빅데이터 인프라] StreamSets Basic tutorial 시작하기(Jython) - 5 (0)	2019.02.27
[빅데이터 인프라] StreamSets Basic tutorial 시작하기(Stream Selector) - 4 (0)	2019.02.26
[빅데이터 인프라] StreamSets Basic tutorial 시작하기(origin) - 2 (0)	2019.02.22
[빅데이터 인프라] Kafka on Mesos broker 다루기 - kafkacat 포함 (0)	2019.02.21
[빅데이터 인프라] Kafkacat 을 centos7에서 설치하기 / kafkacat install on centos7 (0)	2019.02.21

'프로그래밍 기본/서버 구축 및 관리' Related Articles

Comments