ื”ื’ื“ืจืช Spark ืขืœ YARN

ื”ืื‘ืจ, ืฉืœื•ื! ืืชืžื•ืœ ืขืœ ืžืคื’ืฉ ื”ืžื•ืงื“ืฉ ืœ- Apache Spark, ืžื”ื—ื‘ืจ'ื” ืž- Rambler&Co, ื”ื™ื• ื“ื™ ื”ืจื‘ื” ืฉืืœื•ืช ืฉืœ ืžืฉืชืชืคื™ื ื”ืงืฉื•ืจื•ืช ืœื”ื’ื“ืจืช ื”ื›ืœื™ ื”ื–ื”. ื”ื—ืœื˜ื ื• ืœืœื›ืช ื‘ืขืงื‘ื•ืชื™ื• ื•ืœืฉืชืฃ ื‘ื—ื•ื•ื™ื” ืฉืœื ื•. ื”ื ื•ืฉื ืœื ืคืฉื•ื˜ - ืื– ืื ื—ื ื• ืžื–ืžื™ื ื™ื ืืชื›ื ืœืฉืชืฃ ืืช ื”ื—ื•ื•ื™ื” ืฉืœื›ื ื‘ืชื’ื•ื‘ื•ืช, ืื•ืœื™ ื’ื ืื ื—ื ื• ืžื‘ื™ื ื™ื ื•ืžืฉืชืžืฉื™ื ื‘ืžืฉื”ื• ืœื ื ื›ื•ืŸ.

ื”ืงื“ืžื” ืงื˜ื ื” ื›ื™ืฆื“ ืื ื• ืžืฉืชืžืฉื™ื ื‘ืกืคืืจืง. ื™ืฉ ืœื ื• ืชื•ื›ื ื™ืช ืœืฉืœื•ืฉื” ื—ื•ื“ืฉื™ื "ืžื•ืžื—ื” ื‘ื™ื’ ื“ืื˜ื”", ื•ืœืื•ืจืš ื”ืžื•ื“ื•ืœ ื”ืฉื ื™ ื”ืžืฉืชืชืคื™ื ืฉืœื ื• ืขื•ื‘ื“ื™ื ืขืœ ื”ืžื›ืฉื™ืจ ื”ื–ื”. ื‘ื”ืชืื ืœื›ืš, ื”ืžืฉื™ืžื” ืฉืœื ื•, ื›ืžืืจื’ื ื™ื, ื”ื™ื ืœื”ื›ื™ืŸ ืืช ื”ืืฉื›ื•ืœ ืœืฉื™ืžื•ืฉ ื‘ืชื•ืš ืžืงืจื” ื›ื–ื”.

ื”ืžื•ื–ืจื•ืช ืฉืœ ื”ืฉื™ืžื•ืฉ ืฉืœื ื• ื”ื™ื ืฉืžืกืคืจ ื”ืื ืฉื™ื ืฉืขื•ื‘ื“ื™ื ื‘ื• ื–ืžื ื™ืช ืขืœ Spark ื™ื›ื•ืœ ืœื”ื™ื•ืช ืฉื•ื•ื” ืœื›ืœ ื”ืงื‘ื•ืฆื”. ืœืžืฉืœ, ื‘ืกืžื™ื ืจ, ื›ืฉื›ื•ืœื ืžื ืกื™ื ืžืฉื”ื• ื‘ื• ื–ืžื ื™ืช ื•ื—ื•ื–ืจื™ื ืื—ืจื™ ื”ืžื•ืจื” ืฉืœื ื•. ื•ื–ื” ืœื ื”ืจื‘ื” - ืœืคืขืžื™ื ืขื“ 40 ืื™ืฉ. ื›ื ืจืื” ืฉืื™ืŸ ื”ืจื‘ื” ื—ื‘ืจื•ืช ื‘ืขื•ืœื ืฉืขื•ืžื“ื•ืช ื‘ืคื ื™ ืžืงืจื” ืฉื™ืžื•ืฉ ื›ื–ื”.

ืœืื—ืจ ืžื›ืŸ, ืืกืคืจ ืœืš ื›ื™ืฆื“ ื•ืžื“ื•ืข ื‘ื—ืจื ื• ืคืจืžื˜ืจื™ื ืžืกื•ื™ืžื™ื ืฉืœ ืชืฆื•ืจื”.

ื‘ื•ืื• ื ืชื—ื™ืœ ืžื”ื”ืชื—ืœื”. ืœ-Spark ื™ืฉ 3 ืืคืฉืจื•ื™ื•ืช ืœืจื•ืฅ ืขืœ ืืฉื›ื•ืœ: ืขืฆืžืื™, ื‘ืืžืฆืขื•ืช Mesos ื•ืฉื™ืžื•ืฉ ื‘-YARN. ื”ื—ืœื˜ื ื• ืœื‘ื—ื•ืจ ื‘ืืคืฉืจื•ืช ื”ืฉืœื™ืฉื™ืช ื›ื™ ื–ื” ื ืจืื” ืœื ื• ื”ื’ื™ื•ื ื™. ื›ื‘ืจ ื™ืฉ ืœื ื• ืืฉื›ื•ืœ Hadoop. ื”ืžืฉืชืชืคื™ื ืฉืœื ื• ื›ื‘ืจ ืžื›ื™ืจื™ื ื”ื™ื˜ื‘ ืืช ื”ืืจื›ื™ื˜ืงื˜ื•ืจื” ืฉืœื•. ื‘ื•ืื• ื ืฉืชืžืฉ ื‘-YARN.

spark.master=yarn

ืขื•ื“ ื™ื•ืชืจ ืžืขื ื™ื™ืŸ. ืœื›ืœ ืื—ืช ืž-3 ืืคืฉืจื•ื™ื•ืช ื”ืคืจื™ืกื” ื”ืœืœื• ื™ืฉ 2 ืืคืฉืจื•ื™ื•ืช ืคืจื™ืกื”: ืœืงื•ื— ื•ืืฉื›ื•ืœ. ืžื‘ื•ืกืก ืชื™ืขื•ื“ ื•ืงื™ืฉื•ืจื™ื ืฉื•ื ื™ื ื‘ืื™ื ื˜ืจื ื˜, ืื ื• ื™ื›ื•ืœื™ื ืœื”ืกื™ืง ืฉื”ืœืงื•ื— ืžืชืื™ื ืœืขื‘ื•ื“ื” ืื™ื ื˜ืจืืงื˜ื™ื‘ื™ืช - ืœืžืฉืœ ื‘ืืžืฆืขื•ืช ืžื—ื‘ืจืช jupyter, ื•-cluster ืžืชืื™ื ื™ื•ืชืจ ืœืคืชืจื•ื ื•ืช ื™ื™ืฆื•ืจ. ื‘ืžืงืจื” ืฉืœื ื•, ื”ื™ื™ื ื• ืžืขื•ื ื™ื™ื ื™ื ื‘ืขื‘ื•ื“ื” ืื™ื ื˜ืจืืงื˜ื™ื‘ื™ืช, ืœื›ืŸ:

spark.deploy-mode=client

ื‘ืื•ืคืŸ ื›ืœืœื™, ืžืขื›ืฉื™ื• Spark ืื™ื›ืฉื”ื• ื™ืขื‘ื•ื“ ืขืœ YARN, ืื‘ืœ ื–ื” ืœื ื”ืกืคื™ืง ืœื ื•. ืžื›ื™ื•ื•ืŸ ืฉื™ืฉ ืœื ื• ืชื•ื›ื ื™ืช ืขืœ ื‘ื™ื’ ื“ืื˜ื”, ืœืคืขืžื™ื ื”ืžืฉืชืชืคื™ื ืœื ื”ืกืคื™ืงื• ืžืžื” ืฉื”ื•ืฉื’ ื‘ืžืกื’ืจืช ืฉืœ ื—ื™ืชื•ืš ืฉื•ื•ื” ืฉืœ ืžืฉืื‘ื™ื. ื•ืื– ืžืฆืื ื• ื“ื‘ืจ ืžืขื ื™ื™ืŸ - ื”ืงืฆืืช ืžืฉืื‘ื™ื ื“ื™ื ืžื™ืช. ื‘ืงื™ืฆื•ืจ, ื”ื ืงื•ื“ื” ื”ื™ื ื›ื–ื•: ืื ื™ืฉ ืœืš ืžืฉื™ืžื” ืงืฉื” ื•ื”ืืฉื›ื•ืœ ืคื ื•ื™ (ืœื“ื•ื’ืžื”, ื‘ื‘ื•ืงืจ), ืื– ื”ืฉื™ืžื•ืฉ ื‘ืืคืฉืจื•ืช ื”ื–ื• Spark ื™ื›ื•ืœ ืœืชืช ืœืš ืžืฉืื‘ื™ื ื ื•ืกืคื™ื. ื ื—ื™ืฆื•ืช ืžื—ื•ืฉื‘ ืฉื ืœืคื™ ื ื•ืกื—ื” ืขืจืžื•ืžื™ืช. ืœื ื ื™ื›ื ืก ืœืคืจื˜ื™ื - ื–ื” ืขื•ื‘ื“ ื˜ื•ื‘.

spark.dynamicAllocation.enabled=true

ื”ื’ื“ืจื ื• ืืช ื”ืคืจืžื˜ืจ ื”ื–ื”, ื•ืขื ื”ื”ืคืขืœื” Spark ืงืจืก ื•ืœื ื”ืชื—ื™ืœ. ื–ื” ื ื›ื•ืŸ, ื›ื™ ื”ื™ื™ืชื™ ื—ื™ื™ื‘ ืœืงืจื•ื ืืช ื–ื” ืชื™ืขื•ื“ ื™ื•ืชืจ ื‘ื–ื”ื™ืจื•ืช. ื”ื•ื ืงื•ื‘ืข ืฉื›ื“ื™ ืฉื”ื›ืœ ื™ื”ื™ื” ื‘ืกื“ืจ, ืฆืจื™ืš ืœื”ืคืขื™ืœ ื’ื ืคืจืžื˜ืจ ื ื•ืกืฃ.

spark.shuffle.service.enabled=true

ืœืžื” ื–ื” ื ื—ื•ืฅ? ื›ืฉื”ืขื‘ื•ื“ื” ืฉืœื ื• ื›ื‘ืจ ืœื ื“ื•ืจืฉืช ื›ืœ ื›ืš ื”ืจื‘ื” ืžืฉืื‘ื™ื, ืกืคืืจืง ืฆืจื™ืš ืœื”ื—ื–ื™ืจ ืื•ืชื ืœื‘ืจื™ื›ื” ื”ืžืฉื•ืชืคืช. ื”ืฉืœื‘ ืฉืœื•ืงื— ืืช ื”ื–ืžืŸ ื›ืžืขื˜ ื‘ื›ืœ ืžืฉื™ืžื” ืฉืœ MapReduce ื”ื•ื ืฉืœื‘ Shuffle. ืคืจืžื˜ืจ ื–ื” ืžืืคืฉืจ ืœืฉืžื•ืจ ืืช ื”ื ืชื•ื ื™ื ืฉื ื•ืฆืจื™ื ื‘ืฉืœื‘ ื–ื” ื•ืœืฉื—ืจืจ ืืช ื”ืžื‘ืฆืขื™ื ื‘ื”ืชืื. ื•ื”ืžื•ืฆื™ื ืœืคื•ืขืœ ื”ื•ื ื”ืชื”ืœื™ืš ืฉืžื—ืฉื‘ ื”ื›ืœ ืขืœ ื”ืขื•ื‘ื“. ื™ืฉ ืœื• ืžืกืคืจ ืžืกื•ื™ื ืฉืœ ืœื™ื‘ื•ืช ืžืขื‘ื“ ื•ื›ืžื•ืช ืžืกื•ื™ืžืช ืฉืœ ื–ื™ื›ืจื•ืŸ.

ืคืจืžื˜ืจ ื–ื” ื ื•ืกืฃ. ื ืจืื” ื”ื™ื” ืฉื”ื›ืœ ืขื•ื‘ื“. ื”ืชื‘ืจืจ ืฉืœืžืขืฉื” ื ื™ืชื ื• ืœืžืฉืชืชืคื™ื ื™ื•ืชืจ ืžืฉืื‘ื™ื ื›ืฉื”ื ื–ืงื•ืงื™ื ืœื”ื. ืื‘ืœ ืฆืฆื” ื‘ืขื™ื” ื ื•ืกืคืช - ื‘ืฉืœื‘ ืžืกื•ื™ื ืžืฉืชืชืคื™ื ืื—ืจื™ื ื”ืชืขื•ืจืจื• ื•ืจืฆื• ื’ื ื”ื ืœื”ืฉืชืžืฉ ื‘ืกืคืืจืง, ืื‘ืœ ื”ื›ืœ ื”ื™ื” ืชืคื•ืก ืฉื, ื•ื”ื ืœื ื”ื™ื• ืžืจื•ืฆื™ื. ืืคืฉืจ ืœื”ื‘ื™ืŸ ืื•ืชื. ื”ืชื—ืœื ื• ืœืขื™ื™ืŸ ื‘ืชื™ืขื•ื“. ื”ืชื‘ืจืจ ืฉื™ืฉ ืขื•ื“ ืžืกืคืจ ืคืจืžื˜ืจื™ื ืฉื ื™ืชืŸ ืœื”ืฉืชืžืฉ ื‘ื”ื ื›ื“ื™ ืœื”ืฉืคื™ืข ืขืœ ื”ืชื”ืœื™ืš. ืœื“ื•ื’ืžื”, ืื ื”ืžื‘ืฆืข ื ืžืฆื ื‘ืžืฆื‘ ื”ืžืชื ื”, ืœืื—ืจ ืื™ื–ื” ื–ืžืŸ ื ื™ืชืŸ ืœืงื—ืช ืžืžื ื• ืžืฉืื‘ื™ื?

spark.dynamicAllocation.executorIdleTimeout=120s

ื‘ืžืงืจื” ืฉืœื ื•, ืื ื”ืžื•ืฆื™ืื™ื ืœืคื•ืขืœ ืฉืœืš ืœื ืขื•ืฉื™ื ื›ืœื•ื ื‘ืžืฉืš ืฉืชื™ ื“ืงื•ืช, ืื– ื‘ื‘ืงืฉื” ื”ื—ื–ืจ ืื•ืชื ืœื‘ืจื™ื›ื” ื”ืžืฉื•ืชืคืช. ืื‘ืœ ื”ืคืจืžื˜ืจ ื”ื–ื” ืœื ืชืžื™ื“ ื”ืกืคื™ืง. ื”ื™ื” ื‘ืจื•ืจ ืฉื”ืื“ื ืœื ืขืฉื” ื“ื‘ืจ ื‘ืžืฉืš ื–ืžืŸ ืจื‘, ื•ืžืฉืื‘ื™ื ืœื ื”ืชืคื ื•. ื”ืชื‘ืจืจ ืฉื™ืฉ ื’ื ืคืจืžื˜ืจ ืžื™ื•ื—ื“ - ืื—ืจื™ ืื™ื–ื” ืฉืขื” ืœื‘ื—ื•ืจ executors ื”ืžื›ื™ืœื™ื ื ืชื•ื ื™ื ื‘ืžื˜ืžื•ืŸ. ื›ื‘ืจื™ืจืช ืžื—ื“ืœ, ื”ืคืจืžื˜ืจ ื”ื–ื” ื”ื™ื” ืื™ื ืกื•ืฃ! ืชื™ืงื ื• ืืช ื–ื”.

spark.dynamicAllocation.cachedExecutorIdleTimeout=600s

ื›ืœื•ืžืจ, ืื ื”ืžื•ืฆื™ืื™ื ืœืคื•ืขืœ ืฉืœื›ื ืœื ืขื•ืฉื™ื ื›ืœื•ื ื‘ืžืฉืš 5 ื“ืงื•ืช, ืชื ื• ืื•ืชื ืœื‘ืจื™ื›ื” ื”ืžืฉื•ืชืคืช. ื‘ืžืฆื‘ ื–ื”, ืžื”ื™ืจื•ืช ื”ืฉื—ืจื•ืจ ื•ื”ื ืคืงืช ืžืฉืื‘ื™ื ืขื‘ื•ืจ ืžืกืคืจ ืจื‘ ืฉืœ ืžืฉืชืžืฉื™ื ื”ืคื›ื” ื”ื’ื•ื ื”. ื›ืžื•ืช ืื™ ืฉื‘ื™ืขื•ืช ื”ืจืฆื•ืŸ ื™ืจื“ื”. ืื‘ืœ ื”ื—ืœื˜ื ื• ืœืœื›ืช ืจื—ื•ืง ื™ื•ืชืจ ื•ืœื”ื’ื‘ื™ืœ ืืช ื”ืžืกืคืจ ื”ืžืจื‘ื™ ืฉืœ ืžื‘ืฆืขื™ื ืœื›ืœ ืืคืœื™ืงืฆื™ื” - ื‘ืขืฆื ืœื›ืœ ืžืฉืชืชืฃ ื‘ืชื•ื›ื ื™ืช.

spark.dynamicAllocation.maxExecutors=19

ืขื›ืฉื™ื•, ื›ืžื•ื‘ืŸ, ื™ืฉ ืื ืฉื™ื ืœื ืžืจื•ืฆื™ื ื‘ืฆื“ ื”ืฉื ื™ - "ื”ืืฉื›ื•ืœ ื‘ื˜ืœ, ื•ื™ืฉ ืœื™ ืจืง 19 ืžื‘ืฆืขื™ื", ืื‘ืœ ืžื” ืืชื” ื™ื›ื•ืœ ืœืขืฉื•ืช? ืื ื—ื ื• ืฆืจื™ื›ื™ื ืื™ื–ืฉื”ื• ืื™ื–ื•ืŸ ื ื›ื•ืŸ. ืื™ ืืคืฉืจ ืœืฉืžื— ืืช ื›ื•ืœื.

ื•ืขื•ื“ ืกื™ืคื•ืจ ืงื˜ืŸ ืฉืงืฉื•ืจ ืœืคืจื˜ื™ ื”ืžืงืจื” ืฉืœื ื•. ืื™ื›ืฉื”ื•, ื›ืžื” ืื ืฉื™ื ืื™ื—ืจื• ืœืฉื™ืขื•ืจ ืžืขืฉื™, ื•ืžืฉื•ื ืžื” ืกืคืืจืง ืœื ื”ืชื—ื™ืœ ื‘ืฉื‘ื™ืœื. ื‘ื“ืงื ื• ืืช ื›ืžื•ืช ื”ืžืฉืื‘ื™ื ื”ืคื ื•ื™ื™ื - ื ืจืื” ืฉื™ืฉ. ื ื™ืฆื•ืฅ ืฆืจื™ืš ืœื”ืชื—ื™ืœ. ืœืžืจื‘ื” ื”ืžื–ืœ, ื‘ืื•ืชื• ื–ืžืŸ ื”ืชื™ืขื•ื“ ื›ื‘ืจ ื”ืชื•ื•ืกืฃ ืœืชืช-ืงืœื™ืคืช ื”ืžื•ื— ืื™ืคืฉื”ื•, ื•ื ื–ื›ืจื ื• ืฉืขื ื”ื”ืฉืงื”, Spark ืžื—ืคืฉ ื™ืฆื™ืื” ืฉืืคืฉืจ ืœื”ืชื—ื™ืœ ื‘ื”. ืื ื”ื™ืฆื™ืื” ื”ืจืืฉื•ื ื” ื‘ื˜ื•ื•ื— ืชืคื•ืกื”, ื”ื™ื ืขื•ื‘ืจืช ืœื™ืฆื™ืื” ื”ื‘ืื” ืœืคื™ ื”ืกื“ืจ. ืื ื–ื” ื‘ื—ื™ื ื, ื–ื” ืœื•ื›ื“. ื•ื™ืฉ ืคืจืžื˜ืจ ืฉืžืฆื™ื™ืŸ ืืช ืžืกืคืจ ื”ื ื™ืกื™ื•ื ื•ืช ื”ืžืงืกื™ืžืœื™ ืœื›ืš. ื‘ืจื™ืจืช ื”ืžื—ื“ืœ ื”ื™ื 16. ื”ืžืกืคืจ ืงื˜ืŸ ืžืžืกืคืจ ื”ืื ืฉื™ื ื‘ืงื‘ื•ืฆื” ืฉืœื ื• ื‘ื›ื™ืชื”. ื‘ื”ืชืื, ืœืื—ืจ 16 ื ื™ืกื™ื•ื ื•ืช, ืกืคืืจืง ื•ื™ืชืจ ื•ืืžืจ ืฉืื ื™ ืœื ื™ื›ื•ืœ ืœื”ืชื—ื™ืœ. ืชื™ืงื ื• ืืช ื”ืคืจืžื˜ืจ ื”ื–ื”.

spark.port.maxRetries=50

ื‘ืฉืœื‘ ื”ื‘ื ืืกืคืจ ืœื›ื ืขืœ ื›ืžื” ื”ื’ื“ืจื•ืช ืฉืื™ื ืŸ ืงืฉื•ืจื•ืช ื‘ืžื™ื•ื—ื“ ืœืคืจื˜ื™ ื”ืžืงืจื” ืฉืœื ื•.

ื›ื“ื™ ืœื”ืคืขื™ืœ ืืช Spark ืžื”ืจ ื™ื•ืชืจ, ืžื•ืžืœืฅ ืœืื—ืกืŸ ืืช ืชื™ืงื™ื™ืช ื”-jars ืฉื ืžืฆืืช ื‘ืกืคืจื™ื™ืช ื”ื‘ื™ืช ืฉืœ SPARK_HOME ื•ืœื”ืขืœื•ืช ืื•ืชื” ืขืœ HDFS. ืื– ื”ื•ื ืœื ื™ื‘ื–ื‘ื– ื–ืžืŸ ืœื”ืขืžื™ืก ืืช ื”ื’'ืจื ื™ืงื™ื ื”ืืœื” ืขืœ ื™ื“ื™ ืขื•ื‘ื“ื™ื.

spark.yarn.archive=hdfs:///tmp/spark-archive.zip

ื›ืžื• ื›ืŸ, ืžื•ืžืœืฅ ืœื”ืฉืชืžืฉ ื‘-kryo ื›ื—ื•ืžืจ ืกื“ืจื” ืœืคืขื•ืœื” ืžื”ื™ืจื” ื™ื•ืชืจ. ื–ื” ืžื•ืชืื ื™ื•ืชืจ ืžืืฉืจ ื‘ืจื™ืจืช ื”ืžื—ื“ืœ.

spark.serializer=org.apache.spark.serializer.KryoSerializer

ื•ื™ืฉ ื’ื ื‘ืขื™ื” ืืจื•ื›ืช ืฉื ื™ื ืขื Spark ืฉืœืขืชื™ื ืงืจื•ื‘ื•ืช ื”ื•ื ืงื•ืจืก ืžื”ื–ื™ื›ืจื•ืŸ. ืœืขืชื™ื ืงืจื•ื‘ื•ืช ื–ื” ืงื•ืจื” ื‘ืจื’ืข ืฉื‘ื• ื”ืขื•ื‘ื“ื™ื ื—ื™ืฉื‘ื• ื”ื›ืœ ื•ืฉื•ืœื—ื™ื ืืช ื”ืชื•ืฆืื” ืœื ื”ื’. ื”ืคื›ื ื• ืืช ื”ืคืจืžื˜ืจ ื”ื–ื” ืœื’ื“ื•ืœ ื™ื•ืชืจ ืขื‘ื•ืจ ืขืฆืžื ื•. ื›ื‘ืจื™ืจืช ืžื—ื“ืœ, ื”ื•ื 1GB, ื”ืคื›ื ื• ืื•ืชื• ืœ-3.

spark.driver.maxResultSize=3072

ื•ืœื‘ืกื•ืฃ, ื›ืงื™ื ื•ื—. ื›ื™ืฆื“ ืœืขื“ื›ืŸ ืืช Spark ืœื’ืจืกื” 2.1 ื‘ื”ืคืฆืช HortonWorks - HDP 2.5.3.0. ื’ืจืกื” ื–ื• ืฉืœ HDP ืžื›ื™ืœื” ื’ืจืกื” 2.0 ืžื•ืชืงื ืช ืžืจืืฉ, ืื‘ืœ ืคืขื ื”ื—ืœื˜ื ื• ื‘ืขืฆืžื ื• ืฉ-Spark ืžืคืชื—ืช ื“ื™ ืคืขื™ืœ, ื•ื›ืœ ื’ืจืกื” ื—ื“ืฉื” ืžืชืงื ืช ื›ืžื” ื‘ืื’ื™ื ืคืœื•ืก ืžืกืคืงืช ืชื›ื•ื ื•ืช ื ื•ืกืคื•ืช, ื›ื•ืœืœ ืขื‘ื•ืจ ื”-API ืฉืœ python, ืื– ื”ื—ืœื˜ื ื•, ืžื” ืฆืจื™ืš ืœืขืฉื•ืช ื”ื•ื ืขื“ื›ื•ืŸ.

ื”ื•ืจื“ ืืช ื”ื’ืจืกื” ืžื”ืืชืจ ื”ืจืฉืžื™ ืฉืœ Hadoop 2.7. ืคืชื— ืื•ืชื• ื•ื”ื›ื ืก ืœืชื™ืงื™ื™ืช HDP. ื”ืชืงื ื• ืืช ื”ืกืžืœื™ื ืœืคื™ ื”ืฆื•ืจืš. ืื ื—ื ื• ืžืฉื™ืงื™ื ืืช ื–ื” - ื–ื” ืœื ืžืชื—ื™ืœ. ื›ื•ืชื‘ ืฉื’ื™ืื” ืžืื•ื“ ืœื ื‘ืจื•ืจื”.

java.lang.NoClassDefFoundError: com/sun/jersey/api/client/config/ClientConfig

ืœืื—ืจ ื—ื™ืคื•ืฉื™ื ื‘ื’ื•ื’ืœ, ื’ื™ืœื™ื ื• ืฉืกืคืืจืง ื”ื—ืœื™ื˜ื” ืœื ืœื—ื›ื•ืช ืขื“ ืฉื”ืื“ื•ื•ืค ืชื™ื•ื•ืœื“, ื•ื”ื—ืœื™ื˜ื” ืœื”ืฉืชืžืฉ ื‘ื’ืจืกื” ื”ื—ื“ืฉื” ืฉืœ ื’'ืจื–ื™. ื”ื ืขืฆืžื ืžืชื•ื•ื›ื—ื™ื ื–ื” ืขื ื–ื” ืขืœ ื”ื ื•ืฉื ื”ื–ื” ื‘-JIRA. ื”ืคืชืจื•ืŸ ื”ื™ื” ืœื”ื•ืจื™ื“ ื’ืจืกืช ื’'ืจื–ื™ 1.17.1. ืžืงื ืืช ื–ื” ื‘ืชื™ืงื™ื™ืช jars ื‘-SPARK_HOME, ื“ื—ืก ืื•ืชื• ืฉื•ื‘ ื•ื”ืขืœื” ืื•ืชื• ืœ-HDFS.

ืขืงืคื ื• ืืช ื”ืฉื’ื™ืื” ื”ื–ื•, ืื‘ืœ ื ื•ืฆืจื” ืื—ืช ื—ื“ืฉื” ื•ื“ื™ ื™ืขื™ืœื”.

org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master

ื‘ืžืงื‘ื™ืœ, ืื ื• ืžื ืกื™ื ืœื”ืจื™ืฅ ืืช ื’ืจืกื” 2.0 - ื”ื›ืœ ื‘ืกื“ืจ. ื ืกื” ืœื ื—ืฉ ืžื” ืงื•ืจื”. ื‘ื“ืงื ื• ื‘ื™ื•ืžื ื™ื ืฉืœ ื™ื™ืฉื•ื ื–ื” ื•ืจืื™ื ื• ืžืฉื”ื• ื›ื–ื”:

/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar

ื‘ืื•ืคืŸ ื›ืœืœื™, ืžืกื™ื‘ื” ื›ืœืฉื”ื™ hdp.version ืœื ื ืคืชืจื”. ืœืื—ืจ ื—ื™ืคื•ืฉ ื‘ื’ื•ื’ืœ ืžืฆืื ื• ืคืชืจื•ืŸ. ืืชื” ืฆืจื™ืš ืœืœื›ืช ืœื”ื’ื“ืจื•ืช YARN ื‘ืืžื‘ืืจื™ ื•ืœื”ื•ืกื™ืฃ ืฉื ืคืจืžื˜ืจ ืœืืชืจ ื—ื•ื˜ ืžื•ืชืื ืื™ืฉื™ืช:

hdp.version=2.5.3.0-37

ื”ืงืกื ื”ื–ื” ืขื–ืจ, ื•ืกืคืืจืง ื”ืžืจื™ื. ื‘ื“ืงื ื• ื›ืžื” ืžื”ืžื—ืฉื‘ื™ื ื”ื ื™ื™ื“ื™ื ืฉืœื ื• Jupyter. ื”ื›ืœ ืขื•ื‘ื“. ืื ื—ื ื• ืžื•ื›ื ื™ื ืœืฉื™ืขื•ืจ ื ื™ืฆื•ืฅ ืจืืฉื•ืŸ ื‘ืฉื‘ืช (ืžื—ืจ)!

UPD. ื‘ืžื”ืœืš ื”ืฉื™ืขื•ืจ ื”ืชื’ืœืชื” ื‘ืขื™ื” ื ื•ืกืคืช. ื‘ืฉืœื‘ ืžืกื•ื™ื, YARN ื”ืคืกื™ืงื” ืœืกืคืง ืžื›ื•ืœื•ืช ืขื‘ื•ืจ ืกืคืืจืง. ื‘-YARN ื”ื™ื” ืฆื•ืจืš ืœืชืงืŸ ืืช ื”ืคืจืžื˜ืจ, ืืฉืจ ื›ื‘ืจื™ืจืช ืžื—ื“ืœ ื”ื™ื” 0.2:

yarn.scheduler.capacity.maximum-am-resource-percent=0.8

ื›ืœื•ืžืจ, ืจืง 20% ืžื”ืžืฉืื‘ื™ื ื”ืฉืชืชืคื• ื‘ื—ืœื•ืงืช ื”ืžืฉืื‘ื™ื. ืœืื—ืจ ืฉื™ื ื•ื™ ื”ืคืจืžื˜ืจื™ื, ื˜ืขืŸ ืžื—ื“ืฉ ืืช YARN. ื”ื‘ืขื™ื” ื ืคืชืจื” ื•ืฉืืจ ื”ืžืฉืชืชืคื™ื ื”ืฆืœื™ื—ื• ื’ื ืœื”ืจื™ืฅ ืืช ื”ืงืฉืจ.

ืžืงื•ืจ: www.habr.com

ื”ื•ืกืคืช ืชื’ื•ื‘ื”