Habrastatistics: ์‚ฌ์ดํŠธ์—์„œ ๊ฐ€์žฅ ๋งŽ์ด ๋ฐฉ๋ฌธํ•œ ์„น์…˜๊ณผ ๊ฐ€์žฅ ์ ๊ฒŒ ๋ฐฉ๋ฌธํ•œ ์„น์…˜ ํƒ์ƒ‰

ํ—ค์ด ํ•˜๋ธŒ๋ฅด.

ะ’ ์ด์ „ ๋ถ€๋ถ„ Habr์˜ ํŠธ๋ž˜ํ”ฝ์€ ๊ธฐ์‚ฌ ์ˆ˜, ์กฐํšŒ์ˆ˜, ํ‰์  ๋“ฑ ์ฃผ์š” ๋งค๊ฐœ๋ณ€์ˆ˜์— ๋”ฐ๋ผ ๋ถ„์„๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์‚ฌ์ดํŠธ ์„น์…˜์˜ ์ธ๊ธฐ ๋ฌธ์ œ๋Š” ์•„์ง ๊ฒ€ํ† ๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ์ด๋ฅผ ๋” ์ž์„ธํžˆ ์‚ดํŽด๋ณด๊ณ  ๊ฐ€์žฅ ์ธ๊ธฐ ์žˆ๋Š” ํ—ˆ๋ธŒ์™€ ๊ฐ€์žฅ ์ธ๊ธฐ ์—†๋Š” ํ—ˆ๋ธŒ๋ฅผ ์ฐพ๋Š” ๊ฒƒ์ด ํฅ๋ฏธ๋กœ์›Œ์กŒ์Šต๋‹ˆ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ ๊ธฑํƒ€์ž„์Šค ํšจ๊ณผ์— ๋Œ€ํ•ด ์ข€ ๋” ์ž์„ธํžˆ ์‚ดํŽด๋ณด๊ณ , ์ƒˆ๋กœ์šด ์ˆœ์œ„๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์ตœ๊ณ ์˜ ๊ธฐ์‚ฌ๋ฅผ ์ƒˆ๋กญ๊ฒŒ ์„ ์ •ํ•˜๋Š” ๊ฒƒ์œผ๋กœ ๋งˆ๋ฌด๋ฆฌํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

Habrastatistics: ์‚ฌ์ดํŠธ์—์„œ ๊ฐ€์žฅ ๋งŽ์ด ๋ฐฉ๋ฌธํ•œ ์„น์…˜๊ณผ ๊ฐ€์žฅ ์ ๊ฒŒ ๋ฐฉ๋ฌธํ•œ ์„น์…˜ ํƒ์ƒ‰

๋ฌด์Šจ ์ผ์ด ์ผ์–ด๋‚ฌ๋Š”์ง€ ๊ถ๊ธˆํ•˜์‹  ๋ถ„๋“ค์„ ์œ„ํ•ด ์†ํŽธ์€ ์ปทํŒ… ์ค‘์ž…๋‹ˆ๋‹ค.

ํ†ต๊ณ„์™€ ํ‰์ ์€ ๊ณต์‹์ ์ธ ๊ฒƒ์ด ์•„๋‹ˆ๋ฉฐ ๋‚ด๋ถ€ ์ •๋ณด๊ฐ€ ์—†๋‹ค๋Š” ์ ์„ ๋‹ค์‹œ ํ•œ ๋ฒˆ ์ƒ๊ธฐ์‹œ์ผœ๋“œ๋ฆฝ๋‹ˆ๋‹ค. ๋˜ํ•œ ๋‚ด๊ฐ€ ์–ด๋”˜๊ฐ€์— ์‹ค์ˆ˜๋ฅผ ํ•˜์ง€ ์•Š์•˜๊ฑฐ๋‚˜ ๋ญ”๊ฐ€๋ฅผ ๋†“์นœ ๊ฒƒ์ด ์•„๋‹ˆ๋ผ๋Š” ๋ณด์žฅ๋„ ์—†์Šต๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ๊ทธ๋ž˜๋„ ํฅ๋ฏธ๋กœ์› ๋˜ ๊ฒƒ ๊ฐ™์•„์š”. ๋จผ์ € ์ฝ”๋“œ๋ถ€ํ„ฐ ์‹œ์ž‘ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. ๊ด€์‹ฌ์ด ์—†๋Š” ๋ถ„๋“ค์€ ์ฒซ ๋ฒˆ์งธ ์„น์…˜์„ ๊ฑด๋„ˆ๋›ฐ์…”๋„ ๋ฉ๋‹ˆ๋‹ค.

๋ฐ์ดํ„ฐ ์ˆ˜์ง‘

ํŒŒ์„œ์˜ ์ฒซ ๋ฒˆ์งธ ๋ฒ„์ „์—์„œ๋Š” ์กฐํšŒ์ˆ˜, ๋Œ“๊ธ€ ๋ฐ ๊ธฐ์‚ฌ ํ‰์ ๋งŒ ๊ณ ๋ ค๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ์ด๋ฏธ ์ข‹์ง€๋งŒ ๋” ๋ณต์žกํ•œ ์ฟผ๋ฆฌ๋ฅผ ์ˆ˜ํ–‰ํ•˜๋Š” ๊ฒƒ์„ ํ—ˆ์šฉํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์ด์ œ ์‚ฌ์ดํŠธ์˜ ์ฃผ์ œ๋ณ„ ์„น์…˜์„ ๋ถ„์„ํ•  ์ฐจ๋ก€์ž…๋‹ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ๋ช‡ ๋…„ ๋™์•ˆ "C++" ์„น์…˜์˜ ์ธ๊ธฐ๊ฐ€ ์–ด๋–ป๊ฒŒ ๋ณ€ํ–ˆ๋Š”์ง€ ํ™•์ธํ•˜๋Š” ๋“ฑ ๋งค์šฐ ํฅ๋ฏธ๋กœ์šด ์—ฐ๊ตฌ๋ฅผ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ธฐ์‚ฌ ํŒŒ์„œ๊ฐ€ ๊ฐœ์„ ๋˜์–ด ์ด์ œ ๊ธฐ์‚ฌ๊ฐ€ ์†ํ•œ ํ—ˆ๋ธŒ๋Š” ๋ฌผ๋ก  ์ž‘์„ฑ์ž์˜ ๋‹‰๋„ค์ž„๊ณผ ํ‰๊ฐ€๋„ ๋ฐ˜ํ™˜๋ฉ๋‹ˆ๋‹ค(์—ฌ๊ธฐ์—์„œ๋„ ๋งŽ์€ ํฅ๋ฏธ๋กœ์šด ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์ง€๋งŒ ์ด์— ๋Œ€ํ•ด์„œ๋Š” ๋‚˜์ค‘์— ์„ค๋ช…ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค). ๋ฐ์ดํ„ฐ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์€ csv ํŒŒ์ผ์— ์ €์žฅ๋ฉ๋‹ˆ๋‹ค.

2018-12-18T12:43Z,https://habr.com/ru/post/433550/,"ะœะตััะตะฝะดะถะตั€ Slack โ€” ะฟั€ะธั‡ะธะฝั‹ ะฒั‹ะฑะพั€ะฐ, ะบะพััะบะธ ะฟั€ะธ ะฒะฝะตะดั€ะตะฝะธะธ ะธ ะพัะพะฑะตะฝะฝะพัั‚ะธ ัะตั€ะฒะธัะฐ, ะพะฑะปะตะณั‡ะฐัŽั‰ะธะต ะถะธะทะฝัŒ",votes:7,votesplus:8,votesmin:1,bookmarks:32,
views:8300,comments:10,user:ReDisque,karma:5,subscribers:2,hubs:productpm+soft
...

์šฐ๋ฆฌ๋Š” ์‚ฌ์ดํŠธ์˜ ์ฃผ์š” ์ฃผ์ œ๋ณ„ ํ—ˆ๋ธŒ ๋ชฉ๋ก์„ ๋ฐ›๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

def get_as_str(link: str) -> Str:
    try:
        r = requests.get(link)
        return Str(r.text)
    except Exception as e:
        return Str("")

def get_hubs():
    hubs = []
    for p in range(1, 12):
        page_html = get_as_str("https://habr.com/ru/hubs/page%d/" % p)
        # page_html = get_as_str("https://habr.com/ru/hubs/geektimes/page%d/" % p)  # Geektimes
        # page_html = get_as_str("https://habr.com/ru/hubs/develop/page%d/" % p)  # Develop
        # page_html = get_as_str("https://habr.com/ru/hubs/admin/page%d" % p)  # Admin
        for hub in page_html.split("media-obj media-obj_hub"):
            info = Str(hub).find_between('"https://habr.com/ru/hub', 'list-snippet__tags') 
            if "*</span>" in info:
                hub_name = info.find_between('/', '/"')
                if len(hub_name) > 0 and len(hub_name) < 32:
                    hubs.append(hub_name)
    print(hubs)

find_between ํ•จ์ˆ˜์™€ Str ํด๋ž˜์Šค๋Š” ๋‘ ํƒœ๊ทธ ์‚ฌ์ด์—์„œ ๋ฌธ์ž์—ด์„ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค. ์ €๋Š” ์ด๋ฅผ ์‚ฌ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด์ „. ์ฃผ์ œ๋ณ„ ํ—ˆ๋ธŒ๋Š” ์‰ฝ๊ฒŒ ๊ฐ•์กฐ ํ‘œ์‹œํ•  ์ˆ˜ ์žˆ๋„๋ก "*"๋กœ ํ‘œ์‹œ๋˜์–ด ์žˆ์œผ๋ฉฐ ํ•ด๋‹น ์ค„์˜ ์ฃผ์„ ์ฒ˜๋ฆฌ๋ฅผ ์ œ๊ฑฐํ•˜์—ฌ ๋‹ค๋ฅธ ๋ฒ”์ฃผ์˜ ์„น์…˜์„ ๊ฐ€์ ธ์˜ฌ ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.

get_hubs ํ•จ์ˆ˜์˜ ์ถœ๋ ฅ์€ ์ƒ๋‹นํžˆ ์ธ์ƒ์ ์ธ ๋ชฉ๋ก์ด๋ฉฐ ์‚ฌ์ „์œผ๋กœ ์ €์žฅ๋ฉ๋‹ˆ๋‹ค. ๋‚˜๋Š” ์—ฌ๋Ÿฌ๋ถ„์ด ๊ทธ ์–‘์„ ์ถ”์ •ํ•  ์ˆ˜ ์žˆ๋„๋ก ๋ชฉ๋ก ์ „์ฒด๋ฅผ ๊ตฌ์ฒด์ ์œผ๋กœ ์ œ์‹œํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

hubs_profile = {'infosecurity', 'programming', 'webdev', 'python', 'sys_admin', 'it-infrastructure', 'devops', 'javascript', 'open_source', 'network_technologies', 'gamedev', 'cpp', 'machine_learning', 'pm', 'hr_management', 'linux', 'analysis_design', 'ui', 'net', 'hi', 'maths', 'mobile_dev', 'productpm', 'win_dev', 'it_testing', 'dev_management', 'algorithms', 'go', 'php', 'csharp', 'nix', 'data_visualization', 'web_testing', 's_admin', 'crazydev', 'data_mining', 'bigdata', 'c', 'java', 'usability', 'instant_messaging', 'gtd', 'system_programming', 'ios_dev', 'oop', 'nginx', 'kubernetes', 'sql', '3d_graphics', 'css', 'geo', 'image_processing', 'controllers', 'game_design', 'html5', 'community_management', 'electronics', 'android_dev', 'crypto', 'netdev', 'cisconetworks', 'db_admins', 'funcprog', 'wireless', 'dwh', 'linux_dev', 'assembler', 'reactjs', 'sales', 'microservices', 'search_technologies', 'compilers', 'virtualization', 'client_side_optimization', 'distributed_systems', 'api', 'media_management', 'complete_code', 'typescript', 'postgresql', 'rust', 'agile', 'refactoring', 'parallel_programming', 'mssql', 'game_promotion', 'robo_dev', 'reverse-engineering', 'web_analytics', 'unity', 'symfony', 'build_automation', 'swift', 'raspberrypi', 'web_design', 'kotlin', 'debug', 'pay_system', 'apps_design', 'git', 'shells', 'laravel', 'mobile_testing', 'openstreetmap', 'lua', 'vs', 'yii', 'sport_programming', 'service_desk', 'itstandarts', 'nodejs', 'data_warehouse', 'ctf', 'erp', 'video', 'mobileanalytics', 'ipv6', 'virus', 'crm', 'backup', 'mesh_networking', 'cad_cam', 'patents', 'cloud_computing', 'growthhacking', 'iot_dev', 'server_side_optimization', 'latex', 'natural_language_processing', 'scala', 'unreal_engine', 'mongodb', 'delphi',  'industrial_control_system', 'r', 'fpga', 'oracle', 'arduino', 'magento', 'ruby', 'nosql', 'flutter', 'xml', 'apache', 'sveltejs', 'devmail', 'ecommerce_development', 'opendata', 'Hadoop', 'yandex_api', 'game_monetization', 'ror', 'graph_design', 'scada', 'mobile_monetization', 'sqlite', 'accessibility', 'saas', 'helpdesk', 'matlab', 'julia', 'aws', 'data_recovery', 'erlang', 'angular', 'osx_dev', 'dns', 'dart', 'vector_graphics', 'asp', 'domains', 'cvs', 'asterisk', 'iis', 'it_monetization', 'localization', 'objectivec', 'IPFS', 'jquery', 'lisp', 'arvrdev', 'powershell', 'd', 'conversion', 'animation', 'webgl', 'wordpress', 'elm', 'qt_software', 'google_api', 'groovy_grails', 'Sailfish_dev', 'Atlassian', 'desktop_environment', 'game_testing', 'mysql', 'ecm', 'cms', 'Xamarin', 'haskell', 'prototyping', 'sw', 'django', 'gradle', 'billing', 'tdd', 'openshift', 'canvas', 'map_api', 'vuejs', 'data_compression', 'tizen_dev', 'iptv', 'mono', 'labview', 'perl', 'AJAX', 'ms_access', 'gpgpu', 'infolust', 'microformats', 'facebook_api', 'vba', 'twitter_api', 'twisted', 'phalcon', 'joomla', 'action_script', 'flex', 'gtk', 'meteorjs', 'iconoskaz', 'cobol', 'cocoa', 'fortran', 'uml', 'codeigniter', 'prolog', 'mercurial', 'drupal', 'wp_dev', 'smallbasic', 'webassembly', 'cubrid', 'fido', 'bada_dev', 'cgi', 'extjs', 'zend_framework', 'typography', 'UEFI', 'geo_systems', 'vim', 'creative_commons', 'modx', 'derbyjs', 'xcode', 'greasemonkey', 'i2p', 'flash_platform', 'coffeescript', 'fsharp', 'clojure', 'puppet', 'forth', 'processing_lang', 'firebird', 'javame_dev', 'cakephp', 'google_cloud_vision_api', 'kohanaphp', 'elixirphoenix', 'eclipse', 'xslt', 'smalltalk', 'googlecloud', 'gae', 'mootools', 'emacs', 'flask', 'gwt', 'web_monetization', 'circuit-design', 'office365dev', 'haxe', 'doctrine', 'typo3', 'regex', 'solidity', 'brainfuck', 'sphinx', 'san', 'vk_api', 'ecommerce'}

๋น„๊ต๋ฅผ ์œ„ํ•ด geektimes ์„น์…˜์€ ์ข€ ๋” ๊ฒธ์†ํ•ด ๋ณด์ž…๋‹ˆ๋‹ค.

hubs_gt = {'popular_science', 'history', 'soft', 'lifehacks', 'health', 'finance', 'artificial_intelligence', 'itcompanies', 'DIY', 'energy', 'transport', 'gadgets', 'social_networks', 'space', 'futurenow', 'it_bigraphy', 'antikvariat', 'games', 'hardware', 'learning_languages', 'urban', 'brain', 'internet_of_things', 'easyelectronics', 'cellular', 'physics', 'cryptocurrency', 'interviews', 'biotech', 'network_hardware', 'autogadgets', 'lasers', 'sound', 'home_automation', 'smartphones', 'statistics', 'robot', 'cpu', 'video_tech', 'Ecology', 'presentation', 'desktops', 'wearable_electronics', 'quantum', 'notebooks', 'cyberpunk', 'Peripheral', 'demoscene', 'copyright', 'astronomy', 'arvr', 'medgadgets', '3d-printers', 'Chemistry', 'storages', 'sci-fi', 'logic_games', 'office', 'tablets', 'displays', 'video_conferencing', 'videocards', 'photo', 'multicopters', 'supercomputers', 'telemedicine', 'cybersport', 'nano', 'crowdsourcing', 'infographics'}

๋‚˜๋จธ์ง€ ํ—ˆ๋ธŒ๋„ ๊ฐ™์€ ๋ฐฉ์‹์œผ๋กœ ๋ณด์กด๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์ด์ œ ๊ธฐ์‚ฌ๊ฐ€ Geektimes์— ์†ํ•˜๋“  ํ”„๋กœํ•„ ํ—ˆ๋ธŒ์— ์†ํ•˜๋“  ๊ฒฐ๊ณผ๋ฅผ ๋ฐ˜ํ™˜ํ•˜๋Š” ํ•จ์ˆ˜๋ฅผ ์‰ฝ๊ฒŒ ์ž‘์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

def is_geektimes(hubs: List) -> bool:
    return len(set(hubs) & hubs_gt) > 0

def is_geektimes_only(hubs: List) -> bool:
    return is_geektimes(hubs) is True and is_profile(hubs) is False

def is_profile(hubs: List) -> bool:
    return len(set(hubs) & hubs_profile) > 0

๋‹ค๋ฅธ ์„น์…˜(โ€œ๊ฐœ๋ฐœโ€, โ€œ๊ด€๋ฆฌโ€ ๋“ฑ)์—๋„ ์œ ์‚ฌํ•œ ๊ธฐ๋Šฅ์ด ๋งŒ๋“ค์–ด์กŒ์Šต๋‹ˆ๋‹ค.

์ฒ˜๋ฆฌ

์ด์ œ ๋ถ„์„์„ ์‹œ์ž‘ํ•  ์‹œ๊ฐ„์ž…๋‹ˆ๋‹ค. ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ๋กœ๋“œํ•˜๊ณ  ํ—ˆ๋ธŒ ๋ฐ์ดํ„ฐ๋ฅผ ์ฒ˜๋ฆฌํ•ฉ๋‹ˆ๋‹ค.

def to_list(s: str) -> List[str]:
    # "user:popular_science+astronomy" => [popular_science, astronomy]
    return s.split(':')[1].split('+')

def to_date(dt: datetime) -> datetime.date:
    return dt.date()

df = pd.read_csv("habr_2019.csv", sep=',', encoding='utf-8', error_bad_lines=True, quotechar='"', comment='#')
dates = pd.to_datetime(df['datetime'], format='%Y-%m-%dT%H:%MZ')
dates += datetime.timedelta(hours=3)
df['date'] = dates.map(to_date, na_action=None)
hubs = df["hubs"].map(to_list, na_action=None)
df['hubs'] = hubs
df['is_profile'] = hubs.map(is_profile, na_action=None)
df['is_geektimes'] = hubs.map(is_geektimes, na_action=None)
df['is_geektimes_only'] = hubs.map(is_geektimes_only, na_action=None)
df['is_admin'] = hubs.map(is_admin, na_action=None)
df['is_develop'] = hubs.map(is_develop, na_action=None)

์ด์ œ ๋ฐ์ดํ„ฐ๋ฅผ ๋‚ ์งœ๋ณ„๋กœ ๊ทธ๋ฃนํ™”ํ•˜๊ณ  ๋‹ค์–‘ํ•œ ํ—ˆ๋ธŒ์˜ ์ถœํŒ๋ฌผ ์ˆ˜๋ฅผ ํ‘œ์‹œํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

g = df.groupby(['date'])
days_count = g.size().reset_index(name='counts')
year_days = days_count['date'].values
grouped = g.sum().reset_index()
profile_per_day_avg = grouped['is_profile'].rolling(window=20, min_periods=1).mean()
geektimes_per_day_avg = grouped['is_geektimes'].rolling(window=20, min_periods=1).mean()
geektimesonly_per_day_avg = grouped['is_geektimes_only'].rolling(window=20, min_periods=1).mean()
admin_per_day_avg = grouped['is_admin'].rolling(window=20, min_periods=1).mean()
develop_per_day_avg = grouped['is_develop'].rolling(window=20, min_periods=1).mean()

Matplotlib์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ฒŒ์‹œ๋œ ๊ธฐ์‚ฌ ์ˆ˜๋ฅผ ํ‘œ์‹œํ•ฉ๋‹ˆ๋‹ค.

Habrastatistics: ์‚ฌ์ดํŠธ์—์„œ ๊ฐ€์žฅ ๋งŽ์ด ๋ฐฉ๋ฌธํ•œ ์„น์…˜๊ณผ ๊ฐ€์žฅ ์ ๊ฒŒ ๋ฐฉ๋ฌธํ•œ ์„น์…˜ ํƒ์ƒ‰

์ฐจํŠธ์—์„œ ๊ธฐ์‚ฌ๋ฅผ "geektimes"์™€ "geektimes only"๋กœ ๊ตฌ๋ถ„ํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ธฐ์‚ฌ๋Š” ๋™์‹œ์— ๋‘ ์„น์…˜์— ์†ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค(์˜ˆ: "DIY" + "๋งˆ์ดํฌ๋กœ์ปจํŠธ๋กค๋Ÿฌ" + "C++"). ๋‚˜๋Š” ์‚ฌ์ดํŠธ์˜ ํ”„๋กœํ•„ ๊ธฐ์‚ฌ๋ฅผ ๊ฐ•์กฐํ•˜๊ธฐ ์œ„ํ•ด "ํ”„๋กœํ•„"์ด๋ผ๋Š” ๋ช…์นญ์„ ์‚ฌ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค. ์•„๋งˆ๋„ ์ด์— ๋Œ€ํ•œ ์˜์–ด ์šฉ์–ด ํ”„๋กœํ•„์ด ์™„์ „ํžˆ ์ •ํ™•ํ•˜์ง€๋Š” ์•Š์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์•ž์„  ๋ถ€๋ถ„์—์„œ๋Š” ์˜ฌ์—ฌ๋ฆ„๋ถ€ํ„ฐ ๊ธฑํƒ€์ž„์ฆˆ ๊ธฐ์‚ฌ ๊ฒฐ์ œ ๊ทœ์ • ๋ณ€๊ฒฝ์— ๋”ฐ๋ฅธ '๊ธฑํƒ€์ž„์Šค ํšจ๊ณผ'์— ๋Œ€ํ•ด ์งˆ๋ฌธํ–ˆ์Šต๋‹ˆ๋‹ค. Geektimes ๊ธฐ์‚ฌ๋ฅผ ๋ณ„๋„๋กœ ํ‘œ์‹œํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

df_gt = df[(df['is_geektimes_only'] == True)]
group_gt = df_gt.groupby(['date'])
days_count_gt = group_gt.size().reset_index(name='counts')
grouped = group_gt.sum().reset_index()
year_days_gt = days_count_gt['date'].values
view_gt_per_day_avg = grouped['views'].rolling(window=20, min_periods=1).mean()

๊ฒฐ๊ณผ๋Š” ํฅ๋ฏธ๋กญ์Šต๋‹ˆ๋‹ค. Geektimes ๊ธฐ์‚ฌ์˜ ์ด ์กฐํšŒ์ˆ˜ ๋น„์œจ์€ ๋Œ€๋žต 1:5 ์ •๋„์ž…๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ์ „์ฒด ์กฐํšŒ์ˆ˜๋Š” ๋ˆˆ์— ๋„๊ฒŒ ๋ณ€๋™ํ–ˆ์ง€๋งŒ, '์˜ˆ๋Šฅ' ๊ธฐ์‚ฌ ์กฐํšŒ์ˆ˜๋Š” ๊ฑฐ์˜ ๊ฐ™์€ ์ˆ˜์ค€์„ ์œ ์ง€ํ–ˆ์Šต๋‹ˆ๋‹ค.

Habrastatistics: ์‚ฌ์ดํŠธ์—์„œ ๊ฐ€์žฅ ๋งŽ์ด ๋ฐฉ๋ฌธํ•œ ์„น์…˜๊ณผ ๊ฐ€์žฅ ์ ๊ฒŒ ๋ฐฉ๋ฌธํ•œ ์„น์…˜ ํƒ์ƒ‰

๊ทœ์น™์„ ๋ณ€๊ฒฝํ•œ ํ›„์—๋„ "geektimes" ์„น์…˜์— ์žˆ๋Š” ๊ธฐ์‚ฌ์˜ ์ด ์กฐํšŒ์ˆ˜๋Š” ์—ฌ์ „ํžˆ ๊ฐ์†Œํ–ˆ์ง€๋งŒ "๋ˆˆ์œผ๋กœ"๋Š” ์ด ๊ฐ’์˜ 5%๋ฅผ ๋„˜์ง€ ์•Š๋Š” ๊ฒƒ์œผ๋กœ ๋‚˜ํƒ€๋‚ฌ์Šต๋‹ˆ๋‹ค.

๊ธฐ์‚ฌ๋‹น ํ‰๊ท  ์กฐํšŒ์ˆ˜๋ฅผ ์‚ดํŽด๋ณด๋Š” ๊ฒƒ์€ ํฅ๋ฏธ๋กญ์Šต๋‹ˆ๋‹ค.

Habrastatistics: ์‚ฌ์ดํŠธ์—์„œ ๊ฐ€์žฅ ๋งŽ์ด ๋ฐฉ๋ฌธํ•œ ์„น์…˜๊ณผ ๊ฐ€์žฅ ์ ๊ฒŒ ๋ฐฉ๋ฌธํ•œ ์„น์…˜ ํƒ์ƒ‰

"์˜ค๋ฝ" ๊ธฐ์‚ฌ์˜ ๊ฒฝ์šฐ ํ‰๊ท ๋ณด๋‹ค ์•ฝ 40% ๋†’์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ์•„๋งˆ๋„ ๋†€๋ผ์šด ์ผ์ด ์•„๋‹ ๊ฒƒ์ž…๋‹ˆ๋‹ค. XNUMX์›” ์ดˆ์˜ ์‹คํŒจ๋Š” ์ œ๊ฒŒ๋Š” ๋ถˆ๋ถ„๋ช…ํ•ฉ๋‹ˆ๋‹ค. ์•„๋งˆ๋„ ๊ทธ๋Ÿฐ ์ผ์ด ์ผ์–ด๋‚ฌ์„ ์ˆ˜๋„ ์žˆ๊ณ , ์ผ์ข…์˜ ๊ตฌ๋ฌธ ๋ถ„์„ ์˜ค๋ฅ˜์ผ ์ˆ˜๋„ ์žˆ๊ณ , ์•„๋‹ˆ๋ฉด ๊ดด์งœ ์ž‘๊ฐ€ ์ค‘ ํ•œ ๋ช…์ด ํœด๊ฐ€๋ฅผ ๊ฐ”์„ ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค ;).

๊ทธ๊ฑด ๊ทธ๋ ‡๊ณ , ๊ทธ๋ž˜ํ”„๋Š” ๊ธฐ์‚ฌ ์กฐํšŒ์ˆ˜์—์„œ ์ƒˆํ•ด์™€ 5 ์›” ๊ณตํœด์ผ์ด๋ผ๋Š” ๋‘ ๊ฐ€์ง€ ๋ˆˆ์— ๋„๋Š” ์ตœ๊ณ ์ ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

ํ—ˆ๋ธŒ

์•ฝ์†๋œ ํ—ˆ๋ธŒ ๋ถ„์„์œผ๋กœ ๋„˜์–ด๊ฐ€๊ฒ ์Šต๋‹ˆ๋‹ค. ์กฐํšŒ์ˆ˜ ๊ธฐ์ค€์œผ๋กœ ์ƒ์œ„ 20๊ฐœ ํ—ˆ๋ธŒ๋ฅผ ๋‚˜์—ดํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

hubs_info = []
for hub_name in hubs_all:
    mask = df['hubs'].apply(lambda x: hub_name in x)
    df_hub = df[mask]

    count, views = df_hub.shape[0], df_hub['views'].sum()
    hubs_info.append((hub_name, count, views))

# Draw hubs
hubs_top = sorted(hubs_info, key=lambda v: v[2], reverse=True)[:20]
top_views = list(map(lambda x: x[2], hubs_top))
top_names = list(map(lambda x: x[0], hubs_top))

plt.rcParams["figure.figsize"] = (8, 6)
plt.bar(range(0, len(top_views)), top_views)
plt.xticks(range(0, len(top_names)), top_names, rotation=90)
plt.ticklabel_format(style='plain', axis='y')
plt.tight_layout()
plt.show()

๊ฒฐ๊ณผ :

Habrastatistics: ์‚ฌ์ดํŠธ์—์„œ ๊ฐ€์žฅ ๋งŽ์ด ๋ฐฉ๋ฌธํ•œ ์„น์…˜๊ณผ ๊ฐ€์žฅ ์ ๊ฒŒ ๋ฐฉ๋ฌธํ•œ ์„น์…˜ ํƒ์ƒ‰

๋†€๋ž๊ฒŒ๋„ ๊ฒฌํ•ด ์ธก๋ฉด์—์„œ ๊ฐ€์žฅ ์ธ๊ธฐ ์žˆ๋Š” ํ—ˆ๋ธŒ๋Š” '์ •๋ณด ๋ณด์•ˆ'์ด์—ˆ๊ณ  ์ƒ์œ„ 5๊ฐœ ๋ฆฌ๋”์—๋Š” 'ํ”„๋กœ๊ทธ๋ž˜๋ฐ'๊ณผ '๋Œ€์ค‘ ๊ณผํ•™'๋„ ํฌํ•จ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

Antitop์€ Gtk์™€ Cocoa๋ฅผ ์ฐจ์ง€ํ•ฉ๋‹ˆ๋‹ค.

Habrastatistics: ์‚ฌ์ดํŠธ์—์„œ ๊ฐ€์žฅ ๋งŽ์ด ๋ฐฉ๋ฌธํ•œ ์„น์…˜๊ณผ ๊ฐ€์žฅ ์ ๊ฒŒ ๋ฐฉ๋ฌธํ•œ ์„น์…˜ ํƒ์ƒ‰

๋น„๋ฐ€ ํ•˜๋‚˜ ์•Œ๋ ค๋“œ๋ฆด๊ป˜์š” ์ƒ์œ„ ํ—ˆ๋ธŒ๋„ ๋ณผ ์ˆ˜ ์žˆ์–ด์š” ์—ฌ๊ธฐ์—, ์กฐํšŒ์ˆ˜๋Š” ํ‘œ์‹œ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

ํ‰๊ฐ€

๊ทธ๋ฆฌ๊ณ  ๋งˆ์ง€๋ง‰์œผ๋กœ ์•ฝ์†๋œ ๋“ฑ๊ธ‰์ž…๋‹ˆ๋‹ค. ํ—ˆ๋ธŒ ๋ถ„์„ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ 2019๋…„ ์˜ฌํ•ด ๊ฐ€์žฅ ์ธ๊ธฐ ์žˆ๋Š” ํ—ˆ๋ธŒ์— ๋Œ€ํ•œ ๊ฐ€์žฅ ์ธ๊ธฐ ์žˆ๋Š” ๊ธฐ์‚ฌ๋ฅผ ํ‘œ์‹œํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ •๋ณด ๋ณด์•ˆ

ํ”„๋กœ๊ทธ๋ž˜๋ฐ

๋Œ€์ค‘๊ณผํ•™

์ง์—…

IT ์ž…๋ฒ•

์›น ๊ฐœ๋ฐœ

GTK

๊ทธ๋ฆฌ๊ณ  ๋งˆ์ง€๋ง‰์œผ๋กœ ์•„๋ฌด๋„ ๊ธฐ๋ถ„์„ ์ƒํ•˜๊ฒŒํ•˜์ง€ ์•Š๋„๋ก ๊ฐ€์žฅ ์ ๊ฒŒ ๋ฐฉ๋ฌธํ•œ ํ—ˆ๋ธŒ "gtk"์— ๋“ฑ๊ธ‰์„ ๋ถ€์—ฌํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. 1๋…„ ๋งŒ์— ์ถœํŒ๋๋‹ค. ะพะดะฝะฐ ์ด ๊ธฐ์‚ฌ๋Š” "์ž๋™์œผ๋กœ" ๋“ฑ๊ธ‰์˜ ์ฒซ ๋ฒˆ์งธ ์ค„์„ ์ฐจ์ง€ํ•ฉ๋‹ˆ๋‹ค.

๊ฒฐ๋ก 

๊ฒฐ๋ก ์€ ์—†์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋ชจ๋‘๋“ค ์ฆ๊ฑฐ์šด ๋…์„œ ๋˜์„ธ์š”.

์ถœ์ฒ˜ : habr.com

์ฝ”๋ฉ˜ํŠธ๋ฅผ ์ถ”๊ฐ€