博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
英国《卫报》:Can you predict who will love a song?
阅读量:2489 次
发布时间:2019-05-11

本文共 4465 字,大约阅读时间需要 14 分钟。

Competitors used EMI data in an attempt to predict the ratings given to songs by listeners Photograph: Judith Collins / Alamy/Alamy

As finales go, it couldn't have been much more tense. With the finish tantalisingly in sight, the relatively unknown frontrunner held a clear and seemingly unbreakable lead, only to find a veteran champion breaking through. And then as the two grappled for first place, in a true Cinderella story, a third darted in from nowhere in the final moments to steal it from them both and claim the victory.

But this nailbiting finish had nothing to do with the Tour de France, the Olympics, or any other kind of traditional sporting event for that matter. Instead, it involved a battle between hundreds of data scientists around the world racing to help shape the future of the music industry. Their task: to develop an algorithm capable of predicting if a listener will love a new song.

Not that long ago such a pursuit would have been considered utter folly and best left to soothsayers and astrologers. Thanks to the sheer scale and quality of data that's now becoming available, and to the development of better algorithms through events such as this, it is now not only quite feasible but rapidly becoming a way of doing business in many industries.

This event, the , is clear evidence of that because it involved the music giant  Music sharing its highly prized EMI  for the very first time. This is a vast and uniquely rich dataset compiled from 20-minute interviews with 800,000 music lovers from 25 different countries, recording their interests, attitudes, behaviours, and their familiarity and appreciation of music. For the data science community in London and those further afield – through's online platform – this was a chance to show just what can be achieved when the right kind of data meets the right minds.

Held in partnership with , EMI Music, EMC, and Kaggle, the challenge was to use this dataset to predict the rating someone would give a song based on their demographic, the artist and track ratings, their answers to questions about musical preferences and the words they use to describe EMI artists.

With a prize fund of £6,500, we saw more than 1,300 entries submitted by 138 different teams. Some of these attended the event in person, while the rest were made up of Kaggle's online community of 45,000 data scientists. We saw a broad range of approaches, from generalised boosted methods to random forests, single value decomposition to matrix factorisation and collaborative filtering, with no one class of model outperforming all the others.

The , both in terms of quality and quantity of algorithms. However, in the end there was a very clear winning team, which came from , a tech incubator based in Shanghai and Beijing and a rising star in the Kaggle community. As in several previous Kaggle and Data Science London collaborations, the winners' code and algorithms will be open sourced.

But besides showing that is possible to make these kinds of predictions, this event also uncovered some other nice gems, such as how women tended to be generally more positive than men, using words like "current", "edgy" and "cool" to describe songs, as opposed to "cheap", "unoriginal" and "superficial". Retired people tended rate songs higher, while students and unemployed people often gave lower ratings. And it was interesting to see correlations between the words people used to describe the same song, often seemingly at odds with each other.

The words "noisy" and "uplifting" is one example. And similarly one person's "superficial" is another's "playful". Another consistent theme was that the characteristics commonly used by the music industry to inform their marketing, such as "age" and "gender", turned out to be not the most powerful predictors after all.

Perhaps the loudest message to take from this is how very qualitative data sets – extremely subjective survey questions about people, their relationship with the music they like, and the words they associate with different tracks – can be mined. It's a great reminder that collaboration, bright minds, and machine learning can be used to understand even a very non-technical question such as "Will you like a new song?"

 is president and chief scientist at , a platform for competitive data science, specialising in predictive modeling.

http://www.guardian.co.uk/news/datablog/2012/jul/28/music-data-science-emi-predict-song-preferences?INTCMP=SRCH

转载地址:http://limrb.baihongyu.com/

你可能感兴趣的文章
怎样玩转千万级别的数据
查看>>
input输入框修改后自动跳到最后一个字符
查看>>
Windows与Linux之间海量文件的传输与Linux下大小写敏感问题
查看>>
HDU 3948 不同回文子串个数
查看>>
分布式锁的实现方式
查看>>
重定向与转发
查看>>
tslib1.4安装小记
查看>>
rails 5 action cable 服务器部署
查看>>
【ABAP系列】SAP ABAP模块-任意report作为附件以邮件形式发送
查看>>
winfrom 在业务层实现事务控制
查看>>
Leetcode: Valid Parentheses
查看>>
Python
查看>>
自己动手开发调试器 01
查看>>
Python基础-包
查看>>
多线程程序排错总结
查看>>
richTextBoxFontClass
查看>>
MySQL事务管理
查看>>
PHP 实例 - AJAX RSS 阅读器
查看>>
POJ 2696 计算表达式的值
查看>>
都江堰很美-佩服古人_Crmhf的一天
查看>>