Loading…
TPRC has ended
Sunday, September 29 • 11:10am - 11:45am
Search Concentration, Bias & Parochialism: Lessons from a Comparative Study of Google, Baidu & Jike's Search Results in China

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Download Paper

Research Objectives, Importance & Novelty

Are search engines making the rich richer, the poor poorer by driving Web traffic to well-established sites while punishing the lesser known? Do search engines intentionally favor their own content while demoting others? How parochial or cosmopolitan are search engines in directing traffic to sites beyond the user?s national borders?

These questions are crucial not only to websites and users, but also to the well-being of the entire Internet ecosystem that has become search-centered. As information gateways, search engines play a central role in influencing user attention, directing web traffic, and arbitrating advertising dollars. Search giants have also become increasingly vertically integrated, functioning as search engine/advertising agency/ratings system simultaneously. Their status raises concerns over search quality, competition, and openness. The stakes are high.

Finding evidence to start answering the above questions, however, is difficult not least because search engines are complex and proprietary. This paper suggests carefully executed comparative information retrieval research can provide much needed empirical evidence to start probing questions of search concentration, bias and parochialism, particularly in international search markets like China (450 million search engine users, $900 million market size) where little to no independent research has been conducted on such critical issues.

Methodology, Data & Preliminary Results

Comparative search results research evaluates search quality and search engine properties by querying different search engines with a small sample of keywords to detect unique results patterns (e.g. results overlap, search ranking and filtering patterns). In this study, ?search concentration? refers to the degree to which search results are concentrated in a few dominant websites. ?Search bias? is defined as a combination of ?own-content bias? (favored inclusion and ranking of search company?s own content) and ?other-content bias? (exclusion from and lowered ranking of rivals? content). ?Search parochialism? denotes the degree to which search engines include results from overseas sites.

This study compares longitudinal search results data collected inside China in August 2011 and August 2012 from three search engines: Google, Baidu, and Jike (state-sponsored). A total of 35 keywords were used as the sample: 20 from a government-sanctioned report that ranked the top 20 Internet events in 2010 (e.g. Tencent vs. 360 dispute, Shanghai Expo, Foxconn suicides and etc.); 15 were general terms (e.g. transportation, medicine, news). Measures were taken to minimize search personalization and other variables (e.g. disabling cookies, Internet Explorer as default browser etc.). Data was gathered from the same location roughly a year apart with the same set of keywords. The webpages containing the first 10 search results (textual only) were saved, yielding a total of 2100 textual hyperlinks for analysis.

Preliminary analysis finds grounds for concern in the Chinese search market. A high percentage of search results (as much as nearly 50%) came from five or six top sites in China ? Baidu, Sina, Tencent, Sohu, 163.com and Phoenix (ifeng.com). Search concentration is more pronounced in Baidu and Jike than in Google. There was little change over time. Moreover, Baidu consistently referenced its own content significantly more often and ranked it consistently higher than the other two search engines would. For instance, Baidu referenced its own content 52 times in the 2012 data set, compared to 12 times in Google and 25 times in Jike. Baidu and Jike included few search results from overseas sites, while Google was more likely to do so.

Although the study?s sample size is small and the results are not easily generalizable, this project may serve as a valuable baseline for future research and makes a start in raising and tackling these important research questions in the Chinese search engine market.

Moderators
Speakers
avatar for Min Jiang

Min Jiang

Associate Professor, UNC Charlotte
I research and publish in the area of Chinese Internet technologies (search engines, microblogging, big data), Internet politics and policies.


Sunday September 29, 2013 11:10am - 11:45am PDT
GMUSL Room 221

Attendees (0)