Digging Data in China: A Growing Industry

In spite of tight censorship and a fundamental shortage of information access, data journalism is growing in China. Since 2012, the fledgling field has flourished, generating new and creative models for Chinese reporters.

At the Uncovering Asia conference in Seoul this week, three Chinese data experts shared their experience and varied approaches with hundreds of regional and global muckrakers.

The Paper: From Traditional Media to a Digital Platform

Managing editor of The Paper, Lv Yan, said she cannot put enough emphasis on data’s role during the transition from a traditional to a digital platform.

Before the state-owned media platform rebranded into digital-only, it was known as the Oriental Morning Post. China’s news providers once mainly  drew content from various media organizations and newspapers. The Paper wanted to distinguish itself with independent news-gathering and original content, and reporting on trending topics and breaking news.

“I would like to point out an interesting thing,” Lv said. “In the newspaper era, digital journalists such as graphics designers worked under the video team, they played a role as a support unit. But after The Paper came out, they have started to become a news production team.”

In short, The Paper’s data journalists evolved from making maps, to getting technical, to telling stories.

CNB Weekly’s Rising Lab: Ranking China’s Cities

“Chinese people like rankings,” said Jolle Shen with a laugh. The Rising Lab project leader has successfully steered the data research project to YiCai Global, a financial and economic media platform. The Lab’s rankings of China’s new, first-tier cities was a hit on social media.

“It was because we included enough cities to make everyone feel relevant,” Shen said of the wide reach and engagement. They analyzed data from 338 cities in China.

In their rankings, the Lab scored the capital of the southwestern Sichuan province, Chengdu, right next to the long-established first-tier cities: Beijing, Shanghai, Guangzhou and Shenzhen. The Lab’s algorithm marked Chengdu as the best emerging first-tier city.

Faced with the government’s limited information, the team outsourced big data from Internet giants, including payment platform Alipay and e-commerce company JD.com. This provided a wider view of city economies and allowed for better reporting.

Government officials were paying attention. According to Shen, the mayor of Chengdu became extremely interested in the annual list. He wanted to know the city’s scores in every aspects analyzed, and the reasons for any rise or fall in the rankings.

The data analyses soon turned into policy stories. In one case, the Lab visualized bus data in Chengdu, which showed that it was very busy and overcrowded at night time. “We suggested to the government that current bus lines are not enough to support commuters; after they saw our report, they added 12 new bus lines,” Shen said.

There has been skepticism about the Lab’s impartiality in the rankings, but Shen said the biggest problem the team has faced is figuring out how to make the Lab profitable.

“I’m always concerned about media business models right now, like how to convince investors that we can turn a profit out of the project?” Shen asked.

Dataworks: Encourage, Exchange, Educate

Cui Zheng is the chief editor of Dataworks, a commercial platform that provides services such as data scraping, data mining, and data visualization to media outlets and corporations on a contract basis.

Cui said her colleagues, and the data field in general, are facing a rapidly changing journalism environment, a lack of data professionals, and the challenge of accessing data. While technology improves day by day, training a data expert can take years, she added.

Cui highlighted three ongoing Dataworks initiatives:

  • Encourage: Dataworks has hosted data journalism competitions for the
    past three years, generating over 300 pieces of submitted work. According to Cui, many awardees have developed into professional data journalists.
  • Exchange: the team invites international journalists to exchange perspectives
    with local journalists through workshops.
  • Educate: Cui said the company has given more than 100 lectures and over 20
    workshops.
So, how do these experts get their hands on data in a notoriously inaccessible country, with tight controls on information?

Cui Zheng (Dataworks): We usually find the statistics in yearbooks on the Internet, or in libraries of small cities. I remembered we bought old documents off of Taobao [an online shopping website] when we were doing a project once. In addition, we collaborate with tech companies. If, for example, we want some real estate prices, we can look for real data from online real estate agencies. They feel honored when they are mentioned by a platform like ours.

Also, we can see a growth in the commercial data market — Internet companies and data collectors. Mobile service providers also have methods to get data, and they are reliable and often willing to share data with the media.

Lv Yan (The Paper): Company yearbooks are a good way to find open data. Most importantly, however, we work with beat reporters [and sources who can] get us to the data quickly. Additionally, when working with Internet companies, we try to pitch topics, and not be pushed by them.

Information from the government or environmental organizations include official reports. The Supreme Court files judgement papers, but they are written in a certain style. With Python, we can scrape the data, but it is very detailed.

Jolle Shen (YiCai): Try to cooperate with tech companies. We try not to aggravate their competitive aspects. For example, when collecting the data of Alibaba and JD.com at the same time, you need to know whether they are giving you the real or the fake numbers, because of the competition.

Additionally, I have one other thing to share: We are currently working on some projects for the government. They have built a big data council to help us collect data, but it will take some time to build up the pool of information.


Lizzy Huang is a writer, translator and journalist based in Hong Kong. She is currently Chinese assistant editor for the Global Investigative Journalism Network. Previously, she was an editorial intern at Initium Media covering Chinese culture and society.

Leave a Reply

Your email address will not be published. Required fields are marked *