That doesn't mean it actually was hit by that many attacks, or that China is actually the scapegoat for some other international sponsor of cyberespionage rather than the source of much of it.
It does mean much of the evidence against China is circumstantial.
Making it concrete – being able to say for sure who stole a piece of data, trace it from its source to storage sites where it is eventually stashed – would shed a comparatively blinding amount of light on who commits data theft, where the data go and what is done with it.
If it were possible to build a full-scale validation attribution framework, every unit of data – a record, a gigabyte, a chunk of storage space that could live in a metadata container able to contain data of many types – would have a beacon that would activate like a homing device if it were stolen, Oquendo suggests.
A data containers able to identify when it had been moved from an approved location, and send a stream of self-identifying packets back home would not require that much intelligence or additional storage overhead.
It wouldn't if the number of containers was relatively small, which means the containers themselves would have to be really big, which raises the possibility that the containers themselves could be cracked, data removed and the beacon function lost.
If the containers were small enough for single documents, the potential for huge increases in storage and bandwidth requirements might pose such huge cost penalties as to kill any project before it got started. (More globally, we probably don't want to give all those Word and Excel files more power to talk to each other, anyway. They cause enough trouble with the little they're already able to say.)
Oquendo's suggestion is more refined; also more widely deployed already.
Not only would adding beacons to every document be expensive, it would cause a huge shift in the signal/noise ratio of the Internet as well as capacity problems as millions of documents try continually to send notes home.
A more elegant approach is to add what he calls "loaded cookies" that can attach themselves permanently to identify the files when they're found.
source: Honeynet Project