Discussion:
As a species we excel at information organization and dissemination. We are rare in that we are capable of mirroring behavior we have not physically seen but instead visualized through analysis of abstract information. The historic correlation between new methods of information dispersal and social "progress" is well accepted, e.g. the advent of writing, the creation of the printing press and telegraph, television and radios. These new technologies have, over the centuries, allowed progressively more information to be made accessible, and with modern digital communication we are now able to disseminate vast amounts of information quickly and easily.
Humanity is the only species known to encode and transmit information through abstract symbolism, i.e. writing, allowing a healthy amount of current understanding to have already been built on archeodata. Modern archaeology and anthropology are focus heavily on the recovery and study of ancient archeodata while many of the modern "hard" sciences owe significant breakthroughs to the recovery and synthesis of the same. For example, during the 1854 Broad Street cholera outbreak Dr John Snow tracked outbreaks of the disease using a standard dot map/Voronoi diagram, then famously used the data to identify the source of the outbreak as the public well on Broad Street. Afterwards, officials rejected his assertion that water was responsible for bearing the disease and his data was abandoned until 1866 when his information was used to combat a similar outbreak in Bromley. These studies were of minor interest to the medical community at the time, but several decades later were of great interest to Pasteur, Cook and Lister as they established modern germ theory. More recently, there is much debate on the ethics of using data from the infamous Nazi freezing experiments, which remains some of our only data on death from exposure. Conversely, after the death of Nokolai Tesla many of his notes were initially seized by the US government, and after declassification showed theories applicable to to modern plasma torches, radar and wireless networks.
The issue of privacy does not apply to true archeodata because it has, by nature, been abandoned or lost, and thus assumed to possess no value by laypersons. Information is only considered sensitive or private when it's dispersal could potentially impact ones freedoms, but this obviously does not apply to what has been discarded. For example, online fetish communities often include a clause in their membership agreement that members cannot use any information about other members obtained through any means for any purpose; this is done with the stated intention of creating a "safe space" or judgement-free community where members can explore interests without social repercussions. Likewise, government surveillance of citizens is a hotly debated topic with similar arguments for and against, where, conversely, examining the sexuality of various historic cultures is as widely accepted as our poring over ancient journals and entering tombs. A defining hallmark of archeodata is that the information holds no value to whomever, if anyone, is aware of it.
Much data already exists, but in addition to organization it also requires verification. For example, until the recovery and translation of Homer's epic cycle the existence of the city of Troy had been forgotten. It was found after centuries of searching evidence to verify the data that had been implied. Conversely, while the existence of Atlantis or Camelot has been implied by various recovered sources there is much more evidence against their existences then for them.
Archeodata is not limited to information or statistics. A fantastic amount of software code has been written that is considered largely obsolete, ranging from machine-specific drivers to video games, and occasionally this type of information proves useful, or at least entertaining. Conversely, the rate at which software and digital hardware develop can make recovering this type of data difficult: after going out of business, the contractor that built the US military's inventory of A-10 Thunderbolts simply threw out their schematics, forcing the US Air Force to scavenge existing parts until they learned how to build suitable replacements. Similarly, NASA engineers attempting to access old Apollo mission schematics found contemporary hardware incompatible with older storage mediums while the original computers were completely inoperable. Likewise, ancient music has been the subject of much curiosity, but while many ancient instruments have been unearthed relatively few cultures through histories had developed a system of music notation and many of the ancient ones we don't know how to read.
There also comes the unfortunate truth that at some point, data that is of interest to us now will also lose relevance. Our intense desire to analyze our environment is matched only by our desire to preserve our individual analyses, and it is impossible for one to predict all the ways in which information can be used. Many groups intentionally store archeodata in many forms, ranging from humble time capsules to massive national archives. Perhaps the Ur example of the intentional preperation of archeodata is Wikipedia's Terminal Event Management Policy: should a "non-localized event... render the continuation of Wikipedia in its current form untenable" occur, a series of protocols have been developed to increase the chances of the Wikimedia Foundations data banks being preserved. The "worst-case scenario" scenario, with ten minutes or less until failure, involves broadcasting the entire database, compressed, into space via radio telescopes around the world. Conversely, since 1983 the US Department of Energy has been struggling to figure out how to label nuclear waste disposal sites in such a way that their contents will be recognizable as dangerous for the length of their existence, or about 10,000 years. It feels safe to assume that in the space of that time our language and culture may be lost where artifacts remain, thus leaving the correct archeodata in an accessible way might be our only responsible option.
Data is much like a physical tool in that in can be applied to achieve desired results from the natural world, and in that sense finding new data is sort of like finding that a strange tool: you recognize that it is what it is, even if you just don't know what to do with it, until that perfect moment comes along when everything "clicks" and you see exactly how it can be used. The key is to remembering that even if you can use something as a wrench, that doesn't mean you might not be able to use it later on as a screwdriver or a hammer.