Paradata have been used long before the term, describing data automatically as part of the data collection, was invented. For example, interviewer response rates have been monitored on many surveys and in various organizations, during data collection. A recent development, largely helped by responsive (and more generally, adaptive) designs has been the graphical presentation of such paradata over the course of the data collection period. That allows us to see changes in performance over time.
These are not raw paradata because they are still transformed to produce interviewer-level response rates, in this example. However, they are not as useful as they can be because they do not account for many of the relevant factors affecting differences in interviewer response rates. For example, the best interviewers may work predominantly cases that have previously refused and in a simple report or graph would appear to be among the worst performers. Interviewers in general would also appear to perform worse towards the end of data collection as the overall interview rates decline.
The next logical step for paradata monitoring, I believe, is the more extensive use of explicit statistical models that help identify problems with data collection without so many caveats. To address this need, we have started to run programs on a daily (rather, nightly) basis which estimate the predicted interview rates by interviewer and compare them to their empirical interview rates, accounting for differences in the cases (i.e., their case histories) that each interviewer has worked. This allows the use of this report to look at the interviewers with the worst performance (on this metric) and know that it is not due to extraneous factors such as the particular cases that they happen to have been assigned so far. The paradata become more "actionable," it seems. My next goal is to come up with confidence intervals around these model-adjusted metrics.