Publicly available labelled data sets are necessary for evaluating anomaly-based Intrusion Detection Systems (IDS). However, existing data sets are often not up-to-date or not yet published because of privacy concerns. This paper identifies requirements for good data sets and proposes an approach for their generation. The key idea is to use a test environment and emulate realistic user behaviour with parameterised scripts on the clients. Comprehensive logging mechanisms provide additional information which may be used for a better understanding of the inner dynamics of an IDS. Finally, the proposed approach is used to generate the flow-based CIDDS-002 data set.