1 Introduction
1.1 Preface
This is the user's manual for \(\tau\)-Argus version 4.1. \(\tau\)-Argus is a software tool designed to assist a data protector in producing safe tables. This manual describes the first Open Source version of \(\tau\)-Argus. After a long history of development at Statistics Netherlands(CBS) as closed software, CBS has decided to convert \(\tau\)-Argus towards Open Source. This process coincides with the formal retirement of the main developer, Anco Hundepool. With the financial support of Eurostat we have been able to do this transformation and we hope that thefuture of \(\tau\)-Argus is secured. The main aim of this transition project was to port the current (version 3.5) of \(\tau\)-Argus to an open environment. So this version 4.1 does not contain many new extensions.The whole user-interface has been rewritten in java, replacing the old Visual Basic version. The aim of this transition is to be in an open environment and also be platform independent. So also a unix versionis possible now.
Nevertheless with respect to the previous release of \(\tau\)-Argus we have made a few steps forward, and \(\tau\)-Argus now has facilities to protect tables via Controlled Tabular Adjustment (CTA). These routines for this have been developed by Jordi Castro of the Polytechnic Universityof Catalonia.
We have also added the option to use a free open solver (soplex) in addition to the classical commercial solvers like cplex and Xpress. However we expect that these commercial solvers are still very much needed, when we want to protect large serious tables. The purpose of \(\tau\)-Argus is to protect tables against the risk of disclosure, i.e. the accidental or deliberate disclosure of information related to individuals from a statistical table. This is achieved by modifying the table so that it contains less detailed information. \(\tau\)-Argus allows for several modifications of a table: a table can be redesigned, meaning that rows and columns can be combined; sensitive cells can be suppressed and additional cells to protect these can be found in some optimum way (secondary cell suppression). Also rounding and CTA can be used to protect sensitive tables.
The purpose of the present manual is to give a potential user enough information so that he can understand the general principles on which \(\tau\)-Argus is based, and also allow him to use the package. So it contains both general background information and detailed program information. For a more in-depth theoretical background we refer to the handbook “Statistical Disclosure Control” by Anco Hundepool, Josep Domingo-Ferrer, Luisa Franconi, SarahGiessing, Eric Schulte Nordholt, Keith Spicer and Peter-Paul de Wolf(ISBN: 978-1-119-97815-2, Wiley, 2012).
\(\tau\)-Argus is one of a twin set of disclosure control packages. For the protection of microdata - \(\mu\)-Argus - has been developed, which is the twin brother of \(\tau\)-Argus1. Also \(\mu\)-Argus has been ported to OpenSource.
1.2 About the name ARGUS
Some what jokingly the name Argus can be interpreted as the acronym of ‘Anti-Re-identification General Utility System’2. As a matter of fact, the name Argus was inspired by a myth of the ancient Greeks. In this myth Zeus has a girl friend named Io. Hera, Zeus’ wife, did not approve of this relationship and turned Io into a cow. She let the monster Argus guard Io. Argus seemed to be particularly well qualified for this job, because it had a hundred eyes that could watch over Io. If it would fall asleep only two of its eyes were closed. That would leave plenty of eyes to watch Io. Zeus was eager to find a way to getIo back. He hired Hermes who could make Argus fall asleep by the enchanting music on his flute. When Hermes played his flute to Argus this indeed happened: all its eyes closed, one by one. When Hermes had succeeded in making Argus fall asleep, Argus was decapitated. Argus’ eyes were planted onto a bird’s tail - a type of bird that we now know under the name of peacock. That explains why a peacock has these eye-shaped marks on its tail. This also explains the picture on the cover of this manual. It is a copper plate engraving of Gerard de Lairesse (1641-1711) depicting the process where the eyes of Argus are being removed and placed on the peacock’s tail3.
Like the mythological Argus, the software is supposed to guard something, in this case data. This is where the similarity between the myth and the package is supposed to end, as we believe that the package is a winner and not a loser as the mythological Argus is.
1.3 Contact
Feedback from users will help improve future versions of \(\tau\)-Argus and is therefore greatly appreciated. The authors of this manual can be contacted directly for suggestions that may lead to improved versions of \(\tau\)-Argus in writing or otherwise; e-mail messages can also be sent to argus@cbs.nl.
1.4 Open Source
In the open source world the responsibility for the software is different. The idea behind open source is that the software code is no longer owned by one institute (Statistics Netherlands), but the source is available for anybody. Anybody can also contribute to the code and make his own extensions. Nevertheless we do not want to have many different versions of the software and many diversions. Therefore there will always be one official version of \(\tau\)-Argus. In order to achieve this we need a body to make decisions about further developments and extensions for the official \(\tau\)-Argus. This responsibility will be in the hands of a small committee. This committee will be a sub-group of the Eurostat technical working group on Statistical Confidentiality. They will make decisions on whether a new extension/correction will be allowed in the official versions of \(\tau\)-Argus, and also make recommendations for future extensions.
Nevertheless the above mentioned email address (argus@cbs.nl) will remain open for questions.
1.5 Acknowledgments
\(\tau\)-Argus was started as part of the EU \(4^{th}\) framework SDC-project and became a mature software tool as part of the CASC project that was partly sponsored by the EU under contract number IST-2000-25069. This support is highly appreciated. The CASC (Computational Aspects of Statistical Confidentiality) project is part of the Fifth Framework ofthe European Union. The main part of \(\tau\)-Argus has been developed at Statistics Netherlands by Aad van de Wetering and Ramya Ramaswamy (who wrote the kernel) and Anco Hundepool (who wrote the interface). However this software would not have been possible without the contributions of several others, both partners in the CASC-project and outsiders. Recent extensions of \(\tau\)-Argus have been made possible during the European CENEX-SDC-project (grant agreement 25200.2005.001-2005.619), the ESSNet-SDC project (grant agreement 25200.2005.003-2007.670.) and the ESSnet SDC harmonisation (61102.2010.004-2010.579).
The Open Source transition was supported by a Eurostat grant (61102.2012.001-2012.102).
The German partners Statistisches Bundesamt (Sarah Giessing and Dietz Repsilber) have contributed the GHMITER software, which offers a solution for secondary cell suppression based on hypercubes. Peter-Paul de Wolf has built a search algorithm based on non-hierarchical optimal solutions. This algorithm breaks down a large hierarchical table into small non-hierarchical subtables, which are then individually protected. A team led by JJ Salazar of the University La Laguna Tenerife, Spain, has developed the optimisation routines. Additionally Jordi Castro, Universitat Politècnica de Catalunya, Barcelona, has developed a solution based on networks. Jordi Castro also developed the CTA solution. The controlled rounding procedure has been developed by JJ Salazar in a project sponsored by ONS. In order to enhance the usability \(\tau\)-Argus now also can handle SPSS-system files. Forusing \(\tau\)-Argus in combination with SAS, several reports have been produced during the ESSnet projects. These reports and also the SAS-tools are available from the CASC/ESSNet website.The audit routine was first developed by Karl Luhn of the University of Ilmenau and further developed by Destatis.
For solving these optimisation problems, \(\tau\)-Argus traditionally uses commercial LP-solvers. Traditionally we use Xpress as an LP-solver. This package is kindly made available for users of \(\tau\)-Argus in a special agreement between the \(\tau\)-Argus-team and FICO, the developers of Xpress. Alternatively \(\tau\)-Argus can also use the cplex-package. Users can choose either solver to link to \(\tau\)-Argus (provided, of course, they purchase a license for the solver chosen). However users already having a licence for one of these packages for other applications can use their current licence for \(\tau\)-Argus as well. Starting with this Open Source version also free and open solver(Soplex) can now be used to solve the optimisation models behind Cell Suppression, rounding and CTA.
1.6 Latest improvements
The latest extensions in version 4.1 of τ‑argus are :
New structure of the interface, making the table it self the central window.
Controlled Tabular Adjustment.
Rewritten Open Source Code in JAVA.
C++ dlls for data manipulation and the modular approach have been adapted for the Open Source compilers.
The use of free Open Solvers complementary to the commercial solvers.
1.7 The structure of this manual
The remaining part of this manual consists of four chapters and an index. In Chapter 2 we will give a short introduction to the theory and methodology. However for a more fundamental description we refer to the Wiley handbook on Statistical Disclosure Control4. This handbook is the result of the joined work of the SDC specialists in Europe working together for a long period.
In Chapter 3 a short tour of \(\tau\)-Argus will be given as a first impression of the program. Chapter 4 is the reference manual of \(\tau\)-Argus. It will describe in detail the program. This chapter is organized by the menu items of \(\tau\)-Argus. Chapter 5 gives details of files used by \(\tau\)-Argus. The manual is concluded with an index (Chapter 6).
See Anco Hundepool et al. 2014, \(\mu\)-Argus version 5.1 user’s lanual, Statistics Netherlands, The Hague, The Netherlands.↩︎
This interpretation is due to Peter Kooiman, former head of the methodology department at Statistics Netherlands.↩︎
The original copy of this engraving is in the collection of ‘Het Leidsch Prentenkabinet’ in Leiden, The Netherlands.↩︎
Anco Hundepool, Josep Domingo-Ferrer, Luisa Franconi, Sarah Giessing, Eric Schulte Nordholt, Keith Spicer, Peter-Paul de Wolf (2012), Statistical Disclosure control, ISBN: 978-119-97815-2, Wiley.↩︎