CMS open data and legacy data, even though still exciting and full of potential, are already a few years old. Because of the rapidly evolving technolgies, the computing environments that were used to analyze these data are already ancient compared to the current, bleeding edge ones.
Therefore, in order to mantain our ability to study these data, we have to rely on technologies that help us preserve adequate computer environments. One way of doing this is by using virtual machines.
In simple words, a virtual machine is an emulation of a computer system that can run within another system. The latter is usually known as the host.
Open data releases, CMSSW versions and operating systems¶
CMS open data from our 2010 release can be studied using CMSSW_4_2_8, a version of the CMSSW software that used to run under Scientific Linux CERN 5 (slc5) operating system. Likewise, open data from our 2011/2012 release used CMSSW_5_3_32 and those from 2015 release CMSSW_7_6_7 under Scientific Linux CERN 6 (slc6).
The virtual machines that are used to analyze these data, therefore, need to consider all these compatibility subtleties.
Virtual machine images¶
In practical terms, a virtual machine image is a computer file that has all the right ingredients to create a virtual computer inside a given host. This file, however, needs to be decoded by a virtual machine interpreter, usually known as hypervisor, which runs on the host machine. One of the most famous hypervisors is Oracle's VirtualBox.
CMS virtual images¶
The most current images for CMS open data usage are described separately in the CERN Open Portal site for 2010 and 2011/2012/2015. They come equiped with the ROOT framework, CMSSW and CVMFS access.
When installing a CMS virtual machine (following the instructions below), always use the latest image file available for 2010 or 2011/2012/2015 data.
Detailed instructions on how to install the CERN virtual machines can be found in the 2010, 2011/2012 and 2015 virtual machine installation guides from the CERN Open Portal. Choose the one to follow depending on the data release you will be working on.
In summary, the basic steps are as follows:
- Download and install the latest (or even better, the latest tested) version of VirtualBox. Note that it is available for an ample range of platforms.
Download the latest CMS virtual image file. Choose between 2010 or 2011/2012, depending on the data release of interest. Once downloaded, import the image file into VirtualBox.
Always use the latest image file available for 2010 or 2011/2012/2015. Older ones are usually deprecated.
Test the environment; again, 2010, 2011/2012 and 2015, depending on the release.
- Finally, check for any known issues or limitations (2010, 2011/2012, 2015.)