Practical Work with GUMS and GOG
For setting up the Virtual Machine on your own laptop, please follow the instructions written for the last GREAT ITN School held last January 2012 in Leiden at
http://great.ast.cam.ac.uk/Greatwiki/GreatItn/ItnSchoolJan2012/setup.
Once the Virtual Machine is up and running, please have a look at another GREAT ITN School wiki page (
http://great.ast.cam.ac.uk/Greatwiki/GreatItn/ItnSchoolJan2012/HistogramGenerationFramework) for the rationale of the framework built on top of the new Big Data processing paradigm (Map Reduce), meant to easily generate n-dimensional histograms. All the examples contained in those pages are for GUMS10 dataset ("ideal" error-free catalogue, up to G=20). The examples dealing with GOG (actual catalogue simulation including Gaia-like observations, up to G=17) are shown below.
There is a third page worth mentioning (
http://great.ast.cam.ac.uk/Greatwiki/GreatItn/ItnSchoolJan2012/RunningOnAmazonEMR) which explains how to deploy this histogram generation workflows on Amazon Elastic Map Reduce for processing the whole datasets (binary and compressed with hundreds of GB in size).
Examples for GOG dataset
There are two examples that have been developed for the workshop. Both of them can be run against the
small subset of GOG17 data, which consists of the astrometry and photometry (CU3 and CU5 information respectively) of 1/5 of the single stars (chosen randomly) contained in GOG17 dataset. Then, whenever the examples are run on the local laptop, this subset is the one used.
The full dataset (still astrometry and photometry) is available at Amazon, so for running against the entire GOG17, it is necessary to go to Amazon (there are also scripts for that, although this should be done by any of the instructors).
Colour-Magnitude Histogram
This histogram represents a histogram whose X axis contains the difference between GBP and GRP (GBP - GRP) and the Y axis shows the absolute G magnitude (MG = G + 5 log(Pi) + 5 +
AbsG, where
AbsG is neglected (info not available in the subset) as we are looking at nearby stars. Indeed, there are three parameters that can be tuned for changing the stars being processed. These parameters are:
- PI_MAX_ERR_PERCENTAGE. It defines the maximum error percentage for SigmaPi/Pi (typically 1%, 5% or 10%).
- MAX_PARALLAX. Maximum parallax (typically 10 mas).
- MAX_G. Maximum G (GOG17 is limited to 17).
For tuning those parameters, eclipse must be opened within the VM (there is an icon on the desktop). Once open, navigate through the package explorer (left-side panel) through
GreatWorkshop -> src -> gaia.cu9.mock.test package -> Open
AstroPhotoColourMagnitudeHistBuilder.java and on the top of the file you will find the three parameters which can be changed at your wish. Once changes are saved (compilation is done automatically by eclipse), we can rerun the example again through the scripts shown below.
Locally against GOG17 subset
For creating the histogram (it may take 15-20 minutes or so depending on your laptop):
run_histogram.sh /opt/hadoop/tests/props/astrophoto-colourmagnitude.properties
For collecting results:
hadoop dfs -copyToLocal /output/astrophoto-colourmagnitude/part-r-00000 $HOME/astrophoto-colourmagnitude.out
For creating the plot:
plot2DHistogramFromHadoopOutput.py -g --title "Colour-Magnitude Diagram" --xLabel "GBP - GRP" --yLabel "Absolute G Magnitude" --flipY $HOME/astrophoto-colourmagnitude.out
For visualizing the plot:
gimp 2dhistogram.png
The diagram will look like the following one:
Against the full GOG17 subset (contact any of the instructor for the <job-workflow-id> identifier)
For launching the histogram job into Amazon Elastic Map Reduce:
run_histogram_amazon.sh /opt/hadoop/tests/props/astrophoto-colourmagnitude-amazon.properties <job-workflow-id>
For collecting results:
collect_results_amazon.sh /opt/hadoop/tests/props/astrophoto-colourmagnitude-amazon.properties $HOME/astrophoto-colourmagnitude-amazon.out
The plot is created in the same way as for the GOG17 subset:
plot2DHistogramFromHadoopOutput.py -g --title "Colour-Magnitude Diagram" --xLabel "GBP - GRP" --yLabel "Absolute G Magnitude" --flipY $HOME/astrophoto-colourmagnitude-amazon.out
Visualization is again done with gimp or any other image visualization software:
gimp 2dhistogram.png
The plot generated with the full GOG17 dataset looks like the following one:
Mean Radial Velocity Map
The second example creates a Healpix map (NSIDE to be specified, default is 32) whose pixel values represent the mean radial velocity of the sources contained in that pixel area. For changing the NSIDE, you can open eclipse and in the package explorer (left-side panel) go to
GreatWorkshop -> src -> gaia.cu9.mock.test package and open
MeanRadialVelMapHistBuilder.java. The constant name is NSIDE (at the top of the file). It is important to remark that there is a filter in place removing the sources whose radial velocity is not available (
NaN - Not a Number).
Locally against GOG17 subset
For creating the map (it may take 15-20 minutes or so depending on your laptop):
run_histogram.sh /opt/hadoop/tests/props/astro-meanradvelmap.properties
For collecting results:
hadoop dfs -copyToLocal /output/astro-meanradvelmap/part-r-00000 $HOME/astro-meanradvelmap.out
For creating the plot:
hadoopOutputOnSky-NoLog.py $HOME/astro-meanradvelmap.out 32 -g --clabel 'Mean Radial Velocity' --title 'Mean Radial Velocity Healpix Map'
For visualizing the plot:
gimp skymap.png
The diagram will look like the following one:
Against the full GOG17 subset (contact any of the instructor for the <job-workflow-id> identifier)
For launching the histogram job into Amazon Elastic Map Reduce:
run_histogram_amazon.sh /opt/hadoop/tests/props/astro-meanradvelmap-amazon.properties <job-workflow-id>
For collecting results:
collect_results_amazon.sh /opt/hadoop/tests/props/astro-meanradvelmap-amazon.properties $HOME/astro-meanradvelmap-amazon.out
The plot is created in the same way as for the GOG17 subset:
hadoopOutputOnSky-NoLog.py $HOME/astro-meanradvelmap-amazon.out 32 -g --clabel 'Mean Radial Velocity' --title 'Mean Radial Velocity Healpix Map'
Visualization is again done with gimp or any other image visualization software:
gimp skymap.png
The plot generated with the full GOG17 dataset looks like the following one: