Difference: 300ArchiveSystemDesign (1 vs. 14)

Revision 142016-01-26 - NigelHambly

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

WP3 - Aspects of archive system design

Line: 44 to 44
  Our goal in T3.3 is a focused programme of VO consolidation and development work concerning server– side components (as opposed to client–side applications; see WP440) to provide the particular VO infrastructure required for Gaia exploitation. This will involve the following strands of work:
Changed:
<
<
  1. Assessment of compliance with VO standards (Solano, 6 sm CSIC): to test, and implement the Virtual
Observatory standards and protocols necessary to make Gaia data fully VO compliant. We will define the list of VO standards applicable to Gaia data; implement VO standards in Gaia simulated data; and document using simulated data and IVOA standards and protocols as inputs. The main deliverable will be a specification for VO–compliant Gaia data.
  1. Deployment of specific web services (Berthier, 1.8 sm CNRS): the SkyBOT (http://vo.imcce.fr/webservices/
skybot/) service suite will provide VO-compliant tools for the treatment of solar system bodies within Gaia data, while Miriade (http://vo.imcce.fr/webservices/miriade/) computes positional and physical ephemerides of known solar system bodies in a VO-compliant manner.
  1. VO-Dance (Smareglia, 18 sm INAF):The VO-Dance suite provides a lightweight method of publishing data to
the VO. Its components can be distributed as disk images to be run on a virtual machine, so we shall assess its use as a means whereby users can integrate their own datasets with Gaia data.
  1. VOSpace (Voutsinas, 9 sm UEDIN): Support for an extension to the current VOSpace functionality so that,
in addition to providing users with file storage space addressable by VO access protocols, they can also have database storage space on the same basis. This will provide users with a personal database facility like the SDSS MyDB systems, which they are able to address in a VO-complicant manner. For example, a user will be able to direct the result set from one VO query into their personal database, and then use it as the target for a subsequent query, possibly also involving other datasets in the VO, using the TAP Factory system of T3.4 below
>
>
1. Assessment of compliance with VO standards (Solano, 6 sm CSIC): to test, and implement the Virtual Observatory standards and protocols necessary to make Gaia data fully VO compliant. We will define the list of VO standards applicable to Gaia data; implement VO standards in Gaia simulated data; and document using simulated data and IVOA standards and protocols as inputs. The main deliverable will be a specification for VO–compliant Gaia data.

2. Deployment of specific web services (Berthier, 1.8 sm CNRS): the SkyBOT (http://vo.imcce.fr/webservices/skybot/) service suite will provide VO-compliant tools for the treatment of solar system bodies within Gaia data, while Miriade (http://vo.imcce.fr/webservices/miriade/) computes positional and physical ephemerides of known solar system bodies in a VO-compliant manner.

3. VO-Dance (Smareglia, 18 sm INAF):The VO-Dance suite provides a lightweight method of publishing data to the VO. Its components can be distributed as disk images to be run on a virtual machine, so we shall assess its use as a means whereby users can integrate their own datasets with Gaia data.

4. VOSpace (Voutsinas, 9 sm UEDIN): Support for an extension to the current VOSpace functionality so that, in addition to providing users with file storage space addressable by VO access protocols, they can also have database storage space on the same basis. This will provide users with a personal database facility like the SDSS MyDB systems, which they are able to address in a VO-complicant manner. For example, a user will be able to direct the result set from one VO query into their personal database, and then use it as the target for a subsequent query, possibly also involving other datasets in the VO, using the TAP Factory system of T3.4 below

 

T3.4 - Data Centre Collaboration [Months: 1-42]

Revision 132015-06-23 - LolaBalaguer

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

WP3 - Aspects of archive system design

Line: 78 to 78
 The work of T3.5 will centre on the prototyping the deployment, configuration and enhancement of a virtualized data analysis environment for Gaia. Starting with the existing CANFAR system, it will identify best practice and requirements for further development, some of which can be prototyped within T3.5. Comparison with other solutions for Gaia analysis within the date centre will be undertaken and conclusions reported.

This work will be undertaken by Read at UEDIN

Added:
>
>
 

Participants

Changed:
<
<
  • Manager: N. Hambly (Edinburgh)
>
>
  • Manager: N. Hambly (UEDIN)
 
  • Partners:
Changed:
<
<
    • UEDIN
    • CSIC
    • CNRS
    • INAF
>
>
    • UEDIN: Mike Read, Stelios Voutsinas, Mark Holliman, Dave Morris
    • CSIC: Enrique Solano, Luis Sarro
    • CNRS: Jérôme Berthier,
    • INAF: Riccardo Smareglia, Marco Molinaro
 

Meetings

Revision 122015-06-16 - NigelHambly

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

WP3 - Aspects of archive system design

Revision 112014-12-10 - NigelHambly

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

WP3 - Aspects of archive system design

Line: 10 to 10
 
  • Lead beneficiary: UEDIN
  • Type of activity: RTD
Changed:
<
<
The design and technology choices made will be motivated by the real user requirements identified by WP 200 – in particular, the massive, complex queries defined by the Grand Challenges – and by other initiatives, such as the GREAT project, and will be made with full recognition of the constraints imposed by the ESAC archive system, with which it must interface effectively. Prototypes will be prepared and tested in cooperation with the end user community and with the ESAC science archive team through the DPAC CU9. A core principle will be the adoption of Virtual Observatory standards and the development of VO infrastructure to enable ready interoperation with the other external datasets needed to release the full scientific potential of Gaia.
>
>
The design and technology choices made will be motivated by the real user requirements identified by WP 2 – in particular, the massive, complex queries defined by the Grand Challenges – and by other initiatives, such as the GREAT project, and will be made with full recognition of the constraints imposed by the ESAC archive system, with which it must interface effectively. Prototypes will be prepared and tested in cooperation with the end user community and with the ESAC science archive team through the DPAC CU9. A core principle will be the adoption of Virtual Observatory standards and the development of VO infrastructure to enable ready interoperation with the other external datasets needed to release the full scientific potential of Gaia.
 

T3.1 - Technical coordination [Months: 1-42]

UEDIN

Changed:
<
<
In addition to managing the resources deployed on the other WP-300 work packages, and producing reports on those activities, this work package oversees the design and specification of all work conducted under WP-300, to ensure that it adequately addresses the requirements identified within the GENIUS project and from external sources, such as the CU9 and GREAT. The key thing here is to ensure maximum science return by enabling science exploitation through appropriate use of information technologies.
>
>
In addition to managing the resources deployed on the other WP3 work packages, and producing reports on those activities, this work package oversees the design and specification of all work conducted under WP3, to ensure that it adequately addresses the requirements identified within the GENIUS project and from external sources, such as the CU9 and GREAT. The key thing here is to ensure maximum science return by enabling science exploitation through appropriate use of information technologies.
 
Changed:
<
<
This WP also includes the assurance of compliance with the deployment of the archive at ESAC. Since the Gaia archive will be designed and run at in this centre, it is essential that the techniques and technologies prototyped in this project are consistent with what can be ultimately implemented there. An important aspect of WP-310 is to ensure the injection of the relevant requirements for this in the design and evaluation phases, and that all GENIUS system design work is tackled with full awareness of the constraints imposed by ESAC infrastructure and practice. A key deliverable is therefore a formal, documented co-ordination and interface agreement between GENIUS and the Science Archive Team (SAT) at ESAC through the CU9.
>
>
This WP also includes the assurance of compliance with the deployment of the archive at ESAC. Since the Gaia archive will be designed and run at in this centre, it is essential that the techniques and technologies prototyped in this project are consistent with what can be ultimately implemented there. An important aspect of T3.1 is to ensure the injection of the relevant requirements for this in the design and evaluation phases, and that all GENIUS system design work is tackled with full awareness of the constraints imposed by ESAC infrastructure and practice. A key deliverable is therefore a formal, documented co-ordination and interface agreement between GENIUS and the Science Archive Team (SAT) at ESAC through the CU9.
  This work will be undertaken by Hambly of UEDIN.
Line: 32 to 32
  The interface is able to offer users the ability to explore data interactively: they can execute a query, generate summary plots (e.g. scatter plots, histograms, etc), realise their query was not quite making the desired selection, and then easily tweaking the query and executing it again. This reflects the iterative method of working that scientists naturally adopt, which is clearly revealed in analyses the query logs from sky survey archives such as the WFCAM Science Archive [6], curated by UEDIN, and this iterative workflow can be made to run efficiently using a combination of client- and server-side technologies.
Changed:
<
<
What is most important is that the functionality prototyped is that prioritised by scientists, and that any testbed developed here helps the user community to further refine their expressed requirements. For example, while GAP has successfully engaged the Gaia user community via a call for ‘usage scenarios’ under the auspices of GREAT (and these form the inputs to WP200), iteration of requirements with these key consumers has not been considered so far. This process will drive the further development of user interface design – e.g. in determining which additional graphical capacities to implement, and to assess how sophisticated a caching mechanism is required to support the division of datasets between the client and the server – and we propose to use the interfaces developed by this WP for an initial deployment as a testbed for the community to further assess its requirements.
>
>
What is most important is that the functionality prototyped is that prioritised by scientists, and that any testbed developed here helps the user community to further refine their expressed requirements. For example, while GAP has successfully engaged the Gaia user community via a call for ‘usage scenarios’ under the auspices of GREAT (and these form the inputs to WP2), iteration of requirements with these key consumers has not been considered so far. This process will drive the further development of user interface design – e.g. in determining which additional graphical capacities to implement, and to assess how sophisticated a caching mechanism is required to support the division of datasets between the client and the server – and we propose to use the interfaces developed by this WP for an initial deployment as a testbed for the community to further assess its requirements.
  The work will be undertaken by Read (UEDIN)
Line: 42 to 42
  The past decade has seen a huge amount of activity in defining, standardising and implementing the global ‘Virtual Observatory’. From the outset, large–scale mission data sets from ground and space were anticipated as being the cornerstone of the VO. This work has reached a level of maturity whereby most of the basic interoperability standards are in place (http://www.ivoa.net/Documents/ ) and it is possible to build project-specific services on top of them and to see where the further development of standards is needed in support of particular projects.
Changed:
<
<
Our goal in WP-330 is a focused programme of VO consolidation and development work concerning server– side components (as opposed to client–side applications; see WP440) to provide the particular VO infrastructure required for Gaia exploitation. This will involve the following strands of work:
>
>
Our goal in T3.3 is a focused programme of VO consolidation and development work concerning server– side components (as opposed to client–side applications; see WP440) to provide the particular VO infrastructure required for Gaia exploitation. This will involve the following strands of work:
 
  1. Assessment of compliance with VO standards (Solano, 6 sm CSIC): to test, and implement the Virtual
Observatory standards and protocols necessary to make Gaia data fully VO compliant. We will define the list of VO standards applicable to Gaia data; implement VO standards in Gaia simulated data; and document using simulated data and IVOA standards and protocols as inputs. The main deliverable will be a specification for VO–compliant Gaia data.
Line: 51 to 51
 
  1. VO-Dance (Smareglia, 18 sm INAF):The VO-Dance suite provides a lightweight method of publishing data to
the VO. Its components can be distributed as disk images to be run on a virtual machine, so we shall assess its use as a means whereby users can integrate their own datasets with Gaia data.
  1. VOSpace (Voutsinas, 9 sm UEDIN): Support for an extension to the current VOSpace functionality so that,
Changed:
<
<
in addition to providing users with file storage space addressable by VO access protocols, they can also have database storage space on the same basis. This will provide users with a personal database facility like the SDSS MyDB systems, which they are able to address in a VO-complicant manner. For example, a user will be able to direct the result set from one VO query into their personal database, and then use it as the target for a subsequent query, possibly also involving other datasets in the VO, using the TAP Factory system of WP-340 below
>
>
in addition to providing users with file storage space addressable by VO access protocols, they can also have database storage space on the same basis. This will provide users with a personal database facility like the SDSS MyDB systems, which they are able to address in a VO-complicant manner. For example, a user will be able to direct the result set from one VO query into their personal database, and then use it as the target for a subsequent query, possibly also involving other datasets in the VO, using the TAP Factory system of T3.4 below
 

T3.4 - Data Centre Collaboration [Months: 1-42]

Line: 59 to 59
  With the Table Access Protocol (TAP http://www.ivoa.net/Documents/TAP/) the VO provides a standard means of querying tabular data sets, and with the advent of the TAP factory [8] it has become possible to execute multiple, distributed TAP queries. In a traditional IVOA TAP scenario, single TAP endpoints provide the means for VO clients to present the user with a data resource schema and then to service an ADQL query on that resource, but it is then up to further, separate client–end manipulations to join data for multiwavelength science. TAP Factory takes this further by combining TAP with the Open Grid Service Architecture Data Access Infrastructue (OGSA–DAI) middleware to provide a means of creating TAP end-points on–the–fly, and, thereby, facilitating the cross-querying of distributed resources by TAP clients.
Changed:
<
<
Such a system supports one of the fundamental usage scenarios for the VO. A user can select a set of data resources published using TAP on which to execute a distributed query. From the metadata exposed by the individual TAP services, TAP Factory is able to create a new TAP endpoint on–the–fly for the distributed query and present the user with the metadata of the virtual data federation thus generated. The user can then pose a query against this virtual federation as if querying a single TAP service, and, when coupled with the MyDB –like personal database of WP-330, it enables users to create sophisticated sets of cross–catalogue queries, as required for the full exploitation of Gaia data. The key point here is that a data resource can be incorporated into a virtual federation without requiring any action on the part of the staff of the data centre that curate it; so, in the case of Gaia, it is possible for higher level services like these to be developed and deployed, without requiring any action from (or placing any obligations on) the staff at ESAC.
>
>
Such a system supports one of the fundamental usage scenarios for the VO. A user can select a set of data resources published using TAP on which to execute a distributed query. From the metadata exposed by the individual TAP services, TAP Factory is able to create a new TAP endpoint on–the–fly for the distributed query and present the user with the metadata of the virtual data federation thus generated. The user can then pose a query against this virtual federation as if querying a single TAP service, and, when coupled with the MyDB –like personal database of T3.3, it enables users to create sophisticated sets of cross–catalogue queries, as required for the full exploitation of Gaia data. The key point here is that a data resource can be incorporated into a virtual federation without requiring any action on the part of the staff of the data centre that curate it; so, in the case of Gaia, it is possible for higher level services like these to be developed and deployed, without requiring any action from (or placing any obligations on) the staff at ESAC.
  A basic prototype of this system has been produced by UEDIN, but it needs further development in several related regards before it is capable of supporting the scientific exploitation of Gaia. Firstly, the efficiency with which the system can execute a distributed query over the virtual federation constructed by TAP Factory depends on the metadata available to OGSA-DAI’s Distributed Query Processor (DQP) for the purposes of constructing a good query execution plan. For example, if DQP knows the distribution of values of the attributes used in join clauses in the distributed query, it can make an informed decision about how best to move data in executing the query, and whether to perform any server-side pre-processing before doing so. Taking full advantage of these capabilities will require an extension to the TAP standard, to expand the range of metadata exposed by a TAP service, and this can be best progressed through the IVOA standardisation process by the demonstration of powerful prototypes performing realistic science analyses.
Line: 73 to 73
  Research environments such as that provided by CADC with CANFAR (http://canfar.phys.uvic.ca/) represent stateof- the-art solutions to the large and growing range of research and data mining demands being placed upon astronomical archives. CANFAR offers scientists a rich, yet bounded, environment based on virtual machines (VMs), within which a scientist can deploy the software they need for their individual research and have it run in a manner that does not risk the stability of the archive or the research of other scientists. VM images can be created and stored by individual scientists or research consortia, and deployed when, and in the numbers, necessary for the job at hand, so that the available data analysis hardware can be employed effectively, but with the flexibility needed to match the differing needs of multiple user groups.
Changed:
<
<
As archives increase in size and complexity, data analysis will shift to the data centre, and the CANFAR initiative is showing how this can work in practice. Of particular relevance to this project is the recent work (https:// sites.google.com/site/nickballastronomer/research/canfar_skytree) deploying the Skytree scalable data mining software within the CANFAR cloud, which has demonstrated how such the provision within a data centre of such a virtualized environment can support the large-scale data mining analyses envisaged for Gaia by WP-400. CANFAR is the pioneer in this domain, but further R&D work is needed to shape a system that will be suitable for Gaia: e.g. further integration with VO protocols (see WP-330 above), and creation of a more sophisticated packaging system for deployable software.
>
>
As archives increase in size and complexity, data analysis will shift to the data centre, and the CANFAR initiative is showing how this can work in practice. Of particular relevance to this project is the recent work (https:// sites.google.com/site/nickballastronomer/research/canfar_skytree) deploying the Skytree scalable data mining software within the CANFAR cloud, which has demonstrated how such the provision within a data centre of such a virtualized environment can support the large-scale data mining analyses envisaged for Gaia by WP4. CANFAR is the pioneer in this domain, but further R&D work is needed to shape a system that will be suitable for Gaia: e.g. further integration with VO protocols (see T3.3 above), and creation of a more sophisticated packaging system for deployable software.
 
Changed:
<
<
The work of WP-350 will centre on the prototyping the deployment, configuration and enhancement of a virtualized data analysis environment for Gaia. Starting with the existing CANFAR system, it will identify best practice and requirements for further development, some of which can be prototyped within WP-350. Comparison with other solutions for Gaia analysis within the date centre will be undertaken and conclusions reported.
>
>
The work of T3.5 will centre on the prototyping the deployment, configuration and enhancement of a virtualized data analysis environment for Gaia. Starting with the existing CANFAR system, it will identify best practice and requirements for further development, some of which can be prototyped within T3.5. Comparison with other solutions for Gaia analysis within the date centre will be undertaken and conclusions reported.
  This work will be undertaken by Read at UEDIN

Participants

Revision 102014-11-21 - LolaBalaguer

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<

300 - Aspects of archive system design

>
>

WP3 - Aspects of archive system design

 

Description

Line: 10 to 10
 
  • Lead beneficiary: UEDIN
  • Type of activity: RTD
Changed:
<
<
The design and technology choices made will be motivated by the real user requirements identified by WP 200 – in particular, the massive, complex queries defined by the Grand Challenges – and by other initiatives, such as the GREAT project, and will be made with full recognition of the constraints imposed by the ESAC archive system, with which it must interface effectively. Prototypes will be prepared and tested in cooperation with the end user community and with the ESAC science archive team through the DPAC CU9. A core principle will be the adoption of Virtual Observatory standards and the development of VO infrastructure to enable ready interoperation with the other external datasets needed to release the full scientific potential of Gaia.
>
>
The design and technology choices made will be motivated by the real user requirements identified by WP 200 – in particular, the massive, complex queries defined by the Grand Challenges – and by other initiatives, such as the GREAT project, and will be made with full recognition of the constraints imposed by the ESAC archive system, with which it must interface effectively. Prototypes will be prepared and tested in cooperation with the end user community and with the ESAC science archive team through the DPAC CU9. A core principle will be the adoption of Virtual Observatory standards and the development of VO infrastructure to enable ready interoperation with the other external datasets needed to release the full scientific potential of Gaia.
 

T3.1 - Technical coordination [Months: 1-42]

Line: 32 to 32
  The interface is able to offer users the ability to explore data interactively: they can execute a query, generate summary plots (e.g. scatter plots, histograms, etc), realise their query was not quite making the desired selection, and then easily tweaking the query and executing it again. This reflects the iterative method of working that scientists naturally adopt, which is clearly revealed in analyses the query logs from sky survey archives such as the WFCAM Science Archive [6], curated by UEDIN, and this iterative workflow can be made to run efficiently using a combination of client- and server-side technologies.
Changed:
<
<
What is most important is that the functionality prototyped is that prioritised by scientists, and that any testbed developed here helps the user community to further refine their expressed requirements. For example, while GAP has successfully engaged the Gaia user community via a call for ‘usage scenarios’ under the auspices of GREAT (and these form the inputs to WP200), iteration of requirements with these key consumers has not been considered so far. This process will drive the further development of user interface design – e.g. in determining which additional graphical capacities to implement, and to assess how sophisticated a caching mechanism is required to support the division of datasets between the client and the server – and we propose to use the interfaces developed by this WP for an initial deployment as a testbed for the community to further assess its requirements.
>
>
What is most important is that the functionality prototyped is that prioritised by scientists, and that any testbed developed here helps the user community to further refine their expressed requirements. For example, while GAP has successfully engaged the Gaia user community via a call for ‘usage scenarios’ under the auspices of GREAT (and these form the inputs to WP200), iteration of requirements with these key consumers has not been considered so far. This process will drive the further development of user interface design – e.g. in determining which additional graphical capacities to implement, and to assess how sophisticated a caching mechanism is required to support the division of datasets between the client and the server – and we propose to use the interfaces developed by this WP for an initial deployment as a testbed for the community to further assess its requirements.
  The work will be undertaken by Read (UEDIN)
Line: 40 to 40
  INAF, CNRS, UEDIN, CSIC
Changed:
<
<
The past decade has seen a huge amount of activity in defining, standardising and implementing the global ‘Virtual Observatory’. From the outset, large–scale mission data sets from ground and space were anticipated as being the cornerstone of the VO. This work has reached a level of maturity whereby most of the basic interoperability standards are in place (http://www.ivoa.net/Documents/ ) and it is possible to build project-specific services on top of them and to see where the further development of standards is needed in support of particular projects.
>
>
The past decade has seen a huge amount of activity in defining, standardising and implementing the global ‘Virtual Observatory’. From the outset, large–scale mission data sets from ground and space were anticipated as being the cornerstone of the VO. This work has reached a level of maturity whereby most of the basic interoperability standards are in place (http://www.ivoa.net/Documents/ ) and it is possible to build project-specific services on top of them and to see where the further development of standards is needed in support of particular projects.
 
Changed:
<
<
Our goal in WP-330 is a focused programme of VO consolidation and development work concerning server– side components (as opposed to client–side applications; see WP440) to provide the particular VO infrastructure required for Gaia exploitation. This will involve the following strands of work:
>
>
Our goal in WP-330 is a focused programme of VO consolidation and development work concerning server– side components (as opposed to client–side applications; see WP440) to provide the particular VO infrastructure required for Gaia exploitation. This will involve the following strands of work:
 
  1. Assessment of compliance with VO standards (Solano, 6 sm CSIC): to test, and implement the Virtual
Changed:
<
<
Observatory standards and protocols necessary to make Gaia data fully VO compliant. We will define the list of VO standards applicable to Gaia data; implement VO standards in Gaia simulated data; and document using simulated data and IVOA standards and protocols as inputs. The main deliverable will be a specification for VO–compliant Gaia data.
>
>
Observatory standards and protocols necessary to make Gaia data fully VO compliant. We will define the list of VO standards applicable to Gaia data; implement VO standards in Gaia simulated data; and document using simulated data and IVOA standards and protocols as inputs. The main deliverable will be a specification for VO–compliant Gaia data.
 
  1. Deployment of specific web services (Berthier, 1.8 sm CNRS): the SkyBOT (http://vo.imcce.fr/webservices/
skybot/) service suite will provide VO-compliant tools for the treatment of solar system bodies within Gaia data, while Miriade (http://vo.imcce.fr/webservices/miriade/) computes positional and physical ephemerides of known solar system bodies in a VO-compliant manner.
  1. VO-Dance (Smareglia, 18 sm INAF):The VO-Dance suite provides a lightweight method of publishing data to
Line: 57 to 57
  UEDIN
Changed:
<
<
With the Table Access Protocol (TAP http://www.ivoa.net/Documents/TAP/) the VO provides a standard means of querying tabular data sets, and with the advent of the TAP factory [8] it has become possible to execute multiple, distributed TAP queries. In a traditional IVOA TAP scenario, single TAP endpoints provide the means for VO clients to present the user with a data resource schema and then to service an ADQL query on that resource, but it is then up to further, separate client–end manipulations to join data for multiwavelength science. TAP Factory takes this further by combining TAP with the Open Grid Service Architecture Data Access Infrastructue (OGSA–DAI) middleware to provide a means of creating TAP end-points on–the–fly, and, thereby, facilitating the cross-querying of distributed resources by TAP clients.
>
>
With the Table Access Protocol (TAP http://www.ivoa.net/Documents/TAP/) the VO provides a standard means of querying tabular data sets, and with the advent of the TAP factory [8] it has become possible to execute multiple, distributed TAP queries. In a traditional IVOA TAP scenario, single TAP endpoints provide the means for VO clients to present the user with a data resource schema and then to service an ADQL query on that resource, but it is then up to further, separate client–end manipulations to join data for multiwavelength science. TAP Factory takes this further by combining TAP with the Open Grid Service Architecture Data Access Infrastructue (OGSA–DAI) middleware to provide a means of creating TAP end-points on–the–fly, and, thereby, facilitating the cross-querying of distributed resources by TAP clients.
 
Changed:
<
<
Such a system supports one of the fundamental usage scenarios for the VO. A user can select a set of data resources published using TAP on which to execute a distributed query. From the metadata exposed by the individual TAP services, TAP Factory is able to create a new TAP endpoint on–the–fly for the distributed query and present the user with the metadata of the virtual data federation thus generated. The user can then pose a query against this virtual federation as if querying a single TAP service, and, when coupled with the MyDB–like personal database of WP-330, it enables users to create sophisticated sets of cross–catalogue queries, as required for the full exploitation of Gaia data. The key point here is that a data resource can be incorporated into a virtual federation without requiring any action on the part of the staff of the data centre that curate it; so, in the case of Gaia, it is possible for higher level services like these to be developed and deployed, without requiring any action from (or placing any obligations on) the staff at ESAC.
>
>
Such a system supports one of the fundamental usage scenarios for the VO. A user can select a set of data resources published using TAP on which to execute a distributed query. From the metadata exposed by the individual TAP services, TAP Factory is able to create a new TAP endpoint on–the–fly for the distributed query and present the user with the metadata of the virtual data federation thus generated. The user can then pose a query against this virtual federation as if querying a single TAP service, and, when coupled with the MyDB –like personal database of WP-330, it enables users to create sophisticated sets of cross–catalogue queries, as required for the full exploitation of Gaia data. The key point here is that a data resource can be incorporated into a virtual federation without requiring any action on the part of the staff of the data centre that curate it; so, in the case of Gaia, it is possible for higher level services like these to be developed and deployed, without requiring any action from (or placing any obligations on) the staff at ESAC.
 
Changed:
<
<
A basic prototype of this system has been produced by UEDIN, but it needs further development in several related regards before it is capable of supporting the scientific exploitation of Gaia. Firstly, the efficiency with which the system can execute a distributed query over the virtual federation constructed by TAP Factory depends on the metadata available to OGSA-DAI’s Distributed Query Processor (DQP) for the purposes of constructing a good query execution plan. For example, if DQP knows the distribution of values of the attributes used in join clauses in the distributed query, it can make an informed decision about how best to move data in executing the query, and whether to perform any server-side pre-processing before doing so. Taking full advantage of these capabilities will require an extension to the TAP standard, to expand the range of metadata exposed by a TAP service, and this can be best progressed through the IVOA standardisation process by the demonstration of powerful prototypes performing realistic science analyses.
>
>
A basic prototype of this system has been produced by UEDIN, but it needs further development in several related regards before it is capable of supporting the scientific exploitation of Gaia. Firstly, the efficiency with which the system can execute a distributed query over the virtual federation constructed by TAP Factory depends on the metadata available to OGSA-DAI’s Distributed Query Processor (DQP) for the purposes of constructing a good query execution plan. For example, if DQP knows the distribution of values of the attributes used in join clauses in the distributed query, it can make an informed decision about how best to move data in executing the query, and whether to perform any server-side pre-processing before doing so. Taking full advantage of these capabilities will require an extension to the TAP standard, to expand the range of metadata exposed by a TAP service, and this can be best progressed through the IVOA standardisation process by the demonstration of powerful prototypes performing realistic science analyses.
 
Changed:
<
<
The efficiency of the distributed queries can be improved further by collaboration between data centres. A naive spatial cross-match query executed between distributed multi–TB data sets will remain expensive, given network speeds, but several strategies exist that can ameliorate this situation and this work package will assess, through quantitative analysis – and, where possible, direct experimentation – the optimal configuration of the multiwavelength datasets required for the scientific exploitation of Gaia. For example, to determine which external catalogues should be co-located with a copy of the Gaia archive, for which should “cross-neighbour” tables be precomputed to facilitate queries between data sets that remain geographically separated, and for which can crossmatches be performed on-the-fly with sufficient speed.
>
>
The efficiency of the distributed queries can be improved further by collaboration between data centres. A naive spatial cross-match query executed between distributed multi–TB data sets will remain expensive, given network speeds, but several strategies exist that can ameliorate this situation and this work package will assess, through quantitative analysis – and, where possible, direct experimentation – the optimal configuration of the multiwavelength datasets required for the scientific exploitation of Gaia. For example, to determine which external catalogues should be co-located with a copy of the Gaia archive, for which should “cross-neighbour” tables be precomputed to facilitate queries between data sets that remain geographically separated, and for which can crossmatches be performed on-the-fly with sufficient speed.
  The work will be undertaken by Read and Voutsinas of UEDIN.
Line: 90 to 90
 
Added:
>
>

European_Flag.png

The research leading to these results has received funding from the European Community's Seventh Framework Programme (FP7-SPACE-2013-1) under grant agreement n°606740.

 
META FILEATTACHMENT attachment="cu9_rgm_030910.pdf" attr="h" comment="CU9 bon mots from Bob/Nige" date="1319534773" name="cu9_rgm_030910.pdf" path="D:\cu9_rgm_030910.pdf" size="45581" stream="D:\cu9_rgm_030910.pdf" tmpFilename="/usr/tmp/CGItemp39311" user="NigelHambly" version="1"

Revision 92014-07-21 - NigelHambly

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

300 - Aspects of archive system design

Line: 10 to 10
 
  • Lead beneficiary: UEDIN
  • Type of activity: RTD
Changed:
<
<
The design and technology choices made will be motivated by the real user requirements identified by WP 200 – in particular, the massive, complex queries defined by the Grand Challenges – and by other initiatives, such as the GREAT project, and will be made with full recognition of the constraints imposed by the ESAC archive system, with which it must interface effectively. Prototypes will be prepared and tested in cooperation with the end user community and with the ESAC science archive team through the DPAC CU9. A core principle will be the adoption of Virtual Observatory standards and the development of VO infrastructure to enable ready interoperation with the other external datasets needed to release the full scientific potential of Gaia.
>
>
The design and technology choices made will be motivated by the real user requirements identified by WP 200 – in particular, the massive, complex queries defined by the Grand Challenges – and by other initiatives, such as the GREAT project, and will be made with full recognition of the constraints imposed by the ESAC archive system, with which it must interface effectively. Prototypes will be prepared and tested in cooperation with the end user community and with the ESAC science archive team through the DPAC CU9. A core principle will be the adoption of Virtual Observatory standards and the development of VO infrastructure to enable ready interoperation with the other external datasets needed to release the full scientific potential of Gaia.
 

T3.1 - Technical coordination [Months: 1-42]

UEDIN

Changed:
<
<
In addition to managing the resources deployed on the other WP-300 work packages, and producing reports on those activities, this work package oversees the design and specification of all work conducted under WP-300, to ensure that it adequately addresses the requirements identified within the GENIUS project and from external sources, such as the CU9 and GREAT. The key thing here is to ensure maximum science return by enabling science exploitation through appropriate use of information technologies.

This WP also includes the assurance of compliance with the deployment of the archive at ESAC. Since the Gaia archive will be designed and run at in this centre, it is essential that the techniques and technologies prototyped in this project are consistent with what can be ultimately implemented there. An important aspect of WP-310 is to ensure the injection of the relevant requirements for this in the design and evaluation phases, and that all GENIUS system design work is tackled with full awareness of the constraints imposed by ESAC infrastructure and practice. A key deliverable is therefore a formal, documented co-ordination and interface agreement between GENIUS and the Science Archive Team (SAT) at ESAC through the CU9.

>
>
In addition to managing the resources deployed on the other WP-300 work packages, and producing reports on those activities, this work package oversees the design and specification of all work conducted under WP-300, to ensure that it adequately addresses the requirements identified within the GENIUS project and from external sources, such as the CU9 and GREAT. The key thing here is to ensure maximum science return by enabling science exploitation through appropriate use of information technologies.

This WP also includes the assurance of compliance with the deployment of the archive at ESAC. Since the Gaia archive will be designed and run at in this centre, it is essential that the techniques and technologies prototyped in this project are consistent with what can be ultimately implemented there. An important aspect of WP-310 is to ensure the injection of the relevant requirements for this in the design and evaluation phases, and that all GENIUS system design work is tackled with full awareness of the constraints imposed by ESAC infrastructure and practice. A key deliverable is therefore a formal, documented co-ordination and interface agreement between GENIUS and the Science Archive Team (SAT) at ESAC through the CU9.

  This work will be undertaken by Hambly of UEDIN.
Line: 37 to 26
  UEDIN
Changed:
<
<
The Gaia mission will produce a wide variety of data products, leading to a complex archive. A crucial issue for the exploitability of the Gaia data set is, therefore, an archive interface that supports a sufficiently rich range of functionality and is sufficiently easy to use for users to do their science with it effectively. The task of this WP is to prototype archive interface components that meet these user requirements, as developed by the CU9 and GREAT. Since any candidate archive DBMSs to be employed at ESAC support access from Java via Java Database Connectivity (JDBC), it is possible to develop archive interface prototypes independent of the backend DBMS.

UEDIN has recently been prototyping the use of Web 2.0 technologies for the delivery of an intuitive, but richlyfunctioned user interface to sky survey archives with a complicated schema, and this appears promising for Gaia: functionality like making schema information readily available to users as they develop their queries, and, even, using code completion to help write them, can make archive use much more effective.

The interface is able to offer users the ability to explore data interactively: they can execute a query, generate summary plots (e.g. scatter plots, histograms, etc), realise their query was not quite making the desired selection, and then easily tweaking the query and executing it again. This reflects the iterative method of working that scientists naturally adopt, which is clearly revealed in analyses the query logs from sky survey archives such as the WFCAM Science Archive [6], curated by UEDIN, and this iterative workflow can be made to run efficiently using a combination of client- and server-side technologies.

What is most important is that the functionality prototyped is that prioritised by scientists, and that any testbed developed here helps the user community to further refine their expressed requirements. For example, while GAP has successfully engaged the Gaia user community via a call for ‘usage scenarios’ under the auspices of GREAT (and these form the inputs to WP200), iteration of requirements with these key consumers has not been considered so far. This process will drive the further development of user interface design – e.g. in determining which additional graphical capacities to implement, and to assess how sophisticated a caching mechanism is required to support the division of datasets between the client and the server – and we propose to use the interfaces developed by this WP for an initial deployment as a testbed for the community to further assess its requirements.

>
>
The Gaia mission will produce a wide variety of data products, leading to a complex archive. A crucial issue for the exploitability of the Gaia data set is, therefore, an archive interface that supports a sufficiently rich range of functionality and is sufficiently easy to use for users to do their science with it effectively. The task of this WP is to prototype archive interface components that meet these user requirements, as developed by the CU9 and GREAT. Since any candidate archive DBMSs to be employed at ESAC support access from Java via Java Database Connectivity (JDBC), it is possible to develop archive interface prototypes independent of the backend DBMS.

UEDIN has recently been prototyping the use of Web 2.0 technologies for the delivery of an intuitive, but richlyfunctioned user interface to sky survey archives with a complicated schema, and this appears promising for Gaia: functionality like making schema information readily available to users as they develop their queries, and, even, using code completion to help write them, can make archive use much more effective.

The interface is able to offer users the ability to explore data interactively: they can execute a query, generate summary plots (e.g. scatter plots, histograms, etc), realise their query was not quite making the desired selection, and then easily tweaking the query and executing it again. This reflects the iterative method of working that scientists naturally adopt, which is clearly revealed in analyses the query logs from sky survey archives such as the WFCAM Science Archive [6], curated by UEDIN, and this iterative workflow can be made to run efficiently using a combination of client- and server-side technologies.

What is most important is that the functionality prototyped is that prioritised by scientists, and that any testbed developed here helps the user community to further refine their expressed requirements. For example, while GAP has successfully engaged the Gaia user community via a call for ‘usage scenarios’ under the auspices of GREAT (and these form the inputs to WP200), iteration of requirements with these key consumers has not been considered so far. This process will drive the further development of user interface design – e.g. in determining which additional graphical capacities to implement, and to assess how sophisticated a caching mechanism is required to support the division of datasets between the client and the server – and we propose to use the interfaces developed by this WP for an initial deployment as a testbed for the community to further assess its requirements.

  The work will be undertaken by Read (UEDIN)
Line: 72 to 40
  INAF, CNRS, UEDIN, CSIC
Changed:
<
<
The past decade has seen a huge amount of activity in defining, standardising and implementing the global ‘Virtual Observatory’. From the outset, large–scale mission data sets from ground and space were anticipated as being the cornerstone of the VO. This work has reached a level of maturity whereby most of the basic interoperability standards are in place (http://www.ivoa.net/Documents/ ) and it is possible to build project-specific services on top of them and to see where the further development of standards is needed in support of particular projects.

Our goal in WP-330 is a focused programme of VO consolidation and development work concerning server– side components (as opposed to client–side applications; see WP440) to provide the particular VO infrastructure required for Gaia exploitation. This will involve the following strands of work:

>
>
The past decade has seen a huge amount of activity in defining, standardising and implementing the global ‘Virtual Observatory’. From the outset, large–scale mission data sets from ground and space were anticipated as being the cornerstone of the VO. This work has reached a level of maturity whereby most of the basic interoperability standards are in place (http://www.ivoa.net/Documents/ ) and it is possible to build project-specific services on top of them and to see where the further development of standards is needed in support of particular projects.

Our goal in WP-330 is a focused programme of VO consolidation and development work concerning server– side components (as opposed to client–side applications; see WP440) to provide the particular VO infrastructure required for Gaia exploitation. This will involve the following strands of work:

 
  1. Assessment of compliance with VO standards (Solano, 6 sm CSIC): to test, and implement the Virtual
Changed:
<
<
Observatory standards and protocols necessary to make Gaia data fully VO compliant. We will define the list of VO standards applicable to Gaia data; implement VO standards in Gaia simulated data; and document using simulated data and IVOA standards and protocols as inputs. The main deliverable will be a specification for VO–compliant Gaia data.
>
>
Observatory standards and protocols necessary to make Gaia data fully VO compliant. We will define the list of VO standards applicable to Gaia data; implement VO standards in Gaia simulated data; and document using simulated data and IVOA standards and protocols as inputs. The main deliverable will be a specification for VO–compliant Gaia data.
 
  1. Deployment of specific web services (Berthier, 1.8 sm CNRS): the SkyBOT (http://vo.imcce.fr/webservices/
Changed:
<
<
skybot/) service suite will provide VO-compliant tools for the treatment of solar system bodies within Gaia data, while Miriade (http://vo.imcce.fr/webservices/miriade/) computes positional and physical ephemerides of known solar system bodies in a VO-compliant manner.
>
>
skybot/) service suite will provide VO-compliant tools for the treatment of solar system bodies within Gaia data, while Miriade (http://vo.imcce.fr/webservices/miriade/) computes positional and physical ephemerides of known solar system bodies in a VO-compliant manner.
 
  1. VO-Dance (Smareglia, 18 sm INAF):The VO-Dance suite provides a lightweight method of publishing data to
Changed:
<
<
the VO. Its components can be distributed as disk images to be run on a virtual machine, so we shall assess its use as a means whereby users can integrate their own datasets with Gaia data.
>
>
the VO. Its components can be distributed as disk images to be run on a virtual machine, so we shall assess its use as a means whereby users can integrate their own datasets with Gaia data.
 
  1. VOSpace (Voutsinas, 9 sm UEDIN): Support for an extension to the current VOSpace functionality so that,
Changed:
<
<
in addition to providing users with file storage space addressable by VO access protocols, they can also have database storage space on the same basis. This will provide users with a personal database facility like the SDSS MyDB systems, which they are able to address in a VO-complicant manner. For example, a user will be able to direct the result set from one VO query into their personal database, and then use it as the target for a subsequent query, possibly also involving other datasets in the VO, using the TAP Factory system of WP-340 below
>
>
in addition to providing users with file storage space addressable by VO access protocols, they can also have database storage space on the same basis. This will provide users with a personal database facility like the SDSS MyDB systems, which they are able to address in a VO-complicant manner. For example, a user will be able to direct the result set from one VO query into their personal database, and then use it as the target for a subsequent query, possibly also involving other datasets in the VO, using the TAP Factory system of WP-340 below
 

T3.4 - Data Centre Collaboration [Months: 1-42]

UEDIN

Changed:
<
<
With the Table Access Protocol (TAP http://www.ivoa.net/Documents/TAP/) the VO provides a standard means of querying tabular data sets, and with the advent of the TAP factory [8] it has become possible to execute multiple, distributed TAP queries. In a traditional IVOA TAP scenario, single TAP endpoints provide the means for VO clients to present the user with a data resource schema and then to service an ADQL query on that resource, but it is then up to further, separate client–end manipulations to join data for multiwavelength science. TAP Factory takes this further by combining TAP with the Open Grid Service Architecture Data Access Infrastructue (OGSA–DAI) middleware to provide a means of creating TAP end-points on–the–fly, and, thereby, facilitating the cross-querying of distributed resources by TAP clients.

Such a system supports one of the fundamental usage scenarios for the VO. A user can select a set of data resources published using TAP on which to execute a distributed query. From the metadata exposed by the individual TAP services, TAP Factory is able to create a new TAP endpoint on–the–fly for the distributed query and present the user with the metadata of the virtual data federation thus generated. The user can then pose a query against this virtual federation as if querying a single TAP service, and, when coupled with the MyDB–like personal database of WP-330, it enables users to create sophisticated sets of cross–catalogue queries, as required for the full exploitation of Gaia data. The key point here is that a data resource can be incorporated into a virtual federation without requiring any action on the part of the staff of the data centre that curate it; so, in the case of Gaia, it is possible for higher level services like these to be developed and deployed, without requiring any action from (or placing any obligations on) the staff at ESAC.

A basic prototype of this system has been produced by UEDIN, but it needs further development in several related regards before it is capable of supporting the scientific exploitation of Gaia. Firstly, the efficiency with which the system can execute a distributed query over the virtual federation constructed by TAP Factory depends on the metadata available to OGSA-DAI’s Distributed Query Processor (DQP) for the purposes of constructing a good query execution plan. For example, if DQP knows the distribution of values of the attributes used in join clauses in the distributed query, it can make an informed decision about how best to move data in executing the query, and whether to perform any server-side pre-processing before doing so. Taking full advantage of these capabilities will require an extension to the TAP standard, to expand the range of metadata exposed by a TAP service, and this can be best progressed through the IVOA standardisation process by the demonstration of powerful prototypes performing realistic science analyses.

The efficiency of the distributed queries can be improved further by collaboration between data centres. A naive spatial cross-match query executed between distributed multi–TB data sets will remain expensive, given network speeds, but several strategies exist that can ameliorate this situation and this work package will assess, through quantitative analysis – and, where possible, direct experimentation – the optimal configuration of the multiwavelength datasets required for the scientific exploitation of Gaia. For example, to determine which external catalogues should be co-located with a copy of the Gaia archive, for which should “cross-neighbour” tables be precomputed to facilitate queries between data sets that remain geographically separated, and for which can crossmatches be performed on-the-fly with sufficient speed.

>
>
With the Table Access Protocol (TAP http://www.ivoa.net/Documents/TAP/) the VO provides a standard means of querying tabular data sets, and with the advent of the TAP factory [8] it has become possible to execute multiple, distributed TAP queries. In a traditional IVOA TAP scenario, single TAP endpoints provide the means for VO clients to present the user with a data resource schema and then to service an ADQL query on that resource, but it is then up to further, separate client–end manipulations to join data for multiwavelength science. TAP Factory takes this further by combining TAP with the Open Grid Service Architecture Data Access Infrastructue (OGSA–DAI) middleware to provide a means of creating TAP end-points on–the–fly, and, thereby, facilitating the cross-querying of distributed resources by TAP clients.

Such a system supports one of the fundamental usage scenarios for the VO. A user can select a set of data resources published using TAP on which to execute a distributed query. From the metadata exposed by the individual TAP services, TAP Factory is able to create a new TAP endpoint on–the–fly for the distributed query and present the user with the metadata of the virtual data federation thus generated. The user can then pose a query against this virtual federation as if querying a single TAP service, and, when coupled with the MyDB–like personal database of WP-330, it enables users to create sophisticated sets of cross–catalogue queries, as required for the full exploitation of Gaia data. The key point here is that a data resource can be incorporated into a virtual federation without requiring any action on the part of the staff of the data centre that curate it; so, in the case of Gaia, it is possible for higher level services like these to be developed and deployed, without requiring any action from (or placing any obligations on) the staff at ESAC.

A basic prototype of this system has been produced by UEDIN, but it needs further development in several related regards before it is capable of supporting the scientific exploitation of Gaia. Firstly, the efficiency with which the system can execute a distributed query over the virtual federation constructed by TAP Factory depends on the metadata available to OGSA-DAI’s Distributed Query Processor (DQP) for the purposes of constructing a good query execution plan. For example, if DQP knows the distribution of values of the attributes used in join clauses in the distributed query, it can make an informed decision about how best to move data in executing the query, and whether to perform any server-side pre-processing before doing so. Taking full advantage of these capabilities will require an extension to the TAP standard, to expand the range of metadata exposed by a TAP service, and this can be best progressed through the IVOA standardisation process by the demonstration of powerful prototypes performing realistic science analyses.

The efficiency of the distributed queries can be improved further by collaboration between data centres. A naive spatial cross-match query executed between distributed multi–TB data sets will remain expensive, given network speeds, but several strategies exist that can ameliorate this situation and this work package will assess, through quantitative analysis – and, where possible, direct experimentation – the optimal configuration of the multiwavelength datasets required for the scientific exploitation of Gaia. For example, to determine which external catalogues should be co-located with a copy of the Gaia archive, for which should “cross-neighbour” tables be precomputed to facilitate queries between data sets that remain geographically separated, and for which can crossmatches be performed on-the-fly with sufficient speed.

  The work will be undertaken by Read and Voutsinas of UEDIN.
Line: 151 to 71
  UEDIN
Changed:
<
<
Research environments such as that provided by CADC with CANFAR (http://canfar.phys.uvic.ca/) represent stateof- the-art solutions to the large and growing range of research and data mining demands being placed upon astronomical archives. CANFAR offers scientists a rich, yet bounded, environment based on virtual machines (VMs), within which a scientist can deploy the software they need for their individual research and have it run in a manner that does not risk the stability of the archive or the research of other scientists. VM images can be created and stored by individual scientists or research consortia, and deployed when, and in the numbers, necessary for the job at hand, so that the available data analysis hardware can be employed effectively, but with the flexibility needed to match the differing needs of multiple user groups.

As archives increase in size and complexity, data analysis will shift to the data centre, and the CANFAR initiative is showing how this can work in practice. Of particular relevance to this project is the recent work (https:// sites.google.com/site/nickballastronomer/research/canfar_skytree) deploying the Skytree scalable data mining software within the CANFAR cloud, which has demonstrated how such the provision within a data centre of such a virtualized environment can support the large-scale data mining analyses envisaged for Gaia by WP-400. CANFAR is the pioneer in this domain, but further R&D work is needed to shape a system that will be suitable for Gaia: e.g. further integration with VO protocols (see WP-330 above), and creation of a more sophisticated packaging system for deployable software.

The work of WP-350 will centre on the prototyping the deployment, configuration and enhancement of a virtualized data analysis environment for Gaia. Starting with the existing CANFAR system, it will identify best practice and requirements for further development, some of which can be prototyped within WP-350. Comparison with other solutions for Gaia analysis within the date centre will be undertaken and conclusions reported.

>
>
Research environments such as that provided by CADC with CANFAR (http://canfar.phys.uvic.ca/) represent stateof- the-art solutions to the large and growing range of research and data mining demands being placed upon astronomical archives. CANFAR offers scientists a rich, yet bounded, environment based on virtual machines (VMs), within which a scientist can deploy the software they need for their individual research and have it run in a manner that does not risk the stability of the archive or the research of other scientists. VM images can be created and stored by individual scientists or research consortia, and deployed when, and in the numbers, necessary for the job at hand, so that the available data analysis hardware can be employed effectively, but with the flexibility needed to match the differing needs of multiple user groups.

As archives increase in size and complexity, data analysis will shift to the data centre, and the CANFAR initiative is showing how this can work in practice. Of particular relevance to this project is the recent work (https:// sites.google.com/site/nickballastronomer/research/canfar_skytree) deploying the Skytree scalable data mining software within the CANFAR cloud, which has demonstrated how such the provision within a data centre of such a virtualized environment can support the large-scale data mining analyses envisaged for Gaia by WP-400. CANFAR is the pioneer in this domain, but further R&D work is needed to shape a system that will be suitable for Gaia: e.g. further integration with VO protocols (see WP-330 above), and creation of a more sophisticated packaging system for deployable software.

The work of WP-350 will centre on the prototyping the deployment, configuration and enhancement of a virtualized data analysis environment for Gaia. Starting with the existing CANFAR system, it will identify best practice and requirements for further development, some of which can be prototyped within WP-350. Comparison with other solutions for Gaia analysis within the date centre will be undertaken and conclusions reported.

  This work will be undertaken by Read at UEDIN

Participants

Line: 183 to 86
 
    • CNRS
    • INAF
Added:
>
>

Meetings

 
META FILEATTACHMENT attachment="cu9_rgm_030910.pdf" attr="h" comment="CU9 bon mots from Bob/Nige" date="1319534773" name="cu9_rgm_030910.pdf" path="D:\cu9_rgm_030910.pdf" size="45581" stream="D:\cu9_rgm_030910.pdf" tmpFilename="/usr/tmp/CGItemp39311" user="NigelHambly" version="1"

Revision 82014-04-29 - LolaBalaguer

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

300 - Aspects of archive system design

Description

Changed:
<
<
The objective of this workpackage is to design, prototype and develop aspects of the archive infrastructure needed for the scientific exploitation of Gaia data. The design and technology choices made will be motivated by the real user requirements identified by WP 200 – in particular, the massive, complex queries defined by the Grand Challenges – and by other initiatives, such as the GREAT project, and will be made with full recognition of the constraints imposed by the ESAC archive system, with which it must interface effectively. Prototypes will be prepared and tested in cooperation with the end user community and with the ESAC science archive team through the DPAC CU9. A core principle will be the adoption of Virtual Observatory standards and the development of VO infrastructure to enable ready interoperation with the other external datasets needed to release the full scientific potential of Gaia.
>
>
The objective of this workpackage is to design, prototype and develop aspects of the archive infrastructure needed for the scientific exploitation of Gaia data.
 
Added:
>
>
  • WP3 - Aspects of archive system design [Months: 1-42]
  • Lead beneficiary: UEDIN
  • Type of activity: RTD

The design and technology choices made will be motivated by the real user requirements identified by WP 200 – in particular, the massive, complex queries defined by the Grand Challenges – and by other initiatives, such as the GREAT project, and will be made with full recognition of the constraints imposed by the ESAC archive system, with which it must interface effectively. Prototypes will be prepared and tested in cooperation with the end user community and with the ESAC science archive team through the DPAC CU9. A core principle will be the adoption of Virtual Observatory standards and the development of VO infrastructure to enable ready interoperation with the other external datasets needed to release the full scientific potential of Gaia.

T3.1 - Technical coordination [Months: 1-42]

UEDIN

In addition to managing the resources deployed on the other WP-300 work packages, and producing reports on those activities, this work package oversees the design and specification of all work conducted under WP-300, to ensure that it adequately addresses the requirements identified within the GENIUS project and from external sources, such as the CU9 and GREAT. The key thing here is to ensure maximum science return by enabling science exploitation through appropriate use of information technologies.

This WP also includes the assurance of compliance with the deployment of the archive at ESAC. Since the Gaia archive will be designed and run at in this centre, it is essential that the techniques and technologies prototyped in this project are consistent with what can be ultimately implemented there. An important aspect of WP-310 is to ensure the injection of the relevant requirements for this in the design and evaluation phases, and that all GENIUS system design work is tackled with full awareness of the constraints imposed by ESAC infrastructure and practice. A key deliverable is therefore a formal, documented co-ordination and interface agreement between GENIUS and the Science Archive Team (SAT) at ESAC through the CU9.

This work will be undertaken by Hambly of UEDIN.

T3.2 - Aspects of archive interface design [Months: 1-42]

UEDIN

The Gaia mission will produce a wide variety of data products, leading to a complex archive. A crucial issue for the exploitability of the Gaia data set is, therefore, an archive interface that supports a sufficiently rich range of functionality and is sufficiently easy to use for users to do their science with it effectively. The task of this WP is to prototype archive interface components that meet these user requirements, as developed by the CU9 and GREAT. Since any candidate archive DBMSs to be employed at ESAC support access from Java via Java Database Connectivity (JDBC), it is possible to develop archive interface prototypes independent of the backend DBMS.

UEDIN has recently been prototyping the use of Web 2.0 technologies for the delivery of an intuitive, but richlyfunctioned user interface to sky survey archives with a complicated schema, and this appears promising for Gaia: functionality like making schema information readily available to users as they develop their queries, and, even, using code completion to help write them, can make archive use much more effective.

The interface is able to offer users the ability to explore data interactively: they can execute a query, generate summary plots (e.g. scatter plots, histograms, etc), realise their query was not quite making the desired selection, and then easily tweaking the query and executing it again. This reflects the iterative method of working that scientists naturally adopt, which is clearly revealed in analyses the query logs from sky survey archives such as the WFCAM Science Archive [6], curated by UEDIN, and this iterative workflow can be made to run efficiently using a combination of client- and server-side technologies.

What is most important is that the functionality prototyped is that prioritised by scientists, and that any testbed developed here helps the user community to further refine their expressed requirements. For example, while GAP has successfully engaged the Gaia user community via a call for ‘usage scenarios’ under the auspices of GREAT (and these form the inputs to WP200), iteration of requirements with these key consumers has not been considered so far. This process will drive the further development of user interface design – e.g. in determining which additional graphical capacities to implement, and to assess how sophisticated a caching mechanism is required to support the division of datasets between the client and the server – and we propose to use the interfaces developed by this WP for an initial deployment as a testbed for the community to further assess its requirements.

The work will be undertaken by Read (UEDIN)

T3.3 - VO infrastructure [Months: 1-42]

INAF, CNRS, UEDIN, CSIC

The past decade has seen a huge amount of activity in defining, standardising and implementing the global ‘Virtual Observatory’. From the outset, large–scale mission data sets from ground and space were anticipated as being the cornerstone of the VO. This work has reached a level of maturity whereby most of the basic interoperability standards are in place (http://www.ivoa.net/Documents/ ) and it is possible to build project-specific services on top of them and to see where the further development of standards is needed in support of particular projects.

Our goal in WP-330 is a focused programme of VO consolidation and development work concerning server– side components (as opposed to client–side applications; see WP440) to provide the particular VO infrastructure required for Gaia exploitation. This will involve the following strands of work:

  1. Assessment of compliance with VO standards (Solano, 6 sm CSIC): to test, and implement the Virtual
Observatory standards and protocols necessary to make Gaia data fully VO compliant. We will define the list of VO standards applicable to Gaia data; implement VO standards in Gaia simulated data; and document using simulated data and IVOA standards and protocols as inputs. The main deliverable will be a specification for VO–compliant Gaia data.
  1. Deployment of specific web services (Berthier, 1.8 sm CNRS): the SkyBOT (http://vo.imcce.fr/webservices/
skybot/) service suite will provide VO-compliant tools for the treatment of solar system bodies within Gaia data, while Miriade (http://vo.imcce.fr/webservices/miriade/) computes positional and physical ephemerides of known solar system bodies in a VO-compliant manner.
  1. VO-Dance (Smareglia, 18 sm INAF):The VO-Dance suite provides a lightweight method of publishing data to
the VO. Its components can be distributed as disk images to be run on a virtual machine, so we shall assess its use as a means whereby users can integrate their own datasets with Gaia data.
  1. VOSpace (Voutsinas, 9 sm UEDIN): Support for an extension to the current VOSpace functionality so that,
in addition to providing users with file storage space addressable by VO access protocols, they can also have database storage space on the same basis. This will provide users with a personal database facility like the SDSS MyDB systems, which they are able to address in a VO-complicant manner. For example, a user will be able to direct the result set from one VO query into their personal database, and then use it as the target for a subsequent query, possibly also involving other datasets in the VO, using the TAP Factory system of WP-340 below

T3.4 - Data Centre Collaboration [Months: 1-42]

UEDIN

With the Table Access Protocol (TAP http://www.ivoa.net/Documents/TAP/) the VO provides a standard means of querying tabular data sets, and with the advent of the TAP factory [8] it has become possible to execute multiple, distributed TAP queries. In a traditional IVOA TAP scenario, single TAP endpoints provide the means for VO clients to present the user with a data resource schema and then to service an ADQL query on that resource, but it is then up to further, separate client–end manipulations to join data for multiwavelength science. TAP Factory takes this further by combining TAP with the Open Grid Service Architecture Data Access Infrastructue (OGSA–DAI) middleware to provide a means of creating TAP end-points on–the–fly, and, thereby, facilitating the cross-querying of distributed resources by TAP clients.

Such a system supports one of the fundamental usage scenarios for the VO. A user can select a set of data resources published using TAP on which to execute a distributed query. From the metadata exposed by the individual TAP services, TAP Factory is able to create a new TAP endpoint on–the–fly for the distributed query and present the user with the metadata of the virtual data federation thus generated. The user can then pose a query against this virtual federation as if querying a single TAP service, and, when coupled with the MyDB–like personal database of WP-330, it enables users to create sophisticated sets of cross–catalogue queries, as required for the full exploitation of Gaia data. The key point here is that a data resource can be incorporated into a virtual federation without requiring any action on the part of the staff of the data centre that curate it; so, in the case of Gaia, it is possible for higher level services like these to be developed and deployed, without requiring any action from (or placing any obligations on) the staff at ESAC.

A basic prototype of this system has been produced by UEDIN, but it needs further development in several related regards before it is capable of supporting the scientific exploitation of Gaia. Firstly, the efficiency with which the system can execute a distributed query over the virtual federation constructed by TAP Factory depends on the metadata available to OGSA-DAI’s Distributed Query Processor (DQP) for the purposes of constructing a good query execution plan. For example, if DQP knows the distribution of values of the attributes used in join clauses in the distributed query, it can make an informed decision about how best to move data in executing the query, and whether to perform any server-side pre-processing before doing so. Taking full advantage of these capabilities will require an extension to the TAP standard, to expand the range of metadata exposed by a TAP service, and this can be best progressed through the IVOA standardisation process by the demonstration of powerful prototypes performing realistic science analyses.

The efficiency of the distributed queries can be improved further by collaboration between data centres. A naive spatial cross-match query executed between distributed multi–TB data sets will remain expensive, given network speeds, but several strategies exist that can ameliorate this situation and this work package will assess, through quantitative analysis – and, where possible, direct experimentation – the optimal configuration of the multiwavelength datasets required for the scientific exploitation of Gaia. For example, to determine which external catalogues should be co-located with a copy of the Gaia archive, for which should “cross-neighbour” tables be precomputed to facilitate queries between data sets that remain geographically separated, and for which can crossmatches be performed on-the-fly with sufficient speed.

The work will be undertaken by Read and Voutsinas of UEDIN.

T3.5 - Cloud-based research and data mining environments [Months: 1-42]

UEDIN

Research environments such as that provided by CADC with CANFAR (http://canfar.phys.uvic.ca/) represent stateof- the-art solutions to the large and growing range of research and data mining demands being placed upon astronomical archives. CANFAR offers scientists a rich, yet bounded, environment based on virtual machines (VMs), within which a scientist can deploy the software they need for their individual research and have it run in a manner that does not risk the stability of the archive or the research of other scientists. VM images can be created and stored by individual scientists or research consortia, and deployed when, and in the numbers, necessary for the job at hand, so that the available data analysis hardware can be employed effectively, but with the flexibility needed to match the differing needs of multiple user groups.

As archives increase in size and complexity, data analysis will shift to the data centre, and the CANFAR initiative is showing how this can work in practice. Of particular relevance to this project is the recent work (https:// sites.google.com/site/nickballastronomer/research/canfar_skytree) deploying the Skytree scalable data mining software within the CANFAR cloud, which has demonstrated how such the provision within a data centre of such a virtualized environment can support the large-scale data mining analyses envisaged for Gaia by WP-400. CANFAR is the pioneer in this domain, but further R&D work is needed to shape a system that will be suitable for Gaia: e.g. further integration with VO protocols (see WP-330 above), and creation of a more sophisticated packaging system for deployable software.

The work of WP-350 will centre on the prototyping the deployment, configuration and enhancement of a virtualized data analysis environment for Gaia. Starting with the existing CANFAR system, it will identify best practice and requirements for further development, some of which can be prototyped within WP-350. Comparison with other solutions for Gaia analysis within the date centre will be undertaken and conclusions reported.

This work will be undertaken by Read at UEDIN

 

Participants

  • Manager: N. Hambly (Edinburgh)
  • Partners:

Revision 72013-01-30 - SurinyeOlarte

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

300 - Aspects of archive system design

Description

Changed:
<
<
A database and query engine will be at the core of the Gaia archive system. This work package is devoted to the exploration of technologies, development and implementation of demonstration elements for Gaia. The technology choices and the design of the systems should be carefully based on the real user needs, as explored and defined in WP 200. Furthermore, the demonstration system(s) should support the advanced tools and activities produced in the rest of work packages. Specifically, design should support the Grand Challenges outlined in WP 200 that will require complex and massive queries. The activity should culminate in a full working implementation of a Gaia archive system. In the process, prototypes will be prepared and tested in cooperation with the end users always keeping in mind the final goal of installing the Gaia archive at ESAC. A relevant point is that the system should be Virtual Observatory compliant and therefore should include a VO layer and the relevant metadata.
>
>
The objective of this workpackage is to design, prototype and develop aspects of the archive infrastructure needed for the scientific exploitation of Gaia data. The design and technology choices made will be motivated by the real user requirements identified by WP 200 – in particular, the massive, complex queries defined by the Grand Challenges – and by other initiatives, such as the GREAT project, and will be made with full recognition of the constraints imposed by the ESAC archive system, with which it must interface effectively. Prototypes will be prepared and tested in cooperation with the end user community and with the ESAC science archive team through the DPAC CU9. A core principle will be the adoption of Virtual Observatory standards and the development of VO infrastructure to enable ready interoperation with the other external datasets needed to release the full scientific potential of Gaia.
 

Participants

  • Manager: N. Hambly (Edinburgh)
  • Partners:
    • UEDIN
Changed:
<
<
    • INTA
>
>
    • CSIC
 
    • CNRS
Deleted:
<
<
    • CESCA
 
    • INAF
Changed:
<
<
    • FFCUL
>
>
 
META FILEATTACHMENT attachment="cu9_rgm_030910.pdf" attr="h" comment="CU9 bon mots from Bob/Nige" date="1319534773" name="cu9_rgm_030910.pdf" path="D:\cu9_rgm_030910.pdf" size="45581" stream="D:\cu9_rgm_030910.pdf" tmpFilename="/usr/tmp/CGItemp39311" user="NigelHambly" version="1"

Revision 62011-11-24 - XaviLuri

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<

300 - Archive System Design

>
>

300 - Aspects of archive system design

 
Changed:
<
<
A database and query engine will be at the core of the Gaia archive system. This work package is devoted to the study of technologies, development and implementation of these elements for Gaia. The technology choices and the design of the systems should be carefully based on the real user needs, as explored and defined in WP 200. Furthermore, the designed system should support the advanced tools and activities produced in the rest of work packages.Specifically its design should support the Grand Challenges outlined in WP 200, that will require complex and massive queries. The activity should culminate in a full working implementation of the Gaia archive system. In the process, prototypes will be prepared and tested in cooperation with the end users. A relevant point is that the system should be Virtual Observatory compliant and therefore should include a VO layer and the relevant metadata.
>
>

Description

 
Changed:
<
<

Workpackage Breakdown Structure (draft)

>
>
A database and query engine will be at the core of the Gaia archive system. This work package is devoted to the exploration of technologies, development and implementation of demonstration elements for Gaia. The technology choices and the design of the systems should be carefully based on the real user needs, as explored and defined in WP 200. Furthermore, the demonstration system(s) should support the advanced tools and activities produced in the rest of work packages. Specifically, design should support the Grand Challenges outlined in WP 200 that will require complex and massive queries. The activity should culminate in a full working implementation of a Gaia archive system. In the process, prototypes will be prepared and tested in cooperation with the end users always keeping in mind the final goal of installing the Gaia archive at ESAC. A relevant point is that the system should be Virtual Observatory compliant and therefore should include a VO layer and the relevant metadata.
 
Changed:
<
<

310 Management

320 Database systems evaluation

330 Query interface design

340 VO layer

350 HW considerations

  • Virtualisation/Cloud
  • Mirroring

360 Data Centre issues

  • Complementary (local) datasets: both ground-based and space-based mission data

Proposal materials

The following key points should be borne in mind in preparing the proposal case for this workpackage:

  • GAP (precursor to DPAC's CU9) have already started benchmarking DB systems and hardware configurations for a database system from the point of view of offline processing (AGIS) and also possible realtime processing (IDT) presumably with the assumption that the same solution is appropriate for an end-user archive system;
  • The GENIUS proposal must complement and enhance GAP/CU9 work, not duplicate nor replace;
  • It is way beyond the scope of GENIUS (or GAP/CU9 for that matter) to exhaustively study all possible hardware/system configurations;
  • A likely hook on which to hang this section of the proposal is to point to the success of relational technology in serving legacy surveys to the user community (SDSS, UKIDSS, VISTA...), the value of exposing standard SQL interfaces directly to the consumer, and the need to find scalable relational solutions for 10s of billions of row datasets
  • A single one-size-fits-all solution for online/offline data processing AND end-user access is unlikely to be successfuly; for the end-user system, a separate and phased approach starting with a tried-and-tested solution may be more appropriate. This is where GENIUS can contribute, since DPAC/CU9 may be necessarily too focussed on the data processing.

Draft general justification text (from Bob and Nige)

(The following is taken from a proposal to the UK funding agency for CU9 work, but which was eventually removed from the proposal and is unfunded)

>
>

Participants

  • Manager: N. Hambly (Edinburgh)
  • Partners:
    • UEDIN
    • INTA
    • CNRS
    • CESCA
    • INAF
    • FFCUL
 
META FILEATTACHMENT attachment="cu9_rgm_030910.pdf" attr="h" comment="CU9 bon mots from Bob/Nige" date="1319534773" name="cu9_rgm_030910.pdf" path="D:\cu9_rgm_030910.pdf" size="45581" stream="D:\cu9_rgm_030910.pdf" tmpFilename="/usr/tmp/CGItemp39311" user="NigelHambly" version="1"

Revision 52011-10-27 - NigelHambly

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

300 - Archive System Design

Line: 29 to 29
 
  • The GENIUS proposal must complement and enhance GAP/CU9 work, not duplicate nor replace;
  • It is way beyond the scope of GENIUS (or GAP/CU9 for that matter) to exhaustively study all possible hardware/system configurations;
  • A likely hook on which to hang this section of the proposal is to point to the success of relational technology in serving legacy surveys to the user community (SDSS, UKIDSS, VISTA...), the value of exposing standard SQL interfaces directly to the consumer, and the need to find scalable relational solutions for 10s of billions of row datasets
Added:
>
>
  • A single one-size-fits-all solution for online/offline data processing AND end-user access is unlikely to be successfuly; for the end-user system, a separate and phased approach starting with a tried-and-tested solution may be more appropriate. This is where GENIUS can contribute, since DPAC/CU9 may be necessarily too focussed on the data processing.
 

Draft general justification text (from Bob and Nige)

Revision 42011-10-25 - NigelHambly

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

300 - Archive System Design

Deleted:
<
<
A database and query engine will be at the core of the Gaia archive system. This work package is devoted to the selection of technologies, development and implementation of these elements for Gaia. The technology choices and the design of the systems should be carefully based on the real user needs, as explored and defined in WP 200. Furthermore, the designed system should support the advanced tools and activities produced in the rest of work packages.Specifically its design should support the Grand Challenges outlined in WP 200, that will require complex and massive queries. The activity should culminate in a full working implementation of the Gaia archive system. In the process, prototypes will be prepared and tested in cooperation with the end users. A relevant point is that the system should be Virtual Observatory compliant and therefore should include a VO layer and the relevant metadata.
 \ No newline at end of file
Added:
>
>
A database and query engine will be at the core of the Gaia archive system. This work package is devoted to the study of technologies, development and implementation of these elements for Gaia. The technology choices and the design of the systems should be carefully based on the real user needs, as explored and defined in WP 200. Furthermore, the designed system should support the advanced tools and activities produced in the rest of work packages.Specifically its design should support the Grand Challenges outlined in WP 200, that will require complex and massive queries. The activity should culminate in a full working implementation of the Gaia archive system. In the process, prototypes will be prepared and tested in cooperation with the end users. A relevant point is that the system should be Virtual Observatory compliant and therefore should include a VO layer and the relevant metadata.

Workpackage Breakdown Structure (draft)

310 Management

320 Database systems evaluation

330 Query interface design

340 VO layer

350 HW considerations

  • Virtualisation/Cloud
  • Mirroring

360 Data Centre issues

  • Complementary (local) datasets: both ground-based and space-based mission data

Proposal materials

The following key points should be borne in mind in preparing the proposal case for this workpackage:

  • GAP (precursor to DPAC's CU9) have already started benchmarking DB systems and hardware configurations for a database system from the point of view of offline processing (AGIS) and also possible realtime processing (IDT) presumably with the assumption that the same solution is appropriate for an end-user archive system;
  • The GENIUS proposal must complement and enhance GAP/CU9 work, not duplicate nor replace;
  • It is way beyond the scope of GENIUS (or GAP/CU9 for that matter) to exhaustively study all possible hardware/system configurations;
  • A likely hook on which to hang this section of the proposal is to point to the success of relational technology in serving legacy surveys to the user community (SDSS, UKIDSS, VISTA...), the value of exposing standard SQL interfaces directly to the consumer, and the need to find scalable relational solutions for 10s of billions of row datasets

Draft general justification text (from Bob and Nige)

(The following is taken from a proposal to the UK funding agency for CU9 work, but which was eventually removed from the proposal and is unfunded)

META FILEATTACHMENT attachment="cu9_rgm_030910.pdf" attr="h" comment="CU9 bon mots from Bob/Nige" date="1319534773" name="cu9_rgm_030910.pdf" path="D:\cu9_rgm_030910.pdf" size="45581" stream="D:\cu9_rgm_030910.pdf" tmpFilename="/usr/tmp/CGItemp39311" user="NigelHambly" version="1"

Revision 32011-10-13 - XaviLuri

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

300 - Archive System Design

A database and query engine will be at the core of the Gaia archive system. This work package is devoted to the selection of technologies, development and implementation of these elements for Gaia. The technology choices and the design of the systems should be carefully based on the real user needs, as explored and defined in WP 200. Furthermore, the designed system should support the advanced tools and activities produced in the rest of work packages.Specifically its design should support the Grand Challenges outlined in WP 200, that will require complex and massive queries. The activity should culminate in a full working implementation of the Gaia archive system. In the process, prototypes will be prepared and tested in cooperation with the end users. A relevant point is that the system should be Virtual Observatory compliant and therefore should include a VO layer and the relevant metadata.

Deleted:
<
<
<--  
-->
 \ No newline at end of file

Revision 22011-10-12 - XaviLuri

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

300 - Archive System Design

Changed:
<
<
A database and query engine will be at the core of the Gaia archive system. This work package is devoted to the selection of technologies, development and implementation of these elements for Gaia. The technology choices and the design of the systems should be carefully based on the real user needs, as explored and defined in WP 200. Furthermore, the designed system should support the advanced tools and activities produced in the rest of work packages. The activity should culminate in a full working implementation of the Gaia archive system. In the process, prototypes will be prepared and tested in cooperation with the end users. A relevant point is that the system should be Virtual Observatory compliant and therefore should include a VO layer and the relevant metadata.
>
>
A database and query engine will be at the core of the Gaia archive system. This work package is devoted to the selection of technologies, development and implementation of these elements for Gaia. The technology choices and the design of the systems should be carefully based on the real user needs, as explored and defined in WP 200. Furthermore, the designed system should support the advanced tools and activities produced in the rest of work packages.Specifically its design should support the Grand Challenges outlined in WP 200, that will require complex and massive queries. The activity should culminate in a full working implementation of the Gaia archive system. In the process, prototypes will be prepared and tested in cooperation with the end users. A relevant point is that the system should be Virtual Observatory compliant and therefore should include a VO layer and the relevant metadata.
 

Revision 12011-10-10 - XaviLuri

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="WebHome"

300 - Archive System Design

A database and query engine will be at the core of the Gaia archive system. This work package is devoted to the selection of technologies, development and implementation of these elements for Gaia. The technology choices and the design of the systems should be carefully based on the real user needs, as explored and defined in WP 200. Furthermore, the designed system should support the advanced tools and activities produced in the rest of work packages. The activity should culminate in a full working implementation of the Gaia archive system. In the process, prototypes will be prepared and tested in cooperation with the end users. A relevant point is that the system should be Virtual Observatory compliant and therefore should include a VO layer and the relevant metadata.

<--  
-->
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback