ANDS Logo
bannerbannerbannerbanner
 Find research data:

IT infrastructure (hardware and software)

Effective data management cannot be achieved without attention given to the provision of IT infrastructure.  Institutions vary greatly in their infrastructure needs, and this is a reflection of the degree to which the institution is engaged in research and the disciplinary spread of researchers.  Institutions, therefore, will have to make judgements about which of the elements listed below apply to them.

In some cases there is a fine line to be drawn between meeting the needs of data management and enabling and encouraging researchers in their work.  For example, provision of good support for collaboration makes research more efficient; an environment which supports data discovery and reuse enables research which would not otherwise have been possible.  A well-developed data management framework around these and other activities can simplify and facilitate the conduct of research and the benefits it can bring.

Discussion of the different elements below refers to ANDS publications and other information resources. In addition, and because it is not possible for ANDS to prescribe what institutions should provide, we offer a number of questions about each.  These are designed to stimulate thinking around institutional needs and how these might be met. The questions are neither comprehensive nor exhaustive and we would encourage those associated with creating and maintaining a data management framework to generate their own questions to assess their own institutional environment.

Storage for data and metadata

Adequate storage for both data and metadata is essential, whether it is offered by the institution itself, a discipline or is outsourced to one of the many suppliers available.  Critical to any decision about storage is the need to ensure that metadata is adequately supported.
The ANDS Guide to Storage is a general introduction likely to be of interest to researchers, their support staff, data centre and repository staff and the general public.

One example of an institutional store is the Large Research Data Storage (LaRDS) at Monash University. 

Suggested questions to assess storage for data and metadata:

  • Do you have adequate storage for any given dataset? If so, is it permanent storage or temporary?
  • Are datasets backed up or archived in multiple locations?
  • If datasets and their metadata are separate, are they stored together or could they become separated? How closely are the data and metadata linked?
  • When new data storage technology becomes available, will you be able to migrate datasets to it?
  • Do datasets rely on compression or data transformation tools and standards for basic access?  If so, will there be continued access to those tools and standards?
  • Do datasets rely on proprietary tools or formats for basic access? If so, will there continued access to those tools or formats?

Identity management, authentication & access

Suggested questions to assess identity management, authentication & access

  • Can the IT infrastructure record and support the access restrictions needed for all datasets, e.g. passwords, AAF identity, ethics and access agreements and contracts?
  • Can the IT infrastructure support collaboration on any given dataset with researchers wherever they are located?
  • Can the IT infrastructure support the registrations needed to track and identify authenticated access?
  • Is it possible to change any access restrictions for a given dataset at some time in the future (e.g. after an embargo period has passed)?

Internal & external network connectivity

Suggested questions to assess internal & external network connectivity

  • If a dataset is too large or expensive to transfer over internal or external networks, is there a fall back transfer mechanism (e.g. hard drives sent via courier)?
  • What data transfer costs does your institution levy on internal data transfers? What about external transfers to/from other institutions?
  • If a given dataset attracts high demand, can the IT infrastructure cope with the access and copying requests expected?

Access to discipline specific tools to support analysis

Suggested questions to assess access to discipline specific tools to support analysis

  • Does your institution make available a sufficient range of analytical tools to meet all disciplinary needs?  Can all those who need them get access?
  • If a dataset requires access to a particular tool in order to analyse it, has that tool (e.g. statistical analysis package) been archived?  Have any configuration or customization scripts developed for that tool been archived?
  • If a dataset requires access to a commercial data analysis tool, are there alternative open-source tools available for other researchers without access to the commercial tool?
  • If a dataset requires access to a specific instrument or piece of equipment and this in turn requires a specific configuration, is there provision for this to be specified in the dataset metadata?

Software development

Suggested questions to assess software development

  • Does the institution have the capacity to develop or modify software as required? 
  • Does the institution have the expertise to recommend alternate solutions to time-intensive custom-developed development or modification?
  • If a dataset is generated by custom-developed software, has that software and documentation to build it with the dataset been archived?
  • If a dataset is generated by commercial or off-the-shelf software, have any customization or configuration settings with the dataset been archived? Are there alternative open-source tools available for other researchers without access to commercial software?
  • If a dataset requires custom or off the shelf software for basic access (e.g. decompression) have any settings or customizations needed to access the data been specified with the metadata?

Visualisation

Suggested questions to assess visualisation

  • Does the institution make available a sufficient range of visualisation tools, or access to them, to meet all requirements?
  • If a dataset requires supporting visualizations, are the visualisation assets and tools archived with the dataset?
  • Are there facilities to record any visualisation settings or configuration values with the dataset metadata?
  • If the supporting visualisations require access to high performance computing resources (HPC) to repeat or view, are there facilities to note the relevant requirements in the dataset metadata?

Collaborative environments

Suggested questions to assess collaborative environments

  • Does the institution provide adequate collaborative environments, or access to them, to meet all needs?
  • If a dataset is stored within a collaborative environment or requires access to a collaborative environment in order to access or analyse it, are there facilities to store the environment setup and configuration in the dataset metadata?  Can any customized software components needed to use the collaborative environment be archived with it?
  • If a dataset requires access to a commercial collaborative environment,  are there alternative open-source tools available for other researchers without access to the commercial environment? 
  • Is it possible to move a dataset to a newer or different collaborative environment if required for future additions or expansions?

High performance computing (HPC)

Suggested questions to assess high performance computing  

  • If a dataset requires access to HPC facilities in order to access or analyse it, is it possible to specify the facilities setup and configuration in the dataset metadata?
  • Is it possible to archive any customized software components needed to use the HPC facilities?
  • If a dataset requires access to a commercial HPC tool, are there alternative open-source tools available for other researchers without access to the commercial tool?
  • As desktop systems capacities grow, are there mechanisms to allow data to be shifted across?

Feedback

ANDS is keen to have feedback on these materials.  Please address any comments or suggestions to guides@ands.org.au.