Significant Environment Information (SEI)

icon-text

This section is extracted from the deliverable D4.1 Initial version of environment information extraction tools [Corubolo, F. et al (2014)]

http://goo.gl/9Uzv9W

Please refer to the further reading list for the names referenced to in [ ]


Definitions

Environment Information for a source digital object (DO) is the set of dependencies from the source DO’s, together with their target DOs for any type of use.

Depending on the situation, the target of the dependency can be an existing DO, a new DO representing some extracted information, a reference to such objects, or finally a type of object, and should be considered part of the environment information.

Purpose (or intended use) is one specific use or activity applied to the source DO, by a given user community.

It is possible to imagine a hierarchy of purposes, where a higher-level purpose (as for example, ‘render with faithful appearance’) will lead to a set of detailed purposes (such as, ‘accurate colour reproduction’, ‘accurate font reproduction’ etc.).

Significance weight, with respect to a purpose, is a value expressing the importance of each environment information dependency for that particular purpose.

The significance weight will be a property of each dependency between the DO and the DO expressing the environment information for a specific purpose.

On the basis of the above follows the definition for SEI:

Significant Environment Information (SEI) for a source DO, with respect to a given purpose(s) is the set of environment information, qualified with significance weights.

This will include both the dependency relationship (with purpose and weights) and the information that is the target of the dependency. In a less formal way, what we are aiming at is to determine “more or less all you need to have” when interacting with a DO for a specific purpose, and the relative significance of each of these information units (dependency).

Once SEI is determined for a collection of DOs, the different dependencies can form a graph structure, as illustrated in the figure below, where DOs in the collection could have relationships between each other (when a DO in the collection will depend on another DO in the collection for a specific purpose).

 

 Dependency grah ex

Figure: Example of a dependency graph

SEI and preservation

This graph of significant information from the environment can serve as the basis for appraising the set of DOs that should be maintained together with the relationships in the SEI (for example, by applying a simple threshold to the significance weight, in order to support the use of the DO in the future). In this graph we can imagine that weights will be assigned both to the data and the SEI. The combination of the DO weights and their propagation through the dependency weights should allow determining an optimal set of DOs to appraise.

Comparing this definition to existing definitions of SP of the environment, as for example the one described in [PREMIS 2008], we note that the former is aimed at the collection of SEI for a DO to support the different purposes a user can have with respect to it, while the latter defines the significant properties of an environment in itself. The information we aim to collect is constituted by qualified relationships to other DOs, as opposed to properties of the environment.

One could say that SEI is a specialisation of the dependency definition, taking into account the purpose and the weight of the dependency. This means that SEI is a specialised subset of the dependencies for a DO. As mentioned before, the perspective we take is that of observing the current use of a DO in its use environment, which is often before it will enter a Digital Preservation system. This will allow to better determine what dependencies and information, targets of the dependency relationship (be it information or services) are significant for the uses of the DO.

Knowing the significant information necessary to support current uses, will allow to cover, or at least help anticipate more precisely, the needs of user communities also in the long-term.