This is a static archive of the previous Open Grid Forum Redmine content management system saved from host redmine.ogf.org file /projects/ur-wg/wiki/UseCases/annotate/1 at Fri, 04 Nov 2022 15:15:38 GMT UR WG - Open Grid Forum

UseCases

Version 1 (Jon Kerr Nilsen, 08/08/2012 04:05 AM)

1 1 Jon Kerr Nilsen
h1. Use Cases and Requirements
2 1 Jon Kerr Nilsen
3 1 Jon Kerr Nilsen
4 1 Jon Kerr Nilsen
h2. Use Cases:
5 1 Jon Kerr Nilsen
6 1 Jon Kerr Nilsen
7 1 Jon Kerr Nilsen
# I would like to see the number of accesses to a given file/collection during a time window so I determine if...
8 1 Jon Kerr Nilsen
#* A: The file/collection is hot and could be a candidate for replication.
9 1 Jon Kerr Nilsen
#* B: The file/collection is dead and as such is a candidate for deletion (if a replica) or archiving (if on disk and not on Tape).
10 1 Jon Kerr Nilsen
#* All: for the moment the only part that should be accounted for is remote reading (like Amazon is doing) but it is network accounting that is left out of scope for this first implementation of UR
11 1 Jon Kerr Nilsen
#* needs more discussion
12 1 Jon Kerr Nilsen
# I would like to be able to see where files/collections are being accessed from.
13 1 Jon Kerr Nilsen
#*  - This would allow me to identify files/collections which would be candidates from replication.
14 1 Jon Kerr Nilsen
#*  - This may also aid when I need to justify the existence of my storage resource (If it's heavy public access I can show my data is of wide interest etc) 
15 1 Jon Kerr Nilsen
#* All: seems to be more monitoring and might left out scope
16 1 Jon Kerr Nilsen
# If I am a VO and I reserve a block of storage then
17 1 Jon Kerr Nilsen
#* that is viewed by the resource provider as used
18 1 Jon Kerr Nilsen
#* that is viewed as available for the VO to use
19 1 Jon Kerr Nilsen
#** when a user places data in that storage block then it is used twice
20 1 Jon Kerr Nilsen
#*** once in the context of the user using VO's space
21 1 Jon Kerr Nilsen
#*** once in the context of the VO using the Resource's space
22 1 Jon Kerr Nilsen
#* need to be able to either describe this resource as a virtual resource or indicate the leaseholder of the physical resource
23 1 Jon Kerr Nilsen
#* All: if it is reserved it can not used by others user and should be marked as used but the sensor should be careful and do not double count the resourced already used (used and resources should not overlap and count twice). Reserved space is also subtracted from the free resources.
24 1 Jon Kerr Nilsen
# I would like to view Distributed Storage as a whole and also as distinct parts.
25 1 Jon Kerr Nilsen
#* Consider a system in which storage is distributed across several physical resources(location). Similar to Nordugrid distributed dCahce.
26 1 Jon Kerr Nilsen
#* A set of URs are generated that can be used to account for storage on each distinct resource.
27 1 Jon Kerr Nilsen
#* A single UR for the whole storage resource could be formed from an aggregate or summary record of these distinct resources - or could be formed stand alone as a separate UR.
28 1 Jon Kerr Nilsen
29 1 Jon Kerr Nilsen
---
30 1 Jon Kerr Nilsen
31 1 Jon Kerr Nilsen
h2. ALL OK:
32 1 Jon Kerr Nilsen
33 1 Jon Kerr Nilsen
---
34 1 Jon Kerr Nilsen
35 1 Jon Kerr Nilsen
h2. Requirements:
36 1 Jon Kerr Nilsen
37 1 Jon Kerr Nilsen
h3. General
38 1 Jon Kerr Nilsen
39 1 Jon Kerr Nilsen
# Each UR must have a unique identity.
40 1 Jon Kerr Nilsen
#*  (This is true even if the same information is queried at a different point in time).
41 1 Jon Kerr Nilsen
#* All: yes (done)
42 1 Jon Kerr Nilsen
# Each UR must have a Time-stamp (creation).
43 1 Jon Kerr Nilsen
#* All: yes (done)
44 1 Jon Kerr Nilsen
# The UR should provide a means to identify which system (CREAM-pbs, CREAM-LSF, dCache, StoRM, etc.).
45 1 Jon Kerr Nilsen
#* All: yes
46 1 Jon Kerr Nilsen
# The UR should identify the XML author of the record (sensor and aggregator (if the records are aggregated system and sensor proprieties might be lost)... produced the record)).
47 1 Jon Kerr Nilsen
#* All: yes
48 1 Jon Kerr Nilsen
# The UR should be able to be used in a global and/or a local context.
49 1 Jon Kerr Nilsen
#* All: yes
50 1 Jon Kerr Nilsen
# The UR should allow for the user/project to be identified in a global and/or local context.
51 1 Jon Kerr Nilsen
#* All: yes
52 1 Jon Kerr Nilsen
# The UR should allow for the subject identity to be defined with a level of granularity which reflects that of the user/project which wishes to consume/use the record.
53 1 Jon Kerr Nilsen
#*  - This will require a good understanding of the storage system and AAI in place. 
54 1 Jon Kerr Nilsen
#* Should be coverable using a profile, nothing stops you from creating a record per file
55 1 Jon Kerr Nilsen
#** <sr:GroupAttribute sr:attributeType="authority">
56 1 Jon Kerr Nilsen
#** /O=Grid/OU=example.org/CN=host/auth.example.org
57 1 Jon Kerr Nilsen
#** </sr:GroupAttribute>
58 1 Jon Kerr Nilsen
#** With this definition we only need a GlobalGroupAttribute.
59 1 Jon Kerr Nilsen
#* All: yes.
60 1 Jon Kerr Nilsen
# We must define how Aggregate and Summary records can be produced from the URs.
61 1 Jon Kerr Nilsen
#*     - These need to provide aggregates/Summaries across storage systems, users, groups...
62 1 Jon Kerr Nilsen
#* All: yes
63 1 Jon Kerr Nilsen
# The UR must allow for the system, on which the resources were consumed, to be identified (e.g.: hostname, URI).
64 1 Jon Kerr Nilsen
#* All: yes
65 1 Jon Kerr Nilsen
# The UR must allow for the software, on which the resources were consumed, to be identified (e.g.: batch system, storage system, etc. name and version).
66 1 Jon Kerr Nilsen
#* All: yes
67 1 Jon Kerr Nilsen
# The UR sensor or summariser should be identified (e.g.: sensor name and version).
68 1 Jon Kerr Nilsen
#* All: yes
69 1 Jon Kerr Nilsen
# The UR producer or summariser should be identified (e.g.: sensor instance URI)
70 1 Jon Kerr Nilsen
#* All: yes
71 1 Jon Kerr Nilsen
72 1 Jon Kerr Nilsen
73 1 Jon Kerr Nilsen
h3. Storage
74 1 Jon Kerr Nilsen
75 1 Jon Kerr Nilsen
# The UR should be able to provide information about storage usage during a given time window (best estimate of the mean over time window).
76 1 Jon Kerr Nilsen
#*   - As such a start-time and end-time would be required. 
77 1 Jon Kerr Nilsen
#* All: yes. start-time, end-time and creation time cover this.
78 1 Jon Kerr Nilsen
# The UR should be able to provide information about storage groups.
79 1 Jon Kerr Nilsen
#*  - Collections of storage resources which may be spread over multiple sites and which are grouped into a logical unit.
80 1 Jon Kerr Nilsen
#* JG: Not necessarily a record thing
81 1 Jon Kerr Nilsen
#* RMP: Can't see any obstacle in the current record for this
82 1 Jon Kerr Nilsen
#* All: yes
83 1 Jon Kerr Nilsen
# The UR should be able to provide information about Allocated resources, Quotas etc.
84 1 Jon Kerr Nilsen
#* All: yes
85 1 Jon Kerr Nilsen
# The UR should be able to provide information on the number of files (number of directories/collections) which correspond to the produced record.
86 1 Jon Kerr Nilsen
#* All: it is out of scope because the UR is aimed to the space occupancy and not on the number of files present. NO.
87 1 Jon Kerr Nilsen
# The UR should allow for file access to be reported.
88 1 Jon Kerr Nilsen
#*    - number of times a file (set of files in a directory/collection) was accessed.
89 1 Jon Kerr Nilsen
#*    - location of service/user accessing the file (equivalent in some senses to "submitHost" in Compute record).
90 1 Jon Kerr Nilsen
#* All: left out for the moment
91 1 Jon Kerr Nilsen
# The UR should be provide information about both logical and physical storage usage.
92 1 Jon Kerr Nilsen
#*    - Logical - Storage volume when just file/object size is considered.
93 1 Jon Kerr Nilsen
#*    - Physical - Storage volume when all replicas etc are considered.
94 1 Jon Kerr Nilsen
#* All: yes
95 1 Jon Kerr Nilsen
# The UR should be able to provide information about the type/class of stored data.
96 1 Jon Kerr Nilsen
#*     - precious, temporary, replica, pinned etc. 
97 1 Jon Kerr Nilsen
#* All: yes
98 1 Jon Kerr Nilsen
# The UR should be able to provide information about the directory path/collection/data set.
99 1 Jon Kerr Nilsen
#* All: yes
100 1 Jon Kerr Nilsen
# The UR should, where ever possible, aim to be compatible with the Compute UR as we aim for a UR 2.0 solution.
101 1 Jon Kerr Nilsen
#* All: yes whenever possible
102 1 Jon Kerr Nilsen
# The UR should be able to provide point in time (snapshot) information about storage usage.
103 1 Jon Kerr Nilsen
#* All: it is able to do it using TimeDuration=0. It might be a misuse because it is monitoring.
104 1 Jon Kerr Nilsen
# The UR should allow us to distinguish between different storage mediums.
105 1 Jon Kerr Nilsen
#*     - Disk, Tape, Compound (Disk cache in front of Tape)
106 1 Jon Kerr Nilsen
#* All: already in (down to profile to define details on it
107 1 Jon Kerr Nilsen
108 1 Jon Kerr Nilsen
109 1 Jon Kerr Nilsen
h2. Use Cases:
110 1 Jon Kerr Nilsen
111 1 Jon Kerr Nilsen
# Gather storage usage information with a view to producing accounting/billing records.
112 1 Jon Kerr Nilsen
#*    - This should be doable for a resource and/or project and/or user.
113 1 Jon Kerr Nilsen
#*    - This would possibly require storage usage information like 4TB weeks.
114 1 Jon Kerr Nilsen
#*    - w.r.t accounting/billing this may need a charge field (something equivalent to the charge field in the Compute accounting record) 
115 1 Jon Kerr Nilsen
#* All: yes
116 1 Jon Kerr Nilsen
# Gather storage usage information and combine it with compute usage information with a view to producing accounting/billing records.
117 1 Jon Kerr Nilsen
#*    - This should be doable for a resource and/or project and/or user.
118 1 Jon Kerr Nilsen
#*    - This would possibly require storage usage information like 4TB weeks.
119 1 Jon Kerr Nilsen
#*    - w.r.t accounting/billing this may need a charge field (something equivalent to the charge field in the Compute accounting record)  
120 1 Jon Kerr Nilsen
#* All: yes
121 1 Jon Kerr Nilsen
# I would like to gather point in time storage usage information at several points in time that would then allow me to predict future usage.
122 1 Jon Kerr Nilsen
#*  - This would allow me to plan for new storage purchases etc (or to delete old data)
123 1 Jon Kerr Nilsen
#* All: yes, the usage record might be used for that but it should not a requirement
124 1 Jon Kerr Nilsen
#* AC: UR should be independent and include information over a period of time and not only a point in time
125 1 Jon Kerr Nilsen
# As a project I would like to be able to view the used and unused storage space that I have on a storage resource.
126 1 Jon Kerr Nilsen
#* - Thus I can see how much headroom I have.
127 1 Jon Kerr Nilsen
#* All: yes
128 1 Jon Kerr Nilsen
# As a project I would like to be able to view the requested storage I have on a specific resource and the allocated/reserved resources I have on that resource.
129 1 Jon Kerr Nilsen
#*  - Thus I can see I asked for 100TB and I currently have only 80TB at my disposal (of which i am using 50TB).
130 1 Jon Kerr Nilsen
#* All: yes
This is a static archive of the previous Open Grid Forum Redmine content management system saved from host redmine.ogf.org file /projects/ur-wg/wiki/UseCases/annotate/1 at Fri, 04 Nov 2022 15:15:38 GMT